Splitting this conversation into a separate thread...
While I agree that being able to transmit type information with logs
is a noble goal, there are many nuances, especially across JSON and
XML.
JSON handles a few basic types well, namely string, int, double,
boolean, but will require additional work to support other types, such
as datetime.
We need to determine if this is worth addressing. Seeing the the most
popular format will probably be JSON over Syslog, we will lose the
type information if it is not made available.
XML has more flexibility with typing, but only in combination with XML
Schema. This means that you either have to define all of the field
names a priori in XML Schema, or define a minimal schema that binds
type information to predefined type elements.
For example, in order to support this
`<Event><dst_ip>1.2.3.4</dst_ip></Event>`
I need to have a related XML Schema that defines dst_ip has a type of
IPv4 Address (otherwise it will be treated as a string or ducktyped
into an IPv4 address)
`<Event><dst_ip type="ipv4">1.2.3.4</dst_ip></Event>`
This poses similar issues. XML Schema cannot validate the @type
attribute based on the dst_ip value (though this is fairly trivial to
do with XSLT or similar). You also have the issue of what if dst_ip is
defined as an xs:int in the schema but @type is "ipv4", which value
type wins.
Also, this approach works will for atomic types, but does not work as
well if it is a structure and contains child elements.
For the best compatibility with XML Schema:
`<Event><ipv4 name="dst_ip">1.2.3.4</dst_ip></Event>`
This works better for XML Schema validation. But is not as natural to
use as the former examples.
I have no problem with either of the above solutions. After some
thought, option #2 might be the best, but we need to figure out how to
handle/represent structures and make this representable with XML
Schema. As I mention above, this is fairly trivial for atomic types,
but I don't know how to do it.
On Wed, Mar 21, 2012 at 4:12 PM, Botond Botyanszki <boti(a)nxlog.org> wrote:
> On Wed, 21 Mar 2012 14:15:47 -0400
> William Heinbockel <wheinbockel(a)gmail.com> wrote:
>
>> On Wed, Mar 21, 2012 at 2:12 PM, Dmitri Pal <dpal(a)redhat.com> wrote:
>> > On 03/20/2012 12:00 PM, david(a)lang.hm wrote:
>> >> On Tue, 20 Mar 2012, Gergely Nagy wrote:
>> >>
>> >>> david(a)lang.hm writes:
>> >>>
>> >>>> I think that we are going to need a type system before long.
>> >>>
>> >>> Yeah, but not in JSON, where it would be bolted upon.
>> >>
>> >> That's reasonable. It just means we need to support more than just
>> >> JSON soon :-)
>> >
>> > Type system of JSON is good enough. I might be a good compromise between
>> > no types and everything has a schema.
> I'd call it 'better than nothing'. There are some types lacking, most
> notably the DateTime type, which are mostly essential in our case.
>
>> +1
>> While I have nothing against explicit typing, I don't see the need.
> -1
> If you only think about forwarding and storing text (based logs),
> probably there is no need for that. But once you need to analyze the data
> where you compare and sort values, knowing the type of the value is pretty
> much required.
>
>> I would like to have some way to align the JSON structures with XML
>> representations, though. The only real issue here is the mapping of
>> JSON arrays to a similar XML structure.
> I think mapping arrays is pretty straightforward:
> JSON:
> { "addr":["1.2.3.4","2.3.4.5"] }
> XML:
> <event>
> <addr>1.2.3.4</addr>
> <addr>2.3.4.5</addr>
> </event>
> The problem here is mapping the type information what we discussed
> earlier an mostly agreed that squeezing it into JSON gets a little ugly.
>
Yep