Skip to content
This repository has been archived by the owner on Jan 22, 2019. It is now read-only.

Support hierarchical delimiters and qualifiers. #21

Open
prb opened this issue Aug 10, 2013 · 4 comments
Open

Support hierarchical delimiters and qualifiers. #21

prb opened this issue Aug 10, 2013 · 4 comments

Comments

@prb
Copy link
Member

prb commented Aug 10, 2013

The goal would be to transform nested delimiters into nested structures; sample input:

a,b;c;d,e;f

Sample output:

[
  [a],
  [b,c,d],
  [e,f]
]

(Brought over here from a discussion in FasterXML/jackson#2.)

@cowtowncoder
Copy link
Member

One quick mental note: I think this needs to be a feature to enable just because it will require re-scanning of column values. And/or support from higher level data-binder; we already get a "hint" from data-binder if an array value is expected (needed to support XML arrays).

Actually, come to think of it, "isExpectedStartArray" is probably needed anyway to support single-element arrays reliably. In addition need configurability of separator, default of semi-colon seems reasonable.

For output side it might be possible to make this work with less extra settings... a START_ARRAY could indicate mode in which values were appended with separator. So perhaps implementation could start with output-side first, as that should be simpler to get complete first.

@cowtowncoder
Copy link
Member

Ok, so: "inner delimeter" itself sounds reasonable. But how about quoting it? One possibility would be to use doubling (similar to quotes), although it would mean that one could not omit values (i.e. use empty String as marker for null).

I think that ideally this should work in a way to allow two-phase tokenization, which is much simpler to implement than (theoretically more efficient) single-pass, multi-state tokenization.

On the other hand: single-phase tokenization would allow use of escape character also for inner values, whereas two-phase does not (because first pass will handle unescaping and thereby make it impossible for secondary pass to skip ones that were escaped).

@mlvn23
Copy link

mlvn23 commented Mar 4, 2015

Maybe implement this feature like this: https://github.com/Keyang/node-csvtojson#empowered-json-parser. I think this allows arbitrary nesting of data in json while still being easily editable in spreadsheets. TLDR: it describes the data hierarchy in the header like data.field[0].property.

@cowtowncoder
Copy link
Member

Yes, use of naming convention can help. FWIW, use of @JsonUnwrapped already works, so while not as convenient, this is already partially doable. But more fluent support is sort of planned; it just requires integration with introspection (traversal of POJO properties) to produce logical names, and then handling nesting.
And by planned I just mean "thought about regarding feasibility" :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants