Support other formats than JSON #34

Zalastax · 2015-04-01T09:27:59Z

To add support for other formats we will have to separate the conversion code from the JSON-specific things. I see two potentially viable ways of doing it: An intermediate format or the strategy pattern.
The intermediate format would be good because the converter code would have a common target and the format code could do whatever it want as long as it provides a valid intermediate format object. The problem would be coming up with the format and avoiding creating boilerplate code.
The strategy pattern would be replacing every call to the JSON object and instead call to an object that knows how to extract/insert data from/to the format. This will probably not be as flexible as the intermediate format, but would be easier to implement.

BlackEdder · 2015-04-01T10:54:32Z

I would be in favour of the intermediate format. I think the other option would make addig another format quite a lot of work (almost as much as just creating creating a whole separate library to support that format).

I half remember reading about a general Var type in dlang that can be used for any type (and also that JSONValue was derived from that). Such a type could be used for an intermediate format.

Another question is whether an intermediate format should support specific data structures. For example json supports arrays/ranges and maps. That seems quite a sensible basic set.

Zalastax · 2015-04-01T11:36:18Z

Is this what you are thinking of? https://github.com/adamdruppe/arsd/blob/master/jsvar.d or http://dlang.org/phobos/std_variant.html
I think we should try to specialize the supported types as much as possible. If the implementer don't need the extra info they could just ignore it. Could this be made easier for the implementer if we used tags (and a tool for pattern matching somehow)? A custom map using struct would have the tags "userType","struct","map" and if you only need to know it's a map you ignore the other tags.

BlackEdder · 2015-04-01T13:32:53Z

I meant the variant indeed: http://dlang.org/phobos/std_variant.html

In many ways the ultimate specialization is to keep the types as they are, so we do need to make some kind of generalization. For example representing a struct/class with an associative array seems a natural generalization to me.

Zalastax · 2015-04-01T14:32:26Z

That seems reasonable, and is the way JavaScript does it. But we should add as much metadata as possible to allow the format to do what it needs. std.variant could be a good fit if we use the Algebraic type (A closed discriminated union with a limited type universe). It would reduce some boilerplate code for the base types (and allow for using switch on the type using typeid?).
So we need to decide the intermediate format. It should be easy to implement a new format (XML, BSON etc.). As much metadata as possible should be available, but not be in the way. If we can we should make pattern matching possible.
This is a rough sketch to help us get started:

 struct SerializationData
 {
    auto item = Algebraic!(int, double, string, SerializationData[], SerializationData[string]);//Add more as needed
    alias item this;
    string[] tags;//Fill with things like userDefined
 }

rcorre · 2015-04-07T02:57:02Z

After our discussion about how similar jsonizer and painlessjson are, I got interested in this idea too.
std.variant is definitely one obvious option for an intermediate format, but I was wondering how much overhead it will have.

I've been messing around with an experiment that attempts work directly with whatever data it is given.
The idea is to provide a framework that knows how to populate a complex object from an arbitrary data type D as long as D provides:

asPrimitive!T, which converts a D into a primitive type
asRange, which converts the data into an input range, where each entry is a data type that supports 1-3
getEntry(key), which extracts a named entry from the data. Again, the data type returned should support 1-3

Support for data types (json, xml, yaml, ect.) could be provided as plugins that implement the above functions for a given data type. For example, a plugin based on std.json might provide:

asPrimitive(T : long)(JSONValue json) { return cast(T) json.integer; }
asRange(JSONValue json) { return json.array[]; }
getEntry(JSONValue json, string key) { return json.object[key]; }

I haven't gotten very far but I thought I'd throw the concept out there.

rcorre · 2015-04-07T04:15:30Z

Actually, now that I describe that it sounds needlessly complicated. Maybe just using std.variant as an intermediate format would make things simpler.

Zalastax · 2015-04-07T09:39:03Z

What you described is the strategy pattern I talked about in the first comment. It made me think and I realized that there are goals that are at conflict: Simple and unified interface vs CTFE. We might be able to satisfy both goals. The current state of painlessJSON is that we can generate code that is tailored for a specific type. We have the possibility of generating code which is as fast as if it was handwritten. The intermediate format as propsed removes some information from the compile time and forces it into run time (I haven't tested this so please prove me wrong). This gives the implementer of the format flexibility and ease of use:

The format doesn't have to be structured in the same way as JSON. I think it's essential that it should be possible to not have any hierarchy in the forced way you proposed (asRange(JSONValue json) { return json.array[]; } requires that the format can be broken into smaller parts you can pass around).
You can structure your code however you want, as long as you produces and consumes the intermediate format.

The strategy pattern instead keeps information at compile time which enables optimizations, but it forces a hierarchical structure and a coding style.
I conclude that the solution has to be free from forced hierarchy and provide as much information as is available at compile time. For me the sensible middle ground is a templatable intermediate format. Implementing a new format would look like the code we use for the simple parts right now (example:

painlessjson/source/painlessjson/painlessjson.d

Lines 261 to 307 in 4de0fca

    
           private T defaultFromJSONImpl(T)(in JSONValue json) if (is(T == JSONValue)) 
        
           { 
        
               return json; 
        
           } 
        
           private T defaultFromJSONImpl(T)(in JSONValue json) if (isIntegral!T) 
        
           { 
        
               return to!T(json.integer); 
        
           } 
        
           private T defaultFromJSONImpl(T)(in JSONValue json) if (isFloatingPoint!T) 
        
           { 
        
               if (json.type == JSON_TYPE.INTEGER) 
        
                   return to!T(json.integer); 
        
               else 
        
                   return to!T(json.floating); 
        
           } 
        
           private T defaultFromJSONImpl(T)(in JSONValue json) if (is(T == string)) 
        
           { 
        
               return to!T(json.str); 
        
           } 
        
           private T defaultFromJSONImpl(T)(in JSONValue json) if (isBoolean!T) 
        
           { 
        
               if (json.type == JSON_TYPE.TRUE) 
        
                   return true; 
        
               else 
        
                   return false; 
        
           } 
        
           private T defaultFromJSONImpl(T)(in JSONValue json) if (isArray!T &&  !is(T == string)) 
        
           { 
        
               T t; //Se is we can find another way of finding t.front 
        
               return map!((js) => fromJSON!(typeof(t.front))(js))(json.array).array; 
        
           } 
        
           private T defaultFromJSONImpl(T)(in JSONValue json) if (isAssociativeArray!T) 
        
           { 
        
               T t; 
        
               const JSONValue[string] jsonAA = json.object; 
        
               foreach (k, v; jsonAA) 
        
               { 
        
                   t[fromJSON!(typeof(t.keys.front))(parseJSON(k))] = fromJSON!(typeof(t.values.front))(v); 
        
               } 
        
               return t; 
        
           }

).
Extraction of nested types from the intermediate format has to be done somehow too. I would like to give total freedom, while also offering an intermediate format walker that would enable you to do the implementation as if we were using the strategy pattern. This could be done as a mixin or as a function/class that takes a format-class with the right methods as a template parameter.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support other formats than JSON #34

Support other formats than JSON #34

Zalastax commented Apr 1, 2015

BlackEdder commented Apr 1, 2015

Zalastax commented Apr 1, 2015

BlackEdder commented Apr 1, 2015

Zalastax commented Apr 1, 2015

rcorre commented Apr 7, 2015

rcorre commented Apr 7, 2015

Zalastax commented Apr 7, 2015

Support other formats than JSON #34

Support other formats than JSON #34

Comments

Zalastax commented Apr 1, 2015

BlackEdder commented Apr 1, 2015

Zalastax commented Apr 1, 2015

BlackEdder commented Apr 1, 2015

Zalastax commented Apr 1, 2015

rcorre commented Apr 7, 2015

rcorre commented Apr 7, 2015

Zalastax commented Apr 7, 2015