Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support other formats than JSON #34

Open
Zalastax opened this issue Apr 1, 2015 · 7 comments
Open

Support other formats than JSON #34

Zalastax opened this issue Apr 1, 2015 · 7 comments

Comments

@Zalastax
Copy link
Collaborator

Zalastax commented Apr 1, 2015

To add support for other formats we will have to separate the conversion code from the JSON-specific things. I see two potentially viable ways of doing it: An intermediate format or the strategy pattern.
The intermediate format would be good because the converter code would have a common target and the format code could do whatever it want as long as it provides a valid intermediate format object. The problem would be coming up with the format and avoiding creating boilerplate code.
The strategy pattern would be replacing every call to the JSON object and instead call to an object that knows how to extract/insert data from/to the format. This will probably not be as flexible as the intermediate format, but would be easier to implement.

@BlackEdder
Copy link
Owner

I would be in favour of the intermediate format. I think the other option would make addig another format quite a lot of work (almost as much as just creating creating a whole separate library to support that format).

I half remember reading about a general Var type in dlang that can be used for any type (and also that JSONValue was derived from that). Such a type could be used for an intermediate format.

Another question is whether an intermediate format should support specific data structures. For example json supports arrays/ranges and maps. That seems quite a sensible basic set.

@Zalastax
Copy link
Collaborator Author

Zalastax commented Apr 1, 2015

Is this what you are thinking of? https://github.com/adamdruppe/arsd/blob/master/jsvar.d or http://dlang.org/phobos/std_variant.html
I think we should try to specialize the supported types as much as possible. If the implementer don't need the extra info they could just ignore it. Could this be made easier for the implementer if we used tags (and a tool for pattern matching somehow)? A custom map using struct would have the tags "userType","struct","map" and if you only need to know it's a map you ignore the other tags.

@BlackEdder
Copy link
Owner

I meant the variant indeed: http://dlang.org/phobos/std_variant.html

In many ways the ultimate specialization is to keep the types as they are, so we do need to make some kind of generalization. For example representing a struct/class with an associative array seems a natural generalization to me.

@Zalastax
Copy link
Collaborator Author

Zalastax commented Apr 1, 2015

That seems reasonable, and is the way JavaScript does it. But we should add as much metadata as possible to allow the format to do what it needs. std.variant could be a good fit if we use the Algebraic type (A closed discriminated union with a limited type universe). It would reduce some boilerplate code for the base types (and allow for using switch on the type using typeid?).
So we need to decide the intermediate format. It should be easy to implement a new format (XML, BSON etc.). As much metadata as possible should be available, but not be in the way. If we can we should make pattern matching possible.
This is a rough sketch to help us get started:

 struct SerializationData
 {
    auto item = Algebraic!(int, double, string, SerializationData[], SerializationData[string]);//Add more as needed
    alias item this;
    string[] tags;//Fill with things like userDefined
 }

@rcorre
Copy link

rcorre commented Apr 7, 2015

After our discussion about how similar jsonizer and painlessjson are, I got interested in this idea too.
std.variant is definitely one obvious option for an intermediate format, but I was wondering how much overhead it will have.

I've been messing around with an experiment that attempts work directly with whatever data it is given.
The idea is to provide a framework that knows how to populate a complex object from an arbitrary data type D as long as D provides:

  1. asPrimitive!T, which converts a D into a primitive type
  2. asRange, which converts the data into an input range, where each entry is a data type that supports 1-3
  3. getEntry(key), which extracts a named entry from the data. Again, the data type returned should support 1-3

Support for data types (json, xml, yaml, ect.) could be provided as plugins that implement the above functions for a given data type. For example, a plugin based on std.json might provide:

asPrimitive(T : long)(JSONValue json) { return cast(T) json.integer; }
asRange(JSONValue json) { return json.array[]; }
getEntry(JSONValue json, string key) { return json.object[key]; }

I haven't gotten very far but I thought I'd throw the concept out there.

@rcorre
Copy link

rcorre commented Apr 7, 2015

Actually, now that I describe that it sounds needlessly complicated. Maybe just using std.variant as an intermediate format would make things simpler.

@Zalastax
Copy link
Collaborator Author

Zalastax commented Apr 7, 2015

What you described is the strategy pattern I talked about in the first comment. It made me think and I realized that there are goals that are at conflict: Simple and unified interface vs CTFE. We might be able to satisfy both goals. The current state of painlessJSON is that we can generate code that is tailored for a specific type. We have the possibility of generating code which is as fast as if it was handwritten. The intermediate format as propsed removes some information from the compile time and forces it into run time (I haven't tested this so please prove me wrong). This gives the implementer of the format flexibility and ease of use:

  • The format doesn't have to be structured in the same way as JSON. I think it's essential that it should be possible to not have any hierarchy in the forced way you proposed (asRange(JSONValue json) { return json.array[]; } requires that the format can be broken into smaller parts you can pass around).
  • You can structure your code however you want, as long as you produces and consumes the intermediate format.

The strategy pattern instead keeps information at compile time which enables optimizations, but it forces a hierarchical structure and a coding style.
I conclude that the solution has to be free from forced hierarchy and provide as much information as is available at compile time. For me the sensible middle ground is a templatable intermediate format. Implementing a new format would look like the code we use for the simple parts right now (example:

private T defaultFromJSONImpl(T)(in JSONValue json) if (is(T == JSONValue))
{
return json;
}
private T defaultFromJSONImpl(T)(in JSONValue json) if (isIntegral!T)
{
return to!T(json.integer);
}
private T defaultFromJSONImpl(T)(in JSONValue json) if (isFloatingPoint!T)
{
if (json.type == JSON_TYPE.INTEGER)
return to!T(json.integer);
else
return to!T(json.floating);
}
private T defaultFromJSONImpl(T)(in JSONValue json) if (is(T == string))
{
return to!T(json.str);
}
private T defaultFromJSONImpl(T)(in JSONValue json) if (isBoolean!T)
{
if (json.type == JSON_TYPE.TRUE)
return true;
else
return false;
}
private T defaultFromJSONImpl(T)(in JSONValue json) if (isArray!T && !is(T == string))
{
T t; //Se is we can find another way of finding t.front
return map!((js) => fromJSON!(typeof(t.front))(js))(json.array).array;
}
private T defaultFromJSONImpl(T)(in JSONValue json) if (isAssociativeArray!T)
{
T t;
const JSONValue[string] jsonAA = json.object;
foreach (k, v; jsonAA)
{
t[fromJSON!(typeof(t.keys.front))(parseJSON(k))] = fromJSON!(typeof(t.values.front))(v);
}
return t;
}
).
Extraction of nested types from the intermediate format has to be done somehow too. I would like to give total freedom, while also offering an intermediate format walker that would enable you to do the implementation as if we were using the strategy pattern. This could be done as a mixin or as a function/class that takes a format-class with the right methods as a template parameter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants