Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to create a custom body parser #4223

Open
matthieusieben opened this issue Jan 20, 2021 · 26 comments
Open

Add ability to create a custom body parser #4223

matthieusieben opened this issue Jan 20, 2021 · 26 comments
Labels
feature New functionality or improvement

Comments

@matthieusieben
Copy link

Support plan

  • is this issue currently blocking your project? (yes/no): no
  • is this issue affecting a production system? (yes/no): no

Context

  • node version: 14
  • module version: 20.0.3
  • environment (e.g. node, browser, native): node
  • used with (e.g. hapi application, another framework, standalone, ...): hapi
  • any other relevant information: n.a.

What problem are you trying to solve?

I want to create a plugin that enables transparent yaml parsing of request payload. By transparent I mean that I would create an equivalence between yaml and json and transparently parse yaml and allow it to be used with routes expecting JSON.

My current workaround consists of monkey patching Subtext.parse to do something like:

const origParse = Subtext.parse
Subtext.parse = async function (req, tap, options) {
  const { payload, mime } = await origParse.call(this, req, tap, {
    ...options,
   allow: options.allow?.some(isJsonMime)
     ? [ ...options.allow, 'text/vnd.yaml', 'text/yaml', 'text/x-yaml', 'application/x-yaml' ]
     : options.allow,
  });

  if (options.parse === true && isYamlMime(mime)) {
    // Expose Yaml as plain JSON
    return { payload: parseYaml(payload), mime: 'application/json' };
  }

  return { payload, mime };
}

Do you have a new or modified API suggestion to solve the problem?

nope

@matthieusieben matthieusieben added the feature New functionality or improvement label Jan 20, 2021
@Nargonath
Copy link
Member

I think that it seems like an interesting feature but I feel this would probably require some heavy work in the codebase. I'm not sure of how wide the changes would need to be.

@kanongil
Copy link
Contributor

Yes, this is probably a substantial change. If we look into this, we could also consider supporting something like protobuf, where some kind of schema is needed to parse it.

@Nargonath
Copy link
Member

Your suggestion about protobuf rings a bell and remind me about a vision that Eran expressed a while back and that the TSC shares regarding hapi becoming protocol agnostic rather than HTTP centric. That way it could supports many different protocols (RPC, HTTP, etc.). I expect this rework to take place in a bit of a distant future so perhaps there is something we can do to accomodate in the meantime.

@matthieusieben
Copy link
Author

matthieusieben commented Feb 15, 2021

Given that all that was required for me to achieve this is to monkey patch Subtext.parse, it might not be complicated to allow customizing the parser to use through a route option. I am not sure about the exact API but I see a few places that could be adjusted to make this happen:

  • Using a this._extensions.onPayload during the lifecycle (similar to credentials)
  • Using a custom parser (defaulting to Subtext.parse) in internals.payload
  • Using an additional transform stream after (or before) the request._tap()
  • Allow providing a custom additionnal parser to @hapi/subtext's internals.object (instead of throwing) (similar to what's already done here)

@matthieusieben
Copy link
Author

matthieusieben commented Feb 15, 2021

Maybe something like a list of parsers to try out before throwing from @hapi/subtext when parse is true.

route.options.payload.parsers = [
  { mime: (mime) => mime === 'text/yaml', parse: (buffer, mime) => parseYaml(buffer.toString('utf8')) },
  { mime: 'text/yaml', parse: (buffer, mime) => parseYaml(buffer.toString('utf8')) },
  { mime: ['text/yaml'], parse: (buffer, mime) => parseYaml(buffer.toString('utf8')) },
  { mime: /^text\/yaml$/, parse: (buffer, mime) => parseYaml(buffer.toString('utf8')) },
]

@matthieusieben
Copy link
Author

matthieusieben commented Feb 15, 2021

This makes me thinks that it might actually make sense to extract completely the internals.object method from subtext and use a list of parsers in the options instead. This would improve separation of concerns for @hapi/subtext, and reduce the external dependencies of that module (allowing dependent modules to only "load" the parsers they actually require).

Hapi would have a default route.options.payload.parsers options for routes that would look like:

[
  // Binary

  {
    mime: 'application/octet-stream',
    parse: (payload) => (payload.length ? payload : null),
  },

  // Text

  {
    mime: /^text\/.+$/,
    parse: (payload) => payload.toString('utf8'),
  },

  // JSON

  {
    mime: /^application\/(?:.+\+)?json$/,
    parse: (payload, mime, options) => {
      if (!payload.length) {
        return null
      }

      try {
        return Bourne.parse(payload.toString('utf8'), { protoAction: options.protoAction })
      } catch (err) {
        const error = Boom.badRequest('Invalid request payload JSON format', err)
        error.raw = payload
        throw error
      }
    },
  },

  // Form-encoded

  {
    mime: 'application/x-www-form-urlencoded',
    parse: (payload, mime, options) => {
      const parse = options.querystring || Querystring.parse
      return payload.length ? parse(payload.toString('utf8')) : {}
    },
  },
]

@matthieusieben
Copy link
Author

@Nargonath @kanongil Any thoughts on this ?

@kanongil
Copy link
Contributor

kanongil commented Nov 1, 2021

The payload parsing could definitely do with a rework, and making it pluggable seems very reasonable.

I'm not sure that your specific suggestion works that well though. We can't really have a complicated default value. How would you even document that?

Maybe more something like the encoder / decoder registration, and then you just list the parsers you want to enable?

server.parser('yaml', {
    match: ['text/vnd.yaml', 'text/yaml', 'text/x-yaml', 'application/x-yaml'],
    parse(payload, options) {
        
    }
});

server.route({,
    config: {
        payload: {
             parse: ['json', {
                 parser: 'yaml',
                 options: {}
             }]
        }
    }
}

I guess the default would be: ['binary', 'text', 'json', 'qs']

Possibly the match could be excluded by making a lookup into the server.mime db.

@SK-FComputer
Copy link

I've used patch-package to patch @hapi/[email protected] for the project I'm working on, to create what is needed to support custom parsers.

Here is the diff that solved my problem:

subtext

diff --git a/node_modules/@hapi/subtext/lib/index.js b/node_modules/@hapi/subtext/lib/index.js
index 28b91de..888a432 100644
--- a/node_modules/@hapi/subtext/lib/index.js
+++ b/node_modules/@hapi/subtext/lib/index.js
@@ -107,7 +107,7 @@ internals.parse = async function (req, tap, options, contentType) {
     // Output: 'data'
 
     const payload = await Wreck.read(source, { timeout: options.timeout, maxBytes: options.maxBytes });
-    return internals.object(options, payload, contentType.mime);
+    return await internals.object(options, payload, contentType.mime);
 };
 
 
@@ -172,42 +172,73 @@ internals.raw = async function (req, tap, options) {
 };
 
 
-internals.object = function (options, payload, mime) {
-
-    // Binary
-
-    if (mime === 'application/octet-stream') {
-        return payload.length ? payload : null;
-    }
-
-    // Text
+internals.object = async function (options, payload, mime) {
+
+    // Default parsers
+    let parsers = [
+        {
+            // Binary
+            mime: 'application/octet-stream',
+            parse: (unparsedPayload) =>
+                unparsedPayload.length ? unparsedPayload : null,
+        },
+        {
+            // Text
+            mime: /^text\/.+$/,
+            parse: (unparsedPayload) => unparsedPayload.toString('utf8'),
+        },
+        {
+            // JSON
+            mime: /^application\/(?:.+\+)?json$/,
+            parse: (unparsedPayload) => {
+                if (!unparsedPayload.length) {
+                    return null;
+                }
 
-    if (mime.match(/^text\/.+$/)) {
-        return payload.toString('utf8');
+                try {
+                    return Bourne.parse(unparsedPayload.toString('utf8'), {
+                        protoAction: options.protoAction,
+                    });
+                } catch (err) {
+                    const error = Boom.badRequest(
+                        'Invalid request payload JSON format',
+                        err
+                    );
+                    error.raw = payload;
+                    throw error;
+                }
+            },
+        },
+        {
+            // Form-encoded
+            mime: 'application/x-www-form-urlencoded',
+            parse: (unparsedPayload) => {
+                const parse = options.querystring || Querystring.parse;
+                return unparsedPayload.length
+                    ? parse(unparsedPayload.toString('utf8'))
+                    : {};
+            },
+        },
+    ];
+
+    if (options.customParsers && options.customParsers.length) {
+        parsers = options.customParsers.concat(parsers);
     }
 
-    // JSON
-
-    if (/^application\/(?:.+\+)?json$/.test(mime)) {
-        if (!payload.length) {
-            return null;
-        }
-
-        try {
-            return Bourne.parse(payload.toString('utf8'), { protoAction: options.protoAction });
-        }
-        catch (err) {
-            const error = Boom.badRequest('Invalid request payload JSON format', err);
-            error.raw = payload;
-            throw error;
+    const parser = parsers.find((parser) => {
+        if (parser.mime) {
+            if (parser.mime instanceof RegExp) {
+                return parser.mime.test(mime);
+            } else if (parser.mime instanceof String) {
+                return parser.mime === mime;
+            }
         }
-    }
 
-    // Form-encoded
+        return false;
+    });
 
-    if (mime === 'application/x-www-form-urlencoded') {
-        const parse = options.querystring || Querystring.parse;
-        return payload.length ? parse(payload.toString('utf8')) : {};
+    if (parser) {
+        return await parser.parse(payload);
     }
 
     const error = Boom.unsupportedMediaType();

hapi

diff --git a/node_modules/@hapi/hapi/lib/config.js b/node_modules/@hapi/hapi/lib/config.js
index 5279a18..9cd765e 100644
--- a/node_modules/@hapi/hapi/lib/config.js
+++ b/node_modules/@hapi/hapi/lib/config.js
@@ -142,6 +142,7 @@ internals.routeBase = Validate.object({
     payload: Validate.object({
         output: Validate.valid('data', 'stream', 'file').default('data'),
         parse: Validate.boolean().allow('gunzip').default(true),
+        customParsers: Validate.array().items(Validate.object({mime: Validate.alternatives(Validate.object(), Validate.string()), parse: Validate.function()})),
         multipart: Validate.object({
             output: Validate.valid('data', 'stream', 'file', 'annotated').required()
         })

@types/hapi

diff --git a/node_modules/@types/hapi__hapi/index.d.ts b/node_modules/@types/hapi__hapi/index.d.ts
index 564ead8..acc7273 100644
--- a/node_modules/@types/hapi__hapi/index.d.ts
+++ b/node_modules/@types/hapi__hapi/index.d.ts
@@ -1306,6 +1306,18 @@ export type PayloadOutput = 'data' | 'stream' | 'file';
  */
 export type PayloadCompressionDecoderSettings = object;
 
+
+export interface RouteOptionsPayloadCustomParser {
+     /**
+     * A string or a RegExp instance to test with content-type mime of request.
+     */
+    mime: string | RegExp;
+    /**
+     * A function that takes the request payload as a Buffer, and returns the desired output.
+     */
+    parse: (unparsedPayload: Buffer) => Promise<any>;
+}
+
 /**
  * Determines how the request payload is processed.
  * [See docs](https://github.com/hapijs/hapi/blob/master/API.md#-routeoptionspayload)
@@ -1320,7 +1332,7 @@ export interface RouteOptionsPayload {
      * * multipart/form-data
      * * text/*
      * A string or an array of strings with the allowed mime types for the endpoint. Use this settings to limit the set of allowed mime types. Note that allowing additional mime types not listed
-     * above will not enable them to be parsed, and if parse is true, the request will result in an error response.
+     * above will not enable them to be parsed, and if parse is true, the request will result in an error response if a custom parser is not defined.
      * [See docs](https://github.com/hapijs/hapi/blob/master/API.md#-routeoptionspayloadallow)
      */
     allow?: string | string[] | undefined;
@@ -1403,6 +1415,12 @@ export interface RouteOptionsPayload {
      */
     parse?: boolean | 'gunzip' | undefined;
 
+    /**
+     * @default none.
+     * Add custom parsers for unsupported content-types.
+     */
+    customParsers?: RouteOptionsPayloadCustomParser[] | undefined;
+
     /**
      * @default to 10000 (10 seconds).
      * Payload reception timeout in milliseconds. Sets the maximum time allowed for the client to transmit the request payload (body) before giving up and responding with a Request Timeout (408)

@SK-FComputer
Copy link

@kanongil any thoughts ?

@kanongil
Copy link
Contributor

kanongil commented Nov 1, 2021

@SK-FComputer Yes, I think that you are spamming, posting this 3 places. Just post once, and link it from other issues. Also, it's quite rude to @ people for an immediate response when there is no urgency.

@SK-FComputer
Copy link

SK-FComputer commented Nov 1, 2021

@kanongil
Sorry for spamming, where should i leave this comment so that i can link to ?
The 2 other places are in another git repository than here, and i was leaving those comments for other people to use if they ever find it.

Sorry, but i don't see tagging as rude, unless tagging someone at random.
Your last response was an hour ago, so i thought a tag was warranted, and you could answer when the time is right.

@devinivy
Copy link
Member

devinivy commented Nov 1, 2021

Hey @matthieusieben do you have any thoughts or feedback on your and Gil's proposals so far? I think it would be nice to plan out this work, and (with hapi v21 in mind) decide on whether we think there would need to be any breaking changes to enable this.

Ref. hapijs/subtext#90

@matthieusieben
Copy link
Author

matthieusieben commented Nov 2, 2021

In my opinion, @kanongil 's proposal looks great. It has the advantage to be much less verbose. Plugins could cause conflicts (e.g. two different plugins declare a yaml parser but with different match/parse()) but I don't think that's really a problem.

Additionally, this proposal seems compatible with my suggestion in hapijs/subtext#90 (HapiJS would build the list of parsers passed as option to subtext, on a per-route basis).

@matthieusieben
Copy link
Author

I would add a way to allow a parser to be used globally (= add a parser to the list of "default" parsers used by a server), similar to how we can define auth methods (server default and/or route based)

@SK-FComputer
Copy link

SK-FComputer commented Nov 2, 2021

@matthieusieben @devinivy so i've hit another case where this is "a problem".
When uploading a file with these options:

options: {
	payload: {
		allow: 'multipart/form-data',
		parse: true,
		output: 'data',
		multipart: { output: 'annotated' },
	},
},

The file is "parsed" thorugh the default parsers, and if a parser is not defined, it returns code 415 Unsupported Media Type. For example XML files can hit application/xml, seen in the files annotated parts.

@SK-FComputer
Copy link

SK-FComputer commented Nov 2, 2021

Also this is caught in a try catch, so with my patch #4223 (comment) (Which i have expanded on further, @ me for updates if needed.)
https://github.com/hapijs/subtext/blob/035f4ae9e33afeb25277be0f53759472c128aca5/lib/index.js#L383-L390

Will catch a parser error, IE. XML missing closing tag etc., and return a buffer to the request validate if defined, and then to the request handler, and not a error to failAction in options.payload.failAction

@kanongil
Copy link
Contributor

kanongil commented Nov 2, 2021

It seems that you found a minor bug. Unknown content-types are documented to return a 400 Bad Request response here: https://hapi.dev/api?v=20.2.0#-routeoptionspayloadparse - but really a 415 response is returned.

I would say that the existing implementation is correct, and that the docs need fixing.

@SK-FComputer
Copy link

SK-FComputer commented Nov 3, 2021

@kanongil The 415 error code is not returned when parsing fails through #4223 (comment) settings, and failAction is never hit.
Which i think it should, unless there is a case where something else i expected.

@SK-FComputer
Copy link

I guess this wil never be implemented, just FIY, there is a new version of HAPI now, and this should be possible.

Here is the diff that solved my problem:

diff --git a/node_modules/@hapi/subtext/lib/index.js b/node_modules/@hapi/subtext/lib/index.js
index bd6fa72..6bd863f 100755
--- a/node_modules/@hapi/subtext/lib/index.js
+++ b/node_modules/@hapi/subtext/lib/index.js
@@ -175,44 +175,81 @@ internals.raw = async function (req, tap, options) {
 };
 
 
-internals.object = function (options, payload, mime) {
 
-    // Binary
 
-    if (mime === 'application/octet-stream') {
-        return payload.length ? payload : null;
-    }
-
-    // Text
-
-    if (mime.match(/^text\/.+$/)) {
-        return payload.toString('utf8');
+internals.object = async function (options, payload, mime) {        
+    // Default parsers
+    let parsers = [
+        {
+            // Binary
+            mime: 'application/octet-stream',
+            parse: (unparsedPayload) =>
+                unparsedPayload.length ? unparsedPayload : null,
+        },
+        {
+            // Text
+            mime: /^text\/.+$/,
+            parse: (unparsedPayload) => unparsedPayload.toString('utf8'),
+        },
+        {
+            // JSON
+            mime: /^application\/(?:.+\+)?json$/,
+            parse: (unparsedPayload) => {
+                if (!unparsedPayload.length) {
+                    return null;
+                }
+                try {
+                    return Bourne.parse(unparsedPayload.toString('utf8'), {
+                        protoAction: options.protoAction,
+                    });
+                } catch (err) {
+                    const error = Boom.badRequest(
+                        'Invalid request payload JSON format',
+                        err
+                    );
+                    error.raw = payload;
+                    throw error;
+                }
+            },
+        },
+        {
+            // Form-encoded
+            mime: 'application/x-www-form-urlencoded',
+            parse: (unparsedPayload) => {
+                const parse = options.querystring || Querystring.parse;
+                return unparsedPayload.length
+                    ? parse(unparsedPayload.toString('utf8'))
+                    : {};
+            },
+        },
+    ];
+
+    if (options.customParsers && options.customParsers.length) {
+        parsers = options.customParsers.concat(parsers);
     }
 
-    // JSON
-
-    if (/^application\/(?:.+\+)?json$/.test(mime)) {
-        if (!payload.length) {
-            return null;
+    const parser = parsers.find((parser) => {
+        if (parser.mime) {
+            if (parser.mime instanceof RegExp) {
+                return parser.mime.test(mime);
+            } else if (parser.mime instanceof String) {
+                return parser.mime === mime;
+            }
         }
+        return false;
+    });
         
+    if (parser) {
         try {
-            return Bourne.parse(payload.toString('utf8'), { protoAction: options.protoAction });
-        }
-        catch (err) {
-            const error = Boom.badRequest('Invalid request payload JSON format', err);
-            error.raw = payload;
-            throw error;
+            return await parser.parse(payload);
+        } catch (error) {
+            console.error(error);
+            const boomError = Boom.badRequest(`Error parsing data with content-type ${mime}, check content for errors!`)
+            boomError.raw = payload;
+            throw boomError;
         }
     }
     
-    // Form-encoded
-
-    if (mime === 'application/x-www-form-urlencoded') {
-        const parse = options.querystring ?? Querystring.parse;
-        return payload.length ? parse(payload.toString('utf8')) : {};
-    }
-
     const error = Boom.unsupportedMediaType();
     error.raw = payload;
     throw error;
@@ -384,7 +421,7 @@ internals.part = async function (part, output, set, options) {
     }
 
     try {
-        const object = internals.object(options, payload, mime);
+        const object = await internals.object(options, payload, mime);
         annotate(object);
     }
     catch (err) {

This issue body was partially generated by patch-package.

@SK-FComputer
Copy link

SK-FComputer commented Nov 9, 2022

And this for @hapi/hapi:

diff --git a/node_modules/@hapi/hapi/lib/config.js b/node_modules/@hapi/hapi/lib/config.js
index e3768ae..cd6c44f 100755
--- a/node_modules/@hapi/hapi/lib/config.js
+++ b/node_modules/@hapi/hapi/lib/config.js
@@ -144,6 +144,7 @@ internals.routeBase = Validate.object({
     payload: Validate.object({
         output: Validate.valid('data', 'stream', 'file').default('data'),
         parse: Validate.boolean().allow('gunzip').default(true),
+        customParsers: Validate.array().items(Validate.object({mime: Validate.alternatives(Validate.object(), Validate.string()), parse: Validate.function()})),
         multipart: Validate.object({
             output: Validate.valid('data', 'stream', 'file', 'annotated').required()
         })

@matthieusieben
Copy link
Author

Looks nice !

@SK-FComputer
Copy link

Has been nice for 1 year now, and still nothing.

image

@matthieusieben
Copy link
Author

Did you make a pr ?

@SK-FComputer
Copy link

I was rejected at the time, and has since just used patch-package.

@SK-FComputer
Copy link

Almost another year went by and i still don't see a way to do this..
2 years running now

I personally can't upgrade multiple projects because of this issue, and the time has almost come for a refactor of some.
So if this isn't implemented soonish, i rather rewrite in some other API framework, than figure out what is breaking my patches

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New functionality or improvement
Projects
None yet
Development

No branches or pull requests

5 participants