Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance reduction (Node.js vs Browser) #52

Open
YSMull opened this issue Dec 16, 2021 · 3 comments
Open

performance reduction (Node.js vs Browser) #52

YSMull opened this issue Dec 16, 2021 · 3 comments

Comments

@YSMull
Copy link

YSMull commented Dec 16, 2021

I have written a benchmark in our nodejs project comparing with JSON.stringify / JSON.parse method,the result seems good and meets our needs. But when I run my bench on Chrome, the result was inverted.

I guess the reason for the performance reduction is that the byte representation has changed from Buffer to Uint8Array because the browser environment doesn't have Buffer.

In the browser scene, is there still room for optimization of unpack's performance? (Otherwise, replacing JSON.parse will introduce additional overhead)

Nodejs 17.2.0 (V8 version: 9.6.180.14)
image

Chrome 96.0.4664.93 (V8 version: 9.6.180.20)
image

benchmark code:

const _ = require('lodash');
const msgpackr = require('msgpackr');

class PayloadTest {
    constructor(testObj) {
        this.TEST_COUNT = 100;
        this.testObj = testObj;
        this.sendBaseline = undefined;
        this.recvBaseline = undefined;
        this.originSize = new TextEncoder().encode(JSON.stringify(testObj)).length;
    }
    
    _doTest(name, packFn, unPackFn) {
        console.log(`[${name}]:`);
        let sendData = packFn(this.testObj);
        let sendSize;
        if (typeof sendData !== 'string') {
            sendSize = sendData.length;
        } else {
            sendSize = Buffer.from(JSON.stringify(this.testObj), 'utf8').length;
        }
        console.log(`data size: ${(sendSize / 1024).toFixed(2)}kb (${((sendSize / this.originSize) * 100).toFixed(2)}%, saved: ${((this.originSize - sendSize) / 1024).toFixed(2)}kb)`);
        {
            let start = new Date().getTime();
            for (let i = 0; i < this.TEST_COUNT; i++) {
                packFn(this.testObj);
            }
            let end = new Date().getTime();
            if (name === 'JSON.stringify') {
                this.sendBaseline = end - start;
                console.log(`per send baseline cost: ${(this.sendBaseline / this.TEST_COUNT).toFixed(1)}ms`);
            } else {
                console.log(`per send cost: ${((end - start) / this.TEST_COUNT).toFixed(1)}ms (${end - start - this.sendBaseline > 0 ? '+' : ''}${((end - start - this.sendBaseline) / this.TEST_COUNT).toFixed(1)}ms)`);
            }
        }

        {
            let start = new Date().getTime();
            for (let i = 0; i < this.TEST_COUNT; i++) {
                unPackFn(sendData);
            }
            let end = new Date().getTime();
            if (name === 'JSON.stringify') {
                this.recvBaseline = end - start;
                console.log(`per recv baseline cost: ${(this.recvBaseline / this.TEST_COUNT).toFixed(1)}ms`);
            } else {
                console.log(`per recv cost: ${((end - start) / this.TEST_COUNT).toFixed(1)}ms (${end - start - this.recvBaseline > 0 ? '+' : ''}${((end - start - this.recvBaseline) / this.TEST_COUNT).toFixed(1)}ms)`);
            }
        }
        
        let recvData = unPackFn(sendData);
        if (_.isEqual(recvData, this.testObj)) {
            console.log('obj equal check passed!');
        }

        console.log();
    }

    startTest() {
        console.log(`-------${(this.originSize / 1024).toFixed(2)}kb, ${this.TEST_COUNT} times-------`);
        this._doTest('JSON.stringify', obj => JSON.stringify(obj), str => JSON.parse(str));
        this._doTest('msgpackr: pack', obj => msgpackr.pack(obj), buf => msgpackr.unpack(buf));
        let packr = new msgpackr.Packr();
        this._doTest('msgpackr: packr', obj => packr.pack(obj), buf => packr.unpack(buf));
    }
}

module.exports = PayloadTest;
@kriszyp
Copy link
Owner

kriszyp commented Dec 17, 2021

In terms of performance, the primary difference between NodeJS and the browser is that msgpackr is able to use a native add-on that significantly boosts the performance of extracting/deserializing strings. And unfortunately, the browser environment has pretty poor facilities for fast decoding of strings. The only options are on the browsers for decoding strings is TextDecoder and decoding character-by-character in plain JS, both of which are relatively slow compared to the msgpackr's add-on extractor (Node's Buffer also does include string decoding which are faster than TextDecoder, but not faster than the msgpackr's native extractor). I would be curious what your data looks like (amount of strings, string length, latin vs non-latin chars), but I assume it has enough string data that this is probably the primary differentiator in performance.

I don't believe there are any real differences in performance between reading from Buffer and Uint8Array themselves (Node has some extra machinery to reuse blocks of memory for Buffer allocation, but Buffer is actually a subclass of Uint8Array, so once you have an instance the interaction is the same).

msgpackr switches between TextDecoder and plain-JS decoding based on the string length, using TextDecoder for strings above 64 character. It is possible that there might be slight tweaks that could be done on some of that, but I believe most of that is pretty well adjusted and optimized.

To really achieve substantial performance gains in browser decoding, we need a way to to bundle all the string data into a single sequential block of data that can be decoded in single pass. TextDecoder has a high per-call overhead, but is plenty fast in terms per-character performance, and if it is used decode the entirety of the string data for a msgpack structure, in one pass, I believe it would be very fast. I have been thinking about making some type of custom extension for such a string bundle. And it seems likely that would be the type of thing that could improve performance in your situation.

@YSMull YSMull closed this as completed Dec 17, 2021
@YSMull YSMull reopened this Dec 17, 2021
@YSMull
Copy link
Author

YSMull commented Dec 17, 2021

Thanks for your reply!

Here is my test data
We are a BI product, response data for a specific chart might be a bit large (full of arrays).

Front end code use socket.io library (via WebSocket) for data transmission. If the message payload's type is String then socket.io will try to parse the text message to a JSON object. if the message payload's type is binary, the raw ArrayBuffer message will be passed to the user code for subscribing to events using socket.io.

And then, for example, if the binary is a MessagePack binary, we use msgpackr's unpack method to turn it into a JSON object. If its performance is better than JSON.parse, it would be so cool because we not only make the data compressed but also make the deserialization faster!

By the way, before calling unpack, we have to use new Uint8Array(buf) to turn it into a valid input type of unpack (which will cause a negligible overhead on huge data). It will be more convenient if msgpackr's unpack method supports ArrayBuffer as input.

Thanks again!

kriszyp added a commit that referenced this issue Dec 20, 2021
@kriszyp
Copy link
Owner

kriszyp commented Dec 21, 2021

I have added a new bundleStrings flag that allows for string bundling that I believe should significantly improve performance of decoding on browsers (more inline with NodeJS). It is a custom extension, so it requires msgpackr on both client and server (but I think that is your setup, IIUC). This is published in v1.5.2.

kriszyp added a commit that referenced this issue Dec 26, 2021
…for very large data structures to reduce memory entrapment and facilitate streaming in the future, #52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants