-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add BufferedOutputStream #12052
base: main
Are you sure you want to change the base?
Conversation
This pull request was exported from Phabricator. Differential Revision: D67997655 |
✅ Deploy Preview for meta-velox canceled.
|
Summary: Context: I plan to update the PrestoBatchVectorSerializer to write directly to the OutputStream rather than going through VectorStreams as I've seen this can greatly improve the speed. However, this ends up with many small writes to the OutputStream, which is typically a ByteOutputStream, so this leads to a lot of small allocations which is counter productive to the optimization I'm trying to make. To address this I add a BufferedOutputStream which wraps around another OutputStream, coalesces writes in a buffer, and flushes those as large writes to the wrapped OutputStream as the buffer fills up, or as needed. In my experiments I've seen the cost of the additional copy is far less then the cost of the tiny allocations. Differential Revision: D67997655
Summary: Context: I plan to update the PrestoBatchVectorSerializer to write directly to the OutputStream rather than going through VectorStreams as I've seen this can greatly improve the speed. However, this ends up with many small writes to the OutputStream, which is typically a ByteOutputStream, so this leads to a lot of small allocations which is counter productive to the optimization I'm trying to make. To address this I add a BufferedOutputStream which wraps around another OutputStream, coalesces writes in a buffer, and flushes those as large writes to the wrapped OutputStream as the buffer fills up, or as needed. In my experiments I've seen the cost of the additional copy is far less then the cost of the tiny allocations. Differential Revision: D67997655
Summary: Context: I plan to update the PrestoBatchVectorSerializer to write directly to the OutputStream rather than going through VectorStreams as I've seen this can greatly improve the speed. However, this ends up with many small writes to the OutputStream, which is typically a ByteOutputStream, so this leads to a lot of small allocations which is counter productive to the optimization I'm trying to make. To address this I add a BufferedOutputStream which wraps around another OutputStream, coalesces writes in a buffer, and flushes those as large writes to the wrapped OutputStream as the buffer fills up, or as needed. In my experiments I've seen the cost of the additional copy is far less then the cost of the tiny allocations. Differential Revision: D67997655
Summary: Context: I plan to update the PrestoBatchVectorSerializer to write directly to the OutputStream rather than going through VectorStreams as I've seen this can greatly improve the speed. However, this ends up with many small writes to the OutputStream, which is typically a ByteOutputStream, so this leads to a lot of small allocations which is counter productive to the optimization I'm trying to make. To address this I add a BufferedOutputStream which wraps around another OutputStream, coalesces writes in a buffer, and flushes those as large writes to the wrapped OutputStream as the buffer fills up, or as needed. In my experiments I've seen the cost of the additional copy is far less then the cost of the tiny allocations. Differential Revision: D67997655
cc9cffa
to
910ee07
Compare
This pull request was exported from Phabricator. Differential Revision: D67997655 |
Summary: Context: I plan to update the PrestoBatchVectorSerializer to write directly to the OutputStream rather than going through VectorStreams as I've seen this can greatly improve the speed. However, this ends up with many small writes to the OutputStream, which is typically a ByteOutputStream, so this leads to a lot of small allocations which is counter productive to the optimization I'm trying to make. To address this I add a BufferedOutputStream which wraps around another OutputStream, coalesces writes in a buffer, and flushes those as large writes to the wrapped OutputStream as the buffer fills up, or as needed. In my experiments I've seen the cost of the additional copy is far less then the cost of the tiny allocations. Differential Revision: D67997655
910ee07
to
5570b5e
Compare
Summary: Context: I plan to update the PrestoBatchVectorSerializer to write directly to the OutputStream rather than going through VectorStreams as I've seen this can greatly improve the speed. However, this ends up with many small writes to the OutputStream, which is typically a ByteOutputStream, so this leads to a lot of small allocations which is counter productive to the optimization I'm trying to make. To address this I add a BufferedOutputStream which wraps around another OutputStream, coalesces writes in a buffer, and flushes those as large writes to the wrapped OutputStream as the buffer fills up, or as needed. In my experiments I've seen the cost of the additional copy is far less then the cost of the tiny allocations. Differential Revision: D67997655
5570b5e
to
de59cc7
Compare
This pull request was exported from Phabricator. Differential Revision: D67997655 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D67997655 |
Summary: Context: I plan to update the PrestoBatchVectorSerializer to write directly to the OutputStream rather than going through VectorStreams as I've seen this can greatly improve the speed. However, this ends up with many small writes to the OutputStream, which is typically a ByteOutputStream, so this leads to a lot of small allocations which is counter productive to the optimization I'm trying to make. To address this I add a BufferedOutputStream which wraps around another OutputStream, coalesces writes in a buffer, and flushes those as large writes to the wrapped OutputStream as the buffer fills up, or as needed. In my experiments I've seen the cost of the additional copy is far less then the cost of the tiny allocations. Differential Revision: D67997655
de59cc7
to
54d86f0
Compare
Summary: Context: I plan to update the PrestoBatchVectorSerializer to write directly to the OutputStream rather than going through VectorStreams as I've seen this can greatly improve the speed. However, this ends up with many small writes to the OutputStream, which is typically a ByteOutputStream, so this leads to a lot of small allocations which is counter productive to the optimization I'm trying to make. To address this I add a BufferedOutputStream which wraps around another OutputStream, coalesces writes in a buffer, and flushes those as large writes to the wrapped OutputStream as the buffer fills up, or as needed. In my experiments I've seen the cost of the additional copy is far less then the cost of the tiny allocations. Differential Revision: D67997655
54d86f0
to
2580ced
Compare
This pull request was exported from Phabricator. Differential Revision: D67997655 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D67997655 |
Summary:
Context:
I plan to update the PrestoBatchVectorSerializer to write directly to the OutputStream rather than going through
VectorStreams as I've seen this can greatly improve the speed. However, this ends up with many small writes to
the OutputStream, which is typically a ByteOutputStream, so this leads to a lot of small allocations which is
counter productive to the optimization I'm trying to make.
To address this I add a BufferedOutputStream which wraps around another OutputStream, coalesces writes in a
buffer, and flushes those as large writes to the wrapped OutputStream as the buffer fills up, or as needed.
In my experiments I've seen the cost of the additional copy is far less then the cost of the tiny allocations.
Differential Revision: D67997655