-
Notifications
You must be signed in to change notification settings - Fork 920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parquet sub-rowgroup reading. #14360
Parquet sub-rowgroup reading. #14360
Conversation
… been changed to ::parquet::detail, ::parquet::gpu has been renamed to ::parquet::detail, and several detail-style files which were just using ::parquet have been moved into parquet::detail.
…. More work remains though.
Hello @etseidl we think that this work #14360 will have some conflicts with your decoder addition in #14101. Our plan is to complete the work on #14101 first, and then resolve conflicts to this PR. In the meantime, would you please take a look at @nvdbaranec 's work here? |
It's a lot to digest, but looks great so far. I have a few questions. First, will this help with skip_rows? I'm thinking of the predicate case where an index gives you a range of rows to read from the middle of a rowgroup. Can this work be modified to (or does it already handle) process just the pages needed to satisfy the predicate along with any needed dictionary pages? The second is if the size statistics from #14000 are available, would you still use this mechanism but feed in the stats, or would it be better to have an entirely different path for stats driven chunked reading? |
…uncompressed data. Add a couple of simple testts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still not done, but getting close
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the last couple of nits.
thank you for leaving extensive comments, this would be unapproachable otherwise.
/merge |
closes #14270
Implementation of sub-rowgroup reading of Parquet files. This PR implements an additional layer on top of the existing chunking system. Currently, the reader takes two parameters:
input_pass_read_limit
which specifies a limit on temporary memory usage when reading and decompressing file data; andoutput_pass_read_limit
which specifies a limit on how large an output chunk (a table) can be.Currently when the user specifies a limit via
input_pass_read_limit
, the reader will perform multiplepasses
over the file at row-group granularity. That is, it will control how many row groups it will read at once to conform to the specified limit.However, there are cases where this is not sufficient. So this PR changes things so that we now have
subpasses
below the top levelpasses
. It works as follows:input_pass_read_limit
but we do not decompress them immediately. This constitutes apass
.subpasses
.subpass
we apply the output limit to producechunks
.So the overall structure of the reader is: (read)
pass
-> (decompress)subpass
-> (decode)chunk
Major sections of code changes:
Previously the incoming page data in the file was unsorted. To handle this we later on produced a
page_index
that could be applied to the array to get them in schema-sorted order. This was getting very unwieldy so I just sort the pages up front now and thepage_index
array has gone away.There are now two sets of pages to be aware of in the code. Within each
pass_intermediate_data
there is the set of all pages within the current set of loaded row groups. And then within thesubpass_intermediate_data
struct there is a separate array of pages representing the current batch of decompressed data we are processing. To keep the confusion down I changed a good amount of code to always reference it's array though it's associated struct. Ie,pass.pages
orsubpass.pages
. In addition, I removed thepage_info
fromColumnChunkDesc
to help prevent the kernels from getting confused.ColumnChunkDesc
now only has adict_page
field which is constant across all subpasses.The primary entry point for the chunking mechanism is in
handle_chunking
. Here we iterate through passes, subpasses and output chunks. Successive subpasses are computed and preprocessed through here.The volume of diffs you'll see in
reader_impl_chunking.cu
is a little deceptive. A lot of this is just functions (or pieces of functions) that have been moved over from eitherreader_impl_preprocess.cu
orreader_impl_helpers.cpp
. The most relevant actual changes are in:handle_chunking
,compute_input_passes
,compute_next_subpass
, andcompute_chunks_for_subpass
.Note on tests: I renamed
parquet_chunked_reader_tests.cpp
toparquet_chunked_reader_test.cu
as I needed to use thrust. The only actual changes in the file are the addition of theParquetChunkedReaderInputLimitConstrainedTest
andParquetChunkedReaderInputLimitTest
test suites at the bottom.