-
Notifications
You must be signed in to change notification settings - Fork 4
Bulk Operations
Andrew Geweke edited this page Nov 2, 2013
·
2 revisions
One of the principles of low_card_tables
is that everything should be done in bulk, with one-by-one operations merely an instance of a bulk operation that happens to only operate on one item. Consequently:
- Reading the low-card table is done in O(1) SQL statements.
- Creating new low-card rows is done in O(1) SQL statements.
This also means that low_card_tables
provides the tools you need to process large numbers of rows in a constant number of database queries. You will want to see the API for more details. But, as an example...
Say we have a database of 25,000,000 users with a brand-new low-card table, and we want to update those users to point to the right low-card rows. Here's what I would do:
- Use ActiveRecord::Batches, and in particular
#find_in_batches
, to find chunks of, say, 10,000 users at a time. - Iterate through this chunk, extracting from each record the attribute values for that record — for example,
{ :deleted => false, :deceased => false, :gender => 'female', :donation_level => 5 }
. - Build a
Hash
that maps each of these attributeHash
es to anArray
of all of the IDs of all theUser
models that have that particular set of attributes. - At the end of the chunk, call
UserStatus.low_card_find_or_create_ids_for
, and pass in the array of uniqueHash
es. - The return value will be a
Hash
that maps each of thoseHash
es to the correct low-card ID for that set of attributes. - Now, using the
Hash
you built above, fire off one update statement per unique set of attributes, setting the low-card ID where the user ID isIN (...)
. (Alternatively,activerecord-import
and itson_duplicate_key_update
support can actually bulk-update all of them in one fell swoop.) - Repeat for the next chunk of users.
In general, any operation you could want to do with low_card_tables
, you can do in bulk, and it is efficiently implemented that way.