-
Notifications
You must be signed in to change notification settings - Fork 1
Explanation of batching modes
Azure Table Storage supports batch operations (entity group transactions) for insert, replace and merge. They're very convenient when you're working on large sets of rows, as collecting entities to batches can reduce the amount of required table API calls significantly. There are however limitations with batching:
- Azure Table Storage natively supports up to 100 entities per batch, or 4 MB of payload. We can overcome these by splitting the set of entities into multiple batches.
- We cannot mix different operations within one batch. This is not a limitation of Azure Table Storage as it allows mixing insert, update and merge operations but the TableDataStore does not provide an interface that would allow collecting the different types of requests together.
- All entities within a batch must have the same Partition Key.
- An entity can only exist once in a batch.
- Operations within a batch are processed atomically; that is, all operations in the change set either succeed or fail.
The full description of batches, or entity group transactions, can be found in the MSFT docs.
Since the TableDataStore needs to support batching operations for more complex cases, it was decided to explicitly implement different batching modes that act as described.
No batching, or BatchingMode.None
executes all operations separately. Each table operation is followed by related blob operations, if there are any. Table entities are processed individually with one API call per entity, but these calls are made in parallel to a degree. Inefficient for large amounts of operations but generally the simplest case to handle when you're worrying about errors and error handling. Blob operations will only be performed after successful Table operations.
The Strict batching mode equals the Azure Table Storage default batch rules:
- Maximum of 100 entities per batch
- All entities in the batch must belong to the same partition
- Entity size may not exceed 1MB
- Batch size may not exceed 4MB
The batch is essentially a transaction, where either all operations succeed or none of them will. In addition, the entity data model may not contain LargeBlob properties (with inserts being an exception when all the property values are set to null). This is because the Strict mode represents the Azure Table Storage batch rules exactly, and it would not be a transaction anymore if there were non-transactionable blob operations involved.
If these requirements are not fulfilled, the operation will throw an exception.
You'll generally want to use this when you want to explicit control of your table operations and blobs are not involved.
Basically the same as Strict, but with the big differences that in Strong mode:
- Any number of entities can be handled, they will be grouped internally to sub-batches to overcome the 100 entity / 4 MB limitations
- Entities may belong to any partition, and are grouped internally to sub-batches per partition
This is especially useful mode when you have a lot of entity operations and your entities are from different partitions. All the heavy lifting gets done automatically. Sub-batches may however fail, so this is no longer a single transaction.
Loose mode combines Table Entities into sub-batches like Strong mode but also allows Blob operations to be performed along with them. This mode is no longer in any meaningful way transactional. Each sub-batch of entities is followed by related blob operations and there is no guarantee that the produced outcome is fully consistent - one or more blob operations could for example fail, while the related Table Entity batch would succeed, leaving some entities potentially in a broken state regarding their blobs.
This mode exists to allow batch operations with data models that contain LargeBlobs (see LargeBlobs) but you should be aware of the issues that may come along with it.