Improved multi-segment file uploads #826

rbair23 · 2023-10-16T21:44:12Z

rbair23
Oct 16, 2023
Maintainer

We should improve the way we segment a large file into transactions and upload them to the network to be more resilient to failure and to have greater control over when the file is fully uploaded. We had two ideas:

We could have a "segment count". When we prepare to upload a file, we can include in the FileUpdateTransactionBody or FileCreateTransactionBody a "1 of X segments" field. This way, the server (and people consuming records in the mirror node) can determine whether all expected file segments have been uploaded.
We could have a "file hash" added to those transactions to keep track of the completed hash. Each FileAppendTransactionBody we could have the rolling hash of the file up to that point in time. In this way, you could have a strong guarantee that the file in state is what you expect before applying a segment.

Maybe we do both of the above.

bugbytesinc · 2023-10-16T23:27:22Z

bugbytesinc
Oct 16, 2023

What is wrong with sequentially applying File Append Transactions and waiting for a receipt before applying the next one?

4 replies

rbair23 Oct 17, 2023
Maintainer Author

It is slow.

bugbytesinc Oct 17, 2023

Of course its slower, but it is easy and rock solid. What you're proposing is quite a bit of complexity for saving how many seconds for a client? Not to mention the back end complexity and drag on mirror node development?

steven-sheehy Oct 17, 2023
Maintainer

It's not just slower, there's no indication that the file is complete. In theory, an append could be submitted years later to an existing file to complete it. In particular, the mirror node needs the concept of atomic swap so it knows when system files like the address book or the fee schedule are ready to replace the existing files that it serves to users via its APIs.

Though how to accomplish that could be done many ways hence why we're discussing the options here. Besides segment counts, rolling hash, and atomic move transactions we should consider another option. The simplest option could be just to add a boolean last/complete flag to the file create/update/append transactions so downstream clients can react accordingly.

david-bakin-sl Aug 23, 2024
Collaborator

It's much slower for a very common case: uploading a contract's bytecodes.

rbair23 · 2023-10-17T01:49:39Z

rbair23
Oct 17, 2023
Maintainer Author

What if we allowed for out-of-order assembly of a file. Suppose that each chunk had a byte offset associated with it. The server could get chunks in any order, and assemble them on the fly. Since the files have an upper limit (I think it is 1MB, I would need to check), it should be OK.

When a chunk comes in, we read the file data, and we apply the chunk. If the chunk start is < file size, then we just apply the chunk, and overwrite anything that is there. If the chunk start is > file size, then we fill with 0's until we get to the right point, and then insert the chunk.

By allowing chunks to come in out of order we should get maximum performance, because the sender could break the file up and just start submitting transactions to multiple nodes.

I was thinking of yet another solution, where we actually store a file's chunks as unique entries in the merkle tree, and combine them only on queries. But that introduces a lot of complexity if I have two existing chunks and a new one is uploaded that overlaps with both. So probably we don't want to attempt that trick unless we have to.

0 replies

netopyr · 2024-06-03T12:31:24Z

netopyr
Jun 3, 2024
Collaborator

Before diving too deep into the specifications, I think we need to agree on the requirements. I made a first draft of the motivation and user stories:

Abstract

The current file upload functionality is slow and error-prone. This HIP aims to improve it while preserving the existing behavior for backward compatibility.

Motivation

Transactions have a maximum size (at the time of this writing, 6KB). If a user wants to upload larger files, they must be split into smaller packages, which are sent with a CreateFile or an UpdateFile transaction, followed by AppendFile transactions. The current approach has several shortcomings:

It is slow. The packages have to be sent sequentially. After sending a transaction, one has to wait for the receipt.

The uploaded content is not validated automatically (e.g., with a provided checksum).

While a file upload is happening, the file content is in an inconsistent state.

If overriding a file fails, the old content is lost. Once a file upload has started, it cannot be canceled.

It is not possible to signal the end of a file. This imposes a challenge to the network and other nodes, which may have to trigger functionality after a file is completely uploaded.

If a single append fails, the upload must be started again from the beginning

This HIP defines additions to the current API that make file uploads faster and less error-prone.

User stories

As a user, I want to use the existing functionality unchanged so that I am not forced to modify applications that use the current API.

As a user, I want a simple form of validation that the stored content matches what I sent to be sure it was not corrupted during transmission.

As a user, I want to upload file splits in parallel and be able to resend failed parts out-of-order so that I can upload large files faster.

As a user, I always want to receive a complete file when reading its content so that I do not have to deal with corrupt, incomplete files while an upload is in progress.

As a node operator, I want to be compensated for partially uploaded files to cover additional storage costs.

As a user, I want to be able to cancel an already-started upload so that I am not charged recurring costs for partially uploaded files.

Did I miss anything?

1 reply

rbair23 Jun 3, 2024
Maintainer Author

A consequence of "slow" is that it is also unreliable. Each "chunk" you upload has to be aimed at a specific 3 minute window. Otherwise, you have to try again from the beginning, or get all expired chunks re-signed and re-sent. For cases like council uploads where getting chunks resigned is not automated, this is extremely painful.

david-bakin-sl · 2024-06-04T15:08:52Z

david-bakin-sl
Jun 4, 2024
Collaborator

I'd like to see an API that took the file offset and total file size with the current chunk size + data for uploading a chunk. Then first one through consensus could allocate the file and all could write at the proper offset into the file and you could submit all chunks simultaneously and wait for all to complete - which could be in the very same block.

We could also fix the chunk size, not let it be arbitrary (except for the last chunk). This would help with the "rolling hash" issue if the hash was "segmented" by chunk.

When people are uploading a file they know everything about the file in advance - it's not like they're generating it on the fly one sequential chunk at a time. So file size is known in advance.

3 replies

david-bakin-sl Jun 4, 2024
Collaborator

Also @netopyr - I'd add somehow to the user stories. that the user knows the entire file in advance before beginning the upload. Specifying that enables more possibilities - considering a "streaming" source file means you can't have solutions that upload chunks out of order.

netopyr Sep 12, 2024
Collaborator

@david-bakin-sl I wonder if it is really necessary to mandate the file to be known upfront. If chunks are sent with their offset, we could allow users to send them in any order. We would have to store each chunk separately and assemble them at the end instead of on the fly, but this is probably more performant anyway.

david-bakin-sl Sep 13, 2024
Collaborator

I didn't suggest a mandate that the file be known up front. I was suggesting that the most common use case - perhaps the only use case - is that the user does have the entire file up-front and isn't "streaming" it. And thus doesn't have to send pieces one-at-a-time and in sequential order. He could just blast upload every chunk at once, and whatever order they reach consensus is fine.

netopyr · 2024-09-13T11:27:02Z

netopyr
Sep 13, 2024
Collaborator

Here is an initial suggestion for the API that allows streaming.

This API introduces the concept of an ongoing upload. It starts with a FileCreate or FileUpdate transaction with file_upload_command set to CONTINUE (see below) and ends with either a FileAppend transaction with file_upload_command set to FINISH or a FileCancelUpload transaction. Using FileAppend in legacy mode fails while an upload is ongoing. Calling FileCreate or FileUpdate while an upload is ongoing cancels the upload.

If the new behavior is chosen, we store all uploaded fragments separately. Once the server gets the signal that all fragments have been uploaded, we assemble the new file content from the fragments and replace the stored file. This ensures that all read requests provide a valid file, even while an upload is ongoing.

/**
 * A FileUploadCommand is sent as part of FileCreate, FileUpdate, and FileAppend transactions to control the upload.
 */
enum FileUploadCommand {
    CONTINUE = 0;
    FINISH = 1;
}

message FileCreateTransactionBody {
    [...]
    /**
     * If not set, we fall back to the legacy behavior.
     * If set to CONTINUE, this starts a concurrent upload. More FileAppendTransactions are expected.
     * If set to FINISH, the file content is complete and can be replaced immediately
     */
    FileUploadCommand file_upload_command = 9;
}

message FileUpdateTransactionBody {
    [...]
    /**
     * If not set, we fall back to the legacy behavior.
     * If set to CONTINUE, this starts a concurrent upload. More FileAppendTransactions are expected.
     * If set to FINISH, the file content is complete and can be replaced immediately
     */
    FileUploadCommand file_upload_command = 6;
}

message FileAppendTransactionBody {
    [...]
    /**
     * Offset of the current fragment
     * Ignored if legacy behavior was chosen
     */
    uint64 offset = 5;

    /**
     * If not set and no concurrent upload is in progress, we fall back to the legacy behavior
     * If set to CONTINUE, more FileAppendTransactions are expected.
     * If set to FINISH, the file content is complete and can be replaced.
     * Fails the transaction if not set and an upload is in progress or if set and no upload is in progress.
     */
    FileUploadCommand file_upload_command = 6;

    /**
     * If set, the checksum of the complete file is calculated and compared with the provided value.
     * The file content is only replaced if the validation succeeds. Otherwise, the upload is canceled without any changes to the stored file.
     * Ignored in legacy mode or if file_upload_command != FINISHED.
     */
    google.protobuf.UInt32Value check_sum = 7;
}

/**
 * Cancels the current upload.
 * All fragments that have been uploaded previously will be deleted.
 */
message FileCancelUploadTransaction {
    /**
     * The ID of the file for which the upload should be canceled.
     */
    FileID fileID = 1;
}

1 reply

david-bakin-sl Sep 13, 2024
Collaborator

I think you mean that the chunk marked "finished" is the chunk with the highest offset? And that the system must keep track of which chunks are missing (I.e, gaps in the file) until all are available and it can assemble (and checksum) the file?

Otherwise wouldn't this mean that the "last" chunk of the file uploaded could not be uploaded (i.e., transaction submitted) until there were receipts for all previous chunks? Because transactions can be reordered before consensus and thus the one flagged "last" might not actually be the last one processed?

(An alternative could be something like FileCreateTransactionBody has a "this file will have N appends coming" field and the file is complete when it counts N appends. Or it has a field for the total file size, perhaps better. Either way it would mean that the failure-to-checksum response code would be on the receipt of the last chunk to pass consensus.)

(Now that I think further on this approach it seems like you still need the receipt on the FileCreate as it will still need to go through consensus first before any of the appends, including the last one?

(Also: There's a possibility that with the new approach (vs legacy) we could have a time limit of some kind (# of rounds?) where if the file isn't complete within that limit we completely delete it from the store. Any delayed append for that file id would fail. User would have to start over if he really was uploading a file. But it might ameliorate some abuse scenarios?)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improved multi-segment file uploads #826

{{title}}

Replies: 5 comments 9 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Abstract

Motivation

User stories

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Improved multi-segment file uploads #826

rbair23 Oct 16, 2023 Maintainer

Replies: 5 comments · 9 replies

bugbytesinc Oct 16, 2023

rbair23 Oct 17, 2023 Maintainer Author

bugbytesinc Oct 17, 2023

steven-sheehy Oct 17, 2023 Maintainer

david-bakin-sl Aug 23, 2024 Collaborator

rbair23 Oct 17, 2023 Maintainer Author

netopyr Jun 3, 2024 Collaborator

Abstract

Motivation

User stories

rbair23 Jun 3, 2024 Maintainer Author

david-bakin-sl Jun 4, 2024 Collaborator

david-bakin-sl Jun 4, 2024 Collaborator

netopyr Sep 12, 2024 Collaborator

david-bakin-sl Sep 13, 2024 Collaborator

netopyr Sep 13, 2024 Collaborator

david-bakin-sl Sep 13, 2024 Collaborator

rbair23
Oct 16, 2023
Maintainer

Replies: 5 comments 9 replies

bugbytesinc
Oct 16, 2023

rbair23 Oct 17, 2023
Maintainer Author

steven-sheehy Oct 17, 2023
Maintainer

david-bakin-sl Aug 23, 2024
Collaborator

rbair23
Oct 17, 2023
Maintainer Author

netopyr
Jun 3, 2024
Collaborator

rbair23 Jun 3, 2024
Maintainer Author

david-bakin-sl
Jun 4, 2024
Collaborator

david-bakin-sl Jun 4, 2024
Collaborator

netopyr Sep 12, 2024
Collaborator

david-bakin-sl Sep 13, 2024
Collaborator

netopyr
Sep 13, 2024
Collaborator

david-bakin-sl Sep 13, 2024
Collaborator