Replies: 5 comments 9 replies
-
What is wrong with sequentially applying File Append Transactions and waiting for a receipt before applying the next one? |
Beta Was this translation helpful? Give feedback.
-
What if we allowed for out-of-order assembly of a file. Suppose that each chunk had a byte offset associated with it. The server could get chunks in any order, and assemble them on the fly. Since the files have an upper limit (I think it is 1MB, I would need to check), it should be OK. When a chunk comes in, we read the file data, and we apply the chunk. If the chunk start is < file size, then we just apply the chunk, and overwrite anything that is there. If the chunk start is > file size, then we fill with 0's until we get to the right point, and then insert the chunk. By allowing chunks to come in out of order we should get maximum performance, because the sender could break the file up and just start submitting transactions to multiple nodes. I was thinking of yet another solution, where we actually store a file's chunks as unique entries in the merkle tree, and combine them only on queries. But that introduces a lot of complexity if I have two existing chunks and a new one is uploaded that overlaps with both. So probably we don't want to attempt that trick unless we have to. |
Beta Was this translation helpful? Give feedback.
-
Before diving too deep into the specifications, I think we need to agree on the requirements. I made a first draft of the motivation and user stories:
Did I miss anything? |
Beta Was this translation helpful? Give feedback.
-
I'd like to see an API that took the file offset and total file size with the current chunk size + data for uploading a chunk. Then first one through consensus could allocate the file and all could write at the proper offset into the file and you could submit all chunks simultaneously and wait for all to complete - which could be in the very same block. We could also fix the chunk size, not let it be arbitrary (except for the last chunk). This would help with the "rolling hash" issue if the hash was "segmented" by chunk. When people are uploading a file they know everything about the file in advance - it's not like they're generating it on the fly one sequential chunk at a time. So file size is known in advance. |
Beta Was this translation helpful? Give feedback.
-
Here is an initial suggestion for the API that allows streaming. This API introduces the concept of an ongoing upload. It starts with a If the new behavior is chosen, we store all uploaded fragments separately. Once the server gets the signal that all fragments have been uploaded, we assemble the new file content from the fragments and replace the stored file. This ensures that all read requests provide a valid file, even while an upload is ongoing.
|
Beta Was this translation helpful? Give feedback.
-
We should improve the way we segment a large file into transactions and upload them to the network to be more resilient to failure and to have greater control over when the file is fully uploaded. We had two ideas:
FileUpdateTransactionBody
orFileCreateTransactionBody
a "1 of X segments" field. This way, the server (and people consuming records in the mirror node) can determine whether all expected file segments have been uploaded.FileAppendTransactionBody
we could have the rolling hash of the file up to that point in time. In this way, you could have a strong guarantee that the file in state is what you expect before applying a segment.Maybe we do both of the above.
Beta Was this translation helpful? Give feedback.
All reactions