Skip to content

Latest commit

 

History

History
46 lines (33 loc) · 2.1 KB

README.md

File metadata and controls

46 lines (33 loc) · 2.1 KB

near-lake-torrents

This project provides a simple and easy way to download NEAR blockchain archival data from NEAR Lake using BitTorrent.

File Structure

NEAR Lake on S3 is organized in folders by block height only. Folder name is just padded block height like 000042007123/ and contains all the data for that block. Inside the folder, there are multiple files:

  • block.json - block header
  • shard_0.json - shard 0 data
  • shard_*.json - other shards data

This doesn't work well for local FS (and as result torrents) as there are too many folders on top level. It also isn't compressed which results in data transfers being costlier than needed.

To solve this, we have a script that takes NEAR Lake data and reorganizes it into a more efficient structure. It also compresses the data to save on transfer costs.

Check out the load-raw-near-lake for more details.

File structure generated by the script:

Top level:

  • block - block header
  • 0 - data for shard 0
  • * - data for other shards

Inside each shard folder:

  • 000042/007/000042007120.tgz - data for blocks 42007120-42007124 (assuming 5 blocks per archive)

Inside of .tgz archive:

  • 000042007120.json - data for block 42007120
  • 000042007121.json - data for block 42007121
  • 000042007122.json - data for block 42007122
  • 000042007123.json - data for block 42007123
  • 000042007124.json - data for block 42007124

Right now every million blocks are split into a separate torrent, so that it's easier to download only the data you need.

How to use

  1. Download magnet link for relevant data (structured in this repo like network_id/shard_id.csv, e.g. mainnet/0.csv)
  2. Add desired magnet link to your BitTorrent client for download. For example for Transmission using transmission-remote:
    # Download all data for shard 0
    for magnet in $(cat mainnet/0.csv); do
        transmission-remote -a "$magnet" --download-dir /global/path/to/download/0
    done