Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use GGUF to store model weights #69

Merged
merged 10 commits into from
Mar 17, 2024
Merged

Use GGUF to store model weights #69

merged 10 commits into from
Mar 17, 2024

Conversation

certik
Copy link
Owner

@certik certik commented Mar 15, 2024

Here are the two lowest models. 124M:

$ gguf-dump model_fastgpt_124M_v2.gguf
* Loading: model_fastgpt_124M_v2.gguf
* File is LITTLE endian, script is running on a LITTLE endian host.

* Dumping 4 key/value pair(s)
      1: UINT32     |        1 | GGUF.version = 3
      2: UINT64     |        1 | GGUF.tensor_count = 22
      3: UINT64     |        1 | GGUF.kv_count = 1
      4: INT32      |        1 | general.data_offset = 1088

* Dumping 22 tensor(s)
      1:         12 |    12,     1,     1,     1 | I32     | header
      2:   38597376 |   768, 50257,     1,     1 | F32     | wte
      3:     786432 |   768,  1024,     1,     1 | F32     | wpe
      4:   28311552 |  3072,   768,    12,     1 | F32     | mlp_fc_w
      5:      36864 |  3072,    12,     1,     1 | F32     | mlp_fc_b
      6:   28311552 |   768,  3072,    12,     1 | F32     | mlp_proj_w
      7:       9216 |   768,    12,     1,     1 | F32     | mlp_proj_b
      8:   21233664 |  2304,   768,    12,     1 | F32     | attn_w
      9:      27648 |  2304,    12,     1,     1 | F32     | attn_b
     10:    7077888 |   768,   768,    12,     1 | F32     | attn_proj_w
     11:       9216 |   768,    12,     1,     1 | F32     | attn_proj_b
     12:       9216 |   768,    12,     1,     1 | F32     | ln1_b
     13:       9216 |   768,    12,     1,     1 | F32     | ln1_g
     14:       9216 |   768,    12,     1,     1 | F32     | ln2_b
     15:       9216 |   768,    12,     1,     1 | F32     | ln2_g
     16:        768 |   768,     1,     1,     1 | F32     | lnf_b
     17:        768 |   768,     1,     1,     1 | F32     | lnf_g
     18:      50258 | 50258,     1,     1,     1 | I32     | idx
     19:     356735 | 356735,    1,     1,     1 | I8      | decoder_txt
     20:      50002 | 50002,     1,     1,     1 | I32     | vocab_idx
     21:     406304 | 406304,    1,     1,     1 | I8      | vocab_txt
     22:        256 |   256,     1,     1,     1 | I32     | byte_decoder

and 355M:

$ gguf-dump model_fastgpt_355M_v2.gguf 
* Loading: model_fastgpt_355M_v2.gguf
* File is LITTLE endian, script is running on a LITTLE endian host.

* Dumping 4 key/value pair(s)
      1: UINT32     |        1 | GGUF.version = 3
      2: UINT64     |        1 | GGUF.tensor_count = 22
      3: UINT64     |        1 | GGUF.kv_count = 1
      4: INT32      |        1 | general.data_offset = 1088

* Dumping 22 tensor(s)
      1:         12 |    12,     1,     1,     1 | I32     | header
      2:   51463168 |  1024, 50257,     1,     1 | F32     | wte
      3:    1048576 |  1024,  1024,     1,     1 | F32     | wpe
      4:  100663296 |  4096,  1024,    24,     1 | F32     | mlp_fc_w
      5:      98304 |  4096,    24,     1,     1 | F32     | mlp_fc_b
      6:  100663296 |  1024,  4096,    24,     1 | F32     | mlp_proj_w
      7:      24576 |  1024,    24,     1,     1 | F32     | mlp_proj_b
      8:   75497472 |  3072,  1024,    24,     1 | F32     | attn_w
      9:      73728 |  3072,    24,     1,     1 | F32     | attn_b
     10:   25165824 |  1024,  1024,    24,     1 | F32     | attn_proj_w
     11:      24576 |  1024,    24,     1,     1 | F32     | attn_proj_b
     12:      24576 |  1024,    24,     1,     1 | F32     | ln1_b
     13:      24576 |  1024,    24,     1,     1 | F32     | ln1_g
     14:      24576 |  1024,    24,     1,     1 | F32     | ln2_b
     15:      24576 |  1024,    24,     1,     1 | F32     | ln2_g
     16:       1024 |  1024,     1,     1,     1 | F32     | lnf_b
     17:       1024 |  1024,     1,     1,     1 | F32     | lnf_g
     18:      50258 | 50258,     1,     1,     1 | I32     | idx
     19:     356735 | 356735,    1,     1,     1 | I8      | decoder_txt
     20:      50002 | 50002,     1,     1,     1 | I32     | vocab_idx
     21:     406304 | 406304,    1,     1,     1 | I8      | vocab_txt
     22:        256 |   256,     1,     1,     1 | I32     | byte_decoder

driver.f90 Show resolved Hide resolved
ci/build.sh Outdated Show resolved Hide resolved
@certik certik marked this pull request as draft March 15, 2024 20:23
@certik certik marked this pull request as ready for review March 16, 2024 00:42
@certik certik merged commit caf364a into main Mar 17, 2024
2 checks passed
@certik certik deleted the gguf branch March 17, 2024 17:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant