Skip to content

Implements Reformer: The Efficient Transformer in pytorch.

License

Notifications You must be signed in to change notification settings

Rick-McCoy/Reformer-pytorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reformer-pytorch

Implements Reformer: The Efficient Transformer in pytorch. (Work in progress)

Prerequisites

  • Tested with Python 3.7.5, Pytorch 1.4.0.
  • This code is built upon the pytorch-lightning framework.
  • pip install -r requirements.txt

How to train

Datasets

  • If you want to modify trainer.py or model\model.py, it is recommended that you familiarize with youself the pytorch-lightning library beforehand.
  • A custom copy task & music dataset has been implemented under datasets\dataloader.py. Modify as needed.
  • A config yaml file must be placed under config. See provided yaml files for basic framework.

Running the code

  • python3 trainer.py -c \path\to\config\yaml -n [name of run] -b [batch size] -f [fast dev run] -v [version number]
  • The -f flag is used for debugging; only one batch of training, validation, and testing will be calculated.
  • The -v flag is used for resuming from checkpoints; leave empty for new version.
  • A toy copy task of length 32, vocab 128 converges around ~6k steps using a batch size of 1024, learning rate of 1e-3 and Adam. The checkpoint is located under checkpoints\.

Validation accuracy

How to sample

Preparing the checkpoints

  • A complete checkpoint folder must be placed under logs\. Use the entire folder pytorch-lightning automatically saves.

Running the code

  • A corresponding version number must be provided with a -v flag.
  • Run the code with the -s flag set to True. This will generate 1 sample under sample\, if using the music dataset.

To-do

  • Implement general framework of Reformer
  • Rewrite using pytorch-lightning framework
  • Implement Label Smoothing
  • Implement LSH attention
  • Implement reversible layer
  • Implement autoregressive sampling
  • Implement various datasets

Implementation Authors

License

MIT License

Acknowlegdements

  • The general structure of this code is based on The Annotated Transformer, albeit heavily modified.
  • I am aware that reformer-lm exists. However, I was frustrated with the original trax implementation that the authors provided, and decided to rewrite the entire thing from the ground up. Naturally, expect bugs everywhere.
  • Thanks to MINDsLab for providing training resources.

About

Implements Reformer: The Efficient Transformer in pytorch.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages