[1] Tay, Yi, et al. "Synthesizer: Rethinking Self-Attention in Transformer Models." arXiv preprint arXiv:2005.00743 (2020).
[2] Menglong Xu, Shengqiang Li, Xiao-Lei Zhang, “Transformer-based End-to-End Speech Recognition with Local Dense Synthesizer Attention” Proc. ICASSP 2021 : 5899-5903, DOI :10.1109/ICASSP39728.2021.9414353