Project Awesome project awesome

Recent Language Models > MINILM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers

, Wenhui Wang, et al., arXiv, 2020.

Package GitHub
Back to Question Answering