bpe
Japanese and Korean voice search
https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/37842.pdf
Proposes wordpieces, 2012
Neural Machine Translation of Rare Words with Subword Units
https://arxiv.org/abs/1508.07909
2016
Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates
Sentencepiece: A Simple and Language Independent Subword Tokenizer and Detokenizer for Neural Text Processing