Chirkova, Nadezhda
6 publications
ICLR
2023
CodeBPE: Investigating Subtokenization Options for Large Language Model Pretraining on Source Code
ICLRW
2022
CodeBPE: Investigating Subtokenization Options for Large Language Model Pretraining on Source Code