Eosc/norbert
Working Notes for Norwegian BERT-Like Models
Available Text Corpora
Preprocessing and Tokenization
SentencePiece library finds 157 unique characters in Norwegian Wikipedia dump.
SentencePiece library finds 157 unique characters in Norwegian Wikipedia dump.