nlp

语言模型

Posted by kifish on May 9, 2018

Collectively, the different units into which you can break down text (words, charac- ters, or n-grams) are called tokens, and breaking text into such tokens is called tokeniza-tion

-https://radimrehurek.com/gensim/models/doc2vec.html

待整理。