April 16, 2018
Extracting keyphrases from texts: unsupervised algorithm TopicRank
Keyphrase extraction is the task of identifying single or multi-word expressions that represent the main topics of a document. There are 2 approaches to extract topics (and/or keyphrases) from a text: supervised and unsupervised.
Supervised approach This is a multi-label, multi-class classification algorithm, where following features can be used as an input:
text converted to bag-of-words text is treated as a stream of vectors, which are pre-trained word embeddings For bag-of-words linear SVM is a good classifier.
Read more