python - Gensim Extracting TF-IDF value of a word in a corpus -
i'm using gensim tfidfmodel model. code:
dictionary = corpora.dictionary(line.lower().split()) line in open('aaa.txt')) class mycorpus(object): def __iter__(self): line in open('aaa.txt'): yield dictionary.doc2bow(line.lower().split()) corpus = mycorpus() tfidf = models.tfidfmodel(corpus) corpus_tfidf = tfidf[corpus]
now want extract tf-idf value of each word, know in corpus_tfidf variable , tried codes below view of words tf-idf have word 'banana' , want find tf-idf value. there access find each word in dictionary dictionary.token2id['banana'] how can tf-idf of each word?
{dictionary.get(id): value doc in corpus_tfidf id, value in doc}
my corpus has 6501598 documents, 585499 features, 64106768 non-zero entries, , it's important value of each word in minimum time.
Comments
Post a Comment