Saturday, October 6, 2018

Gensim NLP and TF-IDF document search example


In this blog I'll try to use an NLP (Natural language processing) library to get to:
- know some basic concepts and
- try to search a data set of books metadata (author, title, summary) to respond to user query

Setup
 you need:
- Anaconda (python environment management tool), get it here
Anaconda works by creating an isolated installation of python with the packages needed and it won't affect other environments so you can run different versions of python and other python libraries without conflicts
- code: https://github.com/blabadi/gensim-books

- setup the environment (in shell window or in anaconda prompt) run:
conda create gensim_env python=3.5 gensim nltk jupyter
this will install the packages needed
- navigate to the code directory, run:
jupyter notebook

this will open the server web page and you can select the notebook that we need now : gensimWord2Vec

Code
https://github.com/blabadi/gensim-books/blob/master/gensimTFIDF.ipynb

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Istio —simple fast way to start

istio archeticture (source istio.io) I would like to share with you a sample repo to start and help you continue your jou...