A text based search engine.
This search engine will crawl through README files of various Github projects on the internet and store them as documents for our retrieval system. Each document is then indexed and stored for future use. Based on the query by the user, relevant results are returned to the user based on some ranking (to be included later) of documents.
- Python (preferably version 3.7)
- Pip and Pipenv
- Git
- Clone the repo using this command in your preferred directory
git clone https://www.github.com/PranjalGupta2199/open-source-search.git
- Change your working directory to the repo's codebase
cd open-source-search
- Create and install the dependencies using Pipenv.
pipenv install
pipenv shell
- Create a python terminal to install nltk dependencies
>>> import nltk
>>> nltk.download('punkt')
>>> nltk.download('stopwords')
>>> nltk.download('wordnet')
>>> exit()
- Run the search.py file to make query.
python search.py