Text Depot is a tool to search and analyze topics of interest within a large database of text data. The Text Depot dashboard (this repo) provides a front-end to a set of indexes in ElasticSearch. To use this repository, you must provide one or more Elastic Search indexes in a particular format.
- Setup Elastic Search Server
- Create one or more index using Text Depot mappings.
- Clone this repo.
- Run
cp .configs_sample .configs
and fill in the relevant values. - Build and run docker container:
DOCKER_BUILDKIT=1 docker build -t text_depot_dashboard . && docker run -it -p 8080:3838 text_depot_dashboard
- Open the dashboard on your browser: http://localhost:8080
Each data source should be stored in its own Elastic Search index. For more information on how to configure your Elastic Search server, see elasticsearch/
Our workflow contained the following components:
This repository contains the dashboard code (Blue above) for Text Depot. The green components were scheduled with cron jobs, and keep the indexes up-to-date in the ElasticSearch Database. We wrote a custom Parser for each data source, and a single Annotator class that adds the [nieghbourhoods, sentiment, embeddings]
fields to each document and inserts them. The orange components were added for authentication and embeddings-based search, and are optional components.