The development has been conducted in the environment as follows:
- Ruby 2.4.1p111
- Postgres 9.6.8 (TODO: check the lowest viable version given app usage of the database)
- fribidi 0.3.6 (https://github.com/fribidi/fribidi/)
- tesserqact 4.0 (https://github.com/tesseract-ocr/tesseract)
- leptonica 1.74.2 (https://github.com/DanBloomberg/leptonica)
- libtiff
All above libraries can potentially be installed from the system's packages
with most likely exception of both tesseract
and leptonica
.
Ruby based dependencies are handled as usual via the Gemfile. Some gems may require some external libraries, all of which should be easy to get as system packages.
- Node (v8.6.0)
- Yarn (1.2.1)
All JavaScript related dependencies should be easy to get with just yarn
.
The app expects vertain environment variables set in order to operate properly:
[email protected]
CORPUSBUILDER_HOST=http://corpusbuilder.org
CORPUSBUILDER_PORT=1234
The app isn't meant to be used directly. It's purpose is to be both the database of corpuses and the tools to work on them — all to be consumed by an external application.
This means that to use it, one needs to integrate it with some other, existing app.
For the Ruby-world, a helper integration gem has been provided: https://github.com/berkmancenter/corpusbuilder-ruby-client
A more complete platform integration has been made with the SHARIAsource plaftorm. The SHARIAsource platform is open-source software and can be found on github: https://github.com/berkmancenter/SHARIAsource
CorpusBuilder was built with the collaborations of SHARIAsource and OpenITI.
CorpusBuilder is licensed under the GNU AGPL 3.0 License.
2019 President and Fellows of Harvard College.