Copyright (C) Wrocław University of Science and Technology (PWr), 2013-2020. All rights reserved.
Developed within CLARIN-PL project.
Inforex is a web system for text corpora construction. Inforex allows parallel access and sharing resources among many users. The system assists semantic annotation of texts on several levels, such as marking text references, creating new references, or marking word senses.
- Michał Marcińczuk,
- Adam Kaczmarek,
- Jan Kocoń,
- Marcin Ptak,
- Mikołaj Szewczyk,
- Marcin Oleksy,
- Wojciech Rauk.
Marcińczuk, M. & Oleksy, M. (2019). Inforex — a Collaborative Systemfor Text Corpora Annotation and Analysis Goes Open. In Proceedings of the International Conference on Recent Advances in Natural Language Processing, RANLP 2019, pages 711―719. Varna, Bulgaria. INCOMA Ltd.
[PDF]
[Bibtex]
@inproceedings{marcinczuk-oleksy-2019-inforex,
title = "{I}nforex {---} a Collaborative Systemfor Text Corpora Annotation and Analysis Goes Open",
author = "Marci{\'n}czuk, Micha{\l} and
Oleksy, Marcin",
booktitle = "Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)",
month = sep,
year = "2019",
address = "Varna, Bulgaria",
publisher = "INCOMA Ltd.",
url = "https://www.aclweb.org/anthology/R19-1083",
doi = "10.26615/978-954-452-056-4_083",
pages = "711--719",
}
The dependencies are installed within Docker container and the Inforex source code is linked to the container as an external storage.
Before building the docker install Composer, Docker and Docker Compose running the following command:
sudo apt-get install composer docker docker-compose
Than build the docker by executing the following script.
./docker-dev-up.sh
Links:
- http://localhost:9080/inforex — default admin account admin/admin,
- http://localhost:7080 — phpMyAdmin with default an account inforex/password.
When new source files are added it is required to reload the composer dependencies by executing the following command:
composer update
See INSTALL.md.
- Speeding up docker rebuild by using multi-stage build (PR#99)
- Corpus exporter to CONLL and JSON format (PR#94)
- Import documents from a zip file as a background process (PR#93)
- Batch document deletion (PR#92)
- Custom corpus css styles (PR#91)
- UI improvements( PR#98, PR#88, PR#84, PR#82)
- Export raw text content (PR#83).