The Smarti installation guide is written for developers, operators and administrators who like to run Smarti on a server, as professional service or in production.
Installation packages are provided for Debian- and RedHat-based systems. As Smarti is a Spring Boot application it can be started by directly launching the executable jar:
$ java -jar smarti-${smarti-version}-exec.jar
Important
|
To run Smarti in production it is recommended to deploy the deb or rpm package on your target environment instead executing the jar directly.
If you like to setup your development client or a non-productive server read section Build and Run explains how to do start Smarti by executing the executable jar.
|
$ docker run -d --name mongodb mongo
$ docker run -d --name smarti --link mongodb:mongo -p 8080:8080 redlinkgmbh/smarti
The provided docker-image does not contain some required (GPL-licensed) dependencies. To build a contained image, use the following snippet:
$ docker build -t smarti-with-stanfordnlp -<EOF
FROM redlinkgmbh/smarti
USER root
ADD [\
"https://repo1.maven.org/maven2/edu/stanford/nlp/stanford-corenlp/3.8.0/stanford-corenlp-3.8.0.jar", \
"https://repo1.maven.org/maven2/edu/stanford/nlp/stanford-corenlp/3.8.0/stanford-corenlp-3.8.0-models-german.jar", \
"/opt/ext/"]
RUN chmod -R a+r /opt/ext/
USER smarti
EOF
To install additional components, e.g. Stanford-NLP, add the respective libraries into the folder /var/lib/smarti/ext
.
Stanford-NLP can be installed using the following command:
$ wget -P /var/lib/smarti/ext -nH -nd -c \
https://repo1.maven.org/maven2/edu/stanford/nlp/stanford-corenlp/3.8.0/stanford-corenlp-3.8.0.jar \
https://repo1.maven.org/maven2/edu/stanford/nlp/stanford-corenlp/3.8.0/stanford-corenlp-3.8.0-models-german.jar
When installed using one of the provided packages (deb
, rpm
), smarti runs as it’s own system user smarti
. This user is created during the installation
process if it does not already exists.
To run smarti as a different user (assistify
in this example), do the following steps:
-
Create the new working-directory
$ mkdir -p /data/smarti $ chown -R assistify: /data/smarti
-
Populate the new working-directory with the required configuration files, e.g. by copying the default settings
$ cp /etc/smarti/* /data/smarti
-
Update the systemd configuration for smarti
$ systemctl edit smarti
and add the following content
[Service] User = assistify WorkingDirectory = /data/smarti
-
Restart the smarti
$ systemctl try-restart smarti
-
/etc/default/smarti
-
JVM and JVM Options (e.g.
-Xmx4g
) -
Program Arguments (overwriting settings from the
application.properties
)
To check the service status use the daemon
$ service smarti status
This should result in something like
● smarti.service - smarti
Loaded: loaded (/lib/systemd/system/smarti.service; enabled; vendor preset: enabled)
Active: active (running) since Sat 2017-08-26 12:12:28 UTC; 6s ago
Main PID: 4606 (bash)
Tasks: 22
Memory: 647.8M
CPU: 12.518s
CGroup: /system.slice/smarti.service
-
Working-Directory at
/var/lib/smarti/
-
Log-Files available under
/var/lib/smarti/logs/
, a symlink to/var/log/smarti/
-
Log-Configuration under
/var/lib/smarti/logback.xml
, a symlink to/etc/smarti/logback.xml
Note
|
Please keep in mind, that if smarti runs as a different user it probably also has a custom working directory. In such case, logs are stored in the new working directory (or whatever is configured in the logging-configuration). |
To Check if Smarti is started search in /var/log/smarti/main.log
for something like
[main] INFO io.redlink.smarti.Application - smarti started: http://${smarti-host}:${smarti-port}/
A system-wide health check is available at http://${smarti-host}:${smarti-port}/system/health
.
In case of any problems you can call this URL in your browser or send a curl
in order to check if Smarti is up and running.
$ curl -X GET http://${smarti-host}:${smarti-port}/system/health
On success, if everything is running it returns: {"status":"UP"}
else {"status":"DOWN"}
.
Note
|
{"status":"DOWN"} is also reported, if Stanford NLP libraries are not present in the classpath.
|
You can get a detailed information about the build version that is running by calling
$ curl -X GET http://${smarti-host}:${smarti-port}/system/info
When installing via one of the provided packages (rpm
, deb
) a list of used third-party libraries and their licenses are available under
-
/usr/share/doc/smarti/THIRD-PARTY.txt
(backend), and -
/usr/share/doc/smarti/UI-THIRD-PARTY.json
(frontend, UI)
From the running system, similar files are served.
Backend
$ curl http://${smarti-host}:${smarti-port}/THIRD-PARTY.txt
Frontend UI
$ curl http://${smarti-host}:${smarti-port}/3rdPartyLicenses.json
Once Smarti is up and running, go ahead and read the next section Application Configuration about Smarti’s powerful configuration options.
Note
|
The default configuration of Smarti is optimized for 4 GByte of Memory (JVM parameter: -Xmx4g ) and 2 CPU cores.
|
Smarti performs text analysis over conversations. Such analysis are based on machine learning models that require a considerable amount of memory and CPU time. The selection of those models and the configuration of those components highly influence the memory requirements.
Analysis are precessed asynchronously in a thread pool with a fixed size. The size of this pool should not exceed the number of available CPU cores. For a full utilization of the CPU resources the number of threads should be close or equals to the number of CPU cores. However every processing thread will also require additional memory. While this number highly depends on the analysis configuration 500 MByte is a good number to start with.
As an example running Smarti with the default Analysis configuration on a host with 16 CPU cores should use a thread pool with 15 threads and will require 3 GByte + 15 * 500 MByte = 10.5 GByte Memory.
-
smarti.processing.numThreads = 2
: The number of analysis processing threads. This number should be close or equals to the number of CPUs. Each thread will also require additional memory to handle processing information. This amount highly depends on the analysis configuration. Especially of the Stanford NLP Parser component (see below). -
Standord NLP includes several configuration options that influence the memory (both base and processing) memory requirements.
-
nlp.stanfordnlp.de.nerModel = edu/stanford/nlp/models/ner/german.conll.hgc_175m_600.crf.ser.gz
: Name Finder models are the biggest contribution to the base memory requirement. The default model uses about 300 MByte -
Parser: The parsed component has the highest impact on the memory requirements of Smarti. The defautl configuration uses the Shift Reduce Parser as it requires the least memory. As potential configuration options do have a huge impact it is recommended to look at additional information provided by the Stanford Parser FAQ (NOTE that values provided on this page are for a 32bit JVM. So memory requirements on modern systems will be double those values).
-
nlp.stanfordnlp.de.parseModel = edu/stanford/nlp/models/lexparser/germanPCFG.ser.gz
: The parse model to use. Stanford does support three models that have hugely different memory and cpu requirements. Also not that those memory requirement are for processing memory, what means that this memory needs to be provided per analysis thread! -
nlp.stanfordnlp.de.parseMaxlen = 30
: Parsing longer sentences requires square more memory. So restricting the maximum number of tokens of sentences is essential. 30 Tokens seams to be a good tradeoff. In settings where memory is not a concern this value can be increased to40
. If memory consumption is a problem on can decrease this value to 20 tokens.
-
-
-
Open NLP provides an additional Named Entity Finder model for German. The model is provided by IXA Nerc and requires about 500 MByte of memory.