The Data Ingestion Server is the "Ingestion Firewall and Data Cleaning Middleware" of IUDX. It enables Providers and Delegates to publish data using the IUDX API as per the data descriptor using the HTTP protocol over TLS(HTTPs).
- Data Ingestion Server allows IUDX Data Providers and Delegate to publish data into the IUDX platform
- Allows IUDX admin to register and delete ingestion stream for one or more data resources using standard APIs
- Integrated with IUDX authorization server (token introspection) to allow data publication
- Secure data publication over TLS.
- Scalable, service mesh architecture based implementation using open source components: Vert.X API framework and RabbitMQ for data broker.
- Hazelcast and Zookeeper based cluster management and service discovery.
The api docs can be found here.
The Data ingestion pipeline connects to external dependencies namely
- RabbitMQ
Find the installations of the above along with the configurations to modify the database url, port and associated credentials in the appropriate sections here
Make a config file based on the template in ./configs/config-example.json
- Generate a certificate using Lets Encrypt or other methods
- Make a Java Keystore File and mention its path and password in the appropriate sections
- Modify the database url and associated credentials in the appropriate sections
- Install docker and docker-compose
- Clone this repo
- Build the images
./docker/build.sh
- Modify the
docker-compose.yml
file to map the config file you just created - Start the server in production (prod) or development (dev) mode using docker-compose
docker-compose up prod
- Install java 11 and maven
- Use the maven exec plugin based starter to start the server
mvn clean compile exec:java@data-ingestion-server
- Install java 11 and maven
- Set Environment variables
export DI_URL=https://<rs-domain-name>
export LOG_LEVEL=INFO
- Use maven to package the application as a JAR
mvn clean package -Dmaven.test.skip=true
- 2 JAR files would be generated in the
target/
directoryiudx.data.ingestion.server-cluster-0.0.1-SNAPSHOT-fat.jar
- clustered vert.x containing micrometer metricsiudx.data.ingestion.server-dev-0.0.1-SNAPSHOT-fat.jar
- non-clustered vert.x and does not contain micrometer metrics
Note: The clustered JAR requires Zookeeper to be installed. Refer here to learn more about how to set up Zookeeper. Additionally, the zookeepers
key in the config being used needs to be updated with the IP address/domain of the system running Zookeeper.
The JAR requires 3 runtime arguments when running:
- --config/-c : path to the config file
- --hostname/-i : the hostname for clustering
- --modules/-m : comma separated list of module names to deploy
e.g.
java -jar ./fatjar.jar --host $(hostname) -c configs/config.json -m iudx.data.ingestion.server.ApiServerVerticle
Use the--help/-h
argument for more information. You may additionally append anDI_JAVA_OPTS
environment variable containing any Java options to pass to the application. e.g.
$ export RS_JAVA_OPTS="-Xmx4096m"
$ java $RS_JAVA_OPTS -jar target/iudx.data.ingestion.server-cluster-0.0.1-SNAPSHOT-fat.jar ...
The JAR requires 1 runtime argument when running:
- --config/-c : path to the config file
e.g.
java -Dvertx.logger-delegate-factory-class-name=io.vertx.core.logging.Log4j2LogDelegateFactory -jar target/iudx.data.ingestion.server-dev-0.0.1-SNAPSHOT-fat.jar -c configs/config.json
Use the--help/-h
argument for more information. You may additionally append anRS_JAVA_OPTS
environment variable containing any Java options to pass to the application. e.g.
$ export RS_JAVA_OPTS="-Xmx1024m"
$ java $RS_JAVA_OPTS -jar target/iudx.data.ingestion.server-dev-0.0.1-SNAPSHOT-fat.jar ...
- Run the server through either docker, maven or redeployer
- Run the unit tests and generate a surefire report
mvn clean test-compile surefire:test surefire-report:report
- Reports are stored in
./target/
Integration tests are through Postman/Newman whose script can be found from here.
- Install prerequisites
- Example Postman environment can be found here
- Run the server through either docker, maven or redeployer
- Run the integration tests and generate the newman report
newman run <postman-collection-path> -e <postman-environment> --insecure -r htmlextra --reporter-htmlextra-export .
- Reports are stored in
./target/
We follow Git Merge based workflow
- Fork this repo.
- Create a new feature branch in your fork. Multiple features must have a hyphen separated name, or refer to a milestone name as mentioned in Github -> Projects.
- Commit to your fork and raise a Pull Request with upstream.