Skip to content

Commit

Permalink
Merge pull request #30 from interTwin-eu/dev-slangarita
Browse files Browse the repository at this point in the history
Dev slangarita
  • Loading branch information
esparig authored Oct 21, 2024
2 parents 51ea76a + 32d7c8c commit 49a7973
Show file tree
Hide file tree
Showing 14 changed files with 77 additions and 86 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ sidebar_position: 3
---
# S3

The S3 Source captures an ObjectCreated event from an AWS S3 bucket. DCNiOS creates S3 bucket event redirections to SQS queue. Then, Apache NiFi captures the event and introduces it to the dataflow. The whole pipeline is created using DCNiOS. But, SQS queue is deleted with DCNiOS, but the Event Notification in the S3 section needs to be removed manually.
The S3 Source captures an ObjectCreated event from an AWS S3 bucket. DCNiOS creates S3 bucket event redirections to the SQS queue. Then, Apache NiFi captures the event and introduces it to the dataflow. The whole pipeline is created using DCNiOS. The SQS queue is deleted with DCNiOS, but the Event Notification in the S3 section needs to be removed manually.

The S3 Source requires:
- An identifier name of the process. It must be unique. Required.
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,11 @@ DCNiOS can use some AWS as input. A valid pair of AWS Access Key and AWS Secret
aws_access_key_id = AK<>
aws_secret_access_key = <>
```
- From the file of DCNiOS workflow file named `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`.
- From the DCNiOS workflow file using the argument `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`.



AWS_DEFAULT_REGION is mandatory in any Source that uses AWS in the configuration file. These ProcessGroups can employ AWS credentials:
AWS_DEFAULT_REGION is mandatory for any that uses AWS in the configuration file. These ProcessGroups use AWS credentials:
- [SQS](/docs/Sources/SQS)
- [S3](/docs/Sources/S3)

27 changes: 14 additions & 13 deletions docpage/docs/03.- Sources/Kafka.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,26 +4,27 @@ sidebar_position: 2
# Kafka



The Kafka Source allows us to consume a Kafka topic. It requires this information:
- An identifier name of the process. It must be unique. Required.
- Kafka bootstrap_servers, just the IP and the port with any protocol as `<ip>:<port>`. Required.
- The topic name that is going to be consumed. Required.
- The group identifier indicates the consumer group. Required.
- [IM](https://www.grycap.upv.es/im/index.php) serve a recipe that supports the SASL_SSL security protocol. So, the user `sasl_username` and password `sasl_password` must be set. These parameters are set at Kafka deployment time. Required.
- In case the topics you are consuming follow a `key:value` pattern set the argument `separate_by_key` as true and select the demarcator with `message_demarcator`
- If the consumed topic follows a `key:value` pattern, set the argument `separate_by_key` true and select the demarcator with `message_demarcator`.

Also, it is necessary an SSL connection between NiFi and Kafka. This connection is made by a PKCS12 certificate and the password of the certificate.
An SSL connection between NiFi and Kafka is necessary. A PKCS12 certificate and the certificate's password must be provided.

```
Kafka:
- name: kafka
bootstrap_servers: <ip>:<port>
topic: <kafka-topic-name>
group_id: "<kafka-group-id>"
sasl_username: <kafka-sasl-user>
sasl_password: <kafka-sasl-password>
ssl_context:
Truststore_Filename: <certificate-file.p12>
Truststore_Password: <certificate-file-password>
Kafka:
- name: kafka
bootstrap_servers: <ip>:<port>
topic: <topic>
group_id: "1"
sasl_username: <sasl-user>
sasl_password: <sasl-password>
#separate_by_key: "false"
#message_demarcator: ";"
ssl_context:
Truststore_Filename: <name-of-p12>
Truststore_Password: "<password-of-p12>"
```
2 changes: 1 addition & 1 deletion docpage/docs/03.- Sources/dcache.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ dCache is a Source that listens into a dCache instance. The following values mus
- An identifier name of the process. It must be unique. Required.
- Endpoint, user, and password of a dCache instance. Required.
- Folder of dCache where keeps an active listening.Required.
- Statefile is the name of the file that will store the state. `dcache` value is not recommended. It creates misbehavior. Required.
- Statefile is the file that will store the state. Please, do not employ `dcache` as its name, as it may cause problems. Required.

The dCache Source only works when the NiFi cluster is deployed with the image `ghcr.io/grycap/nifi-sse:latest`, is composed of:
- ExecuteProcess
Expand Down
27 changes: 0 additions & 27 deletions docpage/docs/03.- Sources/generic.md

This file was deleted.

13 changes: 13 additions & 0 deletions docpage/docs/03.- Sources/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
---
sidebar_position: 3
---
# Sources

`Sources` defines the information of third-party elements in what NiFi connects, waiting for events.


Sources support:
- [dCache](/docs/Sources/dcache)
- [KAFKA](/docs/Sources/Kafka)
- [S3](/docs/Sources/AWS/S3)
- [SQS](/docs/Sources/AWS/SQS)
2 changes: 1 addition & 1 deletion docpage/docs/04.- Destinations/OSCAR.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ The OSCAR Destination invokes an OSCAR service asynchronously:
- An identifier name of the process. It must be unique. Required.
- Endpoint. Required.
- Service in OSCAR. Required.
- Token or user/password. The user/password will be first if both authentication processes are defined. Do not edit the OSCAR services. Required.
- Token or user/password. User/password or token. The user/password has priority over the token. Please, do not edit the OSCAR services. Required.


Destination is composed of this component:
Expand Down
6 changes: 6 additions & 0 deletions docpage/docs/04.- Destinations/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
sidebar_position: 4
---
# Destinations

The `Destinations` defines the information of third-party elements where NiFi sends the data or event. Only [OSCAR](/docs/Destinations/OSCAR) is available.
2 changes: 1 addition & 1 deletion docpage/docs/05.- Alterations/Decode.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ sidebar_position: 2
# Decode

Alteration's Decode decodes the data flow from the chosen encoding. The user must ensure the input data is encoded using the selected encoding.
Three encodes are available: `base64`, `base32` and `hex`. It is similar to the command `base64 -d` or `base32 -d`. For example, If the input data is a string in base64 with the value `aGVsbG8K` or in base32 with the value `NBSWY3DPBI======`. The output data is be the same in both cases, `hello`.
Three encodes are available: `base64`, `base32` and `hex`. They behave like the command `base64 -d`, `base32 -d`, and hex respectively. For example, If the input data is a string in base64 with the value `aGVsbG8K` or in base32 with the value `NBSWY3DPBI======`. The output data is the same in both cases, `hello`.


Here is the YAML example.
Expand Down
2 changes: 1 addition & 1 deletion docpage/docs/05.- Alterations/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ sidebar_position: 5

# Alterations

The subsection `alterations`, is located inside a Sources elements, and it changes the input data format. These alterations are applied as a descendent definition. These steps are helpful to be okay with the input Sources format and to re-use the Sources with no changes.
The `alterations` subsection, is located inside a Source element and changes the input data format. These alterations are applied as a descendent definition. These steps are helpful to be okay with the input Sources format and to re-use the Sources with no changes.



Expand Down
8 changes: 4 additions & 4 deletions docpage/docs/Introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,12 @@ DCNiOS is an open-source command-line tool that easily manages the creation of e

![DCNiOS images](/../static/img/dcnios-logo-hor.png)

Apache NiFi Process Group is a group of Processors that compose a dataflow. DCNiOS uses predefined Process Groups that make simple actions like interacting with a third-party component (e.g., consuming from Kafka) or changing the data content (e.g.encoding the data in base64) to compose a complete dataflow.
Apache NiFi Process Group is a group of Processors that compose a dataflow. DCNiOS uses predefined Process Groups that make simple actions like interacting with third-party elements (e.g., consuming from Kafka) or changing the data content (e.g.encoding the data in base64) to compose a complete dataflow.

In DCNiOS documentation, the Process Groups are split by purpose into three main groups: 'Sources', 'Destinations', and 'Alterations'.
- 'Sources' interact with a third-party component as the input data receiver.
- 'Destinations' interact with a third-party component as an output data sender.
- 'Alterations' that do not interact with third-party components and change the format of the data flow.
- 'Sources' interact with third-party elements as the input data receiver.
- 'Destinations' interact with third-party elements as an output data sender.
- 'Alterations' that do not interact with third-party elements and change the format of the data flow.



Expand Down
40 changes: 19 additions & 21 deletions docpage/docs/Users.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ sidebar_position: 2

# Users Guide

Here, you will find an explanation of the main concepts of DCNiOS, such as the DCNiOS commands, how to define a workflow, involved sections, and the commun sections for all the third-party connections.
This page explains the main concepts of DCNiOS, such as the DCNiOS commands and workflow definition.

## Commands

Expand Down Expand Up @@ -40,11 +40,11 @@ python dcnios-cli.py changeSchedule --host={nifi-endpoint} \

## File workflow configuration structure (Yaml structure)

Here, we will explain the workflow definition, the structure of the configuration file, and the information the user has to know about each third-party connection. DCNiOS deploys and configures all the definitions in Apache NiFi.
Here, we explain the workflow definition, the structure of the configuration file, and the information the user has to know about each third-party connection. DCNiOS deploys and configures all the definitions in Apache NiFi.

### Apache NiFi credentials:

In this 'nifi' section, the Apache NiFi credentials will be defined. Inside this section will be defined the Sources that will be deployed and the conection between them.
In this `nifi` section, set the Apache NiFi credentials. Inside this section, define the workflow.

```
nifi:
Expand All @@ -59,38 +59,40 @@ nifi:
Moreover, it is necessary to define the source and destination of data.

Sources:
- [dCache](https://www.dcache.org/)
- [KAFKA](https://kafka.apache.org/)
- [S3](https://aws.amazon.com/es/s3/)
- [dCache](/docs/Sources/dcache)
- [KAFKA](/docs/Sources/Kafka)
- [S3](/docs/Sources/AWS/S3)
- [SQS](/docs/Sources/AWS/SQS)

Destinations:
- [OSCAR](https://oscar.grycap.net/)
- [OSCAR](/docs/Destinations/OSCAR)

Alterations:
- Merge
- Encoded
- Decoded

The input data format from Sources can change using Alterations.

#### Components Subsection
Alterations:
- [Merge](/docs/Alterations/Merge)
- [Encode](/docs/Alterations/Encode)
- [Decode](/docs/Alterations/Decode)

The subsection `components`, inside Sources and Destinations, is employed to change the configuration of a single Processor of Apache NiFi. It is necessary to know the name of the component. Then, we can change the seconds between executions,
the scheduled time, seconds between executions (ratio execution), and in which kind of node in Nifi is going to execute.the node execution can be changed.

#### Components Subsection

The components subsection changes the behavior of an inter-process. When you deploy an element, there are some processes running in the background. You can change the seconds between executions (execution ratio) and select which node will perform the execution (PRIMARY or ALL). However, it is necessary to know the name of the process. For example, the destination OSCAR has the component InvokeOSCAR, which sends an HTTP call.


```
components:
- name: GetFile
- name: InvokeOSCAR
seconds: 2
node: (ALL | PRIMARY)
```


#### Alterations

The subsection `alterations`, inside Sources, change the data format. These alterations are applied as a descendent definition. In this example, the input data is merged into one message. Then, the merge message is encoded.
[Alterations](/docs/Alterations), located inside [Sources](/docs/Sources), are employed to modify the format of data. The alterations are applied in the specified order. In the following example, the input data is merged into one message. Then, the merged message is encoded in base64 format.


```
- action: Merge
Expand All @@ -102,9 +104,7 @@ The subsection `alterations`, inside Sources, change the data format. These alte

### Connections



In the Connections section, the connections between sources and destinations are established by employing the `from` and `to` keys.
The Connections section defines the links between Sources and Destinations.

```
connection:
Expand Down Expand Up @@ -137,6 +137,4 @@ nifi:
connection:
- from: dcache
to: edgan3
```
28 changes: 14 additions & 14 deletions docpage/src/css/custom.css
Original file line number Diff line number Diff line change
Expand Up @@ -6,25 +6,25 @@

/* You can override the default Infima variables here. */
:root {
--ifm-color-primary: #2e8555;
--ifm-color-primary-dark: #29784c;
--ifm-color-primary-darker: #277148;
--ifm-color-primary-darkest: #205d3b;
--ifm-color-primary-light: #33925d;
--ifm-color-primary-lighter: #359962;
--ifm-color-primary-lightest: #3cad6e;
--ifm-color-primary: #7a7abc;
--ifm-color-primary-dark: #5454a9;
--ifm-color-primary-darker: #5050a7;
--ifm-color-primary-darkest: #2c2c95;
--ifm-color-primary-light: #7676ba;
--ifm-color-primary-lighter: #8f8fc7;
--ifm-color-primary-lightest: #9f9fcf;
--ifm-code-font-size: 95%;
--docusaurus-highlighted-code-line-bg: rgba(0, 0, 0, 0.1);
}

/* For readability concerns, you should choose a lighter palette in dark mode. */
[data-theme='dark'] {
--ifm-color-primary: #25c2a0;
--ifm-color-primary-dark: #21af90;
--ifm-color-primary-darker: #1fa588;
--ifm-color-primary-darkest: #1a8870;
--ifm-color-primary-light: #29d5b0;
--ifm-color-primary-lighter: #32d8b4;
--ifm-color-primary-lightest: #4fddbf;
--ifm-color-primary: #7a7abc;
--ifm-color-primary-dark: #5454a9;
--ifm-color-primary-darker: #5050a7;
--ifm-color-primary-darkest: #2c2c95;
--ifm-color-primary-light: #7676ba;
--ifm-color-primary-lighter: #8f8fc7;
--ifm-color-primary-lightest: #9f9fcf;
--docusaurus-highlighted-code-line-bg: rgba(0, 0, 0, 0.3);
}

0 comments on commit 49a7973

Please sign in to comment.