Skip to content

Commit

Permalink
Merge pull request #294 from jembi/CU-86bzwgbuv_Update-Architecture-S…
Browse files Browse the repository at this point in the history
…ection

Cu 86bzwgbuv update architecture section
  • Loading branch information
MatthewErispe authored Aug 8, 2024
2 parents 75abfa4 + 1ddddd9 commit 4bd660b
Show file tree
Hide file tree
Showing 6 changed files with 105 additions and 46 deletions.
Binary file modified documentation/.gitbook/assets/0
Binary file not shown.
Binary file modified documentation/.gitbook/assets/2
Binary file not shown.
Binary file modified documentation/.gitbook/assets/3
Binary file not shown.
Binary file modified documentation/.gitbook/assets/4
Binary file not shown.
Binary file modified documentation/.gitbook/assets/5
Binary file not shown.
151 changes: 105 additions & 46 deletions documentation/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,18 +8,16 @@ The JeMPI Client Registry is a system that incorporates a microservice architect

**Below the synchronous and asynchronous flow diagram.**

#### [Asynchronous flow](https://drive.google.com/file/d/1rcbF3UJ5Lh-4bjXl8GVpJVnYxA1diRjl/view?usp=sharing) <a href="#_2v012h2bohjt" id="_2v012h2bohjt"></a>
#### [Asynchronous flow](https://drive.google.com/file/d/1G3_-BZNwRSOeriad6IbR6rFERQUnx1AK/view?usp=sharing) <a href="#_2v012h2bohjt" id="_2v012h2bohjt"></a>

![](.gitbook/assets/0)

## JeMPI_AsyncReceiver <a href="#_6om7ih1t1k41" id="_6om7ih1t1k41"></a>

**Description:** A microservice that sends the content of an uploaded csv file to the JeMPI_ETL service. the JeMPI_AsyncReciever service produces kafka messages where each message has a row from the CSV file uploaded. it will then be saved under a kafka topic.

The base version of JeMPI supports only 10 columns in the following order **\[for the current version]**:
The base version uses a reference implementation with the fields below:

**String** uid,\
**SourceId** sourceId,\
**String** auxId,\
**String** givenName,\
**String** familyName,\
Expand All @@ -35,54 +33,118 @@ The base version of JeMPI supports only 10 columns in the following order **\[fo
Example of input file:

```
ID,Given_Name,Family_Name,Gender_at_Birth,Date_of_Birth,City,Phone_Number,National_ID,Dummy1,Dummy2,Dummy3
rec-00000000-aaa-0,Endalekachew,Onyango,male,20171114,Nairobi,091-749-4674,198804042874913,19940613,19781023,19660406
rec-00000001-aaa-0,Fikadu,Mwendwa,male,19840626,Nairobi,022-460-8846,199403050409528,20190317,19400321,20190104
rec-00000002-bbb-0,Biniyam,Maalim,male,20191022,Nairobi,098-119-7244,200006231841948,,20190302,
ID,Given_Name,Family_Name,Gender_at_Birth,Date_of_Birth,City,Phone_Number,National_ID
rec-00000000-aaa-0,Endalekachew,Onyango,male,20171114,Nairobi,091-749-4674,198804042874913
rec-00000001-aaa-0,Fikadu,Mwendwa,male,19840626,Nairobi,022-460-8846,199403050409528
rec-00000002-bbb-0,Biniyam,Maalim,male,20191022,Nairobi,098-119-7244,200006231841948
```

**Output**

The service will save the data from the CSV file, one line at a time.\
Kafka topic: _TOPIC_INTERACTION_ASYNC_ETL="JeMPI-async-etl"_
Kafka topic: _TOPIC_INTERACTION_ETL="JeMPI-interactions-etl"_

## JeMPI_ETL <a href="#_r783bgaxx08b" id="_r783bgaxx08b"></a>

**Description:** A microservice that pocesses the input coming from the JeMPI_AsyncReceiver. The JeMPI_ETL service will perform some data trasformation e.g. lower case the values for name of the patient or unformat the date for the date of birth. The resulting data will be sent as JSON (JSON Streaming) to the JeMPI_Controller service.

**Input:**

Data coming from the the JeMPI_AsyncReciever service.\
Kafka topic: \_TOPIC_PATIENT_ASYNC_PREPROCESSOR="JeMPI-async-etl"\*
Data coming from the JeMPI_AsyncReciever service.\
Kafka topic: \_TOPIC_INTERACTION_ETL="JeMPI-interaction-etl"\*

**Output:**

Data transformed into JSON that will be sent to the JeMPI_Controller. It will be stored in the Kafka topic: \_TOPIC_PATIENT_CONTROLLER="JeMPI-patient-controller"\*
Data transformed into JSON that will be sent to the JeMPI_Controller. It will be stored in the Kafka topic: \_TOPIC_INTERACTION_CONTROLLER="JeMPI-interaction-controller"\*

Example or a Kafka message coming from the patient controller topic:
Example or a Kafka message coming from the interaction controller topic:

<figure><img src=".gitbook/assets/3" alt=""><figcaption></figcaption></figure>

```json
{
"contentType": "BATCH_INTERACTION",
"tag": "csv/import-1050836091564327170.csv",
"stan": "2023/09/06 08:29:13:0000008",
"tag": "import-5334297603633827819uploadConfig",
"stan": "2024/08/07 08:24:07:0000001",
"interaction": {
"sourceId": { "facility": "FA4", "patient": "197910145001067" },
"sourceId": {
"facility": "FA2",
"patient": "patient_id"
},
"uniqueInteractionData": {
"auxDateCreated": "2023-09-06T08:29:13.426518561",
"auxId": "rec-0000000002--5",
"auxClinicalData": "RANDOM DATA(975)"
"auxDateCreated": "2024-08-07T08:24:08.174750419",
"auxUserFields": [
{
"scTag": "aux_id",
"tag": "auxId",
"value": "rec-0000000708-02"
},
{
"scTag": "aux_clinical_data",
"tag": "auxClinicalData",
"value": "RANDOM DATA(865)"
}
]
},
"demographicData": {
"givenName": "esther",
"familyName": "zulu",
"gender": "female",
"dob": "19791014",
"city": "mufulira",
"phoneNumber": "0157172342",
"nationalId": "197910145001067"
"fields": [
{
"tag": "given_name",
"value": "patricia"
},
{
"tag": "family_name",
"value": "solis"
},
{
"tag": "gender",
"value": "female"
},
{
"tag": "dob",
"value": "19821106"
},
{
"tag": "city",
"value": "chicago"
},
{
"tag": "phone_number",
"value": "0133705553"
},
{
"tag": "national_id",
"value": "198211065001099"
}
]
}
},
"sessionMetadata": {
"commonMetaData": {
"stan": "2024/08/07 08:24:07:0000001",
"uploadConfig": {
"reportingRequired": false,
"uploadWorkflow": 0,
"minThreshold": 0.65,
"linkThreshold": 0.7,
"maxThreshold": 0.75,
"marginWindowSize": 0.1
}
},
"uiMetadata": {
"timeStamp": null
},
"asyncReceiverMetadata": {
"timeStamp": "2024/08/07 08:24:08"
},
"etlMetadata": {
"timeStamp": "2024/08/07 08:27:05"
},
"controllerMetadata": {
"timeStamp": null
},
"linkerMetadata": {
"timeStamp": null
}
}
}
Expand All @@ -92,50 +154,47 @@ Example or a Kafka message coming from the patient controller topic:

**Description:** The JeMPI_Controller service has multiple tasks:

- Send the data coming from the JeMPI_ETL to both the JeMPI_Linker and the JeMPI_EM services. The data will be stored in their respective Kafka topics accessed (consumed) by those service.
- Control and manage the optimization of the M & U value computing by activating or stopping the linkage process of the JeMPI_Linker service. The new values of M & U will be brought from JeMPI_EM service to then be provided to the JeMPI_Linker service.
- Send the data coming from the JeMPI_ETL to either the JeMPI_Linker or the JeMPI_EM services, based on the workflow selection made by user on import. The data will be stored in their respective Kafka topics accessed (consumed) by those service.

**Input:**
Data coming from the JeMPI_ETL service
Kafka topic: \_TOPIC_PATIENT_CONTROLLER="JeMPI-patient-controller"\*.
Kafka topic: \_TOPIC_INTERACTION_CONTROLLER="JeMPI-interaction-controller"\*.

Values of the M & U computed in the JeMPI_EM service
Kafka topic: \_TOPIC_MU_CONTROLLER="JeMPI-mu-controller"\*

**Output:**
Send the data to the JeMPI_EM

- Kafka topic: _TOPIC_PATIENT_EM="JeMPI-patient-em"_
Send the data to the JeMPI_Linker
- Kafka topic: _TOPIC_PATIENT_LINKER="JeMPI-patient-linker"_

MU process: Kafka topic: _TOPIC_MU_LINKER="JeMPI-mu-linker"_
1. Send the data to the JeMPI_EM
- Kafka topic: _TOPIC_INTERACTION_EM="JeMPI-interaction-em"_
- MU process: Kafka topic: _TOPIC_MU_LINKER="JeMPI-mu-linker"_

![](.gitbook/assets/4) ![](.gitbook/assets/5)

2. Send the data to the JeMPI_Linker
- Kafka topic: _TOPIC_INTERACTION_LINKER="JeMPI-interaction-linker"_

## JeMPI_EM <a href="#_7tf3t1atn1ab" id="_7tf3t1atn1ab"></a>

**Description:** A microservice that will create an object containing m\&u of a patient against patient records that go into the EM algorithm (quality (m) and the uniqueness (u) per field). This object is used in the linker for matching patients. It uses a machine learning called Estimation maximisation (EM) algorithm to optimize that value, it is launched after receiving a number of records specified in the configuration.

**Input:** Kafka topic: _TOPIC_PATIENT_LINKER="JeMPI-patient-linker"_
**Input:** Kafka topic: _TOPIC_INTERACTION_LINKER="JeMPI-interaction-linker"_

**Output:** Kafka topic: _TOPIC_MU_CONTROLLER="JeMPI-mu-controller"_

## JeMPI_Linker <a href="#_111ah0ssrp64" id="_111ah0ssrp64"></a>

**Description:** A microservice that will interact with Dgraph database to do the matching of the patients. The Linker uses thresholds to drive the linking and notifications for review processes. These thresholds are the following:

- **A single match or no match threshold :** the encounter will automatically be linked to the highest golden record candidate above the threshold. If no candidate has a score above the threshold, a new golden record is created. This is typically used for fully autonomous linking.
- **Window around the match/no match threshold :** if the highest score generated for the candidates falls within this window, a notification is sent for Admin to review the encounter.
- **Margin threshold :** if another candidate falls within a margin from the highest score and this highest score is above the match/no match threshold, a notification for review is sent for the Admin to review the linked encounter.
- **A single match or no match threshold :** the interaction will automatically be linked to the highest golden record candidate above the threshold. If no candidate has a score above the threshold, a new golden record is created. This is typically used for fully autonomous linking.
- **Window around the match/no match threshold :** if the highest score generated for the candidates falls within this window, a notification is sent for Admin to review the interaction.
- **Margin threshold :** if another candidate falls within a margin from the highest score and this highest score is above the match threshold, a notification for review is sent for the Admin to review the linked interaction.

**Input:**
Kafka topic: _TOPIC_PATIENT_EM="JeMPI-patient-em"_\
Kafka topic: _TOPIC_MU_LINKER="JeMPI-mu-linker"_
Kafka topic: _TOPIC_INTERACTION_EM="JeMPI-interaction-em"_\

**Output:**

- Interact with the Dgraph database using GraphQL queries/mutations, save the patients and the links.
- Interact with the Dgraph database using GraphQL queries/mutations, save the interactions and the links.
- Send response of either the link info or the list of candidates to the Controller
- Save response to Kafka topic: _TOPIC_notifications=”JeMPI_notifications”_

Expand All @@ -147,17 +206,17 @@ Component linked:

- **Dgraph Ratel:** A tool for data visualization and cluster management. Ratel can be used with Dgraph to manage cluster settings, run DQL queries and mutations and see results of the mentioned operations.
- **Dgraph Alpha:** Expose and host endpoints of the indexes.
- **Dgraph Zero:** it is like a zookeeper in Kafka, it will control the instances of Alpha by assigning them to a group, and re-balances the data between them.
- **Dgraph Zero:** it is like a Zookeeper/KRaft in Kafka, it will control the instances of Alpha by assigning them to a group, and re-balances the data between them.

## JeMPI_Kafka <a href="#_lhpqpufx5pyy" id="_lhpqpufx5pyy"></a>

**Description:** Kafka the message queue bus, it contains all the topics used previously in the other components.

## JeMPI_API <a href="#_ioszcxv7tpj" id="_ioszcxv7tpj"></a>

**Description:** The JeMPI_API service contains the endpoints needed to interact with JeMPI. aside from acting as an access point to the JeMPI system, this service
**Description:** The JeMPI_API service contains the endpoints needed to interact with JeMPI.

It will do the following actions:
It performs the following functions:

- Read data from the Kafka topic _TOPIC_notifications=”JeMPI_notifications”_
- Save data related to the administration in PostgeSQL DB
Expand Down

0 comments on commit 4bd660b

Please sign in to comment.