Skip to content

Commit

Permalink
Merge pull request #332 from ggservice007/main
Browse files Browse the repository at this point in the history
feat:add the openai support and remove async in the most functions.
  • Loading branch information
nkwangleiGIT authored Dec 12, 2023
2 parents 61b9e33 + 9e48732 commit b242c29
Show file tree
Hide file tree
Showing 39 changed files with 3,122 additions and 970 deletions.
6 changes: 3 additions & 3 deletions data-processing/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@
__pycache__
.ipynb_checkpoints

mock_data
data_manipulation/mock_data

log
data_manipulation/log

file_handle/temp_file
data_manipulation/file_handle/temp_file
5 changes: 5 additions & 0 deletions data-processing/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,11 @@ ENV MINIO_API_URL=localhost:9000
ENV MINIO_SECURE=False
ENV MINIO_DATASET_PREFIX=dataset

ENV LLM_USE_TYPE=xxxxx
ENV LLM_QA_RETRY_COUNT=xxxxx
ENV OPEN_AI_DEFAULT_KEY=xxxxx
ENV OPEN_AI_DEFAULT_BASE_URL=xxxxx
ENV OPEN_AI_DEFAULT_MODEL=xxxxx
ENV ZHIPUAI_API_KEY=xxxxx

ENV KNOWLEDGE_CHUNK_SIZE=500
Expand Down
98 changes: 97 additions & 1 deletion data-processing/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,4 +34,100 @@ Install the Python dependencies in the requirements.txt file

### Running

Run the server.py file in the data_manipulation directory
Run the server.py file in the data_manipulation directory

# isort
isort is a tool for sorting imports alphabetically within your Python code. It helps maintain a consistent and clean import order.

## install
```shell
pip install isort
```

## isort a file
```shell
isort server.py
```

## isort a directory
```shell
isort data_manipulation
```


# config.yml
## dev phase
The example config.yml is as the following:
```yaml
minio:
access_key: '${MINIO_ACCESSKEY: hpU4SCmj5jixxx}'
secret_key: '${MINIO_SECRETKEY: xxx}'
api_url: '${MINIO_API_URL: 172.22.96.136.nip.io}'
secure: '${MINIO_SECURE: True}'
dataset_prefix: '${MINIO_DATASET_PREFIX: dataset}'

zhipuai:
api_key: '${ZHIPUAI_API_KEY: 871772ac03fcb9db9d4ce7b1e6eea27.VZZVy0mCox0WrzAG}'

llm:
use_type: '${LLM_USE_TYPE: zhipuai_online}' # zhipuai_online or open_ai
qa_retry_count: '${LLM_QA_RETRY_COUNT: 100}'

open_ai:
key: '${OPEN_AI_DEFAULT_KEY: fake}'
base_url: '${OPEN_AI_DEFAULT_BASE_URL: http://172.22.96.167.nip.io/v1/}'
model: '${OPEN_AI_DEFAULT_MODEL_NAME: cb219b5f-8f3e-49e1-8d5b-f0c6da481186}'

knowledge:
chunk_size: '${KNOWLEDGE_CHUNK_SIZE: 500}'
chunk_overlap: '${KNOWLEDGE_CHUNK_OVERLAP: 50}'

backendPg:
host: '${PG_HOST: localhost}'
port: '${PG_PORT: 5432}'
user: '${PG_USER: postgres}'
password: '${PG_PASSWORD: 123456}'
database: '${PG_DATABASE: arcadia}'
```
\${MINIO_ACCESSKEY: hpU4SCmj5jixxx}
MINIO_ACCESSKEY is the environment variable name.
hpU4SCmj5jixxx is the default value if the environment variable is not set.
## release phase
The example config.yml is as the following:
```yaml
minio:
access_key: hpU4SCmj5jixxx
secret_key: xxx
api_url: 172.22.96.136.nip.io
secure: True
dataset_prefix: dataset

zhipuai:
api_key: 871772ac03fcb9db9d4ce7b1e6eea27.VZZVy0mCox0WrzAG

llm:
use_type: zhipuai_online # zhipuai_online or open_ai
qa_retry_count: 100

open_ai:
key: fake
base_url: http://172.22.96.167.nip.io/v1/
model: cb219b5f-8f3e-49e1-8d5b-f0c6da481186

knowledge:
chunk_size: 500
chunk_overlap: 50

backendPg:
host: localhost
port: 5432
user: admin
password: 123456
database: arcadia
```
In the K8s, you can use the config map to point to the /arcadia_app/data_manipulation/config.yml file.
Loading

0 comments on commit b242c29

Please sign in to comment.