Skip to content

Commit

Permalink
Enhancement of the documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
NohTow committed Oct 2, 2024
1 parent bf113d7 commit 2583f5e
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 3 deletions.
2 changes: 1 addition & 1 deletion docs/documentation/.pages
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ nav:
- Datasets: datasets.md
- Retrieval: retrieval.md
- Evaluation: evaluation.md
- Serving: server.md
- FastAPI: fastapi.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Serve the embeddings of a PyLate model
# Serve the embeddings of a PyLate model using FastAPI
The ```server.py``` script (located in the ```server``` folder) allows to create a FastAPI server to serve the embeddings of a PyLate model.
To use it, you need to install the api dependencies: ```pip install "pylate[api]"```
Then, run ```python server.py``` to launch the server.
Expand All @@ -15,7 +15,8 @@ curl -X POST http://localhost:8002/v1/embeddings \
```
If you want to encode queries, simply set ```ìs_query``` to ```True```.

Note that the server leverages [batched](https://github.com/mixedbread-ai/batched), so you can do batch processing by sending multiple separate calls and it will create batches dynamically to fill up the GPU.
???+ tip
Note that the server leverages [batched](https://github.com/mixedbread-ai/batched), so you can do batch processing by sending multiple separate calls and it will create batches dynamically to fill up the GPU.

For now, the server only support one loaded model, which you can define by using the ```--model``` argument when launching the server.

0 comments on commit 2583f5e

Please sign in to comment.