Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

any plan to support LLMs #2626

Open
chgxtony opened this issue Jun 2, 2023 · 4 comments
Open

any plan to support LLMs #2626

chgxtony opened this issue Jun 2, 2023 · 4 comments
Labels
enhancement New feature or request

Comments

@chgxtony
Copy link

chgxtony commented Jun 2, 2023

any plan to support LLMs like openai / llama ? like langchain in java

@chgxtony chgxtony added the enhancement New feature or request label Jun 2, 2023
@frankfliu
Copy link
Contributor

We are working on LLM support, see: #2547

We have no plan to support langchain, however, if you are looking for multimodal support, you can take a look: https://github.com/deepjavalibrary/djl-serving/blob/master/serving/docs/workflows.md

@sandys
Copy link

sandys commented Jun 20, 2023

@frankfliu does #2547 need to be merged before any LLM can be used ?
i was trying to use MPT 7B. Example code here - https://github.com/arakoodev/onnx-djl-example

onnx model loads and all. But cant get it to spit answers.

@frankfliu
Copy link
Contributor

@sandys

If you just want to deploy llama, you can already do it with DJLServing: deepjavalibrary/djl-serving#844

Technically, you can run LLMs with DJL in pure java fashion as well. What's missing is postprocessing and result token search. You can implement it by yourself if you want to.

@sandys
Copy link

sandys commented Jun 22, 2023

@frankfliu hey thanks so much for your reply!
Two questions:

If you just want to deploy llama, you can already do it with DJLServing: deepjavalibrary/djl-serving#844

Llama has a licensing issue, so wondering if u have tested on any other LLMs ? like any at all ?

Technically, you can run LLMs with DJL in pure java fashion as well.

happy to try and implement (and contribute back). starting from our early rough work here - https://github.com/arakoodev/onnx-djl-example . But am not sure how to do this. I dont need any optimisation or anything at all. just get it to reply for now. any pointers how to do this ? we are attempting this with MPT-7B, but ill take anything with an open license.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants