Skip to content

RAG application encapsulated in LangServe API for Basic/PDF QnA

Notifications You must be signed in to change notification settings

MuhammadBinUsman03/Chain-QnA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🤗💬 Chain-Q/A

RAG-application for chat completion and question-answering over PDF document built with Langchain 🦜🔗 framework, deployed with LangServe 🦜️🏓 and Streamlit for frontend. Utilizes Chat-completion LLM Mixtral MoE 8x7B Instruct from Fireworks AI and Cohere Embeddings for text encoding.

image

Features ✅

  • Entire application (all chains / runnables) deployed with Langserve as a single REST API.
  • In-Memory session history to keep track of chat history between user and assistant.
  • Streamed token generation
  • Message trimming to fit in the context length of model. (for QnA chain only)
  • Two chains: for generic QnA / interaction and for question-answering over PDF documents.

Architecture 📐

Architecture

LangServe Encapsulation🦜️🏓

The chains are served through FastAPI endpoints on the same server:

  • QnA-Chain: /chain
  • RAG-chain: /rag_chain

RAG-Chain 📑🔗

PDF-Document content is posted from client-side on /chunk endpoint, where it is recursively splitted and dumped into Chroma VectorDB for similarity retrieval. For a given user query, relevant documents are pulled by retriever and passed as context to the model to output response.

rag_prompt = hub.pull("rlm/rag-prompt")
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | rag_prompt
    | model
)

QnA-Chain 💬🔗

Each session is assigned a user_id and a conversation_id to maintain an In-Memory chat history. This chain is packed with RunnableWithMessageHistory.

chain = first_step | RunnablePassthrough.assign(messages=itemgetter("history") | trimmer) | prompt | model
with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history=get_session_history,
    input_messages_key="question",
    history_messages_key="history",
    history_factory_config=[
        ConfigurableFieldSpec(
            id="user_id",
            annotation=str,
            name="User ID",
            description="Unique identifier for the user.",
            default="",
            is_shared=True,
        ),
        ConfigurableFieldSpec(
            id="conversation_id",
            annotation=str,
            name="Conversation ID",
            description="Unique identifier for the conversation.",
            default="",
            is_shared=True,
        ),
    ],
)

Streamlit Client 🔺

  • When PDF chat is disabled, all user queries are directed to QnA-chain.
remote_chain_qa = RemoteRunnable("http://localhost:8000/chain/")
  • For PDF chat, PDF-content is posted to /chunk endpoint.
requests.post("http://localhost:8000/chunk/", json={"text":text})

Then all queries for PDF QnA are directed to RAG-chain.

remote_chain_rag = RemoteRunnable("http://localhost:8000/rag_chain/")
  • Chain responses are streamed for smoother UX.
def stream_data(query, remote_chain):
    '''Steaming output generator'''
    config = {"user_id": "user_id", "conversation_id": "conversation_id"}
    for r in remote_chain.stream(
        query,
        config={
                "configurable": config
        },
    ):
        yield r.content + ""

Setup 💻

Create a python virtual environment.Clone the repository and install dependencies

git clone https://github.com/MuhammadBinUsman03/Chain-QnA.git
cd Chain-QnA
pip install -r requirements.txt

Start the LangServe server.

python server.py

Start the Streamlit app in split terminal.

streamlit run app.py

📫 Get in Touch

LinkedIn Hugging Face Medium X Substack

About

RAG application encapsulated in LangServe API for Basic/PDF QnA

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published