Skip to content

Commit

Permalink
step-by-step intent detetcion (#308)
Browse files Browse the repository at this point in the history
  • Loading branch information
IANTHEREAL authored Sep 25, 2024
1 parent 0b54548 commit ae9ccee
Show file tree
Hide file tree
Showing 8 changed files with 161 additions and 538 deletions.
26 changes: 11 additions & 15 deletions backend/app/rag/knowledge_graph/intent.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ class RelationshipReasoning(Relationship):

reasoning: str = Field(
description=(
"Category reasoning for the relationship, e.g., 'the main conerns of the user', 'the problem the user is facing', 'the user case scenario', etc."
"Explanation of the user's intention for this step."
)
)

Expand All @@ -23,32 +23,28 @@ class DecomposedFactors(BaseModel):
"""Decomposed factors extracted from the query to form the knowledge graph"""

relationships: List[RelationshipReasoning] = Field(
description="List of relationships to represent critical concepts and their relationships extracted from the query."
description="List of relationships representing the user's prerequisite and step-by-step intentions extracted from the query."
)


class DecomposeQuery(dspy.Signature):
"""You are a knowledge base graph expert and are very good at building knowledge graphs. Now you are assigned to extract the most critical concepts and their relationships from the query. Step-by-Step Analysis:
1. Extract Meaningful user intents and questions:
- Identify the question what the user itentionally asked, focusing on the the critial information about user's main concerns/questions/problems/use cases, etc.
- Make this question simple and clear and ensure that it is directly related to the user's main concerns. Simple and clear question can improve the search accuracy.
2. Establish Relationships to describe the user's intents:
- Define relationships that accurately represent the user's query intent and information needs.
- Format each relationship as: (Source Entity) - [Relationship] -> (Target Entity), where the relationship describes what the user wants to know about the connection between these entities.
"""You are a knowledge base graph expert and are very good at building knowledge graphs. Now you are assigned to extract the user's step-by-step intentions from the query.
## Instructions:
- Limit to no more than 3 pairs. These pairs must accurately reflect the user's real (sub)questions.
- Ensure that the extracted pairs are of high quality and do not introduce unnecessary search elements.
- Ensure that the relationships and intents are grounded and factual, based on the information provided in the query.
- Break down the user's query into a sequence of prerequisite questions (e.g., identifying specific versions) and step-by-step intentions.
- Represent each prerequisite and intention as a relationship: (Source Entity) - [Relationship] -> (Target Entity).
- Provide reasoning for each relationship, explaining the user's intention at that step.
- Limit to no more than 5 relationships.
- Ensure that the extracted relationships accurately reflect the user's real intentions.
- Ensure that the relationships and intentions are grounded and factual, based on the information provided in the query.
"""

query: str = dspy.InputField(
desc="The query text to extract the most critical concepts and their relationships from the query."
desc="The query text to extract the user's step-by-step intentions."
)
factors: DecomposedFactors = dspy.OutputField(
desc="Factors representation of the critical concepts and their relationships extracted from the query."
desc="Representation of the user's step-by-step intentions extracted from the query."
)


Expand Down
39 changes: 15 additions & 24 deletions backend/dspy_compiled_program/decompose_query_program

Large diffs are not rendered by default.

135 changes: 135 additions & 0 deletions backend/dspy_compiled_program/decompose_query_samples.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
[
{
"query":"Chat2query is returning an error message saying \"Query timeout expired\". Additionally, I am unable to locate this SQL query in the slow query log.",
"source_entity":"Chat2query",
"target_entity":"Error Message",
"relationship_desc":"Chat2query is returning an error message saying 'Query timeout expired'.",
"reasoning":"The main problem the user is facing."
},
{
"query":"Chat2query is returning an error message saying \"Query timeout expired\". Additionally, I am unable to locate this SQL query in the slow query log.",
"source_entity":"SQL Query",
"target_entity":"Slow Query Log",
"relationship_desc":"The reason why not to locate the SQL query in the slow query log.",
"reasoning":"The secondary problem the user is facing."
},
{
"query":"I am current using tidb serverless, but as my product grows, I really need a dalicated cluster. Is there a solution helps finish the migration?",
"source_entity":"TiDB Serverless",
"target_entity":"Dedicated Cluster",
"relationship_desc":"How to migrate from TiDB serverless to TiDB dedicated cluster?",
"reasoning":"The main concern of the user."
},
{
"query":"I am designing a table based on TiDB's TTL feature, but when I try to create the table using a cluster created with Serverless, I get a `'TTL' is not supported on TiDB Serverless` error.\n\nI plan to use Dedicated on my production environment and Serverless on my development environment, so it would be helpful if the TTL feature could be used in a Serverless environment.\n\nI've read the documentation that says Serverless will support TTL features in the future, but is there a specific timeline for this?\n\nAlso, is it possible to prevent TTL syntax from causing errors in Serverless?",
"source_entity":"TTL Feature",
"target_entity":"TiDB Serverless",
"relationship_desc":"The TTL feature is not currently supported in TiDB Serverless.",
"reasoning":"The problem the user is facing."
},
{
"query":"I am designing a table based on TiDB's TTL feature, but when I try to create the table using a cluster created with Serverless, I get a `'TTL' is not supported on TiDB Serverless` error.\n\nI plan to use Dedicated on my production environment and Serverless on my development environment, so it would be helpful if the TTL feature could be used in a Serverless environment.\n\nI've read the documentation that says Serverless will support TTL features in the future, but is there a specific timeline for this?\n\nAlso, is it possible to prevent TTL syntax from causing errors in Serverless?",
"source_entity":"TTL Feature",
"target_entity":"Roadmap Support Timeline",
"relationship_desc":"What's the roadmap timeline on when the TTL feature will be supported in TiDB Serverless.",
"reasoning":"The main question the user is asking."
},
{
"query":"I am designing a table based on TiDB's TTL feature, but when I try to create the table using a cluster created with Serverless, I get a `'TTL' is not supported on TiDB Serverless` error.\n\nI plan to use Dedicated on my production environment and Serverless on my development environment, so it would be helpful if the TTL feature could be used in a Serverless environment.\n\nI've read the documentation that says Serverless will support TTL features in the future, but is there a specific timeline for this?\n\nAlso, is it possible to prevent TTL syntax from causing errors in Serverless?",
"source_entity":"TTL SQL Syntax",
"target_entity":"Workaround for SQL Syntax Error",
"relationship_desc":"Workaround to prevent TTL feature SQL syntax from causing errors in TiDB Serverless.",
"reasoning":"The secondary question the user is asking."
},
{
"query":"Upgrade TiDB Serverless to 7.4 or latest for enhanced MySQL 8.0 compatibility",
"source_entity":"TiDB 7.4 or Latest version",
"target_entity":"MySQL 8.0 Compatibility",
"relationship_desc":"TiDB 7.4 or the latest version enhances compatibility with MySQL 8.0",
"reasoning":"The reasoning why user wants to upgrade TiDB Serverless to 7.4 or latest for enhanced MySQL 8.0 compatibility"
},
{
"query":"Upgrade TiDB Serverless to 7.4 or latest for enhanced MySQL 8.0 compatibility",
"source_entity":"TiDB Serverless",
"target_entity":"Upgrade",
"relationship_desc":"How to upgrade TiDB Serverless?",
"reasoning":"The basic question what the user itentionally asked."
},
{
"query":"We are new to TiDB and don't quite understand the potential impact on our application architecture. We are using TiDB for audit logs and continue to direct traffic to TiDB. We noticed a sudden jump ID from 1 to 30,001. Are there any impacts? Do we need to address this? If we have 100 connections from several applications, what will happen? In summary, what should we do for Auto Increment or do nothing?",
"source_entity":"Auto Increment",
"target_entity":"ID Jump",
"relationship_desc":"Why Auto Increment in TiDB causes a sudden increase in the ID values?",
"reasoning":"The main concerns that the user itentionally asked."
},
{
"query":"We are new to TiDB and don't quite understand the potential impact on our application architecture. We are using TiDB for audit logs and continue to direct traffic to TiDB. We noticed a sudden jump ID from 1 to 30,001. Are there any impacts? Do we need to address this? If we have 100 connections from several applications, what will happen? In summary, what should we do for Auto Increment or do nothing?",
"source_entity":"Connections Impact",
"target_entity":"TiDB",
"relationship_desc":"How 100 connections from several applications affect TiDB, especially when the Auto Increment causes a sudden jump in ID values?",
"reasoning":"The second most important question that the user itentionally asked."
},
{
"query":"We are new to TiDB and don't quite understand the potential impact on our application architecture. We are using TiDB for audit logs and continue to direct traffic to TiDB. We noticed a sudden jump ID from 1 to 30,001. Are there any impacts? Do we need to address this? If we have 100 connections from several applications, what will happen? In summary, what should we do for Auto Increment or do nothing?",
"source_entity":"TiDB",
"target_entity":"Audit Logs",
"relationship_desc":"TiDB is used for storing audit logs and receiving continuous traffic.",
"reasoning":"The user case what the user wants to achieve"
},
{
"query":"tidb lighting to sync to serverless cluster,but the load command and the tidb-lighting tools dont have the tls config like --ssl-ca or --ca. so i can not sync to the full back data to the serverless",
"source_entity":"TiDB Lighting",
"target_entity":"Serverless Cluster",
"relationship_desc":"Sync data to a serverless cluster using TiDB Lighting.",
"reasoning":"The user case what the user wants to achieve"
},
{
"query":"tidb lighting to sync to serverless cluster,but the load command and the tidb-lighting tools dont have the tls config like --ssl-ca or --ca. so i can not sync to the full back data to the serverless",
"source_entity":"Load Command and TiDB Lighting Tools",
"target_entity":"TLS Configuration",
"relationship_desc":"How to configure TLS for TiDB Lightning?",
"reasoning":"The basic question what the user itentionally asked."
},
{
"query":"tidb lighting to sync to serverless cluster,but the load command and the tidb-lighting tools dont have the tls config like --ssl-ca or --ca. so i can not sync to the full back data to the serverless",
"source_entity":"Lack of TLS Configuration",
"target_entity":"Sync Issue",
"relationship_desc":"The sync issue is caused by the lack of TLS configuration options for TiDB Lightning.",
"reasoning":"The problem that the user is facing."
},
{
"query":"summary the performance improvement from version 6.5 to newest version for TiDB",
"source_entity":"TiDB",
"target_entity":"the newest version",
"relationship_desc":"what is the newest version of TiDB?",
"reasoning":"The prerequist question need to be figured out."
},
{
"query":"summary the performance improvement from version 6.5 to newest version for TiDB",
"source_entity":"Performance Improvement",
"target_entity":"TiDB 6.5 to Newest Version",
"relationship_desc":"The performance improvement from TiDB 6.5 to the newest version.",
"reasoning":"The main question the user is asking."
},
{
"query":"What are the feature changes in the latest version compared to v7.0 for TiDB?",
"source_entity":"TiDB",
"target_entity":"Latest Version",
"relationship_desc":"What is the latest version of TiDB?",
"reasoning":"The prerequist question need to be figured out."
},
{
"query":"What are the feature changes in the latest version compared to v7.0 for TiDB?",
"source_entity":"New and Deprecated Features",
"target_entity":"TiDB v7.0",
"relationship_desc":"what are the new features added and any features that have been deprecated or removed since TiDB v7.0.",
"reasoning":"The sub question to answer the main question."
},
{
"query":"What are the feature changes in the latest version compared to v7.0 for TiDB?",
"source_entity":"Feature Changes",
"target_entity":"TiDB 7.0 to Latest Version",
"relationship_desc":"The feature changes from TiDB 7.0 to the latest version.",
"reasoning":"The main question the user is asking."
}
]
21 changes: 0 additions & 21 deletions backend/dspy_compiled_program/sql_sample.csv

This file was deleted.

Loading

0 comments on commit ae9ccee

Please sign in to comment.