add new use case, update blog (#2721)

Helicone · Oct 4, 2024 · 8781c06 · 8781c06
1 parent 20eb858
commit 8781c06
Show file tree

Hide file tree

Showing 4 changed files with 318 additions and 26 deletions.
diff --git a/bifrost/app/blog/blogs/replaying-llm-sessions/metadata.json b/bifrost/app/blog/blogs/replaying-llm-sessions/metadata.json
@@ -1,8 +1,8 @@
 {
-  "title": "Replaying LLM Sessions for Iterative AI Agent Improvement",
-  "title1": "Replaying LLM Sessions for Iterative AI Agent Improvement",
-  "title2": "Replaying LLM Sessions for Iterative AI Agent Improvement",
-  "description": "Learn how to enhance your AI agents by replaying and modifying LLM sessions using Helicone. Apply changes directly to real user interactions to gain authentic context, reveal hidden effects, and accelerate iteration.",
+  "title": "Optimizing AI Agents: How Replaying LLM Sessions Enhances Performance",
+  "title1": "Optimizing AI Agents: How Replaying LLM Sessions Enhances Performance",
+  "title2": "Optimizing AI Agents: How Replaying LLM Sessions Enhances Performance",
+  "description": "Learn how to optimize your AI agents by replaying LLM sessions using Helicone. Enhance performance, uncover hidden issues, and accelerate AI agent development with this comprehensive guide.",
   "images": "/static/blog/replaying-llm-sessions/sessions.webp",
   "time": "15 minute read",
   "author": "Cole Gottdank",

diff --git a/bifrost/app/blog/blogs/replaying-llm-sessions/src.mdx b/bifrost/app/blog/blogs/replaying-llm-sessions/src.mdx
@@ -1,35 +1,42 @@
-![sessions](/static/blog/replaying-llm-sessions/sessions.webp)
-Experimenting with prompts in isolation **limits your understanding**. To truly grasp how a prompt change impacts an entire session, you need to **apply changes directly to real user interactions**. **<span style={{color: '#0ea5e9'}}>Replaying LLM sessions with Helicone unlocks this capability</span>**, providing insights unattainable through isolated testing.
+![Optimizing AI Agents](/static/blog/replaying-llm-sessions/sessions.webp)
 
-**Why is this powerful?**
+Are you looking to **<span style={{color: '#0ea5e9'}}>optimize your AI agents</span>** and enhance their performance? Understanding how changes impact your AI agents in real-world interactions is crucial. By **<span style={{color: '#0ea5e9'}}>replaying LLM sessions</span>** with Helicone, you can directly apply modifications to actual AI agent sessions, providing valuable insights that traditional isolated testing may miss.
 
-- **Authentic Context**: By leveraging actual production data, you see how changes affect real user experiences.
-- **Unveiling Hidden Effects**: Discover unintended consequences that only emerge over full sessions.
-- **Accelerated Iteration**: Automate testing with real inputs, streamlining your optimization process.
+**Why Replay LLM Sessions for AI Agents?**
 
-**<span style={{color: '#0ea5e9'}}>Helicone empowers you to replay any complex session</span>**—a capability no other platform offers. Due to our adaptability, more mature product teams often build bespoke solutions atop Helicone to store, aggregate, and analyze their AI workflows, enhancing performance with genuine user data without reinventing the wheel.
+- **Deep Insights into Agent Behavior**: See how your AI agents perform in authentic scenarios using production data.
+- **Uncover Hidden Issues**: Identify and address problems that only arise during full session interactions.
+- **Accelerate Development**: Streamline your AI agent development process by testing changes efficiently.
 
-In this guide, we'll **<span style={{color: '#0ea5e9'}}>demonstrate how to leverage Helicone to replay LLM sessions</span>**. You'll learn how to set up an initial session, query session data, and replay sessions with modifications. We'll also share tips on customizing this approach for your unique needs.
+In this guide, we'll show you **<span style={{color: '#0ea5e9'}}>how to optimize your AI agents by replaying LLM sessions with Helicone</span>**, providing step-by-step instructions and best practices.
 
 ---
 
-## Overview of the Replay Process With Helicone
+## What is an AI Agent?
 
-The process of replaying LLM sessions with Helicone involves three main steps:
+An **<span style={{color: '#0ea5e9'}}>AI agent</span>** is an autonomous software entity that performs tasks on behalf of users with some degree of independence or autonomy, utilizing AI techniques. Optimizing these agents ensures they provide accurate, efficient, and reliable outcomes.
 
-1. **<span style={{color: '#0ea5e9'}}>Setting Up the Initial Session</span>**: Instrument your LLM calls to include Helicone session metadata so that they can be tracked and logged.
-2. **<span style={{color: '#0ea5e9'}}>Querying Helicone for Session Data</span>**: Use Helicone's API to retrieve the logs of past sessions that you want to replay.
-3. **<span style={{color: '#0ea5e9'}}>Replaying the Session with Modifications</span>**: Programmatically modify the retrieved session data as needed and send requests to the LLM to observe the effects.
+---
+
+## Why Optimize AI Agents by Replaying LLM Sessions?
 
-Let's explore each of these steps in detail by following an example.
+Replaying LLM sessions allows you to:
 
-## Example: AI Debate Application
+- **Test Modifications Safely**: Experiment with changes without affecting live users.
+- **Understand Contextual Performance**: See how adjustments impact the agent's behavior over entire sessions.
+- **Improve User Experience**: Deliver more accurate and helpful interactions to users.
+
+---
+
+## Step-by-Step Guide to Enhancing AI Agent Performance
+
+### Example Application: AI Debate
 
 We'll walk through an example of a debate session between a user and an assistant. Between each argument, a impartial assistant scores the argument from 1 to 10.
 
-### Step 1: Setting Up the Initial Session
+### Step 1: Setting Up Your AI Agent with Helicone
 
-Before you can replay sessions, you need to log them properly in Helicone. By adding **<span style={{color: '#0ea5e9'}}>only 3 headers</span>** to your LLM API requests, you can tag and group them into sessions.
+Instrument your AI agent’s LLM calls to include Helicone session metadata for tracking and logging.
 
 #### Instrumenting Your LLM Calls
 
@@ -146,7 +153,9 @@ _Go fullscreen for the best experience._
 
 _Read more about how to implement Helicone sessions [here](https://docs.helicone.ai/features/sessions)._
 
-### Step 2: Querying the Session Data from Helicone
+### Step 2: Retrieving Session Data
+
+Use Helicone's API to fetch session data for analysis.
 
 ```javascript
 const response = await fetch("https://api.helicone.ai/v1/request/query", {
@@ -170,9 +179,9 @@ const data = await response.json();
 
 Read more about Helicone's API [here](https://docs.helicone.ai/rest/request/post-v1requestquery).
 
-### Step 3: Processing and Modifying the Session Data
+### Step 3: Replaying and Modifying Sessions
 
-Now that you have the session data, you'll need to process it.
+Modify session data to test improvements.
 
 1. **Parse and sort the requests**
 
@@ -294,7 +303,7 @@ _Alternatively, as described above, you can manually modify the prompts after re
 
 ### Conclusion
 
-By replaying and modifying LLM sessions with Helicone, you gain deeper insights into how changes affect the entire workflow. This method provides context-rich, real-world data that leads to more effective optimizations and a comprehensive understanding of your AI's behavior.
+By focusing on **replaying LLM sessions**, you can significantly **enhance the performance of your AI agents**. Helicone provides the tools necessary to make this process efficient and effective, leading to better user experiences and more robust AI applications.
 
 ---
 

diff --git a/docs/mint.json b/docs/mint.json
@@ -256,7 +256,8 @@
         "use-cases/experiments",
         "use-cases/enable-stream-usage",
         "use-cases/resell-a-model",
-        "use-cases/bill-by-usage"
+        "use-cases/bill-by-usage",
+        "use-cases/replay-session"
       ]
     },
     {

diff --git a/docs/use-cases/replay-session.mdx b/docs/use-cases/replay-session.mdx
@@ -0,0 +1,282 @@
+---
+title: "Replaying LLM Sessions"
+sidebarTitle: "Replay Sessions"
+description: "Learn how to replay and modify LLM sessions using Helicone to optimize your AI agents and improve their performance."
+"twitter:title": "Replaying LLM Sessions - Helicone OSS LLM Observability"
+---
+
+import QuestionsSection from "/snippets/questions-section.mdx";
+
+Understanding how changes impact your AI agents in real-world interactions is crucial. By **replaying LLM sessions** with Helicone, you can apply modifications to actual AI agent sessions, providing valuable insights that traditional isolated testing may miss.
+
+## Use Cases
+
+- **Optimize AI Agents**: Enhance agent performance by testing modifications on real session data.
+- **Debug Complex Interactions**: Identify issues that only arise during full session interactions.
+- **Accelerate Development**: Streamline your AI agent development process by efficiently testing changes.
+
+<Steps>
+  <Step title="Record Sessions with Helicone Metadata">
+
+    Instrument your AI agent’s LLM calls to include Helicone session metadata for tracking and logging.
+
+    **Example: Setting Up Session Metadata**
+
+    ````javascript Setting Up Session Metadata
+    const { Configuration, OpenAIApi } = require("openai");
+    const { randomUUID } = require("crypto");
+
+    // Generate unique session identifiers
+    const sessionId = randomUUID();
+    const sessionName = "AI Debate";
+    const sessionPath = "/debate/climate-change";
+
+    // Initialize OpenAI client with Helicone baseURL and auth header
+    const configuration = new Configuration({
+      apiKey: process.env.OPENAI_API_KEY,
+      basePath: "https://oai.helicone.ai/v1",
+      baseOptions: {
+        headers: {
+          "Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}`,
+        },
+      },
+    });
+    const openai = new OpenAIApi(configuration);
+    ````
+
+    **Include the Helicone session headers in your requests:**
+
+    ````javascript Including Helicone Session Headers
+    const completionParams = {
+      model: "gpt-3.5-turbo",
+      messages: conversation,
+    };
+
+    const response = await openai.createChatCompletion(completionParams, {
+      headers: {
+        "Helicone-Session-Id": sessionId,
+        "Helicone-Session-Name": sessionName,
+        "Helicone-Session-Path": sessionPath,
+        "Helicone-Prompt-Id": "assistant-response",
+      },
+    });
+    ````
+
+    **Initialize the conversation with the assistant:**
+
+    ````javascript Initializing Conversation
+    const topic = "The impact of climate change on global economies";
+
+    const conversation = [
+      {
+        role: "system",
+        content:
+          "You're an AI debate assistant. Engage with the user by presenting arguments for or against the topic. Keep responses concise and insightful.",
+      },
+      {
+        role: "assistant",
+        content: `Welcome to our debate! Today's topic is: "${topic}". I will argue in favor, and you will argue against. Please present your opening argument.`,
+      },
+    ];
+    ````
+
+    **Loop through the debate turns:**
+
+    ````javascript Looping Through Debate Turns
+    const MAX_TURNS = 3;
+    let turn = 1;
+
+    while (turn <= MAX_TURNS) {
+      // Get user's argument (simulate user input)
+      const userArgument = await getUserArgument();
+      conversation.push({ role: "user", content: userArgument });
+
+      // Assistant responds with a counter-argument
+      const assistantResponse = await generateAssistantResponse(
+        conversation,
+        sessionId,
+        sessionName,
+        sessionPath
+      );
+      conversation.push(assistantResponse);
+
+      turn++;
+    }
+
+    // Function to simulate user input
+    async function getUserArgument() {
+      // Simulate user input or fetch from an input source
+      const userArguments = [
+        "I believe climate change is a natural cycle and not significantly influenced by human activities.",
+        "Economic resources should focus on immediate human needs rather than combating climate change.",
+        "Strict environmental regulations can hinder economic growth and affect employment rates.",
+      ];
+      // Return the next argument
+      return userArguments.shift();
+    }
+
+    // Function to generate assistant's response
+    async function generateAssistantResponse(
+      conversation,
+      sessionId,
+      sessionName,
+      sessionPath
+    ) {
+      const completionParams = {
+        model: "gpt-3.5-turbo",
+        messages: conversation,
+      };
+
+      const response = await openai.createChatCompletion(completionParams, {
+        headers: {
+          "Helicone-Session-Id": sessionId,
+          "Helicone-Session-Name": sessionName,
+          "Helicone-Session-Path": sessionPath,
+          "Helicone-Prompt-Id": "assistant-response",
+        },
+      });
+
+      const assistantMessage = response.data.choices[0].message;
+      return assistantMessage;
+    }
+    ````
+
+    **After setting up and running your session through Helicone, you can view it in Helicone:**
+
+    <Frame>
+      <video width="100%" controls>
+        <source
+          src="https://marketing-assets-helicone.s3.us-west-2.amazonaws.com/session_debate.mp4"
+          type="video/mp4"
+        />
+        Your browser does not support the video tag.
+      </video>
+    </Frame>
+
+    *Go fullscreen for the best experience.*
+
+  </Step>
+
+  <Step title="Retrieve Session Data">
+
+    Use Helicone's [Request API](/rest/request/post-v1requestquery) to fetch session data.
+
+    **Example: Querying Session Data**
+
+    ````bash Querying Session Data
+    curl --request POST \
+      --url https://api.helicone.ai/v1/request/query \
+      --header 'Content-Type: application/json' \
+      --header 'authorization: Bearer sk-<your-helicone-api-key>' \
+      --data '{
+        "limit": 100,
+        "offset": 0,
+        "sort_by": {
+          "key": "request_created_at",
+          "direction": "asc"
+        },
+        "filter": {
+          "properties": {
+            "Helicone-Session-Id": {
+              "equals": "<session-id>"
+            }
+          }
+        }
+      }'
+    ````
+
+  </Step>
+
+  <Step title="Modify and Replay the Session">
+
+    Retrieve the original requests, apply modifications, and resend them to observe the impact.
+
+    **Example: Modifying Requests and Replaying**
+
+    ````javascript Modifying Requests and Replaying
+    const fetch = require("node-fetch");
+    const { randomUUID } = require("crypto");
+
+    const HELICONE_API_KEY = process.env.HELICONE_API_KEY;
+    const OPENAI_API_KEY = process.env.OPENAI_API_KEY;
+    const REPLAY_SESSION_ID = randomUUID();
+
+    async function replaySession(requests) {
+      for (const request of requests) {
+        const modifiedRequest = modifyRequestBody(request);
+
+        await sendRequest(modifiedRequest);
+      }
+    }
+
+    function modifyRequestBody(request) {
+      // Implement modifications to the request body as needed
+      // For example, enhancing the system prompt for better responses
+      if (request.prompt_id === "assistant-response") {
+        const systemMessage = request.body.messages.find(
+          (msg) => msg.role === "system"
+        );
+        if (systemMessage) {
+          systemMessage.content +=
+            " Take the persona of a field expert and provide more persuasive arguments.";
+        }
+      }
+      return request;
+    }
+
+    async function sendRequest(modifiedRequest) {
+      const { body, request_path, path, prompt_id } = modifiedRequest;
+
+      const response = await fetch(request_path, {
+        method: "POST",
+        headers: {
+          "Content-Type": "application/json",
+          Authorization: `Bearer ${OPENAI_API_KEY}`,
+          "Helicone-Auth": `Bearer ${HELICONE_API_KEY}`,
+          "Helicone-Session-Id": REPLAY_SESSION_ID,
+          "Helicone-Session-Name": "Replayed Session",
+          "Helicone-Session-Path": path,
+          "Helicone-Prompt-Id": prompt_id,
+        },
+        body: JSON.stringify(body),
+      });
+
+      const data = await response.json();
+      // Handle the response as needed
+    }
+    ````
+
+    **Note:** In the `modifyRequestBody` function, we're enhancing the assistant's system prompt to make the responses more persuasive by taking the persona of a field expert.
+
+  </Step>
+
+  <Step title="Analyze the Replayed Session">
+
+    After replaying, use Helicone's dashboard to compare the original and modified sessions to evaluate improvements.
+
+    <Frame>
+      <video width="100%" controls>
+        <source
+          src="https://marketing-assets-helicone.s3.us-west-2.amazonaws.com/session_debate_replay.mp4"
+          type="video/mp4"
+        />
+        Your browser does not support the video tag.
+      </video>
+    </Frame>
+
+    *Go fullscreen for the best experience.*
+
+  </Step>
+</Steps>
+
+## Additional Tips
+
+- **Version Control Prompts**: Keep track of different prompt versions to see which yields the best results.
+- **Use Evaluations**: Utilize Helicone's [Evaluation Features](/features/evaluation) to score and compare responses.
+- **Prompt Versioning**: Use Helicone's [Prompt Versioning](/features/prompts) to manage and compare different prompt versions effectively.
+
+## Conclusion
+
+By replaying LLM sessions with Helicone, you can effectively **optimize your AI agents**, leading to improved performance and better user experiences.
+
+<QuestionsSection />