Sweep(slow): count tokens on the server side #17

CNSeniorious000 · 2023-11-30T10:40:53Z

Details

在 src/pages/api/generate.ts 中加上和 src/components/Generator.tsx 中一样的裁剪 messages 的逻辑：
但是注意：服务端用不了 tiktoken 库，只能用 tiktoken-js 库，他们应该有类似的 interface

Checklist

Create src/utils/tiktoken-server.ts ✓ 09d7244
Running GitHub Actions for src/utils/tiktoken-server.ts ✓
Modify src/pages/api/generate.ts ✓ 30a5ea4
Running GitHub Actions for src/pages/api/generate.ts ✓

The text was updated successfully, but these errors were encountered:

sweep-ai · 2023-11-30T10:41:00Z

Here's the PR! #20.

⚡ Sweep Basic Tier: I'm using GPT-4. You have 4 GPT-4 tickets left for the month and 3 for the day. (tracking ID: 1bfb7247a1)

For more GPT-4 tickets, visit our payment portal. For a one week free trial, try Sweep Pro (unlimited GPT-4 tickets).

Actions (click)

↻ Restart Sweep

Sandbox Execution ✓

Here are the sandbox execution logs prior to making any changes:

Sandbox logs for 117c9ef

Checking src/pages/api/generate.ts for syntax errors... ✅ src/pages/api/generate.ts has no syntax errors! 1/1 ✓
Checking src/pages/api/generate.ts for syntax errors...
✅ src/pages/api/generate.ts has no syntax errors!

Sandbox passed on the latest endless, so sandbox checks will be enabled for this issue.

Step 1: 🔎 Searching

I found the following snippets in your repository. I will now analyze these snippets and come up with a plan.

Some code snippets I think are relevant in decreasing order of relevance (click to expand). If some file is missing from here, you can mention the path in the ticket description.

free-chat/src/components/Generator.tsx

Lines 1 to 19 in 117c9ef

    
           import { Index, Match, Show, Switch, batch, createEffect, createSignal, onMount } from 'solid-js' 
        
           import { Toaster, toast } from 'solid-toast' 
        
           import { useThrottleFn } from 'solidjs-use' 
        
           import { generateSignature } from '@/utils/auth' 
        
           import { fetchModeration, fetchTitle } from '@/utils/misc' 
        
           import { audioChunks, getAudioBlob, startRecording, stopRecording } from '@/utils/record' 
        
           import { countTokens } from '@/utils/tiktoken' 
        
           import { MessagesEvent } from '@/utils/events' 
        
           import IconClear from './icons/Clear' 
        
           import MessageItem from './MessageItem' 
        
           import SystemRoleSettings from './SystemRoleSettings' 
        
           import ErrorMessageItem from './ErrorMessageItem' 
        
           import TokenCounter, { encoder } from './TokenCounter' 
        
           import type { ChatMessage, ErrorMessage } from '@/types' 
        
           import type { Setter } from 'solid-js' 
        
           export const minMessages = Number(import.meta.env.PUBLIC_MIN_MESSAGES ?? 3) 
        
           export const maxTokens = Number(import.meta.env.PUBLIC_MAX_TOKENS ?? 3000)

free-chat/src/pages/api/generate.ts

Lines 1 to 15 in 117c9ef

    
           // #vercel-disable-blocks 
        
           import { ProxyAgent, fetch } from 'undici' 
        
           // #vercel-end 
        
           import { generatePayload, parseOpenAIStream } from '@/utils/openAI' 
        
           import { verifySignature } from '@/utils/auth' 
        
           import type { APIRoute } from 'astro' 
        
           const apiKey = import.meta.env.OPENAI_API_KEY 
        
           const httpsProxy = import.meta.env.HTTPS_PROXY 
        
           const baseUrl = ((import.meta.env.OPENAI_API_BASE_URL) || 'https://api.openai.com').trim().replace(/\/$/, '') 
        
           const sitePassword = import.meta.env.SITE_PASSWORD 
        
           const ua = import.meta.env.UNDICI_UA 
        
           const FORWARD_HEADERS = ['origin', 'referer', 'cookie', 'user-agent', 'via']

free-chat/src/components/Generator.tsx

Lines 217 to 260 in 117c9ef

    
           const storagePassword = localStorage.getItem('pass') 
        
           try { 
        
             const controller = new AbortController() 
        
             setController(controller) 
        
             const requestMessageList = [...messageList()] 
        
             let limit = maxTokens 
        
             const systemMsg = currentSystemRoleSettings() 
        
               ? { 
        
                   role: 'system', 
        
                   content: currentSystemRoleSettings(), 
        
                 } as ChatMessage 
        
               : null 
        
             systemMsg && (limit -= countTokens(encoder()!, [systemMsg])!.total) 
        
             while (requestMessageList.length > minMessages && countTokens(encoder()!, requestMessageList)!.total > limit) 
        
               requestMessageList.shift() 
        
             systemMsg && requestMessageList.unshift(systemMsg) 
        
             const timestamp = Date.now() 
        
             const response = await fetch('/api/generate', { 
        
               method: 'POST', 
        
               body: JSON.stringify({ 
        
                 model: localStorage.getItem('model') || 'gpt-3.5-turbo-1106', 
        
                 messages: requestMessageList, 
        
                 time: timestamp, 
        
                 pass: storagePassword, 
        
                 sign: await generateSignature({ 
        
                   t: timestamp, 
        
                   m: requestMessageList?.[requestMessageList.length - 1]?.content || '', 
        
                 }), 
        
               }), 
        
               signal: controller.signal, 
        
               headers: localStorage.getItem('apiKey') ? { authorization: `Bearer ${localStorage.getItem('apiKey')}` } : {}, 
        
             }) 
        
             if (!response.ok) { 
        
               const error = await response.json() 
        
               console.error(error.error) 
        
               setCurrentError(error.error) 
        
               throw new Error('Request failed') 
        
             }

free-chat/src/utils/tiktoken.ts

Lines 1 to 38 in 117c9ef

    
           import type { ChatMessage } from '@/types' 
        
           import type { Tiktoken } from 'tiktoken' 
        
           const countTokensSingleMessage = (enc: Tiktoken, message: ChatMessage) => { 
        
             return 4 + enc.encode(message.content).length // im_start, im_end, role/name, "\n" 
        
           } 
        
           export const countTokens = (enc: Tiktoken | null, messages: ChatMessage[]) => { 
        
             if (messages.length === 0) return 
        
             if (!enc) return { total: Infinity } 
        
             const lastMsg = messages.at(-1) 
        
             const context = messages.slice(0, -1) 
        
             const countTokens: (message: ChatMessage) => number = countTokensSingleMessage.bind(null, enc) 
        
             const countLastMsg = countTokens(lastMsg!) 
        
             const countContext = context.map(countTokens).reduce((a, b) => a + b, 3) // im_start, "assistant", "\n" 
        
             return { countContext, countLastMsg, total: countContext + countLastMsg } 
        
           } 
        
           const cl100k_base_json = import.meta.env.PUBLIC_CL100K_BASE_JSON_URL || '/cl100k_base.json' 
        
           const tiktoken_bg_wasm = import.meta.env.PUBLIC_TIKTOKEN_BG_WASM_URL || '/tiktoken_bg.wasm' 
        
           async function getBPE() { 
        
             return fetch(cl100k_base_json).then(r => r.json()) 
        
           } 
        
           export const initTikToken = async() => { 
        
             const { init } = await import('tiktoken/lite/init') 
        
             const [{ bpe_ranks, special_tokens, pat_str }, { Tiktoken }] = await Promise.all([ 
        
               getBPE().catch(console.error), 
        
               import('tiktoken/lite/init'), 
        
               fetch(tiktoken_bg_wasm).then(r => r.arrayBuffer()).then(wasm => init(imports => WebAssembly.instantiate(wasm, imports))), 
        
             ]) 
        
             return new Tiktoken(bpe_ranks, special_tokens, pat_str)

Step 2: ⌨️ Coding

Create src/utils/tiktoken-server.ts ✓ 09d7244

Create src/utils/tiktoken-server.ts with contents:
• Create a new utility file named `tiktoken-server.ts` in the `src/utils` directory for the server-side token counting logic.
• Use `tiktoken-js` instead of `tiktoken` as the server-side equivalent library.
• Define and export a function `countTokensServer` that implements the same logic as `countTokens` from `src/utils/tiktoken.ts`.
• Ensure the function interface matches that of the `countTokens` presently on the client side, taking an encoder and a list of messages as arguments and returning an object with the total token count.
• Make sure to wrap any initializations that are not available on the server, such as fetching base configurations or initializing WebAssembly modules, in a server-compatible manner.

Running GitHub Actions for src/utils/tiktoken-server.ts ✓

Check src/utils/tiktoken-server.ts with contents:
Ran GitHub Actions for 09d72442ebe25ea72693afd406fe601d703d1b27:
• Vercel Preview Comments: ✓

Modify src/pages/api/generate.ts ✓ 30a5ea4

Modify src/pages/api/generate.ts with contents:
• In the `post` method of the API route, import the `countTokensServer` function from `src/utils/tiktoken-server.ts`.
• After retrieving the request body, apply the token counting logic to trim the `messages` array, ensuring it remains under a defined token limit.
• Use the constants defined in `src/components/Generator.tsx` like `minMessages` and `maxTokens` to set the lower message limit and token count limit. These may need to be moved to a shared constants file if they are not already.
• Ensure that after implementing the logic, the trimmed `messages` are then passed on for the rest of the processing where the generation payload is created.

Running GitHub Actions for src/pages/api/generate.ts ✓

Check src/pages/api/generate.ts with contents:
Ran GitHub Actions for 30a5ea4d0bdc06e092563c96327c3e11eeb3cff2:
• Vercel Preview Comments: ✓

Step 3: 🔁 Code Review

I have finished reviewing the code for completeness. I did not find errors for sweep/server-side-token-counting_1.

🎉 Latest improvements to Sweep:

Sweep uses OpenAI's latest Assistant API to plan code changes and modify code! This is 3x faster and significantly more reliable as it allows Sweep to edit code and validate the changes in tight iterations, the same way as a human would.
Sweep now uses the rope library to refactor Python! Check out Large Language Models are Bad at Refactoring Code. To have Sweep refactor your code, try sweep: Refactor <your_file>.py!

💡 To recreate the pull request edit the issue title or description. To tweak the pull request, leave a comment on the pull request.
^{Join Our Discord}

CNSeniorious000 added the sweep label Nov 30, 2023

This was referenced Nov 30, 2023

Server Side Token Counting #18

Closed

Server-side token counting and message trimming #19

Closed

Implement token counting on server side (✓ Sandbox Passed) #20

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sweep(slow): count tokens on the server side #17

Sweep(slow): count tokens on the server side #17

CNSeniorious000 commented Nov 30, 2023 •

edited by sweep-ai bot

Loading

sweep-ai bot commented Nov 30, 2023 •

edited

Loading

Sweep(slow): count tokens on the server side #17

Sweep(slow): count tokens on the server side #17

Comments

CNSeniorious000 commented Nov 30, 2023 • edited by sweep-ai bot Loading

Details

sweep-ai bot commented Nov 30, 2023 • edited Loading

Here's the PR! #20.

Actions (click)

Sandbox Execution ✓

Step 1: 🔎 Searching

Step 2: ⌨️ Coding

Step 3: 🔁 Code Review

🎉 Latest improvements to Sweep:

CNSeniorious000 commented Nov 30, 2023 •

edited by sweep-ai bot

Loading

sweep-ai bot commented Nov 30, 2023 •

edited

Loading