Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Streaming azure openai #244

Open
wants to merge 55 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
ea0c687
debug code
ZhongpinWang Oct 21, 2024
5d0985f
Make streaming work
ZhongpinWang Oct 22, 2024
a74cf6b
fix: remove await
ZhongpinWang Oct 22, 2024
9025451
fix: await again
ZhongpinWang Oct 22, 2024
fc12de0
small changes
ZhongpinWang Oct 23, 2024
d5d38bd
chore: add missing javadoc
ZhongpinWang Oct 23, 2024
1f466a7
Merge branch 'feat-streaming-azure-openai-playground' into feat-strea…
ZhongpinWang Oct 23, 2024
07dda35
wip
ZhongpinWang Oct 23, 2024
c218211
feat: pipe streams
ZhongpinWang Oct 23, 2024
4ebd37d
feat: wrap chunk to see usage and finish reason
ZhongpinWang Oct 23, 2024
02d1939
refactor: pipe streams
ZhongpinWang Oct 24, 2024
7a00fc3
refactor
ZhongpinWang Oct 24, 2024
dd02651
refactor: change streamString to streamContent
ZhongpinWang Oct 24, 2024
50142e2
fix: lint
ZhongpinWang Oct 24, 2024
c8611d8
refactor
ZhongpinWang Oct 25, 2024
7386dc5
feat: demo streaming in sample-code
ZhongpinWang Oct 25, 2024
a7ec23a
fix: end res in sample code when finish
ZhongpinWang Oct 25, 2024
2a1d3be
Merge branch 'main' into feat-streaming-azure-openai
ZhongpinWang Oct 28, 2024
bc03fed
fix: lint
ZhongpinWang Oct 28, 2024
c399f09
refactor
ZhongpinWang Oct 28, 2024
b3f4e71
fix: check public-api
ZhongpinWang Oct 28, 2024
fa91209
chore: add tests for stream chunk response
ZhongpinWang Oct 28, 2024
56e6197
fix: Changes from lint
Oct 28, 2024
6297626
fix: chunk type inference
ZhongpinWang Oct 29, 2024
f22bed7
refactor: change some types
ZhongpinWang Oct 30, 2024
1348b97
wip
ZhongpinWang Oct 30, 2024
8086b70
fix: internal.js.map issue
ZhongpinWang Oct 30, 2024
40ad3d2
chore: add tests for chat completion stream
ZhongpinWang Oct 30, 2024
dcb6d54
refactor: move stream files
ZhongpinWang Oct 30, 2024
4bde96b
fix: remove duplicated file
ZhongpinWang Oct 30, 2024
3d5554c
refactor: rename stream
ZhongpinWang Oct 30, 2024
0b79c66
refactor: openai stream
ZhongpinWang Oct 30, 2024
7104fc5
chore: add tests for sse-stream (copied from openai)
ZhongpinWang Oct 30, 2024
2c5247a
refactor: rename test responses
ZhongpinWang Nov 4, 2024
3ff4c9e
Merge branch 'main' into feat-streaming-azure-openai
ZhongpinWang Nov 4, 2024
6570bd2
refactor: replace streamContent with a method
ZhongpinWang Nov 11, 2024
9187988
feat: support multiple choices
ZhongpinWang Nov 11, 2024
0bd1c92
fix: Changes from lint
Nov 11, 2024
0510c2a
fix: add abortcontroler and fix sample code
ZhongpinWang Nov 11, 2024
2bf0e7e
fix: add controller signal to axios
ZhongpinWang Nov 11, 2024
050d0db
fix: Changes from lint
Nov 11, 2024
1399a91
chore: add unit test for stream()
ZhongpinWang Nov 11, 2024
ad65518
fix: Changes from lint
Nov 11, 2024
2a940b1
fix: stream finish reason index 0
ZhongpinWang Nov 11, 2024
8bc6364
lint
ZhongpinWang Nov 11, 2024
841d452
fix: type test
ZhongpinWang Nov 11, 2024
39675b5
fix: make toContentStream return AzureOpenAiChatCompletionStream
ZhongpinWang Nov 11, 2024
658d1bc
fix: lint
ZhongpinWang Nov 11, 2024
d5d817a
Merge branch 'main' into feat-streaming-azure-openai
ZhongpinWang Nov 11, 2024
df6ee3f
feat: throw if sse payload invalid
ZhongpinWang Nov 11, 2024
d3ba1d8
fix: Changes from lint
Nov 11, 2024
819692f
refactor: interface
ZhongpinWang Nov 12, 2024
a06cd03
refactor
ZhongpinWang Nov 12, 2024
6a6e403
Merge branch 'main' into feat-streaming-azure-openai
ZhongpinWang Nov 12, 2024
862ff0f
chore: add changeset
ZhongpinWang Nov 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ import {
} from '../../../../test-util/mock-http.js';
import { AzureOpenAiChatClient } from './azure-openai-chat-client.js';
import { apiVersion } from './model-types.js';
import type { AzureOpenAiCreateChatCompletionResponse } from './client/inference/schema';
import type { AzureOpenAiCreateChatCompletionResponse } from './client/inference/schema/index.js';

describe('Azure OpenAI chat client', () => {
const chatCompletionEndpoint = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ import {
} from '@sap-ai-sdk/ai-api/internal.js';
import { apiVersion, type AzureOpenAiChatModel } from './model-types.js';
import { AzureOpenAiChatCompletionResponse } from './azure-openai-chat-completion-response.js';
import { AzureOpenAiChatCompletionStreamResponse } from './azure-openai-chat-completion-stream-response.js';
import { AzureOpenAiChatCompletionStream } from './azure-openai-chat-completion-stream.js';
import type { AzureOpenAiChatCompletionStreamChunkResponse } from './azure-openai-chat-completion-stream-chunk-response.js';
import type { HttpResponse } from '@sap-cloud-sdk/http-client';
import type { AzureOpenAiCreateChatCompletionRequest } from './client/inference/schema/index.js';

/**
Expand All @@ -28,12 +32,60 @@ export class AzureOpenAiChatClient {
data: AzureOpenAiCreateChatCompletionRequest,
requestConfig?: CustomRequestConfig
): Promise<AzureOpenAiChatCompletionResponse> {
const response = await this.executeRequest(data, requestConfig);
return new AzureOpenAiChatCompletionResponse(response);
}

/**
* Creates a completion stream for the chat messages.
* @param data - The input parameters for the chat completion.
* @param requestConfig - The request configuration.
* @returns The completion stream.
*/
async stream(
data: AzureOpenAiCreateChatCompletionRequest,
requestConfig?: CustomRequestConfig
): Promise<
AzureOpenAiChatCompletionStreamResponse<AzureOpenAiChatCompletionStreamChunkResponse>
> {
const response =
new AzureOpenAiChatCompletionStreamResponse<AzureOpenAiChatCompletionStreamChunkResponse>();
response.stream = (await this.createStream(data, requestConfig))
.pipe(AzureOpenAiChatCompletionStream.processChunk)
.pipe(AzureOpenAiChatCompletionStream.processFinishReason, response)
.pipe(AzureOpenAiChatCompletionStream.processTokenUsage, response);
return response;
}

/**
* Creates a completion stream of the delta content for the chat messages.
* @param data - The input parameters for the chat completion.
* @param requestConfig - The request configuration.
* @returns The completion stream of the delta content.
*/
async streamContent(
data: AzureOpenAiCreateChatCompletionRequest,
requestConfig?: CustomRequestConfig
): Promise<AzureOpenAiChatCompletionStreamResponse<string>> {
const response = new AzureOpenAiChatCompletionStreamResponse<string>();
response.stream = (await this.createStream(data, requestConfig))
.pipe(AzureOpenAiChatCompletionStream.processChunk)
.pipe(AzureOpenAiChatCompletionStream.processFinishReason, response)
.pipe(AzureOpenAiChatCompletionStream.processTokenUsage, response)
.pipe(AzureOpenAiChatCompletionStream.processContent, response);
return response;
}

private async executeRequest(
data: AzureOpenAiCreateChatCompletionRequest,
requestConfig?: CustomRequestConfig
): Promise<HttpResponse> {
const deploymentId = await getDeploymentId(
this.modelDeployment,
'azure-openai'
);
const resourceGroup = getResourceGroup(this.modelDeployment);
const response = await executeRequest(
return executeRequest(
{
url: `/inference/deployments/${deploymentId}/chat/completions`,
apiVersion,
Expand All @@ -42,6 +94,25 @@ export class AzureOpenAiChatClient {
data,
requestConfig
);
return new AzureOpenAiChatCompletionResponse(response);
}

private async createStream(
data: AzureOpenAiCreateChatCompletionRequest,
requestConfig?: CustomRequestConfig
): Promise<AzureOpenAiChatCompletionStream<any>> {
const response = await this.executeRequest(
{
...data,
stream: true,
stream_options: {
include_usage: true
}
},
{
...requestConfig,
responseType: 'stream'
}
);
return AzureOpenAiChatCompletionStream.create(response);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
import { parseMockResponse } from '../../../../test-util/mock-http.js';
import { AzureOpenAiChatCompletionStreamChunkResponse } from './azure-openai-chat-completion-stream-chunk-response.js';

describe('OpenAI chat completion stream chunk response', () => {
let mockResponses: {
tokenUsage: any;
finishReason: any;
deltaContent: any;
};
let azureOpenAiChatCompletionStreamChunkResponses: {
tokenUsage: AzureOpenAiChatCompletionStreamChunkResponse;
ZhongpinWang marked this conversation as resolved.
Show resolved Hide resolved
finishReason: AzureOpenAiChatCompletionStreamChunkResponse;
deltaContent: AzureOpenAiChatCompletionStreamChunkResponse;
};

beforeAll(async () => {
mockResponses = {
tokenUsage: await parseMockResponse<any>(
'foundation-models',
'azure-openai-chat-completion-stream-chunk-response-token-usage.json'
),
finishReason: await parseMockResponse<any>(
'foundation-models',
'azure-openai-chat-completion-stream-chunk-response-finish-reason.json'
),
deltaContent: await parseMockResponse<any>(
'foundation-models',
'azure-openai-chat-completion-stream-chunk-response-delta-content.json'
)
};
azureOpenAiChatCompletionStreamChunkResponses = {
tokenUsage: new AzureOpenAiChatCompletionStreamChunkResponse(
mockResponses.tokenUsage
),
finishReason: new AzureOpenAiChatCompletionStreamChunkResponse(
mockResponses.finishReason
),
deltaContent: new AzureOpenAiChatCompletionStreamChunkResponse(
mockResponses.deltaContent
)
};
});

it('should return the chat completion stream chunk response', () => {
expect(
azureOpenAiChatCompletionStreamChunkResponses.tokenUsage.data
).toStrictEqual(mockResponses.tokenUsage);
expect(
azureOpenAiChatCompletionStreamChunkResponses.finishReason.data
).toStrictEqual(mockResponses.finishReason);
expect(
azureOpenAiChatCompletionStreamChunkResponses.deltaContent.data
).toStrictEqual(mockResponses.deltaContent);
});

it('should get token usage', () => {
expect(
azureOpenAiChatCompletionStreamChunkResponses.tokenUsage.getTokenUsage()
).toMatchObject({
ZhongpinWang marked this conversation as resolved.
Show resolved Hide resolved
completion_tokens: expect.any(Number),
prompt_tokens: expect.any(Number),
total_tokens: expect.any(Number)
});
});

it('should return finish reason', () => {
expect(
azureOpenAiChatCompletionStreamChunkResponses.finishReason.getFinishReason()
).toBe('stop');
});

it('should return delta content with default index 0', () => {
expect(
azureOpenAiChatCompletionStreamChunkResponses.deltaContent.getDeltaContent()
).toBe(' is');
});
});
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
/**
* Azure OpenAI chat completion stream chunk response.
*/
export class AzureOpenAiChatCompletionStreamChunkResponse {
constructor(public readonly data: any) {
// TODO: Change `any` to `CreateChatCompletionStreamResponse` once the preview spec becomes stable.
this.data = data;
}

/**
* Usage of tokens in the chunk response.
* @returns Token usage.
*/
getTokenUsage(): this['data']['usage'] {
return this.data.usage;
}

/**
* Reason for stopping the completion stream chunk.
* @param choiceIndex - The index of the choice to parse.
* @returns The finish reason.
*/
getFinishReason(
choiceIndex = 0
): this['data']['choices'][0]['finish_reason'] {
return this.data.choices[choiceIndex]?.finish_reason;
}

/**
* Parses the chunk response and returns the delta content.
* @param choiceIndex - The index of the choice to parse.
* @returns The message delta content.
*/
getDeltaContent(choiceIndex = 0): string | undefined | null {
return this.data.choices[choiceIndex]?.delta?.content;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
import type { AzureOpenAiCompletionUsage } from './client/inference/schema/index.js';
import type { AzureOpenAiChatCompletionStream } from './azure-openai-chat-completion-stream.js';

/**
* Azure OpenAI chat completion stream response.
*/
export class AzureOpenAiChatCompletionStreamResponse<T> {
private _usage: AzureOpenAiCompletionUsage | undefined;
ZhongpinWang marked this conversation as resolved.
Show resolved Hide resolved
private _finishReason: string | undefined;
private _stream: AzureOpenAiChatCompletionStream<T> | undefined;

public get usage(): AzureOpenAiCompletionUsage {
if (!this._usage) {
throw new Error('Response stream is undefined.');
}
return this._usage;
}

public set usage(usage: AzureOpenAiCompletionUsage) {
this._usage = usage;
}

public get finishReason(): string {
ZhongpinWang marked this conversation as resolved.
Show resolved Hide resolved
if (!this._finishReason) {
throw new Error('Response finish reason is undefined.');
}
return this._finishReason;
}

public set finishReason(finishReason: string) {
this._finishReason = finishReason;
}

public get stream(): AzureOpenAiChatCompletionStream<T> {
if (!this._stream) {
throw new Error('Response stream is undefined.');
}
return this._stream;
}

public set stream(stream: AzureOpenAiChatCompletionStream<T>) {
this._stream = stream;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
import { parseFileToString } from "../../../../test-util/mock-http.js";
import { AzureOpenAiChatCompletionStream } from "./azure-openai-chat-completion-stream.js";
import { LineDecoder } from "./azure-openai-line-decoder.js";
import { SSEDecoder } from "./azure-openai-sse-decoder.js";

describe('OpenAI chat completion stream', () => {
let sseChunks: string[];
let originalChatCompletionStream: AzureOpenAiChatCompletionStream<any>;

beforeEach(async () => {
const rawChunksString = await parseFileToString(
'foundation-models',
'azure-openai-chat-completion-stream-chunks.txt'
);
const lineDecoder = new LineDecoder();
const sseDecoder = new SSEDecoder();
const rawLines: string[] = lineDecoder.decode(Buffer.from(rawChunksString, 'utf-8'));

sseChunks = rawLines
.map((chunk) => sseDecoder.decode(chunk))
.filter((sse) => sse !== null)
.filter((sse) => !sse.data.startsWith('[DONE]'))
.map((sse) => JSON.parse(sse.data));

async function *iterator(): AsyncGenerator<any> {
for (let sseChunk of sseChunks) {
yield sseChunk;
}
}
originalChatCompletionStream = new AzureOpenAiChatCompletionStream(iterator);
});

it('should wrap the raw chunk', async () => {
for await (const chunk of AzureOpenAiChatCompletionStream.processChunk(originalChatCompletionStream)) {
expect(chunk).toBeDefined();
console.log(chunk.getDeltaContent());
}
});
});
Loading
Loading