gcp-claude-openai-api-server

This project converts Vertex AI API to OpenAI API format.

Features

Provides an OpenAI-compatible API
Forwards requests to your Vertex AI API
Easy configuration and setup

Prerequisites

Before you begin, ensure you have:

A Google Cloud Platform account with Vertex AI API enabled
Service account credentials with necessary permissions

Configuration

Configure the following environment variables (can be set in a .env file):

ANTHROPIC_VERTEX_PROJECT_ID: Your GCP project ID
CLOUD_ML_REGION: The GCP region for Vertex AI (e.g., us-east5)
GOOGLE_APPLICATION_CREDENTIALS: Path to your service account JSON file

Optional:

HTTPS_PROXY: If you need to use a proxy

Note: The GOOGLE_APPLICATION_CREDENTIALS file contains high-level service account permissions. Keep it secure.

Installation and Usage

Using Pre-built Binaries

Download the latest release for your platform (Windows/Mac/Linux) from the releases page.
Create or edit a .env file in the same directory as the binary.
Run the binary.

Building from Source

git clone https://github.com/lenML/gcp-claude-openai-api-server.git
cd gcp-claude-openai-api-server
pnpm install
pnpm start

Configuration Example

Here's a sample .env file:

# Anthropic Vertex AI configuration
ANTHROPIC_VERTEX_PROJECT_ID=
CLOUD_ML_REGION=us-east5
GOOGLE_APPLICATION_CREDENTIALS=./anthropic-vertex-credentials.json

# Network proxy settings
HTTP_PROXY=
HTTPS_PROXY=

# Conversation processing settings

# Handling of the first message
# Possible values: `continue` | `remove`
# default value: `remove`
ENSURE_FIRST_MODE=continue

# Message merging mode
# Possible values: `all` | `only_system`
# default value: `only_system`
PROMPT_MERGE_MODE=only_system

# System message merging mode
# Possible values: `merge_all` | `merge_top_user` | `merge_top_assistant` | `only_first_user` | `only_first_assistant` | `only_first_remove`
# default value: `merge_top_user`
SYSTEM_MERGE_MODE=merge_top_user

# Maximum token length for the conversation
# default value: 4096
MAX_TOKEN_LENGTH=4096

System Variables Explanation

ENSURE_FIRST_MODE

export const ensure_first_mode = process.env.ENSURE_FIRST_MODE ?? "remove";

The ENSURE_FIRST_MODE setting determines how the system handles the first message in the conversation. It has two possible values:

"remove" (default): If the first message is not from the user, it removes all assistant messages until it finds the first user message. This ensures that the conversation always starts with a user message.
"continue": If the first message is not from the user, it adds a new user message with the content "continue" at the beginning of the conversation. This preserves all existing messages while still ensuring that the first message is from the user.

PROMPT_MERGE_MODE

export const prompt_merge_mode = process.env.PROMPT_MERGE_MODE ?? "only_system";

The PROMPT_MERGE_MODE setting determines how messages are merged in the conversation. It corresponds to the PromptMergeMode enum and has two possible values:

"only_system" (default): Only system messages are merged, while other messages remain separate.
"all": All messages except system messages are merged into a single user message.

SYSTEM_MERGE_MODE

export const system_merge_mode = process.env.SYSTEM_MERGE_MODE ?? "merge_top_user";

The SYSTEM_MERGE_MODE setting determines how system messages are handled and merged. It corresponds to the SystemMergeMode enum and has several possible values:

"merge_top_user" (default): Merges all top system messages together, and treats the remaining system messages as user prompt suffixes.
"merge_all": Merges all system messages together, ignoring the order of other roles.
"merge_top_assistant": Merges all top system messages together, and treats the remaining system messages as assistant prompt suffixes.
"only_first_user": Uses only the first system message as the system prompt, and treats the remaining system messages as user prompt suffixes.
"only_first_assistant": Uses only the first system message as the system prompt, and treats the remaining system messages as assistant prompt suffixes.
"only_first_remove": Uses only the first system message as the system prompt and ignores the rest.

MAX_TOKEN_LENGTH

export const max_token_length = parseInt(process.env.MAX_TOKEN_LENGTH ?? "4096");

The MAX_TOKEN_LENGTH setting determines the maximum number of tokens allowed in the conversation. It is parsed as an integer from the environment variable, with a default value of 4096 if not specified.

This configuration is used when merging prompts and does not affect API call parameters.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

AGPL-3.0 License

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/workflows		.github/workflows
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
rollup.config.mjs		rollup.config.mjs
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gcp-claude-openai-api-server

Features

Prerequisites

Configuration

Installation and Usage

Using Pre-built Binaries

Building from Source

Configuration Example

System Variables Explanation

ENSURE_FIRST_MODE

PROMPT_MERGE_MODE

SYSTEM_MERGE_MODE

MAX_TOKEN_LENGTH

Contributing

License

About

Releases 5

Contributors 2

Languages

License

lenML/gcp-claude-openai-api-server

Folders and files

Latest commit

History

Repository files navigation

gcp-claude-openai-api-server

Features

Prerequisites

Configuration

Installation and Usage

Using Pre-built Binaries

Building from Source

Configuration Example

System Variables Explanation

ENSURE_FIRST_MODE

PROMPT_MERGE_MODE

SYSTEM_MERGE_MODE

MAX_TOKEN_LENGTH

Contributing

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 5

Contributors 2

Languages