Skip to content

Commit

Permalink
Merge pull request #292 from weaviate/feature-custom-deployment
Browse files Browse the repository at this point in the history
Add Custom Deployment
  • Loading branch information
thomashacker authored Sep 23, 2024
2 parents 1a91a27 + cbb562d commit 33805a5
Show file tree
Hide file tree
Showing 23 changed files with 184 additions and 425 deletions.
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,17 @@

All notable changes to this project will be documented in this file.

## [2.1.0] Importastic

## Added

- Added new deployment type: Custom
- Added new port configuration

# Fixed

- Catch Exception when trying to access the OpenAI API Embedding endpoint to retrieve model names

## [2.0.0] Importastic

## Added
Expand Down
26 changes: 14 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ Verba is a fully-customizable personal assistant utilizing [Retrieval Augmented

### Watch our newest Verba video here:

[![VIDEO LINK](https://github.com/weaviate/Verba/blob/main/img/thumbnail.png)](https://www.youtube.com/watch?v=swKKRdLBhas&t)
[![VIDEO LINK](https://github.com/weaviate/Verba/blob/main/img/thumbnail.png)](https://www.youtube.com/watch?v=2VCy-YjRRhA&t=40s&ab_channel=Weaviate%E2%80%A2VectorDatabase)

## Feature Lists

Expand All @@ -63,15 +63,15 @@ Verba is a fully-customizable personal assistant utilizing [Retrieval Augmented
| VoyageAI || Embedding Models by VoyageAI |
| OpenAI || Embedding Models by OpenAI |

| 📁 Data Support | Implemented | Description |
| ------------------------------------------------------- | ----------- | -----------------------------------------------|
| [UnstructuredIO](https://docs.unstructured.io/welcome) || Import Data through Unstructured |
| [Firecrawl](https://www.firecrawl.dev/) || Scrape and Crawl URL through Firecrawl |
| PDF Ingestion || Import PDF into Verba |
| GitHub & GitLab || Import Files from Github and GitLab |
| CSV/XLSX Ingestion || Import Table Data into Verba |
| .DOCX || Import .docx files |
| Multi-Modal (using [AssemblyAI](https://assemblyai.com))|| Import and Transcribe Audio through AssemblyAI |
| 📁 Data Support | Implemented | Description |
| -------------------------------------------------------- | ----------- | ---------------------------------------------- |
| [UnstructuredIO](https://docs.unstructured.io/welcome) || Import Data through Unstructured |
| [Firecrawl](https://www.firecrawl.dev/) || Scrape and Crawl URL through Firecrawl |
| PDF Ingestion || Import PDF into Verba |
| GitHub & GitLab || Import Files from Github and GitLab |
| CSV/XLSX Ingestion || Import Table Data into Verba |
| .DOCX || Import .docx files |
| Multi-Modal (using [AssemblyAI](https://assemblyai.com)) || Import and Transcribe Audio through AssemblyAI |

| ✨ RAG Features | Implemented | Description |
| ----------------------- | ----------- | ------------------------------------------------------------------------- |
Expand Down Expand Up @@ -213,8 +213,6 @@ Verba supports importing documents through Unstructured IO (e.g plain text, .pdf

Verba supports importing documents through AssemblyAI (audio files or audio from video files). To use them you need the `ASSEMBLYAI_API_KEY` environment variable. You can get it from [AssemblyAI](https://assemblyai.com)



## OpenAI

Verba supports OpenAI Models such as Ada, GPT3, and GPT4. To use them, you need to specify the `OPENAI_API_KEY` environment variable. You can get it from [OpenAI](https://openai.com/)
Expand Down Expand Up @@ -359,6 +357,10 @@ RUN pip install -e '.'

## Verba Walkthrough

### Select your Deployment

The first screen you'll see is the deployment screen. Here you can select between `Local`, `Docker`, `Weaviate Cloud`, or `Custom` deployment. The `Local` deployment is using Weaviate Embedded under the hood, which initializes a Weaviate instance behind the scenes. The `Docker` deployment is using a separate Weaviate instance that is running inside the same Docker network. The `Weaviate Cloud` deployment is using a Weaviate instance that is hosted on Weaviate Cloud Services (WCS). The `Custom` deployment allows you to specify your own Weaviate instance URL, PORT, and API key.

### Import Your Data

First thing you need to do is to add your data. You can do this by clicking on `Import Data` and selecting either `Add Files`, `Add Directory`, or `Add URL` tab. Here you can add all your files that you want to ingest.
Expand Down
4 changes: 3 additions & 1 deletion frontend/app/api.ts
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,8 @@ export const fetchHealth = (): Promise<HealthPayload | null> =>
export const connectToVerba = async (
deployment: string,
url: string,
apiKey: string
apiKey: string,
port: string
): Promise<ConnectPayload | null> => {
const host = await detectHost();
const response = await fetch(`${host}/api/connect`, {
Expand All @@ -92,6 +93,7 @@ export const connectToVerba = async (
url: url,
key: apiKey,
},
port: port,
}),
});
const data = await response.json();
Expand Down
103 changes: 72 additions & 31 deletions frontend/app/components/Login/LoginView.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ import { GrConnect } from "react-icons/gr";
import { CgWebsite } from "react-icons/cg";
import { FaBackspace } from "react-icons/fa";
import { HiMiniSparkles } from "react-icons/hi2";
import { TbDatabaseEdit } from "react-icons/tb";

import { connectToVerba } from "@/app/api";

Expand Down Expand Up @@ -161,11 +162,12 @@ const LoginView: React.FC<LoginViewProps> = ({
const [errorText, setErrorText] = useState("");

const [selectedDeployment, setSelectedDeployment] = useState<
"Weaviate" | "Docker" | "Local"
"Weaviate" | "Docker" | "Local" | "Custom"
>("Local");

const [weaviateURL, setWeaviateURL] = useState(credentials.url);
const [weaviateAPIKey, setWeaviateAPIKey] = useState(credentials.key);
const [port, setPort] = useState("8080");

useEffect(() => {
const timer = setTimeout(() => {
Expand All @@ -175,18 +177,26 @@ const LoginView: React.FC<LoginViewProps> = ({
return () => clearTimeout(timer);
}, []);

const connect = async (deployment: "Local" | "Weaviate" | "Docker") => {
const connect = async (
deployment: "Local" | "Weaviate" | "Docker" | "Custom"
) => {
setErrorText("");
setIsConnecting(true);
const response = await connectToVerba(
deployment,
weaviateURL,
weaviateAPIKey
weaviateAPIKey,
port
);
if (response) {
if (response.error) {
if (!("error" in response)) {
setIsLoggedIn(false);
setErrorText(response.error);
setErrorText(JSON.stringify(response));
} else if (response.connected == false) {
setIsLoggedIn(false);
setErrorText(
response.error == "" ? "Couldn't connect to Weaviate" : response.error
);
} else {
setIsLoggedIn(true);
setCredentials({
Expand Down Expand Up @@ -287,6 +297,16 @@ const LoginView: React.FC<LoginViewProps> = ({
}}
loading={isConnecting && selectedDeployment == "Docker"}
/>
<VerbaButton
title="Custom"
Icon={TbDatabaseEdit}
disabled={isConnecting}
onClick={() => {
setSelectedDeployment("Custom");
setSelectStage(false);
}}
loading={isConnecting && selectedDeployment == "Custom"}
/>
<VerbaButton
title="Local"
Icon={FaLaptopCode}
Expand Down Expand Up @@ -336,18 +356,35 @@ const LoginView: React.FC<LoginViewProps> = ({
connect(selectedDeployment);
}}
>
<label className="input flex items-center gap-2 border-none shadow-md bg-bg-verba">
<FaDatabase className="text-text-alt-verba" />
<input
type="text"
name="username"
value={weaviateURL}
onChange={(e) => setWeaviateURL(e.target.value)}
placeholder="Weaviate URL"
className="grow bg-button-verba text-text-alt-verba hover:text-text-verba w-full"
autoComplete="username"
/>
</label>
<div className="flex gap-2 items-center justify-between">
<label className="input flex items-center gap-2 border-none shadow-md w-full bg-bg-verba">
<FaDatabase className="text-text-alt-verba" />
<input
type="text"
name="username"
value={weaviateURL}
onChange={(e) => setWeaviateURL(e.target.value)}
placeholder="Weaviate URL"
className="grow bg-button-verba text-text-alt-verba hover:text-text-verba w-full"
autoComplete="username"
/>
</label>
{selectedDeployment == "Custom" && (
<label className="input flex items-center gap-2 border-none shadow-md bg-bg-verba">
<p className="text-text-alt-verba text-xs">Port</p>
<input
type="text"
name="Port"
value={port}
onChange={(e) => setPort(e.target.value)}
placeholder="Port"
className="grow bg-button-verba text-text-alt-verba hover:text-text-verba w-full"
autoComplete="port"
/>
</label>
)}
</div>

<label className="input flex items-center gap-2 border-none shadow-md bg-bg-verba mt-4">
<FaKey className="text-text-alt-verba" />
<input
Expand All @@ -371,18 +408,20 @@ const LoginView: React.FC<LoginViewProps> = ({
selected_color="bg-primary-verba"
loading={isConnecting}
/>
<VerbaButton
Icon={CgWebsite}
title="Register"
type="button"
disabled={isConnecting}
onClick={() =>
window.open(
"https://console.weaviate.cloud",
"_blank"
)
}
/>
{selectedDeployment == "Weaviate" && (
<VerbaButton
Icon={CgWebsite}
title="Register"
type="button"
disabled={isConnecting}
onClick={() =>
window.open(
"https://console.weaviate.cloud",
"_blank"
)
}
/>
)}
<VerbaButton
Icon={FaBackspace}
title="Back"
Expand All @@ -402,8 +441,10 @@ const LoginView: React.FC<LoginViewProps> = ({
</div>
)}
{errorText && (
<div className="bg-warning-verba p-4 rounded w-full">
<p className="flex w-full whitespace-pre-wrap">{errorText}</p>
<div className="bg-warning-verba p-4 rounded w-full h-full overflow-auto">
<p className="flex w-full h-full whitespace-pre-wrap">
{errorText}
</p>
</div>
)}
</div>
Expand Down
2 changes: 1 addition & 1 deletion frontend/app/types.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
export type Credentials = {
deployment: "Weaviate" | "Docker" | "Local";
deployment: "Weaviate" | "Docker" | "Local" | "Custom";
url: string;
key: string;
};
Expand Down
2 changes: 1 addition & 1 deletion frontend/package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "verba",
"version": "1.0.4",
"version": "2.1.0",
"private": true,
"scripts": {
"dev": "next dev",
Expand Down
32 changes: 20 additions & 12 deletions goldenverba/components/embedding/OpenAIEmbedder.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,20 +104,28 @@ async def vectorize(self, config: dict, content: List[str]) -> List[List[float]]
@staticmethod
def get_models(token: str, url: str) -> List[str]:
"""Fetch available embedding models from OpenAI API."""
if token is None:
try:
if token is None:
return [
"text-embedding-ada-002",
"text-embedding-3-small",
"text-embedding-3-large",
]

import requests # Import here to avoid dependency if not needed

headers = {"Authorization": f"Bearer {token}"}
response = requests.get(f"{url}/models", headers=headers)
response.raise_for_status()
return [
model["id"]
for model in response.json()["data"]
if "embedding" in model["id"]
]
except Exception as e:
msg.info(f"Failed to fetch OpenAI embedding models: {str(e)}")
return [
"text-embedding-ada-002",
"text-embedding-3-small",
"text-embedding-3-large",
]

import requests # Import here to avoid dependency if not needed

headers = {"Authorization": f"Bearer {token}"}
response = requests.get(f"{url}/models", headers=headers)
response.raise_for_status()
return [
model["id"]
for model in response.json()["data"]
if "embedding" in model["id"]
]
38 changes: 19 additions & 19 deletions goldenverba/components/managers.py
Original file line number Diff line number Diff line change
Expand Up @@ -175,32 +175,32 @@ async def connect_to_docker(self, w_url):
),
)

async def connect_to_custom(self, host, w_key):
async def connect_to_custom(self, host, w_key, port):
# Extract the port from the host
parsed_url = urlparse(host)
port = parsed_url.port
if port is None:
raise Exception("No port specified in the host URL")
_host = parsed_url.hostname # Use only the hostname part
msg.info(f"Connecting to Weaviate Custom")

if host is None or host == "":
raise Exception("No Host URL provided")

if w_key is None or w_key == "":
return weaviate.use_async_with_local(
host=_host,
port=port,
host=host,
port=int(port),
skip_init_checks=True,
additional_config=AdditionalConfig(
timeout=Timeout(init=60, query=300, insert=300)
),
)
else:
return weaviate.use_async_with_local(
host=host,
port=int(port),
skip_init_checks=True,
auth_credentials=AuthApiKey(w_key),
additional_config=AdditionalConfig(
timeout=Timeout(init=60, query=300, insert=300)
),
)

return weaviate.use_async_with_custom(
http_host=_host,
http_port=port,
auth_credentials=AuthApiKey(w_key),
additional_config=AdditionalConfig(
timeout=Timeout(init=60, query=300, insert=300)
),
)

async def connect_to_embedded(self):
msg.info(f"Connecting to Weaviate Embedded")
Expand All @@ -211,7 +211,7 @@ async def connect_to_embedded(self):
)

async def connect(
self, deployment: str, weaviateURL: str, weaviateAPIKey: str
self, deployment: str, weaviateURL: str, weaviateAPIKey: str, port: str = "8080"
) -> WeaviateAsyncClient:
try:

Expand All @@ -226,7 +226,7 @@ async def connect(
elif deployment == "Local":
client = await self.connect_to_embedded()
elif deployment == "Custom":
client = await self.connect_to_custom(weaviateURL, weaviateAPIKey)
client = await self.connect_to_custom(weaviateURL, weaviateAPIKey, port)
else:
raise Exception(f"Invalid deployment type: {deployment}")

Expand Down
Loading

0 comments on commit 33805a5

Please sign in to comment.