diff --git a/notebooks/real-time-recommendation-engine/meta.toml b/notebooks/real-time-recommendation-engine/meta.toml new file mode 100644 index 0000000..4690a2c --- /dev/null +++ b/notebooks/real-time-recommendation-engine/meta.toml @@ -0,0 +1,8 @@ +[meta] +title="Real Time Recommendation Engine" +description="""\ +We demonstrate how to build and host a real-time recommendation engine for free with SingleStore. The notebook also leverages our new SingleStore Job Service to ensure that the latest data is ingested and used in providing recommendations.\ + """ +icon="crystal-ball" +tags=["openai", "vercel", "realtime", "vectordb"] +destinations=["spaces"] diff --git a/notebooks/real-time-recommendation-engine/notebook.ipynb b/notebooks/real-time-recommendation-engine/notebook.ipynb new file mode 100644 index 0000000..fa378d4 --- /dev/null +++ b/notebooks/real-time-recommendation-engine/notebook.ipynb @@ -0,0 +1,945 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "6c991811-dee6-4315-b831-320573e8e06f", + "metadata": {}, + "source": [ + "
\n", + "
\n", + " \n", + "
\n", + "
\n", + "
SingleStore Notebooks
\n", + "

Real Time Recommendation Engine

\n", + "
\n", + "
" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# How to build a real-time recommendation engine with SingleStore & Vercel" + ] + }, + { + "attachments": { + "c7f1d715-a955-408e-87f4-fdc1e1b3dc05.png": { + "image/png": "" + } + }, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We will demonstrate how to build a modern real-time AI application for free using a Shared Tier Database, SingleStore Notebooks, and Job Service.\n", + "\n", + "A Free SingleStore Starter Workspace enables you to execute hybrid search, real-time analytics, and point read/writes/updates in a single database. With SingleStore Notebooks and our Job Service, you easily bring in data from various sources (APIs, MySQL / Mongo endpoints) in real-time. You can also execute Python-based transforms, such as adding embeddings, ensuring that real-time data is readily available for your downstream LLMs and applications.\n", + "\n", + "We will showcase the seamless transition from a prototype to an end-application using SingleStore. The final application will be hosted on Vercel. You can see the App we've built following this notebook [here](https://llm-recommender.vercel.app/)\n", + "### Architecture:\n", + "\n", + "![Screenshot 2024-01-12 at 2.13.37 PM.png](attachment:c7f1d715-a955-408e-87f4-fdc1e1b3dc05.png)" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Scenario:\n", + "Building a recommendation engine on what LLM you should be using for your use-case. Bringing together semantic search + real-time analytics on the performance of the LLM to make the recommendations.\n", + "\n", + "Here are the requirements we've set out for this recommendation engine:\n", + "1. Pull data from [Hugging Face Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) on various Open source LLM models and their scores. Pull updated scores on these models every hour.\n", + "2. For each of these models, pull data from Twitter and Github on what developers are saying about these models, and how they are being used in active projects. Pull this data every hour.\n", + "3. Provide an easy 'search' interface to users where they can describe their use-case. When users provide describe their use-case, perform a hybrid search (vector + full-text search) across the descriptions of these models, what users are saying about it on Twitter, and which github repos are using these LLMs.\n", + "4. Combine the results of the semantic search with analytics on the public benchmarks, # likes, # downloads of these models.\n", + "6. Power the app entirely on a single SingleStore Free Shared Tier Workspace.\n", + "7. Ensure that all of the latest posts / scores are reflected in the App. Power this entirely with SingleStore Notebook and Job Service" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Contents\n", + "- Step 1: Creating a Starter Workspace\n", + "- Step 2: Installing & Importing required libraries\n", + "- Step 3: Setting Key Variables\n", + "- Step 4: Designing your table scheama on SingleStore\n", + "- Step 5: Creating Helper Functions to load data into SingleStore\n", + "- Step 6: Loading data with embeddings into SingleStore\n", + "- Step 7: Building the Recommendation Engine Algorithm on Vercel" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 1. Create a Starter Workspace\n", + "\n", + "Create a new Workpsace Group and select a Starter Workspace. If you do not have this enabled email pm@singlestore.com" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 2. Install and import required libraries" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "%pip install singlestoredb openai tiktoken beautifulsoup4 pandas python-dotenv Markdown praw tweepy --quiet\n", + "\n", + "import re\n", + "import json\n", + "import openai\n", + "import tiktoken\n", + "import json\n", + "import requests\n", + "import getpass\n", + "import pandas as pd\n", + "import singlestoredb as s2\n", + "import tweepy\n", + "import praw\n", + "from bs4 import BeautifulSoup\n", + "from markdown import markdown\n", + "from datetime import datetime\n", + "from time import time, sleep" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 3. Seting Environment variables\n", + "\n", + "### 3.1. Set the app common variables. Do not change these" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "MODELS_LIMIT = 100\n", + "MODELS_TABLE_NAME = 'models'\n", + "MODEL_READMES_TABLE_NAME = 'model_readmes'\n", + "MODEL_TWITTER_POSTS_TABLE_NAME = 'model_twitter_posts'\n", + "MODEL_REDDIT_POSTS_TABLE_NAME = 'model_reddit_posts'\n", + "MODEL_GITHUB_REPOS_TABLE_NAME = 'model_github_repos'\n", + "LEADERBOARD_DATASET_URL = 'https://llm-recommender.vercel.app/datasets/leaderboard.json'\n", + "TOKENS_LIMIT = 2047\n", + "TOKENS_TRASHHOLD_LIMIT = TOKENS_LIMIT - 128" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.2. Set the OpenAI variables\n", + "\n", + "We will be using OpenAI's embedding models to create vectors representing our data. The vectors will be stored in the SingleStore Starter Workspace as a column in the relevant tables.\n", + "\n", + "Using OpenAI's LLMs we will also generate output text after we complete the Retrieval Augmentation Generation Steps.\n", + "1. [Open the OpenAI API keys page](https://platform.openai.com/api-keys)\n", + "2. Create a new key\n", + "3. Copy the key and paste it into the `OPENAI_API_KEY` variable" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "OPENAI_API_KEY = getpass.getpass(\"enter openAI apikey here\")" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.3. Set the HuggingFace variables\n", + "\n", + "We will be pulling data from HugginFace about the different models, the usage of these models, and how they score in several evaluation metrics.\n", + "1. [Open the HuggingFace Access Tokens page](https://huggingface.co/settings/tokens)\n", + "2. Create a new token\n", + "3. Copy the key and paste it into the `HF_TOKEN` variable" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "HF_TOKEN = getpass.getpass(\"enter HuggingFace apikey here\")" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.4. Set the Twitter variables\n", + "We will be pulling data from Twitter about what users might be saying about these models. Since teh quality of these models may change over time, we want to caputre the sentiment of what people are talking about and using on twitter.\n", + "1. [Open the Twitter Developer Projects & Apps page](https://developer.twitter.com/en/portal/projects-and-apps)\n", + "2. Add a new app\n", + "3. Fill the form\n", + "4. Generate a Bearer Token and paste it into the `TWITTER_BEARER_TOKEN` variable" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "TWITTER_BEARER_TOKEN = getpass.getpass(\"enter Twitter Bearer Token here\")" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 3.5 Set the GitHub variables\n", + "We will also be pulling data from various Github repos on which models are being referenced and used for which scenarios.\n", + "1. [Open the Register new GitHub App page](https://github.com/settings/apps/new)\n", + "2. Fill the form\n", + "3. Get an access token and paste it into the `GITHUB_ACCESS_TOKEN` variable" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [], + "source": [ + "GITHUB_ACCESS_TOKEN = getpass.getpass(\"enter Github Access Token here\")" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 4. Designing and creating your table schemas in SingleStore\n", + "\n", + "We will be storing all of this data in a single Free Shared Tier Database. Through this database, you can write hybrid search queries, run analytics on the model's performance, and get real-time reads/updates.\n", + "\n", + "- `connection` - database connection to execute queries\n", + "- `create_tables` - function that creates empty tables in the database\n", + "- `drop_table` - helper function to drop a table\n", + "- `get_models` - helper function to get models from the models table\n", + "- `db_get_last_created_at` - helper function to get last `created_at` value from a table\n", + "\n", + "The `create_tables` creates the following tables:\n", + "- `models_table` - table with all models data from the [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)\n", + "- `readmes_table` - table with model readme texts from the HugginFace model pages (used in semantic search)\n", + "- `twitter_posts` - table with tweets related to models (used in semantic search)\n", + "- `github_repos` - table with GitHub readme texts related to models (used in semantic search)" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [], + "source": [ + "connection = s2.connect(connection_url)\n", + "\n", + "\n", + "def create_tables():\n", + " def create_models_table():\n", + " with connection.cursor() as cursor:\n", + " cursor.execute(f'''\n", + " CREATE TABLE IF NOT EXISTS {MODELS_TABLE_NAME} (\n", + " id INT AUTO_INCREMENT PRIMARY KEY,\n", + " name VARCHAR(512) NOT NULL,\n", + " author VARCHAR(512) NOT NULL,\n", + " repo_id VARCHAR(1024) NOT NULL,\n", + " score DECIMAL(5, 2) NOT NULL,\n", + " arc DECIMAL(5, 2) NOT NULL,\n", + " hellaswag DECIMAL(5, 2) NOT NULL,\n", + " mmlu DECIMAL(5, 2) NOT NULL,\n", + " truthfulqa DECIMAL(5, 2) NOT NULL,\n", + " winogrande DECIMAL(5, 2) NOT NULL,\n", + " gsm8k DECIMAL(5, 2) NOT NULL,\n", + " link VARCHAR(255) NOT NULL,\n", + " downloads INT,\n", + " likes INT,\n", + " still_on_hub BOOLEAN NOT NULL,\n", + " created_at TIMESTAMP,\n", + " embedding BLOB\n", + " )\n", + " ''')\n", + "\n", + " def create_model_readmes_table():\n", + " with connection.cursor() as cursor:\n", + " cursor.execute(f'''\n", + " CREATE TABLE IF NOT EXISTS {MODEL_READMES_TABLE_NAME} (\n", + " id INT AUTO_INCREMENT PRIMARY KEY,\n", + " model_repo_id VARCHAR(512),\n", + " text LONGTEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci,\n", + " clean_text LONGTEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci,\n", + " created_at TIMESTAMP,\n", + " embedding BLOB\n", + " )\n", + " ''')\n", + "\n", + " def create_model_twitter_posts_table():\n", + " with connection.cursor() as cursor:\n", + " cursor.execute(f'''\n", + " CREATE TABLE IF NOT EXISTS {MODEL_TWITTER_POSTS_TABLE_NAME} (\n", + " id INT AUTO_INCREMENT PRIMARY KEY,\n", + " model_repo_id VARCHAR(512),\n", + " post_id VARCHAR(256),\n", + " clean_text LONGTEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci,\n", + " created_at TIMESTAMP,\n", + " embedding BLOB\n", + " )\n", + " ''')\n", + "\n", + " def create_model_github_repos_table():\n", + " with connection.cursor() as cursor:\n", + " cursor.execute(f'''\n", + " CREATE TABLE IF NOT EXISTS {MODEL_GITHUB_REPOS_TABLE_NAME} (\n", + " id INT AUTO_INCREMENT PRIMARY KEY,\n", + " model_repo_id VARCHAR(512),\n", + " repo_id INT,\n", + " name VARCHAR(512) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci,\n", + " description TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci,\n", + " clean_text LONGTEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci,\n", + " link VARCHAR(256),\n", + " created_at TIMESTAMP,\n", + " embedding BLOB\n", + " )\n", + " ''')\n", + "\n", + " create_models_table()\n", + " create_model_readmes_table()\n", + " create_model_twitter_posts_table()\n", + " create_model_github_repos_table()\n", + "\n", + "\n", + "def drop_table(table_name: str):\n", + " with connection.cursor() as cursor:\n", + " cursor.execute(f'DROP TABLE IF EXISTS {table_name}')\n", + "\n", + "\n", + "def get_models(select='*', query='', as_dict=True):\n", + " with connection.cursor() as cursor:\n", + " _query = f'SELECT {select} FROM {MODELS_TABLE_NAME}'\n", + "\n", + " if query:\n", + " _query += f' {query}'\n", + "\n", + " cursor.execute(_query)\n", + "\n", + " if as_dict:\n", + " columns = [desc[0] for desc in cursor.description]\n", + " return [dict(zip(columns, row)) for row in cursor.fetchall()]\n", + "\n", + " return cursor.fetchall()\n", + "\n", + "\n", + "def db_get_last_created_at(table, repo_id, to_string=False):\n", + " with connection.cursor() as cursor:\n", + " cursor.execute(f\"\"\"\n", + " SELECT UNIX_TIMESTAMP(created_at) FROM {table}\n", + " WHERE model_repo_id = '{repo_id}'\n", + " ORDER BY created_at DESC\n", + " LIMIT 1\n", + " \"\"\")\n", + "\n", + " rows = cursor.fetchone()\n", + " created_at = float(rows[0]) if rows and rows[0] else None\n", + "\n", + " if (created_at and to_string):\n", + " created_at = datetime.fromtimestamp(created_at)\n", + " created_at = created_at.strftime('%Y-%m-%dT%H:%M:%SZ')\n", + "\n", + " return created_at" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 5. Creating helper functions to load data into SingleStore\n", + "\n", + "### 5.1. Setting up the `openai.api_key`" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [], + "source": [ + "openai.api_key = OPENAI_API_KEY" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 5.2. Create the `create_embeddings` function\n", + "This function will be used to create embeddings on data based on an input to the function. We will be doing this to all data pulled from Github, HuggingFace and Twitter. The vector embeddings created will be stored in the same SingleStore table as a separate column." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [], + "source": [ + "def count_tokens(text: str):\n", + " enc = tiktoken.get_encoding('cl100k_base')\n", + " return len(enc.encode(text, disallowed_special={}))\n", + "\n", + "def create_embedding(input):\n", + " try:\n", + " data = openai.embeddings.create(input=input, model='text-embedding-ada-002').data\n", + " return data[0].embedding\n", + " except Exception as e:\n", + " print(e)\n", + " return [[]]" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 5.3. Create the function/Utils to help parse the data ingested from the various sources\n", + "This is a set of functions that ensure the JSON is in the right format and can be stored in SingleStore as a JSON column. In your Free Shared Tier workspace you can bring data of various formats (JSON, Geospatial, Vector) and interact with this data with SQL and MongoDB API." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [], + "source": [ + "class JSONEncoder(json.JSONEncoder):\n", + " def default(self, obj):\n", + " if isinstance(obj, datetime):\n", + " return obj.strftime('%Y-%m-%d %H:%M:%S')\n", + " return super().default(obj)\n", + "\n", + "def list_into_chunks(lst, chunk_size=100):\n", + " return [lst[i:i + chunk_size] for i in range(0, len(lst), chunk_size)]\n", + "\n", + "def string_into_chunks(string: str, max_tokens=TOKENS_LIMIT):\n", + " if count_tokens(string) <= max_tokens:\n", + " return [string]\n", + "\n", + " delimiter = ' '\n", + " words = string.split(delimiter)\n", + " chunks = []\n", + " current_chunk = []\n", + "\n", + " for word in words:\n", + " if count_tokens(delimiter.join(current_chunk + [word])) <= max_tokens:\n", + " current_chunk.append(word)\n", + " else:\n", + " chunks.append(delimiter.join(current_chunk))\n", + " current_chunk = [word]\n", + "\n", + " if current_chunk:\n", + " chunks.append(delimiter.join(current_chunk))\n", + "\n", + " return chunks\n", + "\n", + "def clean_string(string: str):\n", + " def strip_html_elements(string: str):\n", + " html = markdown(string)\n", + " soup = BeautifulSoup(html, \"html.parser\")\n", + " text = soup.get_text()\n", + " return text.strip()\n", + "\n", + " def remove_unicode_escapes(string: str):\n", + " return re.sub(r'[^\\x00-\\x7F]+', '', string)\n", + "\n", + " def remove_string_spaces(strgin: str):\n", + " new_string = re.sub(r'\\n+', '\\n', strgin)\n", + " new_string = re.sub(r'\\s+', ' ', new_string)\n", + " return new_string\n", + "\n", + " def remove_links(string: str):\n", + " url_pattern = r'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\\\(\\\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+'\n", + " return re.sub(url_pattern, '', string)\n", + "\n", + " new_string = strip_html_elements(string)\n", + " new_string = remove_unicode_escapes(new_string)\n", + " new_string = remove_string_spaces(new_string)\n", + " new_string = re.sub(r'\\*\\*+', '*', new_string)\n", + " new_string = re.sub(r'--+', '-', new_string)\n", + " new_string = re.sub(r'====+', '=', new_string)\n", + " new_string = remove_links(new_string)\n", + "\n", + " return new_string" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 6. Loading Data into SingleStore" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 6.1. Load Data on all Open-Source LLM models from [HuggingFace Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)\n", + "This function loads a pre-generated Open LLM Leaderboard dataset. Based on this dataset, all model data is created and inserted into the database.\n", + "We will also create embeddings for all of this data pulled using the OpenAI Embedding Model." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [], + "source": [ + "def leaderboard_get_df():\n", + " response = requests.get(LEADERBOARD_DATASET_URL)\n", + "\n", + " if response.status_code == 200:\n", + " data = json.loads(response.text)\n", + " df = pd.DataFrame(data).head(MODELS_LIMIT)\n", + " return df\n", + " else:\n", + " print(\"Failed to retrieve JSON file\")\n", + "\n", + "def leaderboard_insert_model(model):\n", + " try:\n", + " _model = {key: value for key, value in model.items() if key != 'readme'}\n", + " to_embedding = json.dumps(_model, cls=JSONEncoder)\n", + " embedding = str(create_embedding(to_embedding))\n", + " model_to_insert = {**_model, embedding: embedding}\n", + " readmes_to_insert = []\n", + "\n", + " if model['readme']:\n", + " readme = {\n", + " 'model_repo_id': model['repo_id'],\n", + " 'text': model['readme'],\n", + " 'created_at': time()\n", + " }\n", + "\n", + " if count_tokens(readme['text']) <= TOKENS_TRASHHOLD_LIMIT:\n", + " readme['clean_text'] = clean_string(readme['text'])\n", + " to_embedding = json.dumps({\n", + " 'model_repo_id': readme['model_repo_id'],\n", + " 'clean_text': readme['clean_text'],\n", + " })\n", + " readme['embedding'] = str(create_embedding(to_embedding))\n", + " readmes_to_insert.append(readme)\n", + " else:\n", + " for i, chunk in enumerate(string_into_chunks(readme['text'])):\n", + " _readme = {\n", + " **readme,\n", + " 'text': chunk,\n", + " 'created_at': time()\n", + " }\n", + "\n", + " _readme['clean_text'] = clean_string(chunk)\n", + " to_embedding = json.dumps({\n", + " 'model_repo_id': _readme['model_repo_id'],\n", + " 'clean_text': chunk,\n", + " })\n", + " _readme['embedding'] = str(create_embedding(to_embedding))\n", + " readmes_to_insert.append(_readme)\n", + "\n", + " with connection.cursor() as cursor:\n", + " cursor.execute(f'''\n", + " INSERT INTO {MODELS_TABLE_NAME} (name, author, repo_id, score, link, still_on_hub, arc, hellaswag, mmlu, truthfulqa, winogrande, gsm8k, downloads, likes, created_at, embedding)\n", + " VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, FROM_UNIXTIME(%s), JSON_ARRAY_PACK(%s))\n", + " ''', tuple(model_to_insert.values()))\n", + "\n", + " for chunk in list_into_chunks([tuple(readme.values()) for readme in readmes_to_insert]):\n", + " with connection.cursor() as cursor:\n", + " cursor.executemany(f'''\n", + " INSERT INTO {MODEL_READMES_TABLE_NAME} (model_repo_id, text, created_at, clean_text, embedding)\n", + " VALUES (%s, %s, FROM_UNIXTIME(%s), %s, JSON_ARRAY_PACK(%s))\n", + " ''', chunk)\n", + " except Exception as e:\n", + " print('Error leaderboard_insert_model: ', e)\n", + "\n", + "\n", + "def leaderboard_process_models():\n", + " print('Processing models')\n", + "\n", + " existed_model_repo_ids = [i[0] for i in get_models('repo_id', as_dict=False)]\n", + " leaderboard_df = leaderboard_get_df()\n", + "\n", + " for i, row in leaderboard_df.iterrows():\n", + " if not row['repo_id'] in existed_model_repo_ids:\n", + " leaderboard_insert_model(row.to_dict())" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 6.2 Loading Data from Github about model usage\n", + "We will search the Github API by keyword based on the model names we have above to find their usage across repos. We will then pull data from the ReadME's of the repos that reference a particular model and create an embedding for it.\n", + "\n", + "This allows us to see in which kinds of scenarios are developers using a particular LLM and incoporate it as a part of our recommendation.\n", + "\n", + "In the first step we search for the model using the github API" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [], + "source": [ + "def github_search_repos(keyword: str, last_created_at):\n", + " repos = []\n", + " headers = {'Authorization': f'token {GITHUB_ACCESS_TOKEN}'}\n", + " query = f'\"{keyword}\" in:name,description,readme'\n", + "\n", + " if last_created_at:\n", + " query += f' created:>{last_created_at}'\n", + "\n", + " try:\n", + " repos_response = requests.get(\n", + " \"https://api.github.com/search/repositories\",\n", + " headers=headers,\n", + " params={'q': query}\n", + " )\n", + "\n", + " if repos_response.status_code == 403:\n", + " # Handle rate limiting\n", + " rate_limit = repos_response.headers['X-RateLimit-Reset']\n", + " if not rate_limit:\n", + " return repos\n", + "\n", + " sleep_time = int(rate_limit) - int(time())\n", + " if sleep_time > 0:\n", + " print(f\"Rate limit exceeded. Retrying in {sleep_time} seconds.\")\n", + " sleep(sleep_time)\n", + " return github_search_repos(keyword, last_created_at)\n", + "\n", + " if repos_response.status_code != 200:\n", + " return repos\n", + "\n", + " for repo in repos_response.json().get('items', []):\n", + " try:\n", + " readme_response = requests.get(repo['contents_url'].replace('{+path}', 'README.md'), headers=headers)\n", + " if repos_response.status_code != 200:\n", + " continue\n", + "\n", + " readme_file = readme_response.json()\n", + " if readme_file['size'] > 7000:\n", + " continue\n", + "\n", + " readme_text = requests.get(readme_file['download_url']).text\n", + " if not readme_text:\n", + " continue\n", + "\n", + " repos.append({\n", + " 'repo_id': repo['id'],\n", + " 'name': repo['name'],\n", + " 'link': repo['html_url'],\n", + " 'created_at': datetime.strptime(repo['created_at'], '%Y-%m-%dT%H:%M:%SZ').timestamp(),\n", + " 'description': repo.get('description', ''),\n", + " 'readme': readme_text,\n", + " })\n", + " except:\n", + " continue\n", + " except:\n", + " return repos\n", + "\n", + " return repos" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "After we conduct this serach, we will insert it into another table in the database. The data inserted will have embeddings associated with it." + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [], + "source": [ + "def github_insert_model_repos(model_repo_id, repos):\n", + " for repo in repos:\n", + " try:\n", + " values = []\n", + " value = {\n", + " 'model_repo_id': model_repo_id,\n", + " 'repo_id': repo['repo_id'],\n", + " 'name': repo['name'],\n", + " 'description': repo['description'],\n", + " 'clean_text': clean_string(repo['readme']),\n", + " 'link': repo['link'],\n", + " 'created_at': repo['created_at'],\n", + " }\n", + "\n", + " to_embedding = {\n", + " 'model_repo_id': model_repo_id,\n", + " 'name': value['name'],\n", + " 'description': value['description'],\n", + " 'clean_text': value['clean_text']\n", + " }\n", + "\n", + " if count_tokens(value['clean_text']) <= TOKENS_TRASHHOLD_LIMIT:\n", + " embedding = str(create_embedding(json.dumps(to_embedding)))\n", + " values.append({**value, 'embedding': embedding})\n", + " else:\n", + " for chunk in string_into_chunks(value['clean_text']):\n", + " embedding = str(create_embedding(json.dumps({\n", + " **to_embedding,\n", + " 'clean_text': chunk\n", + " })))\n", + " values.append({**value, 'clean_text': chunk, 'embedding': embedding})\n", + "\n", + " for chunk in list_into_chunks([list(value.values()) for value in values]):\n", + " with connection.cursor() as cursor:\n", + " cursor.executemany(f'''\n", + " INSERT INTO {MODEL_GITHUB_REPOS_TABLE_NAME} (model_repo_id, repo_id, name, description, clean_text, link, created_at, embedding)\n", + " VALUES (%s, %s, %s, %s, %s, %s, FROM_UNIXTIME(%s), JSON_ARRAY_PACK(%s))\n", + " ''', chunk)\n", + " except Exception as e:\n", + " print('Error github_insert_model_repos: ', e)\n", + "\n", + "\n", + "def github_process_models_repos(existed_models):\n", + " print('Processing GitHub posts')\n", + "\n", + " for model in existed_models:\n", + " try:\n", + " repo_id = model['repo_id']\n", + " last_created_at = db_get_last_created_at(MODEL_GITHUB_REPOS_TABLE_NAME, repo_id, True)\n", + " keyword = model['name'] if re.search(r'\\d', model['name']) else repo_id\n", + " found_repos = github_search_repos(keyword, last_created_at)\n", + "\n", + " if len(found_repos):\n", + " github_insert_model_repos(repo_id, found_repos)\n", + " except Exception as e:\n", + " print('Error github_process_models_repos: ', e)" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 6.3. Load Data from Twitter about these models." + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "First, we will search Twitter based on the model names we have using the API." + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [], + "source": [ + "twitter = tweepy.Client(TWITTER_BEARER_TOKEN)\n", + "def twitter_search_posts(keyword, last_created_at):\n", + " posts = []\n", + "\n", + " try:\n", + " tweets = twitter.search_recent_tweets(\n", + " query=f'{keyword} -is:retweet',\n", + " tweet_fields=['id', 'text', 'created_at'],\n", + " start_time=last_created_at,\n", + " max_results=100\n", + " )\n", + "\n", + " for tweet in tweets.data:\n", + " posts.append({\n", + " 'post_id': tweet.id,\n", + " 'text': tweet.text,\n", + " 'created_at': tweet.created_at,\n", + " })\n", + " except Exception:\n", + " return posts\n", + "\n", + " return posts" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, we will add the text from the posts per model into another table. This table will also have embeddings associated with it." + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [], + "source": [ + "def twitter_insert_model_posts(model_repo_id, posts):\n", + " for post in posts:\n", + " try:\n", + " values = []\n", + "\n", + " value = {\n", + " 'model_repo_id': model_repo_id,\n", + " 'post_id': post['post_id'],\n", + " 'clean_text': clean_string(post['text']),\n", + " 'created_at': post['created_at'],\n", + " }\n", + "\n", + " to_embedding = {\n", + " 'model_repo_id': value['model_repo_id'],\n", + " 'clean_text': value['clean_text']\n", + " }\n", + "\n", + " embedding = str(create_embedding(json.dumps(to_embedding)))\n", + " values.append({**value, 'embedding': embedding})\n", + "\n", + " for chunk in list_into_chunks([list(value.values()) for value in values]):\n", + " with connection.cursor() as cursor:\n", + " cursor.executemany(f'''\n", + " INSERT INTO {MODEL_TWITTER_POSTS_TABLE_NAME} (model_repo_id, post_id, clean_text, created_at, embedding)\n", + " VALUES (%s, %s, %s, %s, JSON_ARRAY_PACK(%s))\n", + " ''', chunk)\n", + " except Exception as e:\n", + " print('Error twitter_insert_model_posts: ', e)\n", + "\n", + "def twitter_process_models_posts(existed_models):\n", + " print('Processing Twitter posts')\n", + "\n", + " for model in existed_models:\n", + " try:\n", + " repo_id = model['repo_id']\n", + " last_created_at = db_get_last_created_at(MODEL_TWITTER_POSTS_TABLE_NAME, repo_id, True)\n", + " keyword = model['name'] if re.search(r'\\d', model['name']) else repo_id\n", + " found_posts = twitter_search_posts(keyword, last_created_at)\n", + "\n", + " if len(found_posts):\n", + " twitter_insert_model_posts(repo_id, found_posts)\n", + " except Exception as e:\n", + " print('Error twitter_process_models_posts: ', e)" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 6.4. Run the functions we've created above to load the data into SingleStore\n", + "First, the notebook creates tables in the database if they don't exist.\n", + "Next, the notebook retrieves the specified number of models from the Open LLM Leaderboard dataset, creates embeddings, and enters the data into the `models` and `model_reamdes` tables.\n", + "Next, it executes a query to retrieve all the models in the database. Based on these models, Twitter posts, Reddit posts, and GitHub repositories are searched, converted into embeddings and inserted into tables.\n", + "\n", + "Finally, we get a ready set of data for finding the most appropriate model for any use case using semantic search." + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [], + "source": [ + "create_tables()\n", + "leaderboard_process_models()\n", + "existed_models = get_models('repo_id, name', f'ORDER BY score DESC LIMIT {MODELS_LIMIT}')\n", + "twitter_process_models_posts(existed_models)\n", + "github_process_models_repos(existed_models)" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### (Optional) 6.5 Run this notebook every hour using our built-in Job Service\n", + "By scheduling this notebook to run every hour, the latest data from Hugging Face will be pulled on new models, their scores and their likes/downloads.\n", + "This will ensure that you can capture the latest sentiment and usage from Twitter / Github about developers.\n", + "\n", + "SingleStore Notebook + Job Service makes it really easy to bring real-time data to your vector-based searches and AI/ML models downstream. You can ensure that the data is in the right format and apply python based transformations like creating embeddings on the most newly ingested data. This would've previously required a combination of several serverless technologies alongside your database as we wrote about this [previously](https://www.singlestore.com/blog/a-serverless-architecture-for-creating-openai-embeddings-with-singlestoredb/)" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## (Optional) Step 7: Host the app with Vercel" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Follow our github [repo](https://github.com/singlestore-labs/llm-recommender/tree/main) where we showcase how to write the front end code of the app which does the vector similarity search to provide the results.The front end is built with our [elegance SDK](https://elegancesdk.com/) and hosted with Vercel.\n", + "\n", + "See our [guide](https://docs.singlestore.com/cloud/integrate-with-singlestoredb-cloud/connect-with-vercel/) on our vercel integration with SingleStore. We have a public version of the app running for free [here](https://llm-recommender.vercel.app/)." + ] + }, + { + "cell_type": "markdown", + "id": "996c0586-1c4b-4c1f-aa37-240d11f544eb", + "metadata": {}, + "source": [ + "
\n", + "
" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.6" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/notebooks/real-time-recommendation-engine/singlestore_bundle.pem b/notebooks/real-time-recommendation-engine/singlestore_bundle.pem new file mode 100644 index 0000000..f9b1e41 --- /dev/null +++ b/notebooks/real-time-recommendation-engine/singlestore_bundle.pem @@ -0,0 +1,130 @@ +-----BEGIN CERTIFICATE----- +MIIF3jCCA8agAwIBAgIQAf1tMPyjylGoG7xkDjUDLTANBgkqhkiG9w0BAQwFADCB +iDELMAkGA1UEBhMCVVMxEzARBgNVBAgTCk5ldyBKZXJzZXkxFDASBgNVBAcTC0pl +cnNleSBDaXR5MR4wHAYDVQQKExVUaGUgVVNFUlRSVVNUIE5ldHdvcmsxLjAsBgNV +BAMTJVVTRVJUcnVzdCBSU0EgQ2VydGlmaWNhdGlvbiBBdXRob3JpdHkwHhcNMTAw +MjAxMDAwMDAwWhcNMzgwMTE4MjM1OTU5WjCBiDELMAkGA1UEBhMCVVMxEzARBgNV +BAgTCk5ldyBKZXJzZXkxFDASBgNVBAcTC0plcnNleSBDaXR5MR4wHAYDVQQKExVU +aGUgVVNFUlRSVVNUIE5ldHdvcmsxLjAsBgNVBAMTJVVTRVJUcnVzdCBSU0EgQ2Vy +dGlmaWNhdGlvbiBBdXRob3JpdHkwggIiMA0GCSqGSIb3DQEBAQUAA4ICDwAwggIK +AoICAQCAEmUXNg7D2wiz0KxXDXbtzSfTTK1Qg2HiqiBNCS1kCdzOiZ/MPans9s/B +3PHTsdZ7NygRK0faOca8Ohm0X6a9fZ2jY0K2dvKpOyuR+OJv0OwWIJAJPuLodMkY +tJHUYmTbf6MG8YgYapAiPLz+E/CHFHv25B+O1ORRxhFnRghRy4YUVD+8M/5+bJz/ +Fp0YvVGONaanZshyZ9shZrHUm3gDwFA66Mzw3LyeTP6vBZY1H1dat//O+T23LLb2 +VN3I5xI6Ta5MirdcmrS3ID3KfyI0rn47aGYBROcBTkZTmzNg95S+UzeQc0PzMsNT +79uq/nROacdrjGCT3sTHDN/hMq7MkztReJVni+49Vv4M0GkPGw/zJSZrM233bkf6 +c0Plfg6lZrEpfDKEY1WJxA3Bk1QwGROs0303p+tdOmw1XNtB1xLaqUkL39iAigmT +Yo61Zs8liM2EuLE/pDkP2QKe6xJMlXzzawWpXhaDzLhn4ugTncxbgtNMs+1b/97l +c6wjOy0AvzVVdAlJ2ElYGn+SNuZRkg7zJn0cTRe8yexDJtC/QV9AqURE9JnnV4ee +UB9XVKg+/XRjL7FQZQnmWEIuQxpMtPAlR1n6BB6T1CZGSlCBst6+eLf8ZxXhyVeE +Hg9j1uliutZfVS7qXMYoCAQlObgOK6nyTJccBz8NUvXt7y+CDwIDAQABo0IwQDAd +BgNVHQ4EFgQUU3m/WqorSs9UgOHYm8Cd8rIDZsswDgYDVR0PAQH/BAQDAgEGMA8G +A1UdEwEB/wQFMAMBAf8wDQYJKoZIhvcNAQEMBQADggIBAFzUfA3P9wF9QZllDHPF +Up/L+M+ZBn8b2kMVn54CVVeWFPFSPCeHlCjtHzoBN6J2/FNQwISbxmtOuowhT6KO +VWKR82kV2LyI48SqC/3vqOlLVSoGIG1VeCkZ7l8wXEskEVX/JJpuXior7gtNn3/3 +ATiUFJVDBwn7YKnuHKsSjKCaXqeYalltiz8I+8jRRa8YFWSQEg9zKC7F4iRO/Fjs +8PRF/iKz6y+O0tlFYQXBl2+odnKPi4w2r78NBc5xjeambx9spnFixdjQg3IM8WcR +iQycE0xyNN+81XHfqnHd4blsjDwSXWXavVcStkNr/+XeTWYRUc+ZruwXtuhxkYze +Sf7dNXGiFSeUHM9h4ya7b6NnJSFd5t0dCy5oGzuCr+yDZ4XUmFF0sbmZgIn/f3gZ +XHlKYC6SQK5MNyosycdiyA5d9zZbyuAlJQG03RoHnHcAP9Dc1ew91Pq7P8yF1m9/ +qS3fuQL39ZeatTXaw2ewh0qpKJ4jjv9cJ2vhsE/zB+4ALtRZh8tSQZXq9EfX7mRB +VXyNWQKV3WKdwrnuWih0hKWbt5DHDAff9Yk2dDLWKMGwsAvgnEzDHNb842m1R0aB +L6KCq9NjRHDEjf8tM7qtj3u1cIiuPhnPQCjY/MiQu12ZIvVS5ljFH4gxQ+6IHdfG +jjxDah2nGN59PRbxYvnKkKj9 +-----END CERTIFICATE----- +-----BEGIN CERTIFICATE----- +MIIGEzCCA/ugAwIBAgIQfVtRJrR2uhHbdBYLvFMNpzANBgkqhkiG9w0BAQwFADCB +iDELMAkGA1UEBhMCVVMxEzARBgNVBAgTCk5ldyBKZXJzZXkxFDASBgNVBAcTC0pl +cnNleSBDaXR5MR4wHAYDVQQKExVUaGUgVVNFUlRSVVNUIE5ldHdvcmsxLjAsBgNV +BAMTJVVTRVJUcnVzdCBSU0EgQ2VydGlmaWNhdGlvbiBBdXRob3JpdHkwHhcNMTgx +MTAyMDAwMDAwWhcNMzAxMjMxMjM1OTU5WjCBjzELMAkGA1UEBhMCR0IxGzAZBgNV +BAgTEkdyZWF0ZXIgTWFuY2hlc3RlcjEQMA4GA1UEBxMHU2FsZm9yZDEYMBYGA1UE +ChMPU2VjdGlnbyBMaW1pdGVkMTcwNQYDVQQDEy5TZWN0aWdvIFJTQSBEb21haW4g +VmFsaWRhdGlvbiBTZWN1cmUgU2VydmVyIENBMIIBIjANBgkqhkiG9w0BAQEFAAOC +AQ8AMIIBCgKCAQEA1nMz1tc8INAA0hdFuNY+B6I/x0HuMjDJsGz99J/LEpgPLT+N +TQEMgg8Xf2Iu6bhIefsWg06t1zIlk7cHv7lQP6lMw0Aq6Tn/2YHKHxYyQdqAJrkj +eocgHuP/IJo8lURvh3UGkEC0MpMWCRAIIz7S3YcPb11RFGoKacVPAXJpz9OTTG0E +oKMbgn6xmrntxZ7FN3ifmgg0+1YuWMQJDgZkW7w33PGfKGioVrCSo1yfu4iYCBsk +Haswha6vsC6eep3BwEIc4gLw6uBK0u+QDrTBQBbwb4VCSmT3pDCg/r8uoydajotY +uK3DGReEY+1vVv2Dy2A0xHS+5p3b4eTlygxfFQIDAQABo4IBbjCCAWowHwYDVR0j +BBgwFoAUU3m/WqorSs9UgOHYm8Cd8rIDZsswHQYDVR0OBBYEFI2MXsRUrYrhd+mb ++ZsF4bgBjWHhMA4GA1UdDwEB/wQEAwIBhjASBgNVHRMBAf8ECDAGAQH/AgEAMB0G +A1UdJQQWMBQGCCsGAQUFBwMBBggrBgEFBQcDAjAbBgNVHSAEFDASMAYGBFUdIAAw +CAYGZ4EMAQIBMFAGA1UdHwRJMEcwRaBDoEGGP2h0dHA6Ly9jcmwudXNlcnRydXN0 +LmNvbS9VU0VSVHJ1c3RSU0FDZXJ0aWZpY2F0aW9uQXV0aG9yaXR5LmNybDB2Bggr +BgEFBQcBAQRqMGgwPwYIKwYBBQUHMAKGM2h0dHA6Ly9jcnQudXNlcnRydXN0LmNv +bS9VU0VSVHJ1c3RSU0FBZGRUcnVzdENBLmNydDAlBggrBgEFBQcwAYYZaHR0cDov +L29jc3AudXNlcnRydXN0LmNvbTANBgkqhkiG9w0BAQwFAAOCAgEAMr9hvQ5Iw0/H +ukdN+Jx4GQHcEx2Ab/zDcLRSmjEzmldS+zGea6TvVKqJjUAXaPgREHzSyrHxVYbH +7rM2kYb2OVG/Rr8PoLq0935JxCo2F57kaDl6r5ROVm+yezu/Coa9zcV3HAO4OLGi +H19+24rcRki2aArPsrW04jTkZ6k4Zgle0rj8nSg6F0AnwnJOKf0hPHzPE/uWLMUx +RP0T7dWbqWlod3zu4f+k+TY4CFM5ooQ0nBnzvg6s1SQ36yOoeNDT5++SR2RiOSLv +xvcRviKFxmZEJCaOEDKNyJOuB56DPi/Z+fVGjmO+wea03KbNIaiGCpXZLoUmGv38 +sbZXQm2V0TP2ORQGgkE49Y9Y3IBbpNV9lXj9p5v//cWoaasm56ekBYdbqbe4oyAL +l6lFhd2zi+WJN44pDfwGF/Y4QA5C5BIG+3vzxhFoYt/jmPQT2BVPi7Fp2RBgvGQq +6jG35LWjOhSbJuMLe/0CjraZwTiXWTb2qHSihrZe68Zk6s+go/lunrotEbaGmAhY +LcmsJWTyXnW0OMGuf1pGg+pRyrbxmRE1a6Vqe8YAsOf4vmSyrcjC8azjUeqkk+B5 +yOGBQMkKW+ESPMFgKuOXwIlCypTPRpgSabuY0MLTDXJLR27lk8QyKGOHQ+SwMj4K +00u/I5sUKUErmgQfky3xxzlIPK1aEn8= +-----END CERTIFICATE----- +-----BEGIN CERTIFICATE----- +MIIFazCCA1OgAwIBAgIRAIIQz7DSQONZRGPgu2OCiwAwDQYJKoZIhvcNAQELBQAw +TzELMAkGA1UEBhMCVVMxKTAnBgNVBAoTIEludGVybmV0IFNlY3VyaXR5IFJlc2Vh +cmNoIEdyb3VwMRUwEwYDVQQDEwxJU1JHIFJvb3QgWDEwHhcNMTUwNjA0MTEwNDM4 +WhcNMzUwNjA0MTEwNDM4WjBPMQswCQYDVQQGEwJVUzEpMCcGA1UEChMgSW50ZXJu +ZXQgU2VjdXJpdHkgUmVzZWFyY2ggR3JvdXAxFTATBgNVBAMTDElTUkcgUm9vdCBY +MTCCAiIwDQYJKoZIhvcNAQEBBQADggIPADCCAgoCggIBAK3oJHP0FDfzm54rVygc +h77ct984kIxuPOZXoHj3dcKi/vVqbvYATyjb3miGbESTtrFj/RQSa78f0uoxmyF+ +0TM8ukj13Xnfs7j/EvEhmkvBioZxaUpmZmyPfjxwv60pIgbz5MDmgK7iS4+3mX6U +A5/TR5d8mUgjU+g4rk8Kb4Mu0UlXjIB0ttov0DiNewNwIRt18jA8+o+u3dpjq+sW +T8KOEUt+zwvo/7V3LvSye0rgTBIlDHCNAymg4VMk7BPZ7hm/ELNKjD+Jo2FR3qyH +B5T0Y3HsLuJvW5iB4YlcNHlsdu87kGJ55tukmi8mxdAQ4Q7e2RCOFvu396j3x+UC +B5iPNgiV5+I3lg02dZ77DnKxHZu8A/lJBdiB3QW0KtZB6awBdpUKD9jf1b0SHzUv +KBds0pjBqAlkd25HN7rOrFleaJ1/ctaJxQZBKT5ZPt0m9STJEadao0xAH0ahmbWn +OlFuhjuefXKnEgV4We0+UXgVCwOPjdAvBbI+e0ocS3MFEvzG6uBQE3xDk3SzynTn +jh8BCNAw1FtxNrQHusEwMFxIt4I7mKZ9YIqioymCzLq9gwQbooMDQaHWBfEbwrbw +qHyGO0aoSCqI3Haadr8faqU9GY/rOPNk3sgrDQoo//fb4hVC1CLQJ13hef4Y53CI +rU7m2Ys6xt0nUW7/vGT1M0NPAgMBAAGjQjBAMA4GA1UdDwEB/wQEAwIBBjAPBgNV +HRMBAf8EBTADAQH/MB0GA1UdDgQWBBR5tFnme7bl5AFzgAiIyBpY9umbbjANBgkq +hkiG9w0BAQsFAAOCAgEAVR9YqbyyqFDQDLHYGmkgJykIrGF1XIpu+ILlaS/V9lZL +ubhzEFnTIZd+50xx+7LSYK05qAvqFyFWhfFQDlnrzuBZ6brJFe+GnY+EgPbk6ZGQ +3BebYhtF8GaV0nxvwuo77x/Py9auJ/GpsMiu/X1+mvoiBOv/2X/qkSsisRcOj/KK +NFtY2PwByVS5uCbMiogziUwthDyC3+6WVwW6LLv3xLfHTjuCvjHIInNzktHCgKQ5 +ORAzI4JMPJ+GslWYHb4phowim57iaztXOoJwTdwJx4nLCgdNbOhdjsnvzqvHu7Ur +TkXWStAmzOVyyghqpZXjFaH3pO3JLF+l+/+sKAIuvtd7u+Nxe5AW0wdeRlN8NwdC +jNPElpzVmbUq4JUagEiuTDkHzsxHpFKVK7q4+63SM1N95R1NbdWhscdCb+ZAJzVc +oyi3B43njTOQ5yOf+1CceWxG1bQVs5ZufpsMljq4Ui0/1lvh+wjChP4kqKOJ2qxq +4RgqsahDYVvTH9w7jXbyLeiNdd8XM2w9U/t7y0Ff/9yi0GE44Za4rF2LN9d11TPA +mRGunUHBcnWEvgJBQl9nJEiU0Zsnvgc/ubhPgXRR4Xq37Z0j4r7g1SgEEzwxA57d +emyPxgcYxn/eR44/KJ4EBs+lVDR3veyJm+kXQ99b21/+jh5Xos1AnX5iItreGCc= +-----END CERTIFICATE----- +-----BEGIN CERTIFICATE----- +MIIFFjCCAv6gAwIBAgIRAJErCErPDBinU/bWLiWnX1owDQYJKoZIhvcNAQELBQAw +TzELMAkGA1UEBhMCVVMxKTAnBgNVBAoTIEludGVybmV0IFNlY3VyaXR5IFJlc2Vh +cmNoIEdyb3VwMRUwEwYDVQQDEwxJU1JHIFJvb3QgWDEwHhcNMjAwOTA0MDAwMDAw +WhcNMjUwOTE1MTYwMDAwWjAyMQswCQYDVQQGEwJVUzEWMBQGA1UEChMNTGV0J3Mg +RW5jcnlwdDELMAkGA1UEAxMCUjMwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEK +AoIBAQC7AhUozPaglNMPEuyNVZLD+ILxmaZ6QoinXSaqtSu5xUyxr45r+XXIo9cP +R5QUVTVXjJ6oojkZ9YI8QqlObvU7wy7bjcCwXPNZOOftz2nwWgsbvsCUJCWH+jdx +sxPnHKzhm+/b5DtFUkWWqcFTzjTIUu61ru2P3mBw4qVUq7ZtDpelQDRrK9O8Zutm +NHz6a4uPVymZ+DAXXbpyb/uBxa3Shlg9F8fnCbvxK/eG3MHacV3URuPMrSXBiLxg +Z3Vms/EY96Jc5lP/Ooi2R6X/ExjqmAl3P51T+c8B5fWmcBcUr2Ok/5mzk53cU6cG +/kiFHaFpriV1uxPMUgP17VGhi9sVAgMBAAGjggEIMIIBBDAOBgNVHQ8BAf8EBAMC +AYYwHQYDVR0lBBYwFAYIKwYBBQUHAwIGCCsGAQUFBwMBMBIGA1UdEwEB/wQIMAYB +Af8CAQAwHQYDVR0OBBYEFBQusxe3WFbLrlAJQOYfr52LFMLGMB8GA1UdIwQYMBaA +FHm0WeZ7tuXkAXOACIjIGlj26ZtuMDIGCCsGAQUFBwEBBCYwJDAiBggrBgEFBQcw +AoYWaHR0cDovL3gxLmkubGVuY3Iub3JnLzAnBgNVHR8EIDAeMBygGqAYhhZodHRw +Oi8veDEuYy5sZW5jci5vcmcvMCIGA1UdIAQbMBkwCAYGZ4EMAQIBMA0GCysGAQQB +gt8TAQEBMA0GCSqGSIb3DQEBCwUAA4ICAQCFyk5HPqP3hUSFvNVneLKYY611TR6W +PTNlclQtgaDqw+34IL9fzLdwALduO/ZelN7kIJ+m74uyA+eitRY8kc607TkC53wl +ikfmZW4/RvTZ8M6UK+5UzhK8jCdLuMGYL6KvzXGRSgi3yLgjewQtCPkIVz6D2QQz +CkcheAmCJ8MqyJu5zlzyZMjAvnnAT45tRAxekrsu94sQ4egdRCnbWSDtY7kh+BIm +lJNXoB1lBMEKIq4QDUOXoRgffuDghje1WrG9ML+Hbisq/yFOGwXD9RiX8F6sw6W4 +avAuvDszue5L3sz85K+EC4Y/wFVDNvZo4TYXao6Z0f+lQKc0t8DQYzk1OXVu8rp2 +yJMC6alLbBfODALZvYH7n7do1AZls4I9d1P4jnkDrQoxB3UqQ9hVl3LEKQ73xF1O +yK5GhDDX8oVfGKF5u+decIsH4YaTw7mP3GFxJSqv3+0lUFJoi5Lc5da149p90Ids +hCExroL1+7mryIkXPeFM5TgO9r0rvZaBFOvV2z0gp35Z0+L4WPlbuEjN/lxPFin+ +HlUjr8gRsI3qfJOQFy/9rKIJR0Y/8Omwt/8oTWgy1mdeHmmjk7j1nYsvC9JSQ6Zv +MldlTTKB3zhThV1+XWYp6rjd5JW1zbVWEkLNxE7GJThEUG3szgBVGP7pSWTUTsqX +nLRbwHOoq7hHwg== +-----END CERTIFICATE-----