Skip to content

Commit

Permalink
feat(froge): Improvement in web components (#7068)
Browse files Browse the repository at this point in the history
- Add `duckduckgo_backend` field to `WebSearchComponent` configuration
- Add `selenium_proxy` to `WebSeleniumComponent` configuration
- Update docs
  • Loading branch information
amirdaaee authored Jul 25, 2024
1 parent 22b6dbb commit 3b0cd95
Show file tree
Hide file tree
Showing 4 changed files with 18 additions and 7 deletions.
1 change: 1 addition & 0 deletions autogpt/.env.template
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,7 @@
## HUGGINGFACE_API_TOKEN - HuggingFace API token (Default: None)
# HUGGINGFACE_API_TOKEN=


### Stable Diffusion (IMAGE_PROVIDER=sdwebui)

## SD_WEBUI_AUTH - Stable Diffusion Web UI username:password pair (Default: None)
Expand Down
12 changes: 7 additions & 5 deletions docs/content/forge/components/built-in-components.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,11 +155,12 @@ Allows agent to search the web. Google credentials aren't required for DuckDuckG

### `WebSearchConfiguration`

| Config variable | Details | Type | Default |
| -------------------------------- | ----------------------------------------------------------------------- | ----- | ------- |
| `google_api_key` | Google API key, *ENV:* `GOOGLE_API_KEY` | `str` | `None` |
| `google_custom_search_engine_id` | Google Custom Search Engine ID, *ENV:* `GOOGLE_CUSTOM_SEARCH_ENGINE_ID` | `str` | `None` |
| `duckduckgo_max_attempts` | Maximum number of attempts to search using DuckDuckGo | `int` | `3` |
| Config variable | Details | Type | Default |
| -------------------------------- | ----------------------------------------------------------------------- | --------------------------- | ------- |
| `google_api_key` | Google API key, *ENV:* `GOOGLE_API_KEY` | `str` | `None` |
| `google_custom_search_engine_id` | Google Custom Search Engine ID, *ENV:* `GOOGLE_CUSTOM_SEARCH_ENGINE_ID` | `str` | `None` |
| `duckduckgo_max_attempts` | Maximum number of attempts to search using DuckDuckGo | `int` | `3` |
| `duckduckgo_backend` | Backend to be used for DDG sdk | `"api" \| "html" \| "lite"` | `"api"` |

### DirectiveProvider

Expand All @@ -183,6 +184,7 @@ Allows agent to read websites using Selenium.
| `headless` | Run browser in headless mode | `bool` | `True` |
| `user_agent` | User agent used by the browser | `str` | `"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36"` |
| `browse_spacy_language_model` | Spacy language model used for chunking text | `str` | `"en_core_web_sm"` |
| `selenium_proxy` | Http proxy to use with Selenium | `str` | `None` |

### DirectiveProvider

Expand Down
7 changes: 5 additions & 2 deletions forge/forge/components/web/search.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import json
import logging
import time
from typing import Iterator, Optional
from typing import Iterator, Literal, Optional

from duckduckgo_search import DDGS
from pydantic import BaseModel, SecretStr
Expand All @@ -24,6 +24,7 @@ class WebSearchConfiguration(BaseModel):
None, from_env="GOOGLE_CUSTOM_SEARCH_ENGINE_ID", exclude=True
)
duckduckgo_max_attempts: int = 3
duckduckgo_backend: Literal["api", "html", "lite"] = "api"


class WebSearchComponent(
Expand Down Expand Up @@ -89,7 +90,9 @@ def web_search(self, query: str, num_results: int = 8) -> str:
if not query:
return json.dumps(search_results)

search_results = DDGS().text(query, max_results=num_results)
search_results = DDGS().text(
query, max_results=num_results, backend=self.config.duckduckgo_backend
)

if search_results:
break
Expand Down
5 changes: 5 additions & 0 deletions forge/forge/components/web/selenium.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,8 @@ class WebSeleniumConfiguration(BaseModel):
"""User agent used by the browser"""
browse_spacy_language_model: str = "en_core_web_sm"
"""Spacy language model used for chunking text"""
selenium_proxy: Optional[str] = None
"""Http proxy to use with Selenium"""


class WebSeleniumComponent(
Expand Down Expand Up @@ -301,6 +303,9 @@ async def open_page_in_browser(self, url: str) -> WebDriver:
options.add_argument("--headless=new")
options.add_argument("--disable-gpu")

if self.config.selenium_proxy:
options.add_argument(f"--proxy-server={self.config.selenium_proxy}")

self._sideload_chrome_extensions(options, self.data_dir / "assets" / "crx")

if (chromium_driver_path := Path("/usr/bin/chromedriver")).exists():
Expand Down

0 comments on commit 3b0cd95

Please sign in to comment.