Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show rate limit issues in the UI #3913

Open
rbren opened this issue Sep 17, 2024 · 4 comments
Open

Show rate limit issues in the UI #3913

rbren opened this issue Sep 17, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@rbren
Copy link
Collaborator

rbren commented Sep 17, 2024

What problem or use case are you trying to solve?

I'm getting rate limited by Anthropic. But it just looks like the agent is kinda stuck while it cools down.

Describe the UX of the solution you'd like

I'd like the indicator to turn yellow, and show a relevant message about rate limits

Screenshot 2024-09-17 at 10 41 33 AM

Do you have thoughts on the technical implementation?

@tobitege has done a little preliminary work here. Basically I think we need to turn the status/badge from "agent status" to "system status"

Describe alternatives you've considered

Additional context

@rbren rbren added the enhancement New feature or request label Sep 17, 2024
@tobitege
Copy link
Collaborator

Btw, during benches since yesterday, I received server error 502 with some html error message (lot of file edits back and forth in a short amount of time), but have a feeling that that is the error you've also experienced when getting limited?

@rbren
Copy link
Collaborator Author

rbren commented Sep 17, 2024

Yeah exactly--I think it was due to file editing issues

@tobitege
Copy link
Collaborator

litellm completion calls can have a cooldown parameter with number of seconds for cooldown after hitting rate limits, i.e. it'll happen automatically without raising an exception.

@tobitege
Copy link
Collaborator

tobitege commented Sep 19, 2024

Just an example I found in my logs (added linebreaks for readability):
(429 is default code for rate limiting in litellm)

18:04:47 - openhands:ERROR: llm.py:128 - litellm.RateLimitError: RateLimitError: OpenAIException - Error code: 429 - 
{'error': {'message': 'No deployments available for selected model, Try again in 60 seconds. Passed model=claude-3-5-sonnet@20240620. pre-call-checks=False,
allowed_model_region=n/a, cooldown_list=[(\'75365eba-c184-48b9-8195-f845d4b812ab\', 
{\'Exception Received\': \'litellm.RateLimitError: BedrockException - {"message":"Too many requests, please wait before trying again.
You have sent too many requests.  Wait before trying again."}\', \'Status Code\': \'429\'}), 
(\'0fba6cb1-2b22-45a1-9ec4-f292d74213d4\', {\'Exception Received\': \'litellm.RateLimitError: litellm.RateLimitError: VertexAIException - 
{\\n  "error": {\\n    "code": 429,\\n    "message": "Online prediction request quota exceeded for anthropic-claude-3-5-sonnet.
Please try again later with backoff.",\\n    "status": "RESOURCE_EXHAUSTED"\\n  }\\n}\\n\', \'Status Code\': \'429\'})]', 
'type': 'None', 'param': 'None', 'code': '429'}}. Attempt #1 | You can customize these settings in the configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants