Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: self discover agent #3437

Closed

Conversation

sb-git-cloud
Copy link
Contributor

What is the problem that this fixes or functionality that this introduces? Does it fix any open issues?

Implements a Self Discover agent, which provides a step-by-step plan to CodeActAgent. This is related to the open PR #46.


Give a summary of what the PR does, explaining any non-trivial design decisions
The agent first chooses reasoning modules that may be relevant for the user-given task. Then it adapts those to the specific task and finally it creates a plan, which is then sent to CodeAct for execution.

As of now it is in beta version.


Other references

@sb-git-cloud sb-git-cloud changed the title Self discover agent feat: self discover agent Aug 17, 2024
@tobitege
Copy link
Collaborator

Thanks for building and contributing this! 🤗


SYSTEM_SUFFIX = """Let the following principles guide your response:
- Before responding read all information carefully.
- Take time to think.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just commenting: I wonder whether this instruction was ever followed. 😂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good question. I don't have any data to answer...

Copy link
Collaborator

@tobitege tobitege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would love to see some test for this, if possible. Maybe in a follow up PR.

You MUST NOT include any other text besides the JSON response.
"""

# TODO: modify example so that it is consistent with prompting
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to keep this commented out code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Contributor Author

@sb-git-cloud sb-git-cloud Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, will remove the commented-out code for now and just leave the Todo comment

@li-boxuan
Copy link
Collaborator

Can you showcase some simple examples? Screenshots would suffice. Then maybe we could run some evaluation against it. C.C. @xingyaoww

Copy link
Contributor

@xingyaoww xingyaoww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! Overall this looks good to me! I don't see major road block to get this merged. A few things:

Btw, @sb-git-cloud if you are interested in running some SWE-Bench eval to validate the agent, i'd be happy to sponsor some API credits - just lmk in the slack :D

@@ -195,6 +195,11 @@ def _get_messages(self, state: State) -> list[Message]:
),
]

if 'task' in state.inputs:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this absolutely necessary for CodeAct to be delegated by other agents? cc @li-boxuan

Copy link
Collaborator

@enyst enyst Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, at least last I checked. The task was not accessible to the agent via an event.

I've been thinking we could return it from get_current_user_intent() though. It would be easier to use like goal=state.get_user_intent() for everything, delegate or not.

The issue is, if I recall correctly, that children start with zero history. The task is included in the DelegateAction, but the child doesn't have access to that action. We'll need Li Boxuan to confirm, though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The contract is messy and there lacks a clear API between agents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like for now the best is to leave it as is, and once the API between agents is defined we can update the agent with a follow-up PR.

implement_example = IMPLEMENT_EXAMPLE.format(
implement_state_key=SelfDiscoverState.IMPLEMENT.value,
task_key=TASK_KEY,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be nice if we can use the newly implemented Jinja template for this agent - but no pressure - we can do it follow up PR.

https://github.com/OpenDevin/OpenDevin/blob/14a4e45cbbbb01eba01b47e5d874c2583decdfd8/agenthub/codeact_agent/codeact_agent.py#L76-L80

If the template is too rigid, we can figure out a way to generalize it a bit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, given your comment below let's defer to a future PR.

"""


def get_prompt(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a second thought: the prompt here is probably a little bit too complex to support for the Jinja2 template in its current shape - we can defer that to future PR.

You MUST NOT include any other text besides the JSON response.
"""

# TODO: modify example so that it is consistent with prompting
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Collaborator

@tobitege tobitege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR still needs to register the agent in the agenthub registry (agenthub/__init__.py)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, can you add some README.md under this folder to explain more context about self-discover agent? I also think we need some evalution result on SWE bench before merge.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little curious on this, what result are we looking for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sounds good. Thanks for the suggestion. I will add a readme and run some evaluations.

@tobitege tobitege dismissed their stale review August 20, 2024 04:43

deferring to other maintainers

@sb-git-cloud
Copy link
Contributor Author

Thank you all for your reviews! Will work on the related changes.

@mamoodi
Copy link
Collaborator

mamoodi commented Sep 7, 2024

Hi @sb-git-cloud thanks so much for your contribution. Just wanted to check if this is something that is generally on your radar?

@sb-git-cloud
Copy link
Contributor Author

Hi @sb-git-cloud thanks so much for your contribution. Just wanted to check if this is something that is generally on your radar?

Thanks for checking in. Yes. Haven't had the chance to run evaluations. Hopefully I will get to it next week.

@neubig
Copy link
Contributor

neubig commented Sep 23, 2024

Hey @sb-git-cloud , if you're interested in running evaluations, things are now a bit easier with our remote runtime: https://github.com/All-Hands-AI/OpenHands/tree/main/evaluation/swe_bench#run-inference-on-remoteruntime-experimental

If you'd like to use that we'd be happy to help out.

Copy link
Contributor

This PR is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale Inactive for 30 days label Oct 24, 2024
Copy link
Contributor

This PR was closed because it has been stalled for over 30 days with no activity.

@github-actions github-actions bot closed this Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Stale Inactive for 30 days
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants