Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GoalManager, and make a DeployableMicrochainAgent with this feature #413

Merged
merged 30 commits into from
Aug 29, 2024

Conversation

evangriffiths
Copy link
Contributor

@evangriffiths evangriffiths commented Aug 21, 2024

The GoalManager is a way of generating and evaluating goals for an agent. It can be used as a mechanism to learn from its experience, and add more structure to each session.

The motivation is that we in theory have a way for the agent to learn from its experience, through a combination of RememberPastActions+UpdateMySystemPrompt, but in practice for our deployed agents it's not clear how effective this is.

The Goals generated for an agent are lower-level than the description of the agent ("you are a PM trader agent, make money"), but higher-level than individual tools ("buy Yes tokens"). This makes them suited to being proposed+evaluated on a per-session basis, and therefore (hopefully!) gives the agent some signal on how to adapt on a per-session basis.

An agent can use the GoalManager to:

  • propose new goals (from scratch, or considering its goal history, including evaluations of whether they were completed)
  • retry previous goals that were not completed
  • evaluate whether the goal was completed in a session
  • save goals and their evaluations in a db

This feature is added to the microchain agent, but is default-off. A new deployable agent (DeployableMicrochainWithGoalManagerAgent0) is added for testing out this feature. You can see it in action below:

A goal is generated, and used as the user-prompt for the agent
Screenshot 2024-08-21 at 17 30 15

It seems to work quite nicely with the microchain agent without having to tweak any prompting. It reasons that it has completed the goal, and calls Stop() to terminate the program. The goal evaluation is appended to the chat history for the user to see.
Screenshot 2024-08-21 at 17 30 43

Fixes #233

Copy link
Contributor

coderabbitai bot commented Aug 21, 2024

Warning

Rate limit exceeded

@evangriffiths has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 10 minutes and 24 seconds before requesting another review.

How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

Commits

Files that changed from the base of the PR and between 499b1f6 and d98d14a.

Walkthrough

The updates introduce a GoalManager class to the prediction market agent, enabling structured goal management and evaluation. This class incorporates Pydantic models for data validation, facilitates goal generation, and monitors goal progress based on chat history. Additionally, modifications to the DeployableMicrochainAgent class integrate the GoalManager, enhancing the agent's capability to manage and report on goals during execution.

Changes

Files Change Summary
prediction_market_agent/agents/goal_manager.py Introduced GoalManager for managing and evaluating goals using structured templates. Added data models for Goal, GoalEvaluation, etc.
prediction_market_agent/agents/microchain_agent/deploy.py Enhanced DeployableMicrochainAgent to include a GoalManager, allowing for goal retrieval and progress evaluation during execution.
prediction_market_agent/run_agent.py Added a new agent type, DeployableMicrochainWithGoalManagerAgent0, to the RunnableAgent enumeration, expanding agent capabilities.
prediction_market_agent/agents/microchain_agent/microchain_agent.py Introduced get_functions_summary_list function for summarizing engine functions.
prediction_market_agent/agents/microchain_agent/prompts.py Added new system prompt TRADING_AGENT_SYSTEM_PROMPT_MINIMAL and updated SystemPromptChoice enum to support minimal prompt variant.

Assessment against linked issues

Objective Addressed Explanation
Add ability for agent to create and monitor (sub-)goals (#233)
Add ability to set a goal in app (#416)

Possibly related issues


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Comment on lines +218 to +219
TODO add the ability to continue from a previous session if the goal
is not complete.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +163 to +164
TODO support generation of long-horizon goals with a specified
completion date, until which the goal's status is 'pending'.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Outside diff range, codebase verification and nitpick comments (3)
prediction_market_agent/db/evaluated_goal_table_handler.py (2)

24-34: Clarify type hint for get_latest_evaluated_goals.

The return type hint list[EvaluatedGoalModel] might be misleading if the method can return an empty list. Consider specifying that an empty list is possible.

def get_latest_evaluated_goals(self, limit: int) -> list[EvaluatedGoalModel]:
    ...

36-43: Improve docstring clarity.

The docstring for delete_all_evaluated_goals could be more descriptive. Consider specifying that it deletes goals for the given agent_id.

"""
Delete all evaluated goals associated with the specified `agent_id`.
"""
scripts/delete_agent_db_entries.py (1)

Line range hint 17-44: Update docstring to reflect new functionality.

The docstring for the main function should be updated to include the new delete_goals parameter and its purpose.

"""
Delete all memories, prompts, and evaluated goals for a given agent, defined by the session_id.
"""
Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between c8b311e and e0e2593.

Files ignored due to path filters (2)
  • poetry.lock is excluded by !**/*.lock, !**/*.lock
  • pyproject.toml is excluded by !**/*.toml
Files selected for processing (12)
  • prediction_market_agent/agents/goal_manager.py (1 hunks)
  • prediction_market_agent/agents/microchain_agent/deploy.py (5 hunks)
  • prediction_market_agent/agents/microchain_agent/memory.py (2 hunks)
  • prediction_market_agent/agents/microchain_agent/microchain_agent.py (2 hunks)
  • prediction_market_agent/agents/utils.py (1 hunks)
  • prediction_market_agent/db/evaluated_goal_table_handler.py (1 hunks)
  • prediction_market_agent/db/models.py (1 hunks)
  • prediction_market_agent/run_agent.py (3 hunks)
  • scripts/delete_agent_db_entries.py (3 hunks)
  • tests/agents/test_goal_manager.py (1 hunks)
  • tests/db/test_evaluated_goal_table_handler.py (1 hunks)
  • tests/test_chat_history.py (2 hunks)
Additional context used
Ruff
tests/agents/test_goal_manager.py

152-152: Local variable assistant_message is assigned to but never used

Remove assignment to unused variable assistant_message

(F841)


205-205: Comparison to None should be cond is None

Replace with cond is None

(E711)


278-278: Comparison to None should be cond is None

Replace with cond is None

(E711)

prediction_market_agent/agents/goal_manager.py

203-211: Return the condition directly

Inline condition

(SIM103)

Additional comments not posted (28)
prediction_market_agent/db/evaluated_goal_table_handler.py (1)

9-19: Ensure SQLHandler initialization is correct.

The EvaluatedGoalTableHandler class initializes a SQLHandler with a model and optional database URL. Ensure that the SQLHandler is correctly handling the model and URL, especially if the URL can be None.

Verify that the SQLHandler can handle a None value for sqlalchemy_db_url without issues.

prediction_market_agent/db/models.py (1)

34-50: Ensure field types align with database schema.

The EvaluatedGoalModel class defines fields with specific types. Ensure that these types align with the database schema and that fields like datetime_ are correctly indexed if needed for performance.

Verify that the database schema supports these field types and consider indexing datetime_ for efficient queries.

scripts/delete_agent_db_entries.py (1)

38-44: Ensure exception handling is robust.

The logic for deleting evaluated goals raises an exception if entries are not deleted. Ensure that this exception handling is robust and provides meaningful error messages.

Verify that the exception handling provides clear and actionable error messages to aid in debugging.

tests/test_chat_history.py (1)

80-89: Good addition: Test for stringified chat history.

The test function test_stringified_chat_history correctly validates the string representation of a ChatHistory object. This enhances test coverage for the chat history functionality.

tests/db/test_evaluated_goal_table_handler.py (4)

14-21: Well-structured fixture for table handler setup.

The table_handler fixture correctly sets up an in-memory SQLite DB for testing. This is a good practice for isolated and repeatable tests.


24-40: Comprehensive test for saving and loading evaluated goals.

The test_save_load_evaluated_goal_0 function effectively tests the saving and loading of an evaluated goal, ensuring data integrity.


43-81: Effective test for LIFO order and multiple goal saving/loading.

The test_save_load_evaluated_goal_1 function correctly verifies the LIFO order and the ability to handle multiple goals. The use of different limits enhances the test robustness.


84-114: Appropriate test for handling multiple agents.

The test_save_load_evaluated_goal_multiple_agents function ensures that goals are correctly associated with different agents, which is crucial for multi-agent scenarios.

prediction_market_agent/run_agent.py (1)

28-28: Seamless integration of new agent type.

The addition of DeployableMicrochainWithGoalManagerAgent0 to the RunnableAgent enum and RUNNABLE_AGENTS dictionary is well-implemented, enhancing the agent system's capabilities.

Also applies to: 59-59, 80-80

prediction_market_agent/agents/utils.py (1)

32-32: Addition of MICROCHAIN_AGENT_OMEN_WITH_GOAL_MANAGER is approved.

The new enumeration value extends the capabilities of the AgentIdentifier class without affecting existing logic.

prediction_market_agent/agents/microchain_agent/memory.py (2)

26-28: Addition of __str__ method in ChatMessage is approved.

The method enhances readability by providing a formatted string representation of the message.


104-106: Addition of __str__ method in ChatHistory is approved.

The method provides a clear overview of chat messages, improving debugging and logging.

prediction_market_agent/agents/microchain_agent/deploy.py (5)

5-5: Import of GoalManager is approved.

The import is necessary for integrating goal management features.


36-36: Introduction of goal_manager attribute is approved.

The optional goal_manager attribute allows the agent to manage and evaluate goals, enhancing its functionality.


57-61: Conditional logic for goal_manager is approved.

The logic checks for the presence of a goal_manager and retrieves the current goal if available, enhancing the agent's capabilities.


81-92: Goal evaluation and history update logic is approved.

The logic evaluates the goal's progress and appends the evaluation to the agent's history, improving its operational capabilities.


136-152: Addition of DeployableMicrochainWithGoalManagerAgent0 is approved.

The new class integrates goal management features, enhancing the agent's decision-making and performance in trading tasks.

prediction_market_agent/agents/microchain_agent/microchain_agent.py (2)

207-208: LGTM!

The addition of the prompt parameter in step_end_callback is consistent with its usage in build_agent.


150-150: LGTM! Verify the usage of the prompt parameter.

The addition of the prompt parameter enhances the agent's flexibility.

Ensure that the prompt parameter is correctly utilized throughout the codebase.

Run the following script to verify the usage of the prompt parameter:

Also applies to: 207-208

Verification successful

The prompt parameter is correctly utilized throughout the codebase.

The prompt parameter is consistently used in various parts of the codebase, including agent initialization and prompt handling functionalities. No issues were found with its integration.

  • The prompt parameter is used in the build_agent function to set the agent.prompt.
  • It is referenced in test files and other scripts related to prompt handling.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the usage of the `prompt` parameter in the codebase.

# Test: Search for occurrences of the `prompt` parameter. Expect: Consistent usage.
rg --type python -A 5 $'prompt'

Length of output: 61


Script:

#!/bin/bash
# Description: Verify the usage of the `prompt` parameter in the codebase.

# Test: Search for occurrences of the `prompt` parameter. Expect: Consistent usage.
rg --type py -A 5 $'prompt'

Length of output: 96659

tests/agents/test_goal_manager.py (6)

12-61: Comprehensive test for retry limit functionality.

The test effectively covers various scenarios for the retry limit in GoalManager.


64-100: Effective test for evaluated goals string conversion.

The test correctly verifies the string representation of evaluated goals.


103-132: Comprehensive test for goal generation.

The test effectively covers different scenarios for goal generation in GoalManager.


135-147: Effective test for chat history retrieval after goal prompt.

The test correctly verifies the retrieval of chat history after a goal prompt.


150-163: Effective test for error handling in chat history retrieval.

The test correctly verifies the error handling when the goal prompt is not found in the chat history.

Tools
Ruff

152-152: Local variable assistant_message is assigned to but never used

Remove assignment to unused variable assistant_message

(F841)


208-244: Effective test for evaluating completed goals with output.

The test correctly verifies the evaluation of a completed goal with output.

prediction_market_agent/agents/goal_manager.py (3)

51-65: Well-defined Goal class.

The class effectively encapsulates a goal's attributes and provides a method to convert it to a prompt.


68-83: Well-defined GoalEvaluation class.

The class effectively encapsulates a goal evaluation's attributes and provides a method to convert it to a string.


86-129: Well-defined EvaluatedGoal class.

The class effectively extends Goal with additional attributes and provides methods for conversion to/from a model and to a goal.

tests/agents/test_goal_manager.py Outdated Show resolved Hide resolved
tests/agents/test_goal_manager.py Outdated Show resolved Hide resolved
prediction_market_agent/agents/goal_manager.py Outdated Show resolved Hide resolved
Comment on lines 141 to 150
high_level_description="You are a trader agent in prediction markets to maximise your profit.",
agent_capabilities=(
"You are able to:"
"\n- List all binary markets that can be traded."
"\n- List the current outcome probabilities for each open market."
"\n- Predict the outcome probability for a market."
"\n- Buy, sell and hold outcome tokens in a market."
"\n- Query your wallet balance, and the positions you hold in open markets."
"\n- Query the past bets you've made, and their outcomes."
),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super-keen on how the high_level_description and agent_capabilities are defined here. It feels like some duplication of the system prompt, and engine.help respectively. But I couldn't think of a better way of doing this

  • high_level_description is different enough from the system prompt, and I didn't think it was a good solution to try and derive it from the system prompt via an llm call
  • agent_capabilities could be derived from agent.engine.functions, but there would be some messiness if I tried to define this after build_agent_functions is called.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good enough, but maybe you can use (as you mentioned) agent.engine.functions (or agent.help) to get a long list of functions, and have an LLM generate a summary of those and output a bullet-point list?
At least you stay up-to-date with the most recent functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 😃

pyproject.toml Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between e0e2593 and 5fe0235.

Files selected for processing (1)
  • tests/agents/test_goal_manager.py (1 hunks)
Additional context used
Ruff
tests/agents/test_goal_manager.py

155-155: Local variable assistant_message is assigned to but never used

Remove assignment to unused variable assistant_message

(F841)


209-209: Comparison to None should be cond is None

Replace with cond is None

(E711)


284-284: Comparison to None should be cond is None

Replace with cond is None

(E711)

Additional comments not posted (5)
tests/agents/test_goal_manager.py (5)

14-63: LGTM! Effective test for retry limit functionality.

The test cases cover various scenarios for retry limits effectively.


66-102: LGTM! Effective test for string conversion of evaluated goals.

The test verifies the output string format correctly.


105-135: LGTM! Effective test for goal generation.

The test cases effectively verify the goal generation logic, considering previous evaluated goals.


153-166: LGTM! Effective test for error handling in chat history retrieval.

The test correctly verifies the error handling logic.

Tools
Ruff

155-155: Local variable assistant_message is assigned to but never used

Remove assignment to unused variable assistant_message

(F841)


212-249: LGTM! Effective test for evaluating completed goals with output.

The test correctly verifies the evaluation logic for a completed goal with output.

tests/agents/test_goal_manager.py Show resolved Hide resolved
tests/agents/test_goal_manager.py Outdated Show resolved Hide resolved
tests/agents/test_goal_manager.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 5fe0235 and 8df7f40.

Files selected for processing (2)
  • prediction_market_agent/agents/goal_manager.py (1 hunks)
  • tests/agents/test_goal_manager.py (1 hunks)
Files skipped from review as they are similar to previous changes (1)
  • prediction_market_agent/agents/goal_manager.py
Additional comments not posted (8)
tests/agents/test_goal_manager.py (8)

14-64: Comprehensive test for retry limit logic.

The test cases effectively cover various scenarios for the retry limit in GoalManager.


67-103: Effective test for evaluated goals string conversion.

The test correctly verifies the string representation of evaluated goals.


106-136: Thorough test for goal generation logic.

The test effectively verifies the goal generation based on previous evaluations.


139-151: Effective test for chat history retrieval.

The test correctly verifies the retrieval of chat history after a goal prompt.


154-166: Correct test for error handling in chat history retrieval.

The test appropriately raises and checks a ValueError when the goal prompt is not found.


169-209: Effective test for evaluating completed goals with None output.

The test correctly verifies the evaluation of a completed goal with a None output.


212-249: Effective test for evaluating completed goals with non-None output.

The test correctly verifies the evaluation of a completed goal with specific output content.


252-284: Effective test for evaluating incomplete goals.

The test correctly verifies the evaluation of an incomplete goal.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 8df7f40 and 1f48aa7.

Files selected for processing (1)
  • prediction_market_agent/agents/goal_manager.py (1 hunks)
Files skipped from review as they are similar to previous changes (1)
  • prediction_market_agent/agents/goal_manager.py
Additional context used
Learnings (1)
Common learnings
Learnt from: evangriffiths
PR: gnosis/prediction-market-agent#413
File: prediction_market_agent/db/evaluated_goal_table_handler.py:21-22
Timestamp: 2024-08-22T12:57:57.324Z
Learning: The current use case for the `GoalManager` involves generating only one goal per session, so batch saving is not needed at the moment, but it may be considered for future use.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 3b0359a and 3d39d17.

Files selected for processing (1)
  • prediction_market_agent/agents/microchain_agent/deploy.py (5 hunks)
Files skipped from review as they are similar to previous changes (1)
  • prediction_market_agent/agents/microchain_agent/deploy.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 3d39d17 and e8a9464.

Files ignored due to path filters (2)
  • poetry.lock is excluded by !**/*.lock, !**/*.lock
  • pyproject.toml is excluded by !**/*.toml
Files selected for processing (1)
  • prediction_market_agent/agents/microchain_agent/deploy.py (6 hunks)
Files skipped from review as they are similar to previous changes (1)
  • prediction_market_agent/agents/microchain_agent/deploy.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between e8a9464 and 3168372.

Files selected for processing (1)
  • prediction_market_agent/run_agent.py (3 hunks)
Files skipped from review as they are similar to previous changes (1)
  • prediction_market_agent/run_agent.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Outside diff range, codebase verification and nitpick comments (1)
prediction_market_agent/agents/microchain_agent/deploy.py (1)

40-44: Clarify the purpose of the build_goal_manager method.

This method currently returns None, which suggests it's intended to be overridden in subclasses. Consider adding a docstring to clarify its purpose and usage.

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 3168372 and 399abb5.

Files ignored due to path filters (2)
  • poetry.lock is excluded by !**/*.lock, !**/*.lock
  • pyproject.toml is excluded by !**/*.toml
Files selected for processing (2)
  • prediction_market_agent/agents/microchain_agent/deploy.py (6 hunks)
  • prediction_market_agent/agents/microchain_agent/microchain_agent.py (1 hunks)
Additional comments not posted (3)
prediction_market_agent/agents/microchain_agent/deploy.py (2)

153-162: Ensure correct initialization of GoalManager.

The build_goal_manager method properly initializes the GoalManager with specific parameters. Verify that the agent_capabilities are dynamically generated and accurate. Consider adding error handling for potential issues during initialization.

Consider adding error handling to manage potential initialization failures, enhancing robustness.

Verification successful

Dynamic generation of agent_capabilities is correctly implemented.

The get_functions_summary_list function dynamically generates the agent_capabilities by summarizing the functions available to the agent. Consider adding error handling in the build_goal_manager method to manage potential initialization failures, enhancing robustness.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the dynamic generation of `agent_capabilities`.

# Test: Search for the implementation of `get_functions_summary_list` to ensure it dynamically generates capabilities.
rg --type python -A 5 $'get_functions_summary_list'

Length of output: 81


Script:

#!/bin/bash
# Description: Verify the dynamic generation of `agent_capabilities`.

# Corrected Test: Search for the implementation of `get_functions_summary_list` to ensure it dynamically generates capabilities.
rg --type py -A 5 $'get_functions_summary_list'

Length of output: 1210


Line range hint 77-105: Review the integration of GoalManager in the run method.

The logic correctly checks for the presence of a goal_manager and handles goal retrieval and evaluation. However, ensure that the goal_manager is robustly integrated and tested, especially in scenarios where it might be None.

Would you like me to help with creating unit tests for this integration?

prediction_market_agent/agents/microchain_agent/microchain_agent.py (1)

248-249: Review the implementation of get_functions_summary_list.

The function correctly generates a summary list of functions. Ensure that the function list is always correctly populated and consider adding error handling for cases where the function list might be empty or malformed.

Add error handling to manage cases where the input list is empty or contains invalid data, improving robustness.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 399abb5 and 14c13bf.

Files selected for processing (2)
  • prediction_market_agent/agents/microchain_agent/deploy.py (6 hunks)
  • prediction_market_agent/agents/microchain_agent/microchain_agent.py (1 hunks)
Files skipped from review as they are similar to previous changes (2)
  • prediction_market_agent/agents/microchain_agent/deploy.py
  • prediction_market_agent/agents/microchain_agent/microchain_agent.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 14c13bf and b220db1.

Files selected for processing (2)
  • prediction_market_agent/agents/microchain_agent/deploy.py (6 hunks)
  • prediction_market_agent/agents/microchain_agent/prompts.py (4 hunks)
Additional comments not posted (6)
prediction_market_agent/agents/microchain_agent/prompts.py (3)

41-50: New minimal system prompt added successfully.

The TRADING_AGENT_SYSTEM_PROMPT_MINIMAL is well-defined and aligns with the intended use of providing a minimalistic approach for the trading agent. This allows for greater flexibility in scenarios where user-provided instructions will guide the agent's actions.

The addition of this new prompt is approved.


93-93: Enum updated to include new minimal prompt choice.

The addition of TRADING_AGENT_MINIMAL to the SystemPromptChoice enum is a necessary update to support the new minimal trading agent prompt. This change ensures that the system can properly recognize and handle different types of prompts based on the agent's operational context.

The update to the enum is approved.


115-118: Integration of new prompt into system configurations.

The updates to the FunctionsConfig class and the SYSTEM_PROMPTS dictionary are well-executed. These changes ensure that the new minimal prompt is fully integrated into the system, allowing the agent to utilize this prompt effectively during its operations.

The updates to the system configurations are approved.

Also applies to: 136-136

prediction_market_agent/agents/microchain_agent/deploy.py (3)

40-44: Base implementation of build_goal_manager method.

The method build_goal_manager in DeployableMicrochainAgent is designed to return None by default, which is a sensible default that allows derived classes to override this method based on their specific needs. This design supports flexibility and extensibility in how goal management is integrated into different agents.

The base implementation of this method is approved.


154-163: Robust integration of GoalManager in new agent class.

The method build_goal_manager in DeployableMicrochainWithGoalManagerAgent0 is effectively implemented to return a GoalManager instance configured with specific capabilities and a retry limit. This setup aligns well with the PR objectives of enhancing the agent's functionality through structured goal management.

The integration of GoalManager in this new agent class is approved.


Line range hint 77-106: Comprehensive goal management logic in run method.

The goal management logic in the run method of DeployableMicrochainAgent is well-implemented. It includes checks for the presence of a goal_manager, retrieves the current goal, and handles goal evaluation and saving. This comprehensive approach ensures that the agent can effectively manage and report on its goals, aligning with the PR objectives.

The implementation of goal management logic in the run method is approved.

)
latest_evaluated_goals_str = self.evaluated_goals_to_str(latest_evaluated_goals)
llm = ChatOpenAI(
temperature=0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be better with super low temperature constant I introduced somewhere, if you want consistency as much as possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't feel totally convinced by the "very small temperature is best for consistency" argument (I guess the same point expressed in the comment thread on the original PR https://github.com/gnosis/prediction-market-agent-tooling/pull/307/files#r1675590362).

I re-googled to see if there was any new info, and haven't found anything to convince one way or the other. In the anthropic API docs (https://docs.anthropic.com/en/api/complete) it says:

Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks.

Note that even with temperature of 0.0, the results will not be fully deterministic.

which suggests 0.0 is more consistent than 0.000001, but still too vague to be sure!

Anyway, all that to say, if it's alright with you I'll stick with 0.0 until we know more (and we just have inconsistent code 😅)

Copy link
Contributor

@kongzii kongzii Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! I'm now curious to do some manual experiment about this 🙈 Maybe Friday afternoon mini project 😄

@@ -60,6 +74,10 @@ def run(
),
)

if goal_manager := self.build_goal_manager(agent=agent):
goal = goal_manager.get_goal()
agent.prompt = goal.to_prompt() if goal else None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the goal undefined if the if statement above is evaluated as false?

Suggested change
agent.prompt = goal.to_prompt() if goal else None
agent.prompt = goal.to_prompt() if goal else None

(not sure if that's the right fix 😄 )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah eagle eye'd spot, thanks!

Since GoalManager.get_goal always returns a goal, it can be a little simpler:

        if goal_manager := self.build_goal_manager(agent=agent):
            goal = goal_manager.get_goal()
            agent.prompt = goal.to_prompt()

def build_goal_manager(
self,
agent: Agent,
) -> GoalManager | None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
) -> GoalManager | None:
) -> GoalManager:

@@ -29,6 +29,7 @@ class AgentIdentifier(str, Enum):
MICROCHAIN_AGENT_OMEN_LEARNING_2 = "general-agent-2"
MICROCHAIN_AGENT_OMEN_LEARNING_3 = "general-agent-3"
MICROCHAIN_AGENT_STREAMLIT = "microchain-streamlit-app"
MICROCHAIN_AGENT_OMEN_WITH_GOAL_MANAGER = "general-agent-4-with-goal-manager"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
MICROCHAIN_AGENT_OMEN_WITH_GOAL_MANAGER = "general-agent-4-with-goal-manager"
MICROCHAIN_AGENT_OMEN_WITH_GOAL_MANAGER = "trader-agent-4-with-goal-manager"

general agents above are the ones that are learning, this one (based on trader prompt) should be only trader, if that's right, can we distinguish it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah true. How about trader-agent-0-with-goal-manager, as there are no other trader-agent-*s in the list?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

pyproject.toml Outdated Show resolved Hide resolved
Co-authored-by: Peter Jung <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between b220db1 and a539091.

Files ignored due to path filters (1)
  • pyproject.toml is excluded by !**/*.toml
Files selected for processing (2)
  • prediction_market_agent/agents/microchain_agent/prompts.py (4 hunks)
  • prediction_market_agent/run_agent.py (3 hunks)
Files skipped from review as they are similar to previous changes (2)
  • prediction_market_agent/agents/microchain_agent/prompts.py
  • prediction_market_agent/run_agent.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between a539091 and 499b1f6.

Files ignored due to path filters (2)
  • poetry.lock is excluded by !**/*.lock, !**/*.lock
  • pyproject.toml is excluded by !**/*.toml
Files selected for processing (2)
  • prediction_market_agent/agents/microchain_agent/deploy.py (6 hunks)
  • prediction_market_agent/run_agent.py (3 hunks)
Files skipped from review as they are similar to previous changes (2)
  • prediction_market_agent/agents/microchain_agent/deploy.py
  • prediction_market_agent/run_agent.py

@evangriffiths evangriffiths merged commit 1e8a5b1 into main Aug 29, 2024
7 checks passed
@evangriffiths evangriffiths deleted the evan/goal-manager branch August 29, 2024 15:13
@coderabbitai coderabbitai bot mentioned this pull request Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add ability for agent to create and monitor (sub-)goals
3 participants