Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to associate inputs and outputs of multiple ActionNodes #1445

Open
chenk-gd opened this issue Aug 9, 2024 · 6 comments
Open

How to associate inputs and outputs of multiple ActionNodes #1445

chenk-gd opened this issue Aug 9, 2024 · 6 comments

Comments

@chenk-gd
Copy link

chenk-gd commented Aug 9, 2024

What is the recommended practice if one wishes to use the output of the previous ActionNode as part of the prompt of the next ActionNode's fill method? The current MetaGPT default (strgy='complex') has no relationship between individual ActionNodes. It just collects the output of each child ActionNode and puts it into the upper-level ActionNode's instruct_content variable.

        elif strgy == "complex":
            # 这里隐式假设了拥有children
            tmp = {}
            for _, i in self.children.items():
                if exclude and i.key in exclude:
                    continue
                child = await i.simple_fill(schema=schema, mode=mode, images=images, timeout=timeout, exclude=exclude)
                tmp.update(child.instruct_content.model_dump())
            cls = self._create_children_class()
            self.instruct_content = cls(**tmp)
            return self

Or, is this a recommended ActionNode implementation for this kind of requirement?

@iorisa
Copy link
Collaborator

iorisa commented Aug 10, 2024

If you want to use the output of the previous ActionNode as part of the fill method prompt for the next ActionNode, that means the previous action is done. So these should be two different actions.
ActionNode is only responsible for modularizing different blocks of prompt within an action and do not support cross-action associations.

@chenk-gd
Copy link
Author

If the ActionNode is responsible for modularizing the Prompt, but actually with the parameter strgy='complex', each ActionNode builds the Prompt individually and calls LLM to get the result. This seems inconsistent with 'ActionNode is responsible for modularizing Prompt'. What scenarios does this apply to?

There is such a task. The first step is to analyze a certain document to extract the relevant part A, the next step is to generate B, C, and D (examples are given separately), based on A, and finally, based on another example, using B, C, and D, the final result is generated. So what is the recommended way for this task, in the framework of Action/ActionNode?

Another question, ActionNode supports output examples. But how to represent examples that contain both inputs and outputs?

@chenk-gd
Copy link
Author

And for the example of printing the Fibonacci series using ActionNode, it also takes the output of the previous ActionNode (SIMPLE_THINK_NODE) as input to the next ActionNode (SIMPLE_CHECK_NODE):

        elif strgy == "complex":
            # 这里隐式假设了拥有children
            child_context = context  # 输入context作为第一个子节点的context
            for _, i in self.children.items():
                i.set_context(child_context)  # 为子节点设置context
                child = await i.simple_fill(schema=schema, mode=mode)
                child_context = child.content  # 将返回内容(child.content)作为下一个子节点的context

@iorisa
Copy link
Collaborator

iorisa commented Aug 12, 2024

1. ActionNode

write_prd_an.py provides some usage examples.

NODES = [
    LANGUAGE,
    PROGRAMMING_LANGUAGE,
    ORIGINAL_REQUIREMENTS,
    PROJECT_NAME,
    PRODUCT_GOALS,
    USER_STORIES,
    COMPETITIVE_ANALYSIS,
    COMPETITIVE_QUADRANT_CHART,
    REQUIREMENT_ANALYSIS,
    REQUIREMENT_POOL,
    UI_DESIGN_DRAFT,
    ANYTHING_UNCLEAR,
]

REFINED_NODES = [
    LANGUAGE,
    PROGRAMMING_LANGUAGE,
    REFINED_REQUIREMENTS,
    PROJECT_NAME,
    REFINED_PRODUCT_GOALS,
    REFINED_USER_STORIES,
    COMPETITIVE_ANALYSIS,
    COMPETITIVE_QUADRANT_CHART,
    REFINED_REQUIREMENT_ANALYSIS,
    REFINED_REQUIREMENT_POOL,
    UI_DESIGN_DRAFT,
    ANYTHING_UNCLEAR,
]

WRITE_PRD_NODE = ActionNode.from_children("WritePRD", NODES)
REFINED_PRD_NODE = ActionNode.from_children("RefinedPRD", REFINED_NODES)

NODES and REFINED_NODES reuse some prompt module such as LANGUAGE, PROGRAMMING_LANGUAGE, and so on:

LANGUAGE = ActionNode(
    key="Language",
    expected_type=str,
    instruction="Provide the language used in the project, typically matching the user's requirement language.",
    example="en_us",
)

PROGRAMMING_LANGUAGE = ActionNode(
    key="Programming Language",
    expected_type=str,
    instruction="Python/JavaScript or other mainstream programming language.",
    example="Python",
)
  1. simple strgy merge all ActionNode object inputs into a single prompt:
    async def simple_fill(
      self, schema, mode, images: Optional[Union[str, list[str]]] = None, timeout=USE_CONFIG_TIMEOUT, exclude=None
  ):
      prompt = self.compile(context=self.context, schema=schema, mode=mode, exclude=exclude)
      ......
      content, scontent = await self._aask_v1(
              prompt, class_name, mapping, images=images, schema=schema, timeout=timeout
          )
      ......
  1. complex strgy run each ActionNode object isolatedly and merge all outputs into a single dict:
tmp = {}
for _, i in self.children.items():
    if exclude and i.key in exclude:
        continue
    child = await i.simple_fill(schema=schema, mode=mode, images=images, timeout=timeout, exclude=exclude)
    tmp.update(child.instruct_content.model_dump())

For example:

WRITE_PRD_NODE.fill(strgy="simple", ...)  # merge all children ActionNode object inputs into a single prompt.
WRITE_PRD_NODE.fill(strgy="complex", ...)  # merge all children ActionNode object outputs into a single dict.

2. DAG Flow

You can refer to the qa_engineer.py to build the flow you need.

    async def _act(self) -> Message:
        ......
        code_filters = any_to_str_set({PrepareDocuments, SummarizeCode})
        test_filters = any_to_str_set({WriteTest, DebugError})
        run_filters = any_to_str_set({RunCode})
        for msg in self.rc.news:
            # Decide what to do based on observed msg type, currently defined by human,
            # might potentially be moved to _think, that is, let the agent decides for itself
            if msg.cause_by in code_filters:
                # engineer wrote a code, time to write a test for it
                await self._write_test(msg) # publish_message(AIMessage(cause_by=WriteTest, send_to=self))
            elif msg.cause_by in test_filters:
                # I wrote or debugged my test code, time to run it
                await self._run_code(msg) # publish_message(AIMessage(cause_by=RunCode, send_to=self))
            elif msg.cause_by in run_filters:
                # I ran my test code, time to fix bugs, if any
                await self._debug_error(msg) # publish_message(AIMessage(cause_by=DebugError, send_to=self))
            elif msg.cause_by == any_to_str(UserRequirement):
                return await self._parse_user_requirement(msg)  # publish_message(AIMessage(cause_by=PrepareDocuments, send_to=self))
        ......

Where:

  1. for msg in self.rc.news processes each message sent to itself one by one;
  2. Executing self._write_test(msg) will send a new WriteTest message to itself, and this message will be added to self.rc.news;
  3. Executing self._run_code(msg) will send a new RunCode message to itself, and this message will be added to self.rc.news;
  4. Executing self._debug_error(msg) will send a new DebugError message to itself, and this message will be added to self.rc.news.

You can refer to QaEngineer's message passing approach to implement your DAG flow.
You can use memory or external storage to wait until the results of B, C, and D are all collected, and then publish a new message to trigger the subsequent workflow.

More Details: Agent Communication

https://docs.deepwisdom.ai/main/en/guide/in_depth_guides/agent_communication.html

@chenk-gd
Copy link
Author

Thank you for answering.
When constructing a DAG flow, suppose there are 3 actions A, B and C. If A and B are independent but C depends both on A and B, how is this case implemented?

@iorisa
Copy link
Collaborator

iorisa commented Aug 16, 2024

  1. Who consumes the data, and who is responsible for determining whether the conditions are met.
  2. At the end of the execution, role just emit the results, regardless of who is consuming the data downstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants