Feature/maestro ai #1906

bartekpacia · 2024-08-12T09:48:03Z

Intro

This PR adds 2 new commands: assertWithAI and assertNoDefectsWithAI.

Documentation PR here. Read it for intro to these 2 commands.

Notes

at least for now, we don't consider AI output to be part of the test results. It's an additional stream of information, just like logs or screenshots. Reason: we're not sure of relationship of AI output with e.g. JUnit - how it should be serialized into that format.

Data structure of AI output

flow 1 \
   assertVisualAI 1
     screenshot
       - defect 1
       - defect 2
       - defect 3
flow 2 \
   assertVisualAI 1
     screenshot
       - defect 1
       - defect 2
       - defect 3
   assertVisualAI 2
     screenshot
       - defect 1
       - defect 2
       - defect 3

In the longer run, I'd like to integrate normal test results with AI test results (integrate meaning e.g. link to each other from HTML reports. Or, preferably, maybe merge their HTML reports)

To test

works in Maestro Studio
~~works in Maestro Cloud~~ todo in the near future
works in continuous mode

Off-top things noticed

Output directories

Maestro test output + AI output should be placed in the root directory of the project. This will allow to use relative paths in e.g. test outputs. Benefit: possible to move files out of CI without breaking paths.
Maestro debug output can stay where it is right now.

coroutines

Maestro uses a lot of runBlocking { }, which sometimes is justified, but often it's a bad practice.

I think most of code in maestro-client and maestro-orchestra (Maestro, Driver) should be using suspending functions.

maestro-cli/src/main/java/maestro/cli/report/HtmlAITestSuiteReporter.kt

maestro-ai/README.md

maestro-ai/src/main/java/maestro/ai/AI.kt

amanjeetsingh150 · 2024-08-28T18:28:21Z

maestro-ai/src/main/java/maestro/ai/DemoApp.kt

+ * maestro-ai-demo foo_bad_1.png
+ * ```
+ *
+ * ### Input format


What should be the expected output? would be helpful to document that as well

Added expected output.

Also, I think we should add a link where the fixtures can be downloaded from. Otherwise the barrier entry to this DemoApp is very high, because of the need to take the screenshots manually.

Currently the fixtures dataset I use is from https://github.com/mobile-dev-inc/copilot/pull/188.

I will upload that dataset to GCP and paste the link here. Do you see any problem with it? I think the customer apps that are in https://github.com/mobile-dev-inc/copilot/pull/188 are okay with that?

Lets not public the GCS link for now, you can make it work for any storage link. Lets write a runbook internally for this in case some one from us wants to evaluate. This repository should ideally have access to general data (from open source apps) instead of customer apps.

Good points, though I don't agree.

I'd prefer to have this dataset public, internal runbooks tend to rot and no one uses them unless really needed. But if we made this testing dataset public, anybody could play around with LLM outptus and submit a PR that improves it.

If this requires removing apps from our customers, and using screenshots from more popular, well known apps instead (like Uber, Bolt), then I still think it's worth the effort. We should strive to make working on Maestro possible and easy to people outside of @mobile-dev-inc.

PS 1 The perfect situation would be that generating screenshots for each app with takeScreenshot. But I think it's too much work to automate this, and what we have now is enough.

That said: I will take no action for now.

maestro-cli/src/main/java/maestro/cli/App.kt

maestro-cli/src/main/java/maestro/cli/report/HtmlAITestSuiteReporter.kt

amanjeetsingh150 · 2024-08-28T18:35:41Z

maestro-cli/src/main/java/maestro/cli/runner/TestRunner.kt

+                                view = resultView,
+                                commands = commands,
+                                debugOutput = FlowDebugOutput(),
+                                // TODO(bartekpacia): make AI outputs work in continuous mode


Better to create issue as well after release!

Created #1972 to track this

maestro-orchestra-models/src/main/java/maestro/orchestra/Commands.kt

maestro-orchestra/src/main/java/maestro/orchestra/Orchestra.kt

… sure about it

bartekpacia force-pushed the feature/maestro_ai branch 4 times, most recently from 7b35e60 to 6ba3d0a Compare August 13, 2024 12:41

bartekpacia added 6 commits August 14, 2024 16:29

implement a basic assertVisualAI command

a7b5969

MaestroCommandRunner: remove mini-cleanup

3916aff

Orchestra: fix nasty typo

e58ceb9

basic done

0d7f025

work on saving files

8237e63

saving LLM output: text done, screenshots WIP

2b29c53

bartekpacia force-pushed the feature/maestro_ai branch from 65fd303 to 2b29c53 Compare August 14, 2024 15:29

bartekpacia added 3 commits August 15, 2024 03:08

more wip

d089f0c

Merge branch 'main' into feature/maestro_ai

fc57417

fix screenshots taken by assertVisualAI being empty

d23a25f

bartekpacia force-pushed the feature/maestro_ai branch from 9d929f8 to d23a25f Compare August 16, 2024 11:27

bartekpacia added 2 commits August 16, 2024 13:00

make assertVisualAI save screenshots when running many flows at once

0f7b418

Orchestra; create sealed class CommandOutput

b76d5e1

bartekpacia force-pushed the feature/maestro_ai branch from 675db0a to b76d5e1 Compare August 16, 2024 14:57

generate HTML output

c7e3c8f

bartekpacia force-pushed the feature/maestro_ai branch 2 times, most recently from fc919e5 to 52ffba1 Compare August 19, 2024 16:06

display possible defect count

33f0c84

bartekpacia force-pushed the feature/maestro_ai branch from 52ffba1 to 33f0c84 Compare August 19, 2024 16:08

bartekpacia added 7 commits August 20, 2024 12:54

take assertion into account

1c0072b

App: add debug output from HTML AI reporter

18345a3

generate nicer HTML

f2e9db6

styling improvements

b928f67

introduce TestDebugReporter.saveSuggestions()

19a60c4

more UI improvements

0dca65e

fixie

85f46c6

bartekpacia added 3 commits August 23, 2024 18:09

add Xjdk-release param

d9e060e

split assertVisualAI into assertNoDefectsWithAI and assertWithAI

6f041c9

implement the new commands

c0b69c5

This was referenced Aug 27, 2024

Add docs for 2 new AI commands mobile-dev-inc/maestro-docs#76

Merged

More AI providers and models should be supported #1957

Open

bartekpacia requested a review from amanjeetsingh150 August 27, 2024 13:21

bartekpacia marked this pull request as ready for review August 27, 2024 13:21

finish adding the 2 new commands

ba69273

bartekpacia force-pushed the feature/maestro_ai branch from 0aed452 to ba69273 Compare August 27, 2024 13:31

lucaswiechmann reviewed Aug 28, 2024

View reviewed changes

maestro-cli/src/main/java/maestro/cli/report/HtmlAITestSuiteReporter.kt Outdated Show resolved Hide resolved

delete obsolete comments

ba04550

amanjeetsingh150 reviewed Aug 28, 2024

View reviewed changes

bartekpacia added 4 commits August 29, 2024 00:39

move askForDefectsSchema from AI to Prediction

2ef50d5

DemoApp: update comment to reflect updated photo naming schema

df6a274

DemoApp: specify output format

78c04a3

DemoApp: add more command invocation examples

674a717

bartekpacia force-pushed the feature/maestro_ai branch from 2584b86 to 674a717 Compare August 28, 2024 23:05

bartekpacia added 2 commits August 29, 2024 12:14

update comment to make more sense

c2e8623

delete scaffold code for previousFalsePositives feature, we are not…

f848f31

… sure about it

amanjeetsingh150 approved these changes Aug 29, 2024

View reviewed changes

bartekpacia added 2 commits August 29, 2024 12:52

update comment

b8e08d7

make HttpClient a dependency of AI clients

0bfac77

bartekpacia force-pushed the feature/maestro_ai branch from 1efa308 to 0bfac77 Compare August 29, 2024 11:01

remove debug printlns

a927e05

bartekpacia mentioned this pull request Aug 29, 2024

Merge default HTML reports and AI reports #1973

Open

add link to issue

8668cad

bartekpacia force-pushed the feature/maestro_ai branch from 9455f47 to 8668cad Compare August 29, 2024 11:23

improve CLI logs when AI assertion evaluates to false

507afea

bartekpacia merged commit 582bdcc into main Aug 29, 2024
3 checks passed

bartekpacia deleted the feature/maestro_ai branch August 29, 2024 11:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/maestro ai #1906

Feature/maestro ai #1906

bartekpacia commented Aug 12, 2024 •

edited

Loading

amanjeetsingh150 Aug 28, 2024

bartekpacia Aug 28, 2024

amanjeetsingh150 Aug 29, 2024

bartekpacia Aug 29, 2024

bartekpacia Aug 29, 2024

amanjeetsingh150 Aug 28, 2024

bartekpacia Aug 29, 2024

Feature/maestro ai #1906

Feature/maestro ai #1906

Conversation

bartekpacia commented Aug 12, 2024 • edited Loading

Intro

Notes

To test

Off-top things noticed

amanjeetsingh150 Aug 28, 2024

Choose a reason for hiding this comment

bartekpacia Aug 28, 2024

Choose a reason for hiding this comment

amanjeetsingh150 Aug 29, 2024

Choose a reason for hiding this comment

bartekpacia Aug 29, 2024

Choose a reason for hiding this comment

bartekpacia Aug 29, 2024

Choose a reason for hiding this comment

amanjeetsingh150 Aug 28, 2024

Choose a reason for hiding this comment

bartekpacia Aug 29, 2024

Choose a reason for hiding this comment

bartekpacia commented Aug 12, 2024 •

edited

Loading