Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove some deepcopy to speed up workflow conductor #256

Open
wants to merge 34 commits into
base: master
Choose a base branch
from

Conversation

guzzijones
Copy link
Contributor

@guzzijones guzzijones commented Jul 7, 2023

  1. removing deepcopy where we can
  2. change deepcopy where still needed to custom copy command to increase speed using instead a shallow copy
  3. add some data classes * tsc meeting comment

@guzzijones
Copy link
Contributor Author

The use of 'latest' is causing my tox unit testing to fail it appears.

  orquesta.exceptions.PluginFactoryError: Unable to load plugin orquesta.composers.native. No 'orquesta.composers' driver found, looking for 'native'

I also had to disable python 3.6 for now.

set ubuntu-20.04
stevedore is not working in the python 3,8 unit tests
@guzzijones
Copy link
Contributor Author

@m4dcoder can you please take a look at this python 3.8 test failing. It looks like stevedore is not able to find the native, mock plugins to run the tests in python 3.8.

@guzzijones guzzijones self-assigned this Jul 8, 2023
@guzzijones guzzijones requested review from cognifloyd and removed request for nzlosh and m4dcoder July 10, 2023 17:17
@guzzijones
Copy link
Contributor Author

guzzijones commented Jul 11, 2023

I did some testing with st2 and this definitely breaks with-items no longer report failure to the parent workflow.

(virtualenv) [vagrant@stackstorm]~/st2/tools$ st2 execution get 64adc15887cff69baae67294
id: 64adc15887cff69baae67294
action.ref: test_pack.test_fail_workflow
parameters:
  number: 5
status: running (635s elapsed)
start_timestamp: Tue, 11 Jul 2023 20:53:44 UTC
end_timestamp:
log:
  - status: requested
    timestamp: '2023-07-11T20:53:44.168000Z'
  - status: scheduled
    timestamp: '2023-07-11T20:53:44.245000Z'
  - status: running
    timestamp: '2023-07-11T20:53:44.317000Z'
result:
  output: null
+--------------------------+------------------------+-----------+---------------------+-------------------------------+
| id                       | status                 | task      | action              | start_timestamp               |
+--------------------------+------------------------+-----------+---------------------+-------------------------------+
| 64adc1581982b0a045f91e70 | succeeded (0s elapsed) | do_action | test_pack.test_fail | Tue, 11 Jul 2023 20:53:44 UTC |
| 64adc1591982b0a045f91e85 | succeeded (0s elapsed) | with_test | test_pack.test_fail | Tue, 11 Jul 2023 20:53:45 UTC |
| 64adc1591982b0a045f91e88 | succeeded (1s elapsed) | with_test | test_pack.test_fail | Tue, 11 Jul 2023 20:53:45 UTC |
| 64adc1591982b0a045f91e8b | failed (1s elapsed)    | with_test | test_pack.test_fail | Tue, 11 Jul 2023 20:53:45 UTC |
| 64adc1591982b0a045f91e8f | succeeded (1s elapsed) | with_test | test_pack.test_fail | Tue, 11 Jul 2023 20:53:45 UTC |
| 64adc1591982b0a045f91e92 | succeeded (1s elapsed) | with_test | test_pack.test_fail | Tue, 11 Jul 2023 20:53:45 UTC |
+--------------------------+------------------------+-----------+---------------------+-------------------------------+

@Kami
Copy link
Member

Kami commented Aug 9, 2023

On a related note to my comment above^ - I think it would be good to add a feature flag / config option for this functionality.

This will serve multiple purposes:

  1. We will be able to use feature flag to benchmark both scenarios - regular deepcopy vs fast_deepcopy
  2. In case user encounters issues with fast_deepcopy() which we didn't catch here, they can easily opt-out of this functionality and continue using StackStorm without needing to downgrade or similar.

Kami
Kami previously requested changes Aug 9, 2023
Copy link
Member

@Kami Kami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution.

I think those changes are good if we can get them to work, but I also think it's important that for changes like that which affect performance we have at least some basic micro benchmarks.

And in addition to that, per my in-line comment, I think we should also put them behind a feature / config option flag (we can debate later if the default should be enabled or disabled).

@guzzijones
Copy link
Contributor Author

We won't be able to add a feature flag without a change in ST2. Orquesta does not have the Stackstorm config object afaik.
I will put some benchmarks in the PR and also make sure we double check that errors is also copied.

@Kami
Copy link
Member

Kami commented Aug 10, 2023

I will put some benchmarks in the PR and also make sure we double check that errors is also copied.

Thanks.

We won't be able to add a feature flag without a change in ST2. Orquesta does not have the Stackstorm config object afaik.

Yeah, I assume it would probably also require a change on the StackStorm side. If you have cycles, I still think that would be a good idea.

Still much easier to flip the feature flag / config option in case something goes wrong versus needing to downgrade to a previous release.

Especially if it's combined with a breaking backward incompatible change in StackStorm core (live action FK change - there is only a forward migration in that PR, but not a backward one).

@rebrowning
Copy link

I've been looking into some slowness for workflows with a large amount of items inside of a with items step, I saw this PR and I'm curious what items are we actually looking to have gated by a feature flag in this PR?

@guzzijones
Copy link
Contributor Author

task render speed gain when removing deep copy for large parameters

@guzzijones
Copy link
Contributor Author

I added a micro benchmark to show how much faster it is to NOT do a deep copy. I could add it for the serialize and deserialize, but the results will be the same. Copying large objects is costly.

@guzzijones
Copy link
Contributor Author

all the 3.9 tests passed with these updates minus the lint checks).
rpm is here if @rebrowning wants to test it out.

@guzzijones
Copy link
Contributor Author

I am running this additional speed up on our internal fork of St2.
I will report back if I see any issues.

It looks like ci/cd failed because black changed with their latest release.

@rebrowning
Copy link

@Kami @guzzijones are we requiring the feature flag to switch between deepcopy and fastdeepcopy for this to move forward?

@FileMagic
Copy link

Note, this will help with "with items" speed issues.

@guzzijones guzzijones dismissed Kami’s stale review September 13, 2024 19:15

@Kami is not active on this project anymore it seems

@guzzijones
Copy link
Contributor Author

@nzlosh @cognifloyd please take a peak at this. I adds significant speed improvements by eliminating some deep copies. This helps with larger context objects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants