Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GitHub actions log is >2hr behind actual AWS run #474

Open
corneliusroemer opened this issue Aug 22, 2024 · 8 comments
Open

GitHub actions log is >2hr behind actual AWS run #474

corneliusroemer opened this issue Aug 22, 2024 · 8 comments
Labels
bug Something isn't working

Comments

@corneliusroemer
Copy link
Member

corneliusroemer commented Aug 22, 2024

I just noticed that ingest is seemingly stuck for at least 2 hours without doing anything? It hasn't yet started any curation. It's not fetching from Genbank nor from GISAID, yet nothing is happening. This is very strange.

@joverlee521 do you have an idea what is going on. I ran these with fetching deselected

https://github.com/nextstrain/ncov-ingest/actions/runs/10513067722/job/29127832018

@corneliusroemer corneliusroemer added the bug Something isn't working label Aug 22, 2024
@corneliusroemer
Copy link
Member Author

Maybe it's just log output to GitHub that's stuck instead? I didn't check locally or on aws, just in GitHub actions logs.

@joverlee521
Copy link
Contributor

Logs are further ahead on AWS.

Maybe behind in GH Action logs due to Python output buffering?

@corneliusroemer
Copy link
Member Author

Can we somehow fix the output buffering? I love being able to see status in the Gh logs but it's useless if it's behind.

@corneliusroemer corneliusroemer changed the title Ingest not doing anything for 2+ hours after downloading caches. What's going on? GitHub actions log is >2hr behind actual AWS run Aug 22, 2024
@joverlee521
Copy link
Contributor

joverlee521 commented Aug 22, 2024

I think this is something we would have to fix in the Nextstrain CLI: nextstrain/cli#393

Nvm, I'm connected to the same job locally and my logs are in sync with the AWS Cloudwatch logs. This is something specific to the GH Action workflow.

@joverlee521
Copy link
Contributor

Ah, this is an issue in the GH Action runners: actions/runner#1031

There is slow down because they have to scan every line to mask any potential secrets in the logs.

@tsibley
Copy link
Member

tsibley commented Aug 27, 2024

Wowee.

@victorlin
Copy link
Member

From actions/runner#1031:

Once a log message gets up to about 6k characters it looks like the runner starts to take a few seconds to process.
At 100k characters this can be 10 minutes or so

The log linked in the input description has 2583 lines and all are under 1500 characters long, most under 400. This doesn't seem that bad. Maybe something else is happening. Likely nothing we can do here if this is specific to GH Actions?

@tsibley
Copy link
Member

tsibley commented Aug 27, 2024

Is the issue that stdout is block-buffered since GitHub Actions redirects it to a non-interactive destination? If Python's default block size is high on the runners—it can vary machine to machine, IIRC—it might take a bit to fill the block up before it's flushed. We could try running under PYTHONUNBUFFERED=1 in the GitHub workflow to diagnose this. If it helps, then we could switch explicitly to line-buffering in Nextstrain CLI; line-buffering is a nice middle ground for this sort of output, between fully unbuffered and block-buffered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants