Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Python sdk docker images released in tandem with pypi #32731

Closed
1 of 17 tasks
RobMcKiernan opened this issue Oct 10, 2024 · 9 comments
Closed
1 of 17 tasks

Comments

@RobMcKiernan
Copy link

What would you like to happen?

I've noticed that the pypi releases of the beam python sdk are often out of sync with the docker image. This causes a bit of annoyance with my dependency update tools

For instance, at the time of writing, PyPi is on 2.59.0 https://pypi.org/project/apache-beam/
Whereas here's a docker image on dockerhub that's on 2.60.0 https://hub.docker.com/r/apache/beam_python3.12_sdk/tags

I'd like both to be released at the same time.

Issue Priority

Priority: 2 (default / most feature requests should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner
@liferoad
Copy link
Collaborator

@Abacn did we already release the docker images with 2.60.0? both should be released together.

@Abacn
Copy link
Contributor

Abacn commented Oct 11, 2024

Thus is work as intended. docker image of the "latest" tag is still on 2.59.0.

@liferoad
Copy link
Collaborator

I see. So docker pull apache/beam_python3.12_sdk:latest is the one that means the official latest Beam version. A bit confusing.

@github-actions github-actions bot added this to the 2.61.0 Release milestone Oct 11, 2024
@RobMcKiernan
Copy link
Author

I pulled the docker image tagged "latest" to verify that it's on 2.59.0, however it seems to be on 2.57.0, at least according to the version.__version__

➜ docker pull apache/beam_python3.12_sdk:latest
latest: Pulling from apache/beam_python3.12_sdk
fea1432adf09: Pull complete 
5651b5803b18: Pull complete 
3873416e6a33: Pull complete 
8a142b8b0e69: Pull complete 
c1ab5c0b7cab: Pull complete 
541445b689ae: Pull complete 
2160297bfeb9: Pull complete 
6aa46f5fa6a3: Pull complete 
7819e6b66eb2: Pull complete 
6f34129bac21: Pull complete 
163e928ec7e8: Pull complete 
254158c9034b: Pull complete 
db59b7ea230b: Pull complete 
4f4fb700ef54: Pull complete 
Digest: sha256:c73e4ae03eb2f257cf5984458b429de84e72a1eaa94b99be7f8c389b54aa6baa
Status: Downloaded newer image for apache/beam_python3.12_sdk:latest
docker.io/apache/beam_python3.12_sdk:latest

➜ docker images|grep beam_python3.12_sdk
apache/beam_python3.12_sdk                               latest                   4fa1bf687b40   3 months ago    2.8GB

➜ docker run -it --entrypoint=/bin/bash 4fa1bf687b40
root@b148a708f868:/# python -c "import apache_beam;print(apache_beam.version.__version__)"
2.57.0

Not sure what's going on here? @Abacn can you shed some light on this?

@liferoad
Copy link
Collaborator

cc @damccorm

@damccorm
Copy link
Contributor

Looks like latest got updated for everything but 3.12 (e.g. https://hub.docker.com/r/apache/beam_python3.11_sdk/tags).

Additionally, I don't see 3.12 getting picked up in the finalize_release run - https://github.com/apache/beam/actions/runs/10819961501/job/30018975145

@damccorm
Copy link
Contributor

damccorm commented Oct 11, 2024

I think https://github.com/apache/beam/blob/07322cc86d35fd2af5c32228796e7936f58416d6/.github/workflows/finalize_release.yml#L58C18-L58C77 is likely the problem - it looks like maybe docker search doesn't respect anything after the /.

dannymccormick-macbookpro:~ dannymccormick$ docker search apache/beam_ --format "{{.Name}}" --limit 100
apache/airflow
apache/superset
bitnami/apache
apache/couchdb
apache/solr-operator
apache/tika
apache/apisix
apache/nifi-registry
...

(doesn't include beam_python3.12_sdk, but does include the others)

@damccorm
Copy link
Contributor

Interestingly, searching for apache/beam produces much better results. So that may be the quick fix:

docker search "apache/beam" --format "{{.Name}}" --limit 100
apache/beam_go_sdk
apache/beam_java11_sdk
apache/beam_python3.8_sdk
apache/beam_python3.7_sdk
apache/beam_java8_sdk
apache/beam_python3.10_sdk
apache/beam_python3.11_sdk
apache/beam_python3.9_sdk
apache/beam_java17_sdk
apache/beam_java_sdk
apache/beam_python3.6_sdk
apache/beam_spark_job_server
apache/beam_flink1.13_job_server
apache/beam_spark3_job_server
apache/beam_flink1.12_job_server
apache/beam_java21_sdk
apache/beam_flink1.14_job_server
apache/beam_flink1.16_job_server
apache/beam_flink1.15_job_server
apache/beam_flink1.10_job_server
apache/beam_flink1.9_job_server
apache/beam_python2.7_sdk
apache/beam_flink1.8_job_server
apache/beam_flink1.11_job_server
apache/beam_python3.5_sdk
apache/airflow
apache/superset
apache/beam_python3.12_sdk
...

It still doesn't feel as robust as I'd like to be, though, so maybe it makes sense to make this a real docker task alongside

tasks.register("pushAllDockerImages") {
eventually

@damccorm damccorm mentioned this issue Oct 11, 2024
3 tasks
@damccorm
Copy link
Contributor

I reran the finalize release workflow after updating the script in #32750 and now things should be correct. Thanks for reporting this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants