Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container Registry Images From Manifests Never Cleaned Up #32053

Open
nephatrine opened this issue Sep 16, 2024 · 2 comments
Open

Container Registry Images From Manifests Never Cleaned Up #32053

nephatrine opened this issue Sep 16, 2024 · 2 comments

Comments

@nephatrine
Copy link
Contributor

nephatrine commented Sep 16, 2024

Description

I build and push architecture-specific container images to Gitea's container registry and then create a manifest to combine them all into one multi-architecture tag and push that manifest as the image I expect users to actually use.

It appears that when I do this, Gitea creates duplicate packages for them under their sha instead of referencing the existing packages. When I push out a new version of all my tags and manifest, those packages that are listed just by their naked sha hashes are left still there. They are not visible in the normal Packages section of the web ui for any user or organization unless I delete all the named tags/versions first. They are visible in the Package Management section of the site administrative UI though and only in that interface can I see that I have literally thousands of old unreferenced container images that Gitea thinks are still active.

These do not seem to be cleaned up or deleted by Gitea. The UI doesn't provide a means to delete them aside from one-by-one either which makes cleaning this up an extremely daunting task it would seem. Is there a) a way to stop this from continuing to happen aside from "don't use manifests" and b) any way to batch delete container packages?

Gitea Version

1.22.2

Can you reproduce the bug on the Gitea demo site?

No

Log Gist

No response

Screenshots

image

alpine-s6:edge is a manifest containing the tags for amd64, x86, aarch64, armv7, and riscv64 architectures. I have drawn connections showing which packages are duplicates of one another.

Git Version

No response

Operating System

Alpine v3.20

How are you running Gitea?

self-built docker container

Database

SQLite

@nephatrine
Copy link
Contributor Author

This behavior with the extra packages being created seem to happen when I do this:

docker manifest create MANIFESTNAME IMAGE1 IMAGE2 IMAGE3
docker manifest push --purge MANIFESTNAME 

but does not occur when I do this:

docker buildx imagetools create -t MANIFESTNAME IMAGE1 IMAGE2 IMAGE3

I'm not sure if those are expected to function differently, but I at least seem to be able to switch to imagetools create to avoid more crud piling up.

I still am completely lost at how I'm supposed to resolve all the existing packages though. Gitea is showing me as having almost 1TB worth of packages spread across nearly 4000 packages with like 95% of those being container images that need to be deleted.

@nephatrine
Copy link
Contributor Author

So the issue seems to be when creating a manifest in one package path that references images in another. This is how DockerHub handles things (https://github.com/docker-library/official-images#architectures-other-than-amd64). Where the architecture-specific images are under their own separate users and packages like (riscv64/ubuntu:devel) and the multiarch manifest is pushed to the unprefixed path (ubuntu:devel). This makes it much nicer and cleaner to view the tags for those multiarch images.

If I create a manifest in Gitea and push all those images to the same path like nephatrine/package:latest-amd64, nephatrine/package:latest-i386, etc. and then push the manifest in that same path like nephatrine/package:latest, that seems to work perfectly... except it creates a lot of spam/noise by listing all those tags/versions when someone is viewing the container package - but at least when new versions of those tags are pushed the old ones don't persist indefinitely.

The main issue to me is that Gitea does not warn about this behaviour with manifests and there's no indication to the package uploader or owner that these garbage packages are piling up because they're only visible in the site administration page. I do not think this is a weird corner-case use case as again this is how DockerHub pushes their multi-arch manifests. A non-administrator user would have no idea that they are doing terrible things to the packages system since they can't see all those packages listed in the web ui. They might not be informed until the site administrator investigates the high disk usage or package count at which point it's probably going to be a huge mess already that only the admin can fix. Either there should be a clear warning in the documentation that Gitea can't handle this type of workflow and you should not do this or those sha256: versions should be visible alongside the tagged ones in the ui where it would be obvious what was happening under the hood and could be cleaned up or maintained by the package owner rather than the site admin.

As an aside to that, I think these minor feature requests are related. I don't know whether I should open separate feature request issues for them though. 1) Either ability to "hide" images with specific suffixes from the normal web ui for non-owners of the package or to have manifests be able to be pushed to separate packages without creating hidden ghost package versions in the first place. 2) Checkboxes and a delete button in the site administration packages panel to more quickly delete multiple packages at once.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants