Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build-release: Validate collections MANIFEST.json and FILES.json against repository tag #321

Open
dmsimard opened this issue Oct 19, 2021 · 7 comments

Comments

@dmsimard
Copy link
Contributor

Two files are added to a built collection tarball: FILES.json and MANIFEST.json.

We should validate:

  • That FILES.json sha256sum matches what is in MANIFEST.json
  • That each file in FILES.json indeed matches the reported sha256sum
  • That each file sha256sum in FILES.json matches the tagged version from the collection source repository

Samples of these files below:

> cat MANIFEST.json 
{
 "collection_info": {
  "namespace": "community",
  "name": "general",
  "version": "3.7.0",
  "authors": [
   "Ansible (https://github.com/ansible)"
  ],
  "readme": "README.md",
  "tags": [
   "community"
  ],
  "description": null,
  "license": [],
  "license_file": "COPYING",
  "dependencies": {},
  "repository": "https://github.com/ansible-collections/community.general",
  "documentation": "https://docs.ansible.com/ansible/latest/collections/community/general/",
  "homepage": "https://github.com/ansible-collections/community.general",
  "issues": "https://github.com/ansible-collections/community.general/issues"
 },
 "file_manifest_file": {
  "name": "FILES.json",
  "ftype": "file",
  "chksum_type": "sha256",
  "chksum_sha256": "ec3acfc23701b8e4d876ee274cdcc928d4e5690c6b66fa9233f0cf7a25051bbf",
  "format": 1
 },
 "format": 1
}⏎         
> head -n50 FILES.json
{
 "files": [
  # [...]
  {
   "name": ".github/ISSUE_TEMPLATE/bug_report.yml",
   "ftype": "file",
   "chksum_type": "sha256",
   "chksum_sha256": "61579cadde5af3a949928993bcf53e5ad593ecc95fe3a9c598e56cfa9b5af89e",
   "format": 1
  },
  {
   "name": ".github/ISSUE_TEMPLATE/documentation_report.yml",
   "ftype": "file",
   "chksum_type": "sha256",
   "chksum_sha256": "a6f7269155262cf2612f378421b260aed37260822b9f94b09236f220b25de176",
   "format": 1
  },
  # [...]
@dmsimard
Copy link
Contributor Author

I have bits of code that might be helpful:

Ideally we would be able to check out the git version of the collection and then compare if the files match what we have in FILES.json.

I suspect some challenges like repositories tagging in various different formats (or not at all) but it is hopefully not impossible.

@felixfontein
Copy link
Collaborator

We should validate:

* That FILES.json sha256sum matches what is in MANIFEST.json

* That each file in FILES.json indeed matches the reported sha256sum

ansible-galaxy collection verify already does some (or even all?) of that verification. Using that should be easy.

@dmsimard
Copy link
Contributor Author

dmsimard commented Oct 19, 2021

We should validate:

* That FILES.json sha256sum matches what is in MANIFEST.json

* That each file in FILES.json indeed matches the reported sha256sum

ansible-galaxy collection verify already does some (or even all?) of that verification. Using that should be easy.

Thanks for the tip @felixfontein, I didn't even realize that command existed.

I tried it really quickly and I need to dig further to understand exactly what it does (and how it does it) but at first glance:

> ansible-galaxy collection verify --help
usage: ansible-galaxy collection verify [-h] [-s API_SERVER] [--token API_KEY] [-c] [-v] [-p COLLECTIONS_PATH] [-i] [--offline] [-r REQUIREMENTS] [collection_name ...]

positional arguments:
  collection_name       The collection(s) name or path/url to a tar.gz collection artifact. This is mutually exclusive with --requirements-file.

optional arguments:
  -h, --help            show this help message and exit
  -s API_SERVER, --server API_SERVER
                        The Galaxy API server URL
  --token API_KEY, --api-key API_KEY
                        The Ansible Galaxy API key which can be found at https://galaxy.ansible.com/me/preferences.
  -c, --ignore-certs    Ignore SSL certificate validation errors.
  -v, --verbose         verbose mode (-vvv for more, -vvvv to enable connection debugging)
  -p COLLECTIONS_PATH, --collections-path COLLECTIONS_PATH
                        One or more directories to search for collections in addition to the default COLLECTIONS_PATHS. Separate multiple paths with ':'.
  -i, --ignore-errors   Ignore errors during verification and continue with the next specified collection.
  --offline             Validate collection integrity locally without contacting server for canonical manifest hash.
  -r REQUIREMENTS, --requirements-file REQUIREMENTS
                        A file containing a list of collections to be verified.

> wget https://galaxy.ansible.com/download/community-general-3.8.0.tar.gz
--2021-10-19 12:37:27--  https://galaxy.ansible.com/download/community-general-3.8.0.tar.gz
Resolving galaxy.ansible.com (galaxy.ansible.com)... 104.26.1.234, 172.67.68.251, 104.26.0.234, ...
Connecting to galaxy.ansible.com (galaxy.ansible.com)|104.26.1.234|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://ansible-galaxy.s3.amazonaws.com/artifact/4c/0f4b5be02d12235c4209af98501bc19fe305a066ff8dead9509b314af67ca8?response-content-disposition=attachment%3B%20filename%3Dcommunity-general-3.8.0.tar.gz&AWSAccessKeyId=AKIAJZZ23S6M5JUH2EOA&Signature=nJ2AkYIdRJ%2B9n0gkQ1QNxKiItzY%3D&Expires=1634665048 [following]
--2021-10-19 12:37:28--  https://ansible-galaxy.s3.amazonaws.com/artifact/4c/0f4b5be02d12235c4209af98501bc19fe305a066ff8dead9509b314af67ca8?response-content-disposition=attachment%3B%20filename%3Dcommunity-general-3.8.0.tar.gz&AWSAccessKeyId=AKIAJZZ23S6M5JUH2EOA&Signature=nJ2AkYIdRJ%2B9n0gkQ1QNxKiItzY%3D&Expires=1634665048
Resolving ansible-galaxy.s3.amazonaws.com (ansible-galaxy.s3.amazonaws.com)... 52.216.8.243
Connecting to ansible-galaxy.s3.amazonaws.com (ansible-galaxy.s3.amazonaws.com)|52.216.8.243|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2220728 (2.1M) [application/octet-stream]
Saving to: ‘community-general-3.8.0.tar.gz’

community-general-3.8.0.tar.gz                                         100%[============================================================================================================================================================================>]   2.12M  8.82MB/s    in 0.2s    

2021-10-19 12:37:28 (8.82 MB/s) - ‘community-general-3.8.0.tar.gz’ saved [2220728/2220728]

> ansible-galaxy collection verify community-general-3.8.0.tar.gz
ERROR! 'file' type is not supported. The format namespace.name is expected.

> ansible-galaxy collection verify https://galaxy.ansible.com/download/community-general-3.8.0.tar.gz
Downloading https://galaxy.ansible.com/download/community-general-3.8.0.tar.gz to /home/dmsimard/.ansible/tmp/ansible-local-821546820f8vxq/tmplac4iqqg/community-general-3.8.0-jg8r9knz
ERROR! 'url' type is not supported. The format namespace.name is expected.

> ansible-galaxy collection verify community.general
Downloading https://galaxy.ansible.com/download/community-general-3.7.0.tar.gz to /home/dmsimard/.ansible/tmp/ansible-local-8215998k2x4b04/tmpbict5ymn/community-general-3.7.0-qvh97n7n
Verifying 'community.general:3.7.0'.
Installed collection found at '/home/dmsimard/.ansible/collections/ansible_collections/community/general'
MANIFEST.json hash: cf2fd70ec4ca3a314da9fa2df876f6f90e4696c17bb6e1e4fb8d8b5674e1c347
Successfully verified that checksums for 'community.general:3.7.0' match the remote collection.

My understanding is that it downloads the version of the collection you have installed and verifies that it matches.
This is still somewhat useful but it's not exactly what I am looking for.

Pretend the following scenario:

  • I push a version tag, say, 1.9.0, for a collection on github that isn't published automatically by zuul (i.e, someone doing it manually or some other process)
  • I have the collection checked out on my laptop but it's "dirty" with either unrelated files or files that are different from the version I just pushed
  • I run ansible-galaxy collection build . and publish that to galaxy as 1.9.0

The contents of the tarball and the contents of the 1.9.0 tag would be different with mismatched checksums.

Another example would be how https://github.com/CheckPointSW/CheckPointAnsibleMgmtCollection doesn't have a meta/runtime.yml file even though it's in the tarball. A testing implementation (as described in this issue) would have picked up that the file was in the tarball but not in the source repo and raised a flag.

This would avoid mistakes, bugs or worse: unintended or malicious code changes.

@dmsimard
Copy link
Contributor Author

I created an issue about the misleading help text for ansible-galaxy collection verify: ansible/ansible#76087

@s-hertel
Copy link

s-hertel commented Oct 21, 2021

  • That each file sha256sum in FILES.json matches the tagged version from the collection source repository

Yeah, if the tag isn't published that won't work. Verify is for checking an installed collection against a published one (not source repository).

Regarding the scenario in #321 (comment), how about using the --offline option to check if the local collection is dirty? That just skips the download and trusts the local FILES.json.

@dmsimard
Copy link
Contributor Author

I thought it would be worth mentioning that there's an interesting tangential improvement planned in ansible-core 2.13: ansible-galaxy CLI collection verification, source, and trust.

From asking around, my understanding is that it will allow to trigger the build of a collection based on a git ref. In other words, instead of building a collection locally and then uploading the resulting tarball, the ansible-galaxy CLI would be able to trigger the build of a specific git ref and then the galaxy server would be the one checking out the repository and doing the build.

While we may not be able to leverage it right away in the community galaxy, we can look forward to it when it eventually is migrated to galaxyng.

@gotmax23
Copy link
Contributor

I created a RFC in #556. It needs some work, but I wanted to get the discussion started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants