Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect Debian data live, aka. purl2meta #245

Closed
8 tasks done
pombredanne opened this issue Dec 13, 2023 · 3 comments
Closed
8 tasks done

Collect Debian data live, aka. purl2meta #245

pombredanne opened this issue Dec 13, 2023 · 3 comments
Assignees

Comments

@pombredanne
Copy link
Contributor

pombredanne commented Dec 13, 2023

I would like to have the live, on demand PurlDB API endpoint for metadata and scans to support for live, synchronous collection of a system/distro ecosystem for Debian distro packages (And later we can expand this to other distros):

The design would include updated API end-point(s) for metadata and scan for Debian packages (aka. purl2meta and purl2scancode)

@pombredanne
Copy link
Contributor Author

A simple use case is that given an container image scan in SCIO, we should be able to metadata from Debian for all its PURLs.

@pombredanne
Copy link
Contributor Author

@AyanSinhaMahapatra is this completed now?

@AyanSinhaMahapatra
Copy link
Contributor

Yes, this is completed, and also merged and released everywhere else (SCTK, SCIO and debian-inspector) to be included in purldb.

In debian-inspector (released as v31.1.0):

In scancode-toolkit (released as v32.1.0):

  • Improve debian source/binary package detection nexB/scancode-toolkit#3682:

    • added debian namespace detection from clues and correct namespace based on other detected packages.
    • use changes in debian-inspector` v31.1.0
    • fix purl qualifier bugs
    • support scanning .dsc and copyright files and improving debian source/binary scan results for SCIO pipelines
  • Get debian source purl from parsing debian status files nexB/scancode-toolkit#3661

In scancode.io (released as v34.1.0):

  • update debian support Update debian support scancode.io#1096:

    • Add support for sending source purls for debian packages in populate_purldb pipeline
    • Add debian namespace from debian distro info detected
    • use latest SCTK v32.1.0
  • Purldb: Get metadata for and scan debian packages from Purls #300

    • update collect/ endpoints and all related functionality to accept and use source purls with purls
    • get debian binary, source package links, metadata .dsc and copyright links from purl and source purls (just purl would not be sufficient)
    • support both debian and ubuntu packages

We also have the latest SCTK and SCIO releases with all these changes merged in purldb: #357

See also #242 and aboutcode-org/scancode.io#1110 for related support to fetch metadata and get scancode scans.

To test this out:

  1. You can either provide debian PackageURLs from debian/ubuntu docker images or scan the latest debian docker images https://scancodeio.readthedocs.io/en/latest/tutorial_web_ui_analyze_docker_image.html
  2. You can either send individual purls to the api/collect/purl?= endpoint or use the populate purldb pipeline to send these purls to purldb to get the metadata, or collect and store both metadata and scan binary/source packages.

This is tested end-to-end and working nicely for all the debian purls in ubuntu and debian latest docker images.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants