Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No error when BOM upload fails due to PURL address size limit exceeding #3876

Open
2 tasks done
eugenhoffmann opened this issue Jun 24, 2024 · 3 comments
Open
2 tasks done
Labels
defect Something isn't working p2 Non-critical bugs, and features that help organizations to identify and reduce risk size/M Medium effort

Comments

@eugenhoffmann
Copy link

Current Behavior

When uploading a BOM file containing PURL addresses that exceed the 786-characters limit, the upload fails and is logged as an error.

stack_trace: javax.jdo.JDOFatalUserException: Attempt to store value "..." in column ""PURL"" that has maximum length of 786. Please correct your data!

However, the message "BOM uploaded" can be misleading, as it suggests that the BOM upload was successful.
When using the /v1/bom REST API, a success status code (200) is returned without any error message, which can cause confusion.

Steps to Reproduce

  • Upload the BOM file via GUI large_purl.json. Project - Tab Components - Upload BOM
  • Upload the BOM file large_purl.json file using the /v1/bom REST API

Expected Behavior

Return an error message indicating that the PURL addresses exceed the 786-character limit

Dependency-Track Version

4.11.4

Dependency-Track Distribution

Container Image

Database Server

PostgreSQL

Database Server Version

15.5

Browser

Google Chrome

Checklist

@eugenhoffmann eugenhoffmann added defect Something isn't working in triage labels Jun 24, 2024
@nscuro nscuro added p2 Non-critical bugs, and features that help organizations to identify and reduce risk size/S Small effort size/M Medium effort and removed in triage size/S Small effort labels Jun 24, 2024
@nscuro
Copy link
Member

nscuro commented Jun 24, 2024

I knew people were abusing PURLs, but exceeding even 786 characters is new to me 😆


Note, we might need some custom pre-flight checks after schema validation, but before dispatching the BOM for asynchronous processing, that check for DT-specific limitations like this.

Another offender could be components having an empty name field, which the CycloneDX schema currently does not cover: CycloneDX/specification#461

Will need to keep in mind that uploaded BOMs can be large, so repeated parsing will cause significant overhead. Either we find a way to parse once, and re-use that across schema- and custom validation, or the custom validation needs to make use of streaming, similar to

private Version detectSchemaVersionFromJson(final byte[] bomBytes) throws IOException {
try (final com.fasterxml.jackson.core.JsonParser jsonParser = jsonMapper.createParser(bomBytes)) {
JsonToken currentToken = jsonParser.nextToken();
if (currentToken != JsonToken.START_OBJECT) {
final String currentTokenAsString = Optional.ofNullable(currentToken)
.map(JsonToken::asString).orElse(null);
throw new JsonParseException(jsonParser, "Expected token %s, but got %s"
.formatted(JsonToken.START_OBJECT.asString(), currentTokenAsString));
}

@marob
Copy link

marob commented Sep 2, 2024

PURL specification is not defining any size limit and we should expect it to be arbitrary long (see package-url/purl-spec#289).
I'd suggest storing as TEXT (or similar) in database to remove any size limit in order to be compliant to the spec.

@nscuro
Copy link
Member

nscuro commented Sep 2, 2024

@marob We are limited by databases offering different degrees of support for this: #2076 (comment)

The limitation can be lifted in Dependency-Track v5, where we will focus entirely on PostgreSQL.

I'd also argue that while the PURL spec doesn't dictate any limit, lengths above 786 are borderline abuse of PURLs, and you should really question if cramming so much data into it is really necessary.

See also: https://github.com/package-url/purl-spec/blob/master/PURL-SPECIFICATION.rst#known-qualifiers-keyvalue-pairs

Note: Do not abuse qualifiers: it can be tempting to use many qualifier keys but their usage should be limited to the bare minimum for proper package identification to ensure that a purl stays compact and readable in most cases.

Additional, separate external attributes stored outside of a purl are the preferred mechanism to convey extra long and optional information such as a download URL, vcs URL or checksums in an API, database or web form.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect Something isn't working p2 Non-critical bugs, and features that help organizations to identify and reduce risk size/M Medium effort
Projects
None yet
Development

No branches or pull requests

3 participants