Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkcs7+der: mixed BER/DER encoding with undefined length #779

Open
smndtrl opened this issue Nov 30, 2022 · 17 comments
Open

pkcs7+der: mixed BER/DER encoding with undefined length #779

smndtrl opened this issue Nov 30, 2022 · 17 comments

Comments

@smndtrl
Copy link
Contributor

smndtrl commented Nov 30, 2022

Hi,

while looking into the pkcs7 crate for a CMS usecase around the Apple world I discovered that their detached signatures use the BER indefinite length encoding for some of the SEQUENCE which of course is not supported by the der crate.

Has there been any thoughts/discussions around if that's something the formats repo should address or if that is out of scope (for now) and left to other crates.

I started with adding the signed-data content to pkcs7 validating against a DER encoded signature and discovered the problem afterwards with the ones Apple generates :( Link to example

Simon

@tarcieri
Copy link
Member

We've talked about introducing a lax mode to the der crate which supports a limited number of BER productions necessary to interop with real-world use cases like this. So far there hasn't been one until now.

@smndtrl
Copy link
Contributor Author

smndtrl commented Dec 19, 2022

I gave it a shot with my limited experience with Rust and also this codebase. It is meant as a starting point.

I used it successfully with changes to pkcs7/CMS crate that are not yet ready/

@tarcieri
Copy link
Member

tarcieri commented Dec 19, 2022

@smndtrl rather than trying to change how the existing parsing works in the der crate, you can write your own "lax" implementation of Decode for a specific type in the pkcs7 crate

@smndtrl
Copy link
Contributor Author

smndtrl commented Dec 20, 2022

@tarcieri Wouldn't that require me changing how DecodeValue is implemented on Sequence? My PKCS7 data starts with 30 80 and that already makes the Length::decode fail without reaching the first DecodeValue code for ContentInfo in pkcs7.

For making indefinite length work a solution for Length and Readers is required. The ASN1 I encountered had indefinite lengths for SEQUENCE and OCTET STRING but also can be other fields that can be constructed in BER. For OctetStringRef I'm not sure how to handle those cases as it then is non-contiguous memory (or at least broken up with Tags/Lengths values. For Sequence it via the Reader.

I'm fine with carrying this patch on my own as it works well for my use case. I might revisit the suggested approach later on.

@tarcieri
Copy link
Member

@smndtrl you'll need to write all the impls by hand if you go that route, and you will have to avoid impl'ing one of DecodeValue or FixedTag (I would suggest DecodeValue unless it needs to be in an IMPLICIT and context-specific), as otherwise it will receive a blanket impl of Decode.

If you can't figure it out I can take a look at some point based on your lax branch. It should be possible to impl the "lax" logic for just a single type.

@bkstein
Copy link
Contributor

bkstein commented Aug 10, 2023

We've talked about introducing a lax mode to the der crate which supports a limited number of BER productions necessary to interop with real-world use cases like this. So far there hasn't been one until now.

We've found a real-world use case: EJBCA sends (degenerate certificates-only) CMS messages encoded with BER. Unfortunately, that's in line with RFC 5652:

   ...
   The CMS values are generated using ASN.1 [X.208-88], using BER-
   encoding (Basic Encoding Rules) [X.209-88].

@carl-wallace
Copy link
Contributor

BER was an issue working on PKCS #12 as well. Exports from Firefox are BER. In a cert collection I use for testing, there are several several AIAs hosting P7s that are BER encoded.

@tarcieri
Copy link
Member

We're aware this is a real issue. It's something I've been meaning to address.

@tarcieri
Copy link
Member

I think we can probably use an approach like #810 but we need some way to control the granularity so enabling it doesn't cause der to revert to BER parsing globally

@bkstein
Copy link
Contributor

bkstein commented Nov 13, 2023

FYI, I have implemented a simple parser/converter (Berder), that replaces all indefinite length specifications in a BER message by definite lengths. The parser recurses through the complete message (though there is a limit for recursion depth 🙂). It doesn't fix ordering, but as RustCrypto's der crate fixes it, I think it should work reliably. I tested it against EJBCA messages. The usage is simple. Just prepare your BER data before you try to parse it, e.g.:

    ...
    let der = Berder::ber_to_der(ber.as_slice())?; // Indefinite -> definite lengths
    let ci = ContentInfo::from_der(der.as_slice())?;
    let pki_message = SignedData::from_der(ci.content.to_der()?;

This could be implemented as Decode::from_ber(), but could also be integrated into Decode::from_der(). However, the latter would introduce some performance loss.
What do you think?

@tarcieri
Copy link
Member

tarcieri commented Nov 13, 2023

I think we can probably implement Decode::from_ber in a way that doesn't require transcoding the entire document to DER first, using an approach similar to #810 (we have since added an IndefiniteLength type), and perhaps carrying the flag for using BER-like rules on the Reader, for lack of a better option.

@bkstein
Copy link
Contributor

bkstein commented Nov 14, 2023

I think we can probably implement Decode::from_ber in a way that doesn't require transcoding the entire document to DER first, using an approach similar to #810 (we have since added an IndefiniteLength type), and perhaps carrying the flag for using BER-like rules on the Reader, for lack of a better option.

@tarcieri I have started doing that based on #810. I think I got pretty far with this implementation, but there are one or two issues left (correct handling of the end-of-content markers). The main idea was, to modify the der parser to do look-ahead evaluation of indefinite lengths. Whenever the parser encounters an indefinite length, it stores the current position and makes a "fast-forward" until the end of the current TLV is reached. Then, the length is "definite"ly known and we can resume parsing with the definite length at the stored position. This look-ahead must be recursive, even if e.g. an OctetString is parsed, because it might contain other (nested) TLVs with indefinite lengths.
I have uploaded this to my repo, if you want to have a look at it.
And as you mentioned IndefiniteLength: I changed der::Header:

pub struct Header {
    /// Tag representing the type of the encoded value
    pub tag: Tag,

    /// Length of the encoded value
    pub length: IndefiniteLength,
}

This implementation is not complete, but I think it is already more complete, than the current #810.

@tarcieri
Copy link
Member

@bkstein I'd like to avoid making changes to the existing APIs like that. The idea would be to use IndefiniteLength as an encoding type to parse lengths in "BER mode", but then convert back to a Length in the public API.

However, I'd definitely like to also preserve the way the existing implementation works so indefinite lengths are rejected when parsing DER.

@bkstein
Copy link
Contributor

bkstein commented Nov 16, 2023

Yes, the Header change would break compatibility. I saw, that signature's crates dsa and ecdsa access Header::length, so this would actually affect compatibility here.
However, I introduced a flag is_parsing_ber in Reader. Is that approximately, what you intend with "BER mode"? In "DER mode", parsing is unchanged.

@tarcieri
Copy link
Member

I was thinking an enum like Encoding::Ber which would improve clarity, I think, but yes that's the basic idea

tarcieri added a commit that referenced this issue Jan 9, 2024
Adds an enum with `Ber` and `Der` (default) variants which can be used
to selectively allow a limited number of BER productions when decoding
certain BER-based security-oriented formats, e.g. CMS, PKCS#8.

Currently this doesn't actually do anything, however the goal is to
address #779, where we can't decode CMS generated by Apple tooling.

PR #810 is an example of how the rules could be relaxed to support
`IndefiniteLength`s.
@tarcieri
Copy link
Member

tarcieri commented Jan 9, 2024

I've bumped der's version to v0.8.0-pre.0 so we're now allowed to make breaking changes, which should make this easier.

I opened #1321 which adds an EncodingRules enum with Ber and Der variants, as well as a Reader::encoding_rules method to interrogate them.

tarcieri added a commit that referenced this issue Jan 9, 2024
Adds an enum with `Ber` and `Der` (default) variants which can be used
to selectively allow a limited number of BER productions when decoding
certain BER-based security-oriented formats, e.g. CMS, PKCS#8.

Currently this doesn't actually do anything, however the goal is to
address #779, where we can't decode CMS generated by Apple tooling.

PR #810 is an example of how the rules could be relaxed to support
`IndefiniteLength`s.
@Firstyear
Copy link

@tarcieri For reference, we were interested in the BER capability as we want to unify our LDAP and Kerberos parsers on one trusted encoding/decoding stack. Currently we have a "ber only" parser, but we would like to move to this crate. I'll do some testing soon to see what we need for extending this to support BER in our cases.

@tarcieri tarcieri changed the title pkcs7, der: mixed BER/DER encoding with undefined length pkcs7+der: mixed BER/DER encoding with undefined length Aug 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants