Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid global path conditions in Kani's library #2394

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

tautschnig
Copy link
Member

Description of changes:

There is no need to maintain global path conditions in order to constrain non-deterministic data: we can locally pick an arbitrary one of the permitted values. This avoids propagating guards on input data across all properties subsequently seen in symbolic execution.

Resolved issues:

n/a

Related RFC:

n/a

Call-outs:

n/a

Testing:

  • How is this change tested? Tested locally and found fewer path conditions in CBMC's --program-only output.

  • Is this a refactor change? No.

Checklist

  • Each commit message has a non-empty body, explaining why the change was made
  • n/a Methods or procedures are documented
  • Regression or unit tests are included, or existing tests cover the modified code
  • My PR is restricted to a single feature or bugfix

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

@tautschnig tautschnig requested a review from a team as a code owner April 20, 2023 11:42
@tautschnig tautschnig force-pushed the no-assume branch 2 times, most recently from 8df4908 to 5068b06 Compare April 20, 2023 12:54
There is no need to maintain global path conditions in order to
constrain non-deterministic data: we can locally pick an arbitrary one
of the permitted values. This avoids propagating guards on input data
across all properties subsequently seen in symbolic execution.
Copy link
Contributor

@zhassan-aws zhassan-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tautschnig. For my own education, can you clarify how this may improve CBMC's performance? My understanding is that an assume expands to something along the lines of if (!cond) exit(0). If we replace it with:

if value is valid {
    return value
} else {
    return a default valid value
}

Would this result in more guards propagated (for the two branches) in symex?

library/kani/src/arbitrary.rs Show resolved Hide resolved
let val = u32::any();
crate::assume(val <= 0xD7FF || (0xE000..=0x10FFFF).contains(&val));
unsafe { char::from_u32_unchecked(val) }
let val = u32::any() & 0x0FFFFF;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this eliminate valid char values like 0x10FFFE?

Copy link
Contributor

@celinval celinval Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could add a harness that this logic covers all valid char values and that it doesn't produce any invalid ones. For example, I wrote a harness that uses the same logic here, but add two assertions (if I understood this correctly):

#[kani::proof]
fn any() -> char {
    // Generate an arbitrary u32 and constrain it to make it a valid representation of char.
    let val = kani::any::<u32>() & 0x0FFFFF;
    if val & 0xD800 != 0 {
        // val > 0xD7FF && val < 0xE000
        // This branch should only cover values that cannot be converted to UTF-8.
        assert!(char::from_u32(val).is_none());
        unsafe { char::from_u32_unchecked(0x10FFFF) }
    } else {
        // This branch should only convert valid values.
        assert!(char::from_u32(val).is_some());
        unsafe { char::from_u32_unchecked(val) }
    }
}

The assertion of the if branch fails. The CEX I got was for 69632.

Copy link
Contributor

@celinval celinval Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following harness uses cover to check if our logic covers all characters:

/// Same logic as `kani::any::<char>()` but that takes the u32 as input.
fn any_char(input: u32) -> char {
    let val = input & 0x0FFFFF;
    if val & 0xD800 != 0 {
        // val > 0xD7FF && val < 0xE000
        unsafe { char::from_u32_unchecked(0x10FFFF) }
    } else {
        unsafe { char::from_u32_unchecked(val) }
    }
}

/// Checks that we generate all possible valid characters.
#[kani::proof]
fn check_all_valid_char() {
    let input: u32 = kani::any();
    if let Some(valid) = char::from_u32(input) {
        if valid != any_char(input) {
            // If this is unreachable, we produce all valid characters.
            // If this is satisfiable, we may produce all valid characters.
            // If this is unsat, we do not produce value for all valid characters.
            kani::cover!(any_char(kani::any()) == valid)
        }
    }
}

This cover is unsatisfiable.

@@ -100,8 +103,11 @@ macro_rules! nonzero_arbitrary {
#[inline(always)]
fn any() -> Self {
let val = <$base>::any();
crate::assume(val != 0);
unsafe { <$type>::new_unchecked(val) }
if val == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it useful to eliminate branches like you did in the bool case? If so, we can use:

   let val = val | (((val == 0) as $base) & 0x1);

or something simpler (if possible). This sets the last bit only if the value is 0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will do it, right?

   let val = val | (val == 0) as $base;

@tautschnig tautschnig marked this pull request as draft April 20, 2023 19:40
@celinval
Copy link
Contributor

Thanks @tautschnig. For my own education, can you clarify how this may improve CBMC's performance? My understanding is that an assume expands to something along the lines of if (!cond) exit(0). If we replace it with:

if value is valid {
    return value
} else {
    return a default valid value
}

Would this result in more guards propagated (for the two branches) in symex?

Isn't this similar to the problem we had when we added an assume(!cond) after every assert!(cond)? If I remember correctly, any assumption condition becomes a reachability condition for every code that follows its execution.

@tautschnig tautschnig added the Z-BenchCI Tag a PR to run benchmark CI label Sep 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Z-BenchCI Tag a PR to run benchmark CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants