Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bufferoverrun Analysis - small fixes and improvements #1736

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

sjxer723
Copy link

Hi, this pr contains some improvements and fixes to buffer overrun analysis. The two fixes were originally inspired due to the two false positives/negatives we found during using it.

  1. The buffer overrun checker ignores checking the case when the index may be +oo while the array size is less than +oo. It may cause a false negative. For example,
int a[1];
for(int i=0; a[i]; i++) {}

There will be a buffer overrun issue within the condition judgment of the for statement since i will finally be equal to 1 and the statement will check whether a[1] is nonzero.

  1. The buffer overrun checker makes a rough estimation when adding two upper bounds, i.e., when i1 <= max {1, i1}, i2 <= i2, BO only let i1 + i2 <= 1 + i2, while the true upper bound should be max{1 + i2, i1 + i2}. Hence it will cause a false positive when checking the codes below:
#define MAX_LEN (16)

void foo(const unsigned char *additional, size_t len, size_t nonce_len, size_t entropy_len)
{
        unsigned char seed[MAX_LEN];
        size_t        seedlen = 0;

        if(entropy_len > MAX_LEN) 
            return;
        if(nonce_len > MAX_LEN - entropy_len) 
            return;
        if(len > MAX_LEN - entropy_len - nonce_len) 
            return;

        seedlen +=  entropy_len;
        if(nonce_len != 0)
                seedlen += nonce_len;
        if(additional != NULL && len != 0){
                memcpy(seed + seedlen, additional, len);
                seedlen += len;
        }
}

int main()
{
        unsigned char arr[20] = {0};
        foo(arr, 8, 0, 5);
        return 0;
}

Before executing the fourth if statement, BO will regard the value of seedlen as max{16, entropy_len}. However, when executing the fourth if statement, by adding the two variables seedlen and nonce_len, BO makes a rough estimation as above description, and will regard the upper bound of seedlen as 16 + nonce_len. Hence, it will cause a false positive when executing the call to the memcpy function.

For the two issues above, I have made the following improvements.

  1. Within the checking function of array access, I let BO report a L3-level error when the upper bound of index is +oo and the upper bound of the array size is limited.
  2. I have implemented a more precise plus of bounds. It handles the case when one operator is c1 + max{x1, d1} and another is c2 + d2.

… and the upper bound of size is smaller than +oo

Summary:
The buffer overrun checker misses to handle the case when the upper bound of offset is +oo, and the upper bound of array size is less than +oo, which will causes false negative in some test cases.

For instance, for the following program,

```
int a[1];
for(int i=0; a[i]; i++) {}
```

The variable `i` will eventually be equal 1 and causes an overrun error within the loop statement. However, this error was missed by buffer overrun checker.
@facebook-github-bot
Copy link
Contributor

@skcho has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Comment on lines +350 to +353
else if
Bound.is_infty (ItvPure.ub real_idx) && Bound.is_not_infty (ItvPure.ub size)
then {report_issue_type= Issue IssueType.buffer_overrun_l3; propagate= false}
(* su < iu = +oo, probably an error *)
Copy link
Contributor

@skcho skcho Feb 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, this error was missed by buffer overrun checker.

a.c:4: error: Buffer Overrun L4
  Offset: [0, +oo] Size: 2.
  2. void foo() {
  3. int a[2];
  4. for(int i=0; a[i]; i++) {}
                  ^
  5. }

For the example case, it reports L4 issue as above. The reason you did not see the issue is that it suppresses the issue types that are likely to be false positive, e.g. offset or size interval includes an infinity bound like above.

let buffer_overrun_l4 =
register ~enabled:false ~id:"BUFFER_OVERRUN_L4" Error BufferOverrunChecker
~user_documentation:"See [BUFFER_OVERRUN_L1](#buffer_overrun_l1)"

Note that ~enabled:false is given when defining the L4 issue type.

To enable all issue types, you can pass --no-filtering option, or --enable-issue-type to enable each of them.

Comment on lines +80 to +86
let mem = (
match Tenv.lookup tenv typname with
| Some {fields} ->
let (mem, _) = decl_local_struct_fields_loc model_env loc fields (mem, 1) in mem
| None -> mem
)
in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to pass current dimension and inst_num, to distinguish the allocation sites.

        let mem, inst_num =
          match Tenv.lookup tenv typname with
          | Some {fields} ->
              decl_local_struct_fields_loc model_env loc fields ~inst_num ~dimension mem
          | None ->
              (mem, inst_num)
        in

Comment on lines +953 to +958
| MinMax (c1, Plus, Max, d1, x1), Linear (c2, x2)
| Linear (c2, x2), MinMax (c1, Plus, Max, d1, x1) ->
mk_MinMaxB (Max, (Linear (Z.(c1 + d1 + c2), x2)), (Linear (Z.(c1 + c2), SymLinear.plus (SymLinear.singleton_one x1) x2)))
| MinMax (c1, Minus, Min, d1, x1), Linear (c2, x2)
| Linear (c2, x2), MinMax (c1, Minus, Min, d1, x1) ->
mk_MinMaxB (Max, (Linear (Z.(c1 - d1 + c2), x2)), (Linear (Z.(c1 + c2), SymLinear.plus (SymLinear.singleton_minus_one x1) x2)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While MinMaxB was introduced to express more complex min/max expressions, it is NOT always a better choice than others. By its complex nature, many operations may easily lose precision on MinMaxB bound, for example, during widening it is more likely to have infinity bounds according to my experience. Therefore, if we want to introduce MinMaxB here for the plus operation, we should evaluate overall precision impacts. Let me check if it is fine for our use cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants