Skip to content

Query API

Florian Wendland edited this page Apr 7, 2022 · 20 revisions

We want to create a way to create "rules" or "checks", that check for certain patterns in the code. Therefore we need to decide, if we want to have a "algorithmic" or a "descriptive" way to declare such as check.

Syntax explanation: |x| means, that x should be "resolved", either through constant propagation or other fancy algorithms.

The following examples check that no such bug is present.

Array out of bounds exception

Part of: CWE XXX Do we need to think of negative indices?

forall (n: ArraySubscriptionExpression): |max(n.index)| < |min(n.array.capacity)|

Null pointer exception

forall (n: Node): n.has_base() => |n.base| != null

Memcpy too large source (Buffer Overflow)

Part of CWE 120: Buffer Copy without Checking Size of Input ('Classic Buffer Overflow') --> do we also need to find selfwritten copy functions for buffers?

forall (n: CallExpression): n.invokes.name == "memcpy" => |sizeof(n.arguments[0])| < |sizeof(n.arguments[2])|

Memcpy too small source

forall (n: CallExpression): n.invokes.name == "memcpy" => |sizeof(n.arguments[1])| >= |sizeof(n.arguments[2])|

Division by 0

forall (n: BinaryOperator): n.op == "/" => not exists n.rhs == 0

Integer Overflow/Underflow

For assignments:

forall (n: BinaryOperator): n.op == "=" => |max(n.rhs)| < |max_size(n.type)| land |max(n.rhs)| > |min_size(n.type)|

For other expressions, we need to compute the effect of the operator

Use after free

Intuition: No node which is reachable from a node free(x) must use x. Use EOG for reachability but I'm not sure how to say "don't use x". This is the most basic form.

forall (n: CallExpression): n.invokes.name == "free" => forall u in |EOG(n)|: u != n.arguments[0]

Double Free

Format string attack

arg0 of functions such as printf must not be user input. Since I'm not aware that we have a general model for "this is user input" (yet), we could say that all options for the argument must be a Literal (not sure if the proposed notation makes sense though).

vuln_fcs = ["fprint", "printf", "sprintf", "snprintf", "vfprintf", "vprintf", "vsprintf", "vsnprintf"];
forall (n: CallExpression): n.invokes.name in vuln_fcs => forall u in |backwards_DFG(n.arguments[0])|: u is Literal

Since many classical vulns. (injection) are related to user input, we probably need a way to specify sources of user input (or "sources" in general). To reduce FP, we probably also want to check some conditions over the path between the source and the sink (e.g. some checks are in place to check for critical characters/substrings, do escaping, etc.). Problem: There are tons of options.

Access of Uninitialized Pointer (CWE 824)

Access of Invalid Memory Address

Unsecure Default Return Value

Missing Return Value Validation (Error checking)

Command Injection

Proper Nulltermination of Strings (C specific)

Improper Certificate Validation (CWE 306)

Use of Hard-coded Credentials (CWE 798)

Idea: when crypto API is known, we could follow to input argument for passwords / keys ...

Scribbles

Test arguments of call expression

forall (n: CallExpression): n.invokes.name == "<function name>" => |n.arguments[<no.>]| == <value>

Track return value of call expression

data_flow(from, to) -- there exists a dataflow of the value from into a "sink" to

forall (n1: CallExpression, n2: CallExpression): n1.invokes.name == "<function name 1>" && n2.invokes.name == "<function name 2>" => data_flow(n1.returnValue, n2.arguments[<no.>])

Ensure path property

forall (n: CallExpression, v: Value) : n.invokes.name == "<function name>" && data_flow(v, n.arguments[<no.>]) => inferred_property(v, <property>)

Example:

val algo = read_from_file(/* some file */);
if (val != "AES") {
  throw Exception();
}
val cipher = initialize_cipher(algo); // at this point one can infer that val must have the value "AES"
Clone this wiki locally