Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[analyzer] Clean up list of taint propagation functions (llvm#91635)
This commit refactors GenericTaintChecker and performs various improvements in the list of taint propagation functions: 1. The matching mode (usually `CDM::CLibrary` or `CDM::CLibraryMaybeHardened`) was specified to avoid matching e.g. C++ methods or functions from a user-defined namespace that happen to share the name of a well-known library function. 2. With these matching modes, a `CallDescription` can automatically match builtin variants of the functions, so entries that explicitly specified a builtin function were removed. This eliminated inconsistencies where the "normal" and the builtin variant of the same function was handled differently (e.g. `__builtin_strlcat` was covered, while plain `strlcat` wasn't; while `__builtin_memcpy` and `memcpy` were both on the list with different propagation rules). 3. The modeling of the functions `strlcat` and `strncat` was updated to propagate taint from the first argument (index 0), because a tainted string should remain tainted even if we append something else to it. Note that this was already applied to `strcat` and `wcsncat` by commit 6ceb1c0. 4. Some functions were updated to propagate taint from a size/length argument to the result: e.g. `memcmp(p, q, get_tainted_int())` will now return a tainted value (because the attacker can manipulate it). This principle was already present in some propagation rules (e.g. `__builtin_memcpy` was handled this way), and even after this commit there are still some functions where it isn't applied. (I only aimed for consistency within the same function family.) 5. Functions that have hardened `__FOO_chk()` variants are matched in `CDM:CLibraryMaybeHardened` to ensure consistent handling of the "normal" and the hardened variant. I added special handling for the hardened variants of "sprintf" and "snprintf" because there the extra parameters are inserted into the middle of the parameter list. 6. Modeling of `sscanf_s` was added, to complete the group of `fscanf`, `fscanf_s` and `sscanf`. 7. The `Source()` specifications for `gets`, `gets_s` and `wgetch` were ill-formed: they were specifying variadic arguments starting at argument index `ReturnValueIndex`. (That is, in addition to the return value they were propagating taint to all arguments.) 8. Functions that were related to each other were grouped together. (I know that this makes the diff harder to read, but I felt that the full list is unreadable without some reorganization.) 9. I spotted and removed some redundant curly braces. Perhaps would be good to switch to a cleaner layout with less nested braces... 10. I updated some obsolete comments and added two TODOs for issues that should be fixed in followup commits.
- Loading branch information