Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add helpers to perform consistency checks on slab cache #226

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

imran-kn
Copy link
Contributor

@imran-kn imran-kn commented Dec 1, 2022

Both slub_debug (for SLUB allocator) and CONFIG_DEBUG_SLAB (for SLAB allocator) are useful in debugging memory errors involving slab cacheobjects. But in both the cases the error reporting may get delayed or may get missed depending on other conditions. Further for slub_debug we need to include slub_debug=F (i.e consistency checks) to get error reports. slub_debug=F has decent overhead because it involves checking each object right before allocation and free.

If slub_debug=F is not enabled due to performance constraints or if errors were not reported by SLAB/SLUB debuggers at run time, we can still comb through a vmcore (for a kernel where SLUB/SLAB debugging was enabled) manually and look for objects that have their poison or redzone values wrong.

This changeset introduces helper that can perform consistency checks on slab-cache objects in a vmcore.

For example in a vmcore (kernel booted with slub_debug=FPZU) where all objects are fine we get a report like below:

kmalloc_64_cache = find_slab_cache(prog, "kmalloc-64")
slab_cache_check_consistency(kmalloc_64_cache, "void *")
Starting consistency check for: kmalloc-64 (struct kmem_cache *)0xffff905e7f407740
Start checking individual slabs.
Checking slab: 0xfffffcf4c1ab3500
Checking slab: 0xfffffcf4c1ab4600
...................
...................
Checking slab: 0xfffffcf4c1fdbf80
Checking slab: 0xfffffcf4c1fded00
Finished checking individual slabs.
Start checking free objects.
Finished checking free objects.
Start checking allocated objects.
Finished checking allocated objects.
Finished consistency check for slab-cache: kmalloc-64
Number of checked slabs: 125
Number of checked allocated objects: 1974
Number of checked free objects: 26

On the other hand in a vmcore where some kmalloc-64 objects have been corrupted
due to OOB access, we get a report like below:

kmalloc_64_cache = find_slab_cache(prog, "kmalloc-64")
slab_cache_check_consistency(kmalloc_64_cache, "void *")
Starting consistency check for: kmalloc-64 (struct kmem_cache *)0xffff96c13d407740
Start checking individual slabs.
Checking slab: 0xfffff1bb41a34280
Checking slab: 0xfffff1bb41a37800
Checking slab: 0xfffff1bb41a37c00
.................
.................
Checking slab: 0xfffff1bb41f5ad80
Checking slab: 0xfffff1bb41f5bf80
Checking slab: 0xfffff1bb41f5ed00
Finished checking individual slabs.
Start checking free objects.
Finished checking free objects.
Start checking allocated objects.
Slab-cache: b'kmalloc-64' Object: 0xffff96c13919ee40 Right Redzone overwritten
Info: 0xffff96c13919ee80 - 0xffff96c13919ee80 @offset= 64 First byte 55 instead of 0xcc
Slab-cache: b'kmalloc-64' Object: 0xffff96c1391b4240 Right Redzone overwritten
Info: 0xffff96c1391b4280 - 0xffff96c1391b4280 @offset= 64 First byte 55 instead of 0xcc
Slab-cache: b'kmalloc-64' Object: 0xffff96c1392aca40 Right Redzone overwritten
Info: 0xffff96c1392aca80 - 0xffff96c1392aca80 @offset= 64 First byte 55 instead of 0xcc
Slab-cache: b'kmalloc-64' Object: 0xffff96c13cc1ea40 Right Redzone overwritten
Info: 0xffff96c13cc1ea80 - 0xffff96c13cc1ea80 @offset= 64 First byte 55 instead of 0xcc
Slab-cache: b'kmalloc-64' Object: 0xffff96c13cc1ec40 Right Redzone overwritten
Info: 0xffff96c13cc1ec80 - 0xffff96c13cc1ec80 @offset= 64 First byte 55 instead of 0xcc
Finished checking allocated objects.
Finished consistency check for slab-cache: kmalloc-64
Number of checked slabs: 126
Number of checked allocated objects: 1981
Number of checked free objects: 35

From another vmcore, where kmalloc-64 objects have been corrupted because of UAF:

kmalloc_64_cache = find_slab_cache(prog, "kmalloc-64")
slab_cache_check_consistency(kmalloc_64_cache, "void *")
Starting consistency check for: kmalloc-64 (struct kmem_cache *)0xffff8e2aff407740
Start checking individual slabs.
Checking slab: 0xffffe18ac1ab2480
Checking slab: 0xffffe18ac1ac2880
Checking slab: 0xffffe18ac1ac4b80
Checking slab: 0xffffe18ac1ac4d00
..................
..................
Checking slab: 0xffffe18ac1fdad80
Checking slab: 0xffffe18ac1fdbf80
Checking slab: 0xffffe18ac1fded00
Finished checking individual slabs.
Start checking free objects.
Slab-cache: b'kmalloc-64' Object: 0xffff8e2aeac92c40 Poison overwritten
Info: 0xffff8e2aeac92c4a - 0xffff8e2aeac92c4a @offset= 10 First byte 66 instead of 0x6b
Slab-cache: b'kmalloc-64' Object: 0xffff8e2aeac93440 Poison overwritten
Info: 0xffff8e2aeac9344a - 0xffff8e2aeac9344a @offset= 10 First byte 66 instead of 0x6b
Slab-cache: b'kmalloc-64' Object: 0xffff8e2aeac93c40 Poison overwritten
Info: 0xffff8e2aeac93c4a - 0xffff8e2aeac93c4a @offset= 10 First byte 66 instead of 0x6b
Slab-cache: b'kmalloc-64' Object: 0xffff8e2aeb309c40 Poison overwritten
Info: 0xffff8e2aeb309c4a - 0xffff8e2aeb309c4a @offset= 10 First byte 66 instead of 0x6b
Slab-cache: b'kmalloc-64' Object: 0xffff8e2aeb308040 Poison overwritten
Info: 0xffff8e2aeb30804a - 0xffff8e2aeb30804a @offset= 10 First byte 66 instead of 0x6b
Finished checking free objects.
Start checking allocated objects.
Finished checking allocated objects.
Finished consistency check for slab-cache: kmalloc-64
Number of checked slabs: 124
Number of checked allocated objects: 1976
Number of checked free objects: 8

Besides adding above mentioned helper, this change-set introduces some other helpers
as well which can be used for other debugging purposes.

I have added unit tests for some of the helpers added here and am working on unit tests for the remaining (slab_cache_for_each_slab, slab_cache_check_slab, slab_cache_check_object_address and slab_cache_check_object). In the meanwhile I thought of floating these changes for review so that I can get some feedback around their usability and improvement.

Having dedicated helpers to iterate over all slabs of a slab-cache,
not only helps in current use case(s) (i.e.iterating over objects
of a slab-cache) but it also helps in cases where we intend to
perform sanity checks on individual slabs.
For example we can iterate over all slabs to check that number of
objects are within allowed limit or to check that number of inuse
objects is not more than total number of objects on the slab.

Signed-off-by: Imran Khan <[email protected]>
…_each_slab.

slab_cache_for_each_slab was introduced in previous commit. It has its uses but
even now for_each_allocated_object can make use of it.

Signed-off-by: Imran Khan <[email protected]>
Right now _page_objects returns allocated objects of the slab for
both SLUB and SLAB. In certain debug scenarios we may be interested
in all or only free objects of a slab.
Next changes are introducing such  helpers.
So rename _page_objects to _page_allocated_objects, to indicate its
current functionality and also to allow addition of other helpers.

Signed-off-by: Imran Khan <[email protected]>
These are useful when we want to traverse list of free objects
of a slab-cache.

Signed-off-by: Imran Khan <[email protected]>
This is useful, when we intend to iterate through all objects
of a slab (and by extension slab-cache), to perform sanity checks
on individual objects.

Signed-off-by: Imran Khan <[email protected]>
With slab debugging we expect certain magic values to be present
in specific object areas.
These object areas are checked as part of slab consistency checks.
Add relevant poison patterns and helpers to check presence of
slub debug options.

Signed-off-by: Imran Khan <[email protected]>
This can be used to verfiy that pointers lying on a regular or
lockless freelists are valid addresses of slab objects or not.

Signed-off-by: Imran Khan <[email protected]>
This can be used to verify different object counts(max,
current), to verify slab padding etc.

Signed-off-by: Imran Khan <[email protected]>
This is useful in validating entire object if slub/slab debugging
has been enabled. It verifies redzone and padding area around objects.
For free objects it also verifies that payload area of object contains
correct poison values, if poisoning has been enabled.

Signed-off-by: Imran Khan <[email protected]>
Both slub_debug (for SLUB allocator) and CONFIG_DEBUG_SLAB (for SLAB
allocator) are useful in debugging memory errors involving slab cache
objects. But in both the cases the error reporting may get delayed or
may get missed depending on other conditions.
Further for slub_debug we need to include slub_debug=F (i.e consistency
checks) to get error reports. slub_debug=F has decent overhead because
it involves checking each object right before allocation and free.

If slub_debug=F is not enabled due to performance constraints or if
errors were not reported by SLAB/SLUB debuggers at run time, we can
still comb through a vmcore (for a kernel where SLUB/SLAB debugging
was enabled) manually and look for objects that have their poison or
redzone values wrong.

The helper introduced in this change is useful in such cases. As an
example below snippet shows UAF errors involving 5 kmalloc-64 objects

...
Finished checking individual slabs.
Start checking free objects.
Slab-cache: b'kmalloc-64' Object: 0xffff8e2aeac92c40 Poison overwritten
Info:  0xffff8e2aeac92c4a - 0xffff8e2aeac92c4a  @offset= 10 First byte 66 instead of 0x6b
Slab-cache: b'kmalloc-64' Object: 0xffff8e2aeac93440 Poison overwritten
Info:  0xffff8e2aeac9344a - 0xffff8e2aeac9344a  @offset= 10 First byte 66 instead of 0x6b
Slab-cache: b'kmalloc-64' Object: 0xffff8e2aeac93c40 Poison overwritten
Info:  0xffff8e2aeac93c4a - 0xffff8e2aeac93c4a  @offset= 10 First byte 66 instead of 0x6b
Slab-cache: b'kmalloc-64' Object: 0xffff8e2aeb309c40 Poison overwritten
Info:  0xffff8e2aeb309c4a - 0xffff8e2aeb309c4a  @offset= 10 First byte 66 instead of 0x6b
Slab-cache: b'kmalloc-64' Object: 0xffff8e2aeb308040 Poison overwritten
Info:  0xffff8e2aeb30804a - 0xffff8e2aeb30804a  @offset= 10 First byte 66 instead of 0x6b
Finished checking free objects.
Start checking allocated objects.
Finished checking allocated objects.
Finished consistency check for slab-cache: kmalloc-64
Number of checked slabs: 124
Number of checked allocated objects: 1976
Number of checked free objects: 8
...

Signed-off-by: Imran Khan <[email protected]>
Add unit tests for slab helpers added in this change set.

Signed-off-by: Imran Khan <[email protected]>
@imran-kn
Copy link
Contributor Author

I have changed consistency checking functions to validators:
slab_cache_check_object_address ---> slab_cache_validate_object_address
slab_cache_check_object ---> slab_cache_validate_object
slab_cache_check_slab---> slab_cache_validate_slab
slab_cache_check_consistency --> slab_cache_validate

Also the changes have been rebased on current slab.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant