Skip to content

KernelGPT: Enhanced Kernel Fuzzing via Large Language Models

Notifications You must be signed in to change notification settings

ise-uiuc/KernelGPT

Repository files navigation

KernelGPT: Enhanced Kernel Fuzzing via Large Language Models

Important

We are keeping improving the documents and adding more implementation details. Please stay tuned at README-DEV.md for more information.

Contact: Chenyuan Yang, Zijie Zhao, Lingming Zhang.

About

  • KernelGPT is a novel approach to automatically inferring Syzkaller specifications via Large Language Models (LLMs) for enhanced kernel fuzzing
  • KernelGPT leverages an iterative approach to automatically infer all the necessary specification components, and further leverages the validation feedback to repair/refine the initial specifications.

Important

  • KernelGPT has detected 19 new bugs 🐛 in the Linux kernel, with 8 assigned with CVEs❗, and 8 of them are fixed.
  • A number of specifications generated by KernelGPT have already been merged into Syzkaller.

🔨 Installation

To install the required packages, run the following command:

pip install -r requirements.txt

Linux & Syzkaller

You need to clone the linux and syzkaller repository to run the code. You can do this by running the following command:

git submodule update --init --recursive

Please refer to the Sykaller documentation for setup instructions.

Image

cd image && bash create-image.sh

🔍 Usage

Parsing

You need to first compile the kernel with Clang and trace the compile commands. To do this, run the following command:

cd linux
make CC=clang HOSTCC=clang allyesconfig
bear -- make CC=clang HOSTCC=clang -j$(nproc)

To parse the Linux repository, run the following command:

cd spec-gen/analyzer
make all

This will create one analyze and one usage executable in the spec-gen/analyzer directory.

⚠️ Possible issues You need to install `clang` and `libclang-dev` to compile the `analyze` and `usage` executables. More specifically, we need the Clang with version 14. You can install it by running the following command:
sudo apt-get install clang-14 libclang-dev

Please refer to the analyzer README for more information.

./analyze -p /path/to/linux/compile_commands.json

Run the process_output.py script

python process_output.py --linux-path /path/to/linux

Then collect the usage information

./usage -p /path/to/linux/compile_commands.json

And run the process_output.py script again

python process_output.py --linux-path /path/to/linux --usage

After that, you will get the following files under the spec-gen/analyzer directory:

processed_enum.json
processed_enum-typedef.json
processed_func.json
processed_handlers.debug.json
processed_handlers.json
processed_ioctl_filtered.json
processed_ioctl.json
processed_struct.json
processed_struct-typedef.json
processed_usage.json

Specification Generation

To generate the specification, first put your OpenAI API key in the openai_key file under the spec-gen directory. Then run the following command:

python gen_spec.py -d analyzer/processed_handlers.json -o spec-output -n 1

This will generate one specification file in the spec-output directory.

Then you can validate and repair the specification by running the following command:

python eval_spec.py -u -s spec-output/_generated --output-name debug -o eval-output

This will validate the specification and generate the repaired specification in the eval-output directory. It will invoke the spec-eval/run-specs.py.