GitHub - shobhitk18/Kernel-Hacking-And-Debugging: Injecting bugs in the kernel and demonstrating use of kernel-hacking option to debug them

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commits
Makefile		Makefile
README		README
common.h		common.h
install_module.sh		install_module.sh
kernel_main.config		kernel_main.config
memleak_deadlock.config		memleak_deadlock.config
sys_trigger_bug.c		sys_trigger_bug.c
without_kho.config		without_kho.config
xtrigger_bug.c		xtrigger_bug.c

Repository files navigation

As part of this assignment , we have demonstrated 13 kernel hacking option to trigger and catch various bugs in kernel code.

System Design :
To facilitate triggering of KHOs , we have implemented a new system call “__NR_trigger_bug”.
The user program takes the command number from users and triggers the respective bug.

The syscall program is named as "sys_trigger_bug.c". The user program is named as "xtrigger_bug.c"
The options are defined in a common header file called "common.h" , which is shared between the user and kernel code. Our design ensures efficiency
as multiple bugs can be triggered with a single kernel config and single module. Different configs are maintained for the conflicting KHOs.

We have 3 config files :
1. kernel_main.config
-config file to run all Kernel hacking options except BUG_KERNEL_MEM_LEAK (bug code 3) and BUG_DEADLOCK (bug code 5)
2. memleak_deadlock.config
-config file to run 2 kernel hacking options: BUG_KERNEL_MEM_LEAK (bug code 3) and BUG_DEADLOCK (bug code 5)
3. without_kho.config
-config file without kernel hacking options enabled

User level validations :
1. Missing argument
If the user doesn’t give an option along with the command , error message will be returned.
2. Invalid Argument
If the user gives an invalid option , for which bug code is not defined , an error message will be returned to the user.

We have detected various classes of bugs in the code such as memory management , bugs related to locking , detection of stalls / delays ,
linked list manipulations , device driver related errors. The options used and code implementation is listed below.

Run Instructions:
The Makefile is located in CSE-506 dir.
Also there is a script "install_module.sh"
This scripts loads and removes the module along with make. So there is no need to do make separately.
Simply run: sh install_module.sh

Kernel Hacking Options :

1. BUG_RW_SEMAPHORE : [1]
Option Enabled :
RW Semaphore debugging: basic checks ( CONFIG_DEBUG_RWSEMS)
Implementation:
This kernel hacking option allows detection of conflicts in RW semaphore locking.
RW locks can be taken in exclusive / shared mode i.e. in write / read mode. The locks need to be unlocked in the respective mode ,
otherwise the critical section can be corrupted or can be in an inconsistent state.
To detect such mismatches , this kernel hacking option is introduced. To trigger this bug we have taken the lock in one mode and
unlocked it in a different mode. The kernel hacking option detects the anomaly and gives a warning message at runtime in dmesg (along with the call trace) as below:
DEBUG_LOCKS_WARN_ON (sem->owner != get_current())
WARNING: CPU: 0 PID: 6487 at kernel/locking/rwsem.c:134 up_write+0x75/0x80
Command to Run:
./xtrigger_bug 1
NOTE:
Reboot kernel before running this option.

2. BUG_SLEEP_INSIDE_ATOMIC_SECTION
Option Enabled:
Sleep inside atomic section checking (CONFIG_DEBUG_ATOMIC_SLEEP)
Implementation:
Spin Locks are fast and held for small critical/atomic sections. So processes are not supposed to sleep inside the atomic section.
This config when enabled detects such occurrences (process sleeping inside the atomic section). In our implementation, the current
process holds a spinlock, and inside the atomic section tries to allocate kernel RAM of 10k bytes using kmalloc with 'GFP_KERNEL' in
which the process sleeps. This in turn triggers the bug which is detected when this option is enabled, which is listed in dmesg as:
BUG: sleeping function called from invalid context at mm/slab.h:421 [37956.023826] in_atomic() (along with the call trace).
Command to Run:
./xtrigger_bug 2

3. BUG_KERNEL_MEM_LEAK
Option Enabled:
Kernel memory leak detector (CONFIG_DEBUG_KMEMLEAK)
Implementation:
A memory leak occurs when a kernel memory is allocated and not freed. This option when enabled detects memory leaks. We simply
allocated the memory using kmalloc and did not free it.
A kernel thread scans the memory every 10 minutes (by default) and prints the number of new unreferenced objects found. To display
the details of all the possible memory leaks [2]:
# mount -t debugfs nodev /sys/kernel/debug/
# cat /sys/kernel/debug/kmemleak
To trigger an intermediate memory scan:
# echo scan > /sys/kernel/debug/kmemleak
The file /sys/kernel/debug/kmemleak lists the memleaks as follows:
unreferenced object 0xffff9e3037a00000 (size 1000000):
comm "xtrigger_bug", pid 4896, jiffies 4294941456 (age 208.064s)hex dump (first 32 bytes):
f0 77 ff ff 05 00 00 00 00 00 00 00 00 00 00 00 .w..............
1c 00 00 00 74 05 00 00 e8 77 ff ff 32 00 00 00 ....t....w..2...
backtrace:
[<0000000003ce7d60>] trigger_kernel_mem_leak+0x14/0x30 [sys_trigger_bug]

Command to Run:
./xtrigger_bug 3 [Run this twice]
NOTE:
This won't run in conjuction with debug slab option.

4. BUG_DEBUG_VM_PAGE
Options Enabled :
Debug VM (CONFIG_DEBUG_VM) , Debug page-flags operations (CONFIG_DEBUG_VM_PGFLAGS)
Implementation:
This options enables extra validation on page flags operations. To trigger this bug , we have poisoned the “flag” field in the page
structure. When we try to free the allocated page , this option gets triggered. It checks the flag field value and gives BUG as the
flag field is corrupted. The bug is shown in dmesg (along with the call trace) as below:
BUG: Bad page state in process xtrigger_bug.
page ffffca88140... is unitialized and poisoned
page dumped because PAGE_FLAGS_CHECK_AT_FREE is set bad because of flags 0x30f231(locked|lru|active|slab|reserved|private|private_2|writeback|unevictable|mlocked)
Command to run :
./xtrigger_bug 4

5. BUG_DEADLOCK:
Option Enabled :
RT Mutex debugging, deadlock detection (CONFIG_DEBUG_RT_MUTEXES)
Implementation :
This option helps us in catching deadlock scenarios and prints info in dmesg on the cause of the deadlock if occured because of
mutex(cyclic dependency). A very obvious scenario would be a race between two processes trying to grab two mutexes in opposite
orders. This would lead to a deadlock in a race condition where one thread takes lock A and the other thread takes lock B and now
they both will wait forever to grab lock B and lock A respectively.
This scenario has been regenerated in the code and demonstrated.
The option emits some useful description of the locks held and possible cause for the deadlock.
The dmesg shows the following (along with the call trace):
WARNING: bad unlock balance detected!
thread1/4888 is trying to release lock (test_mutex_lock2) at:
[<ffffffffc026d27e>] t1_func+0x7e/0x110 [sys_trigger_bug] but there are no more locks to release!
WARNING: possible circular locking dependency detected
Command to Run:
./xtrigger_bug 5
NOTE:
Reboot the kernel before running this option

6. BUG_LINKED_LIST_CORRUPTION
Option Enabled :
Debug linked list manipulation (CONFIG_DEBUG_LIST)
Implementation:
This option helps us in finding/catching bugs at runtime relating to list manipulation in kernel. This is a very strong feature
as it catches several types of manipulation bugs.
One particular bug that has been demonstrated in our work is the list corruption. Here we deliberately corrupt the pointer of the node
in a list and then while trying to add a new node there are some extra checks that are performed because of this option which can detect
any pointer mismatches. There is also a way to catch bugs where we add the same node twice in a linked list.(Not Demonstrated though)
The dmesg shows the following (along with the call trace):
list_add corruption. prev->next should be next (ffffa79b80543e90), but was 00000000deadbeef. (prev=ffff92c574c34d28).
WARNING: CPU: 0 PID: 4989 at lib/list_debug.c:28 __list_add_valid+0x6a/0x70
Command to Run:
./xtrigger_bug 6

7. BUG_SOFT_LOCKUP
Option Enabled :
Detect Soft Lockups (CONFIG_SOFTLOCKUP_DETECTOR)
Implementation:
This option helps in finding tasks that are looped in kernel mode for more than 20 secs. With the use of this kernel hacking option,
we can find task that take up CPU times unnecessarily. To demonstrate this feature, we simple ran a while loop for around 30 secs.
This essentially makes the thread take up the CPU core for 30 secs. With the option enabled kernel prints out appropriate message,
mentioning the thread that caused soft lockup. Also, the thread causing the soft-lockup will automatically die after 30 secs, so that
the core does not stall. The bug is shown in dmesg (along with the call trace)as below :
watchdog: BUG: soft lockup CPU#1 stuck for 23s![xtrigger_bug:4997]
Command to Run:
./xtrigger_bug 7

8. BUG_INVALID_NOTIFIER
Option Enabled :
Debug notifier call chains (CONFIG_DEBUG_NOTIFIERS)
Implementation: [3]
This adds sanity check for notifier call chains. Notifier chains allow a device to inform about any event or status through the function
calls registered by the notifiers. Each device maintains a structure called notifier_block, which keeps a list of notifiers that subscribed
to the device. Upon an event, the notification publisher module traverses the notifier list and call the event-handlers of each registered
notifiers. But, if the registered event-handler is outside the kernel TEXT segment, it may taint the kernel. Thus, this debug option becomes
helpful for device drivers, to check if their notifiers are valid with correct handler methods registered. To create such a scenario, we have
created a dummy publisher and a dummy subscriber, that passes the event-handler function pointer with a value of 0, which is way outside the
kernel TEXT segment. And we trigger a dummy event from our subscriber, which will catch the issue of invalid notifier handler being registered.
The dmesg shows the following (along with the call trace):
Invalid notifier called!
WARNING: CPU: 1 PID: 4951 at kernel/notifier.c:88 notifier_call_chain+0x86/0x90
Command to Run:
./xtrigger_bug 8

9. BUG_SCATTERLIST_CHAINED
Option Enabled :
Debug SG table operations (CONFIG_DEBUG_SG)
Implementation:
This option turns on checks on scatter/gather tables. Scatterlist allows us to create huge buffers that are scattered around the physical
memory. Each scatterlist object points to a page in Memory. However, we can chain a single-page scatterlist to point to another scatterlist.
Information about whether a scatterlist is chained is maintained by simply overloading the last scatterlist entry in the page_link. However,
it does not make sense to assign a page to an already chained scatterlist object. This, this hacking option enables us to catch such issues
where we incorrectly try to assign a page to a chained scatterlist array. The dmesg shows the following (along with the call trace):
kernel BUG at ./include/linux/scatterlist.h:97!
invalid opcode: 0000 [#1] SMP PTI
Command to Run:
./xtrigger_bug 9

10.BUG_DMA_API
Option Enabled:
Debug slab memory allocations (CONFIG_DEBUG_SLAB)
Implementation:
We create a dummy device with dma capabilities and then allocate DMA-coherent buffers using
void * dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle, gfp_t flag).
It returns a pointer to the allocated region (consistent memory) or NULL if the allocation fails.
We then free this consistent memory allocated passing NULL in the device param in:
void dma_free_coherent(struct device *dev, size_t size, void *cpu_addr, dma_addr_t dma_handle). Here cpu_addr is the address of
the allocated DMA-coherent buffer [4].
This then triggers the bug where device driver tries to free memory it has not allocated as below in dmesg:
NULL NULL: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x0000000138b51000] [size=100 bytes]
(along with the call trace)
Command to Run:
./xtrigger_bug 10

Extra Credits :

11. SLAB_VALIDATOR
Option Enabled :
Implementation:
This is a crucial option that enables various types of checks in the kernel memory allocation functions. Issues like memory overrun and
missing initialization error could be caught using this option. Use-after-free is a common issue encountered with any faulty kernel code
and it becomes difficult to catch such errors as it may not fail under normal circumstances. However, it could lead to kernel panics in
some cases. Thus, this option becomes an useful candidate for any kernel code debugging. To demonstrate, we simply kmalloc’ed an char array
and then kfree’ed it. On subsequent use/access of the same array, the kernel complained of SLAB corruption, because of this kernel hacking
option. The dmesg shows the following:
Slab corruption (Tainted: G D W OE ): kmalloc-32 start=ffff9de335ecb960, len=32
000: 61 62 63 00 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b abc.kkkkkkkkkkkk
Prev obj: start=ffff9de335ecb940, len=32
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk.
Next obj: start=ffff9de335ecb980, len=32
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk.
Command to Run:
./xtrigger_bug 11
NOTE:
This cannot used in conjunction with the kmemleak option

12. BUG_HUNG_TASK
Option Enabled:
Detect Hung Task (CONFIG_DETECT_HUNG_TASK), Default timeout for hung task detection (in seconds) set to 30 for faster detection
in demo. Earlier it was 120 seconds, by default.
Implementation:
Hung tasks are the bugs that cause the task to be stuck in uninterruptible “D” state indefinitely. To do this, we created two
threads t1 and t2. Thread t1 acquires lock 1 and thread t2 acquires lock 2. t1 then tries to acquire lock 2 and t2 tries to acquire lock 1.
Since this causes a deadlock situation, both the threads are stuck and the task gets hung. After 30 second (configurable) we get the following
in dmesg along with the call trace
INFO: task thread1:8549 blocked for more than 30 seconds.
[43106.992007] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Command to run:
./xtrigger_bug 12
NOTE:
Reboot the kernel before running this option.

13. BUG_CRED_MANAGEMENT:
Option Enabled:
Debug credential management (CONFIG_DEBUG_CREDENTIALS)
Implementation:
This debug option helps us catching bugs where we have grabbed reference to credential structure i.e pertaining to tasks, files etc.
and unknowingly mess up with the freeing of the same. We have demonstrated a simple scenario where we try to put back the reference to
a cred more than it was grabbed. This is caught by the kernel and error is shown. This is an extra feature just in case prof decides to
count this in.
Dmesg shows the following (along with the call trace):
kernel BUG at kernel/cred.c:769!
invalid opcode: 0000 [#1] SMP PTI
CPU: 0 PID: 4950 Comm: xtrigger_bug Tainted: G OE 4.20.6+ #26
RIP: 0010:__invalid_creds+0x47/0x50
Command to run:
./xtrigger_bug 13

References :
[1] https://lkml.org/lkml/2018/3/27/1192
[2] https://www.kernel.org/doc/html/v4.19-rc2/dev-tools/kmemleak.html
[3] https://opensourceforu.com/2009/01/the-crux-of-linux-notifier-chains/
[4] https://www.kernel.org/doc/Documentation/DMA-API.txt
[5] https://cateee.net/lkddb/web-lkddb/