Jump to content

Debugging Linux kernel

From Wikibooks, open books for an open world


Performance

[edit | edit source]

There are many factors that can affect the performance of the Linux kernel, including hardware configurations, software configurations, and workload characteristics.

In this context, performance optimization of the Linux kernel involves identifying and addressing performance bottlenecks in the system. This can involve tuning kernel parameters, optimizing system resources, and identifying and fixing bugs and other issues that may be impacting performance.

Given the complexity of the Linux kernel and the wide range of factors that can affect performance, performance optimization can be a challenging task. However, with the right tools and techniques, it is possible to significantly improve the performance and reliability of Linux-based systems.

Perf_events

[edit | edit source]

Perf_events, short for performance events, is a powerful interface that provides detailed insights into the performance characteristics of software running on a system. By analyzing the data collected by perf_events, developers can identify performance bottlenecks and optimize software to improve performance and reduce resource utilization. Perf_events is designed to be a lightweight, low-overhead monitoring solution that has minimal impact on system performance.


🔧 TODO


⚲ Interfaces

man 1 perf – performance analysis tools
Basic commands:
man 1 perf-help – display help information about perf
man 1 perf-top – System profiling tool.
man 1 perf-record – Run a command and record its profile into perf.data
man 1 perf-report – Read perf.data (created by perf record) and display the profile
Other commands:
man 1 perf-annotate – Read perf.data (created by perf record) and display annotated code
man 1 perf-archive – Create archive with object files with build-ids found ...
man 1 perf-arm-spe – Support for Arm Statistical Profiling Extension within...
man 1 perf-bench – General framework for benchmark suites
man 1 perf-buildid-cache – Manage build-id cache.
man 1 perf-buildid-list – List the buildids in a perf.data file
man 1 perf-c2c – Shared Data C2C/HITM Analyzer.
man 1 perf-config – Get and set variables in a configuration file.
man 1 perf-daemon – Run record sessions on background
man 1 perf-data – Data file related processing
man 1 perf-diff – Read perf.data files and display the differential profile
man 1 perf-dlfilter – Filter sample events using a dynamically loaded shared...
man 1 perf-evlist – List the event names in a perf.data file
man 1 perf-ftrace – simple wrapper for kernel's ftrace functionality
man 1 perf-inject – Filter to augment the events stream with additional in...
man 1 perf-intel-pt – Support for Intel Processor Trace within perf tools
man 1 perf-iostat – Show I/O performance metrics
man 1 perf-kallsyms – Searches running kernel for symbols
man 1 perf-kmem – Tool to trace/measure kernel memory properties
man 1 perf-kvm – Tool to trace/measure kvm guest os
man 1 perf-kwork – Tool to trace/measure kernel work properties (latencies)
man 1 perf-list – List all symbolic event types
man 1 perf-lock – Analyze lock events
man 1 perf-mem – Profile memory accesses
man 1 perf-probe – Define new dynamic tracepoints
man 1 perf-sched – Tool to trace/measure scheduler properties (latencies)
man 1 perf-script – Read perf.data (created by perf record) and display tr...
man 1 perf-script-perl – Process trace data with a Perl script
man 1 perf-script-python – Process trace data with a Python script
man 1 perf-stat – Run a command and gather performance counter statistics
man 1 perf-test – Runs sanity tests.
man 1 perf-timechart – Tool to visualize total system behavior during a workload
man 1 perf-trace – strace inspired tool
man 1 perf-version – display the version of perf binary


⚙️ Internals

man 2 perf_event_open – sets up performance monitoring
uapi/linux/perf_event.h inc
tools/perf src
linux/perf_event.h inc
kernel/events/core.c src
kernel/profile.c src – simple profiling


📖 References

perf – instruments CPU performance counters, tracepoints, kprobes, and uprobes
https://perf.wiki.kernel.org/


📚 Further reading

perf Examples
The Unofficial Linux Perf Events Web-Page


🛠️ Utilities

Performance Co-Pilot, https://pcp.io/ – Performance Co-Pilot
Prometheus, https://prometheus.io/
https://github.com/redhat-nfvpe/container-perf-tools
https://github.com/brendangregg/perf-tools – performance analysis tools based on Linux perf_events (aka perf) and ftrace
readprofile – a tool to read kernel profiling information


📚 Further reading

stress-ng – exercises various kernel interfaces
http://trac.gateworks.com/wiki/linux/profiling
Analyzing application performance in RHEL 9
Monitoring and managing system status and performance in RHEL 9
Real-time Linux

User space debug interfaces

[edit | edit source]

⚲ Interfaces

man 1 dmesg – prints or control the kernel ring buffer
man 2 syslog – system call, which is used to control the kernel printk() buffer
man 1 strace – system calls and signals tracing tool
man 2 ptrace – process trace system call
man 3 klogctl
man 5 core
/sys/kernel/debug/ – debugfs
dmesg --console-level <level>
gdb /usr/src/linux/vmlinux /proc/kcore
/proc/self/stack
dynamic doc debug
⌨️ hands-on:
echo "module atkbd +pfl" | sudo tee /sys/kernel/debug/dynamic_debug/control


⚙️ Internals

handle_sysrq id


📚 References

Development tools for the kernel doc
DebugFS doc, samples/qmi/qmi_sample_client.c src
Kprobe-based Event Tracing doc
Dynamic debug doc
Linux Magic System Request Key Hacks doc
Magic SysRq key

Tracing and logging

[edit | edit source]

⚲ API:

User-space interface:

man 1 dmesg – prints or control the kernel ring buffer
man 2 syslog – system call, which is used to control the kernel printk() buffer
/proc/kmsg
man 1 trace-cmd – interacts with Ftrace Linux kernel internal tracer /sys/kernel/debug/tracing/

Most common functions

linux/printk.h inc
pr_devel id- conditional debug-level message
pr_debug id- conditional debug-level or dynamic doc message
⌨️ hands-on:
echo "module atkbd +pfl" | sudo tee /sys/kernel/debug/dynamic_debug/control
Log messages with other levels:
pr_info id, pr_notice id, pr_warn id, pr_err id, pr_crit id, pr_alert id, pr_emerg id
asm-generic/bug.h inc
WARN_ON id
WARN id


⚙️ Internals

printk id
kernel/printk/printk.c src
arch/x86/kernel/traps.c src
lib/dump_stack.c src
kernel/trace src
scripts/tracing/draw_functrace.py src
logging ltp, tracing ltp
samples/ftrace src
samples/trace_events src
samples/trace_printk src
linux/instrumentation.h inc


📚 References:

Debugging by printing
Message logging with printk doc
SystemTap
man 1 stap – systemtap script translator/driver
strace
man 1 strace – trace system calls and signals
LTTng
ftrace
Linux Tracing Technologies doc
Tracepoint Analysis doc
Function Tracer doc – function, latency and event tracing
Event Tracing doc
Using ftrace to hook to functions doc
Fprobe - Function entry/exit probe doc
Kprobes doc
Kprobe-based Event Tracing doc
Uprobe-tracer: Uprobe-based Event Tracing doc
Using the Linux Kernel Tracepoints doc
Subsystem Trace Points: kmem doc
Subsystem Trace Points: power doc
NMI Trace Events doc
In-kernel memory-mapped I/O tracing doc
Event Histograms doc
Histogram Design Notes doc
Boot-time tracing doc
Hardware Latency Detector doc
Intel(R) Trace Hub (TH) doc
Lockless Ring Buffer Design doc
System Trace Module doc
CoreSight - ARM Hardware Trace doc

🔧 TODO. 🚀 advanced features

linux/kmemleak.h inc – memory leak detector
pr_cont id- continues a previous log message in the same line
print_hex_dump_bytes id
print_hex_dump_debug id
dump_stack id
CONFIG_PRINTK_CALLER id
CONFIG_DEBUG_KERNEL id
CONFIG_DEBUG_INFO id
https://git.kernel.org/pub/scm/libs/libtrace/

kgdb and kdb

[edit | edit source]

⚲ Interfaces

linux/kgdb.h inc
linux/kdb.h inc


⚙️ Internals

kernel/debug src


📚 References

Using kgdb, kdb and the kernel debugger internals doc
kdump
kdump doc
man 8 crash – Analyze Linux crash dump data or a live system


⚲ API:

man 2 bpfkernel/bpf/syscall.c src


📖 References

eBPF and BPF doc


📚 Further reading

man 7 bpf-helpers
Linux Extended BPF (eBPF) Tracing Tools
bpftrace – High-level tracing language for Linux eBPF
BCC – Tools for BPF-based Linux IO analysis, networking, monitoring, and more
Example of trace.py
man 8 stapbpf
eBPF Programming for Linux Kernel Tracing
lockdep - Runtime locking correctness validator doc


Watchdogs

[edit | edit source]

The Linux Kernel/Softdog Driver

dev_watchdog id – network device watchdog

The NMI watchdog lockup detectors:

⚲ API

/proc/sys/kernel/nmi_watchdog
/proc/sys/kernel/soft_watchdog
/proc/sys/kernel/watchdog
/proc/sys/kernel/watchdog_cpumask
/proc/sys/kernel/watchdog_thresh
/proc/sys/kernel/hardlockup_all_cpu_backtrace
/proc/sys/kernel/hardlockup_panic
/proc/sys/kernel/softlockup_all_cpu_backtrace
/proc/sys/kernel/softlockup_panic
linux/nmi.h inc


👁️ Example

./lib/test_lockup.c src – test module to generate lockups

Provoke NMI watchdog without panic:

echo 0 > /proc/sys/kernel/hardlockup_panic
insmod test_lockup.ko disable_irq=1 time_secs=13

⚙️ Internals

kernel/watchdog.c src – detects hard and soft lockups on a system
kernel/watchdog_perf.c src – detects hard lockups on a system using perf
kernel/watchdog_buddy.c src

📚 References

Documentation for /proc/sys/kernel/ doc
Softlockup detector and hardlockup detector (aka nmi_watchdog) doc
kernel parameters:
nmi_watchdog param
nowatchdog param
nosoftlockup param
softlockup_panic param

⚙️ Internals

arch/x86/kernel/traps.c src


📖 References for debugging

Ramoops oops/panic logger doc
pstore block oops/panic logger doc
Fault injection doc
Bisecting a bug doc
Development tools for the kernel doc
linux/tracepoint.h inc


📚 Further reading

https://drgn.readthedocs.io/ – programmable debugger
https://crash-utility.github.io/
https://wiki.ubuntu.com/Kernel/Debugging
Linux Applications Debugging Techniques