Writing Kernel Modules
2024-01-26
Kernel modification can seem like a herculean task to undertake, despite the bounty of resources available to the inspiring kernel developer. Working on kernel modules is perhaps one of the simplest ways to start working on the kernel. The kernel's modular nature makes this also quite practical, as many important kernel functions are simply kernel modules. Sadly, this task is still quite daunting for the uninitiated. Let me be thy Virgil, as we descend to the brim of the depths that is kernel modding.
You need a kernel source tree of the same version as your kernel.
Preferably one that has already been compiled.
Depending on your distro, you may have one already.
For example, on gentoo, your kernel sources should be located at /usr/src/linux/
.
See this article if you need more help in getting a kernel source tree.
Graciously, the benevolent designers of the kbuild system thought it wise to allow module developers to build modules separate from the kernel source tree itself. This means we can work in whichever directory we see fit.
Step one is to create a new folder for your module, and to populate it with 3 things:
Makefile
.c
file for your module, in this tutorial we will call it lugmod.c
.h
file for userspace, we will call this lugmod.h
It may seem odd that you need a kernel source tree despite the fact you are only working on a module. However, doing so allows us to take advantage of the kernel's robust kbuild system. We will allow kbuild to do all the heavy lifting, whilst we ourselves only need to write a bare-bones makefile for our own module. It's a "it just works" sort of thing.
Here is what our Makefile
will look like:
# This tells kbuild which .c's to compile for the module
obj-m := lugmod.o
It's alright to be amazed at it's simplicity.
That is all that is needed for kbuild to take your source and turn it into a .ko
kernel module.
This is a perfectly good Makefile
for a simple kernel module, but they can get more complex.
More information on writing makefiles for external modules can be found here: https://www.kernel.org/doc/html/latest/kbuild/modules.html.
Now that our working environment is set up, we can create a basic kernel module.
There are really only 3 things necessary to get a module to compile and run: an init
function, an exit
function, and a specification of the module's license.
The init
and exit
functions of the module are executed when the module is loaded or unloaded, respectively.
Thus, these functions are where you will do the bulk of your setup (on load) and your cleaning up (on unload).
#include <linux/init.h>
#include <linux/module.h>
static int lugmod_init(void) {
printk("hewwo ^w^");
return 0;
}
static void lugmod_exit(void) {
printk("bye bye ;w;");
}
module_init(lugmod_init);
module_exit(lugmod_exit);
The module_init
and module_exit
functions take, as you may expect, a function pointer to your initialization and exit functions.
These functions simply inform the Linux kernel on what function to execute when this module is loaded or unloaded.
If you attempt to compile this right away, you will get an error along the lines of:
ERROR: modpost: missing MODULE_LICENSE() in /home/ron/workspace/lugmod/lugmod.o
make[2]: *** [scripts/Makefile.modpost:145: /home/ron/workspace/lugmod/Module.symvers] Error 1
make[1]: *** [/usr/src/linux-6.6.13-gentoo/Makefile:1865: modpost] Error 2
make: *** [Makefile:234: __sub-make] Error 2
This is kbuild yelling at you for not including a license definition in your module.
To do this, include the MODULE_LICENSE
macro somewhere in your source code.
This can take a small set of strings as valid input, the exact licenses supported can be found here: https://www.kernel.org/doc/html/latest/process/license-rules.html.
Thus, our basic kernel module, when all is said and done, will look so:
#include <linux/init.h>
#include <linux/module.h>
MODULE_LICENSE("GPL");
static int lugmod_init(void) {
printk("hewwo ^w^");
return 0;
}
static void lugmod_exit(void) {
printk("bye bye ;w;");
}
module_init(lugmod_init);
module_exit(lugmod_exit);
Now that we have a basic kernel module and a Makefile, we can compile it using kbuild. To do this, run the following command:
make -C [PATH TO YOU LINUX SOURCE TREE GOES HERE] M=$PWD
So, in my case, I will execute the command:
make -C /usr/src/linux M=$PWD
The -C
flag for make informs make to change directory to the one specified in the argument.
Setting the M
variable informs kbuild that you want to build an external module at the specified path (in this case, our working directory).
After executing this command, a .ko
object named the same as your .c
should have been built.
Congratualtions, that's your very own kernel module.
You can load it using the following command:
sudo insmod ./lugmod.ko
At this point, if you check your kernel messages (via the dmesg
command), you should see something like:
[19748.455458] lugmod: loading out-of-tree module taints kernel.
[19748.455719] hewwo ^w^
Beautiful. :3
Further, you can check that your module is running via the lsmod
command, which lists all your currently loaded modules.
Module Size Used by
lugmod 12288 0
Since we've seen all there is that is exciting to see, we can go ahead and unload our module like so:
sudo rmmod lugmod
We should see the corresponding exit message in dmesg
:
[19766.549173] bye bye ;w;
Poor guy.
Our module is a little boring, so let's try to spice it up a little. We will create a character device driver with a simple system call. You might've heard the oft-repeated mantra "Everything in Unix is a file!". Character devices (char devs) are perhaps the epitome of this. These are system functions you interact with like you were interact with any normal file. You use all the normal file system calls on them. A prime example of this in the kernel is the KVM module, which is implemented as a char device. This makes these drivers quite simple to understand.
To create a character device we need only do a few things:
dev_t
for our device, and allocate a char device region for our device
The first thing we need to do is create a dev_t
for our device.
Think of this as a "region" where our actual devices will be held.
static dev_t lugmod_dev_t;
Every device in Linux has two numbers associated to it (represented with the dev_t
struct), a major and a minor number.
We can see all of the current device numbers being used via ls /dev/char
.
10:126 10:239 116:6 13:75 180:97 202:13 203:13 244:1 254:0 4:21 4:36 4:50
...
Alternatively, we can view a list of the major device numbers being used by reading the /proc/devices
file.
Character devices:
1 mem
4 /dev/vc/0
4 tty
4 ttyS
5 /dev/tty
5 /dev/console
...
Think of the major device number as an identifier for the device, with each type of device having it's own unique ID. The minor device number represents an instance of that device -- that is what you're actually interacting with when you make a system call.
Back in þe olden days kernel developers had to choose these device numbers manually.
This, of course, resulted in a lot of overlap between devices using the same device number (not good!).
Luckily, we have moved on from those dark time, and the kernel provides us with a handy function to allocate us both a device number and reserve a character device region for that number.
This is the register_chrdev_region
function.
We do this in our init
function.
int ret;
ret = alloc_chrdev_region(&lugmod_dev_t, 0, LUGMOD_MINOR_COUNT, "lugmod_char");
if (ret != 0) {
printk(KERN_ALERT "oopsie ;owo");
return ret;
}
The first argument is a pointer to the afforedeclared dev_t
struct that will store the device number the kernel allocates us.
The second argument is the minor device number we wish to use as our first (typically this is 0).
The third argument is the amount of minor device numbers we wish to allocate.
I have this value stored as an enum, LUGMOD_MINOR_COUNT = 1
, since we will only be worrying about 1 device in this blog post.
Lastly, the fourth argument is the name of the device, very self-explanatory.
We also do some error checking -- if this function returns non-zero, then it failed, and thus we also fail the init function itself, preventing it from being loaded.
A device class is a set of all our devices we will create.
Most importantly, creating a device class for our module will mean when we create a device it will allocate a file in the /dev/
directory for us to interact with.
When we create our class, we should, in addition, be able to see all of our devices in the /sys/class/lugmod_char/
directory.
We first define a class in our module.
static struct class *lugmod_class = NULL;
Then, in our init
function, we actual initialize it.
lugmod_class = class_create("lugmod_char");
Now that we have all of that setup to allocate a device number, char dev region, and class, we can actually create our device. First, we will define a struct to hold the data our device may need to have access to whenever it is interacted with.
struct lugmod_dev_data {
struct cdev cdev;
};
struct lugmod_dev_data dev;
The cdev
member is the struct that actually holds all of the character device functionality.
Next, we need to define the operations that we may perform on the device.
To do this we create an instance of the file_operations
struct, with pointers to implementations of operations we wish to perform.
const struct file_operations lugmod_fops = {
.owner = THIS_MODULE,
.open = lugmod_open,
.release = lugmod_release
};
In this case, we will have the driver do something when the device file is opened or closed. We'll simply have it printk something.
static int lugmod_open(struct inode *inode, struct file *file) {
printk("lugmod opened!");
return 0;
}
static int lugmod_release(struct inode *inode, struct file *file) {
printk("lugmod released!");
return 0;
}
Now, in our init
function, we can put this all together.
First, we initialize the character device, and store it to our cdev
member of our device structure.
We also set the owner of the device to be our module.
cdev_init(&dev.cdev, &lugmod_fops);
dev.cdev.owner = THIS_MODULE;
Then, we create a device at minor device number zero -- this is what we will actually be working with.
The MKDEV
macro lets us specify a device number manually, but we use the major number of our dynamically allocated device region (via the MAJOR
macro).
This device will take up the zeroth slot of our region.
The third argument specifies a count for the number of devices we are adding, which in our case is only one.
cdev_add(&dev.cdev, MKDEV(MAJOR(lugmod_dev_t), 0), 1);
Finally, we create the device.
The first argument is our class.
The second is the parent device (we can leave this null for now).
The third is the same device number as before.
The fourth is related to callbacks (don't need to worry about this).
And the last is the device identifier (what it will be called in the /dev/
directory).
device_create(lugmod_class, NULL, MKDEV(MAJOR(lugmod_dev_t), 0), NULL, "lug");
There is one last thing we need to do before we can compile.
Clean up our mess.
It is very pertinent that we close anything we opened when the module goes to unload.
As such, add the following to your exit
function:
device_destroy(lugmod_class, MKDEV(MAJOR(lugmod_dev_t), 0));
cdev_del(&dev.cdev);
unregister_chrdev_region(lugmod_dev_t, LUGMOD_MINOR_COUNT);
class_unregister(lugmod_class);
class_destroy(lugmod_class);
This snippet destroys our device, unallocates our chardev region, and destroys our device class.
Now, we may finally recompile. To test your kernel, execute:
sudo cat /dev/lug
This command will open and close your file, so you should see your two printks show up in dmesg
.
[29289.988805] lugmod opened!
[29289.988867] lugmod released!
Hurray.
Let's finish up our tour by doing something a little more interesting.
We will add an ioctl
that reads data from userspace, does something, and sends data back to userspace.
An ioctl
is a pretty common way of implementing system calls -- see the aforementioned KVM module.
It basically takes a file, a command, and parameters, and performs some action on that file.
First, we will define a few things in our header file we created earlier.
struct lug_add {
int a;
int b;
int r;
};
#define LUG_ADD _IOWR('k', 1, struct lug_add)
We first define the lug_add
structure.
This structure will be shared between the kernel and user space, for transmitting data between these realms.
Next, we define the ioctl
command.
This warrants some explanation.
The _IOWR
macro means we're defining this as an ioctl that both reads and writes to userspace memory.
We'll skip over the first argument for the moment.
The second argument is the command's id.
If we had another ioctl to add to this driver, we would make this 2, so on and so forth.
Lastly, the third argument is the type of the parameter that will be passed to the ioctl.
The first argument of this macro is the magic ioctl
number.
This should be coordinated so at to not overlap with any other device drivers, but in practice this is an unrealistic expectation.
I use 'k' as the number, as that seems to be a common magic number to use.
The magic number, command number, and io direction are all used to generate the final ioctl number, which is a 32-bit integer.
Problems could arise if two active drivers share the same ioctl numbers.
See this article for more info on picking ioctl
numbers properly: https://www.kernel.org/doc/html/latest/userspace-api/ioctl/ioctl-number.html.
Now, we can add behavior to our ioctl in the driver. First, add an ioctl handler to the file operations structure.
const struct file_operations lugmod_fops = {
.owner = THIS_MODULE,
.open = lugmod_open,
.release = lugmod_release,
.unlocked_ioctl = lugmod_ioctl
};
Now, we will write the handler function.
static long lugmod_ioctl(struct file *file, unsigned int cmd, unsigned long arg) {
switch(cmd) {
case LUG_ADD:
struct lug_add add;
if (copy_from_user(&add, (struct lug_add *)arg, sizeof(struct lug_add)))
return -1;
add.r = add.a + add.b;
if (copy_to_user((struct lug_add *)arg, &add, sizeof(struct lug_add)))
return -1;
return 0;
default:
return 1;
}
}
This function, like the file operations defined above, takes as input a pointer to the file being worked on, this will be our character device.
The other parameters are simple.
arg
is what was passed from userspace, this is usually cast to be some type of pointer.
cmd
is the ioctl number that was called.
The switch statement is a little degenerate here, given we only have one case.
However, this would be very useful in the case where our driver could handle multiple different type of ioctl
syscalls.
In the case the ioctl
passed is not our ioctl
, we return a non-zero (error).
Otherwise, we do the action we want to happen when our ioctl
is called.
In the case of LUG_ADD
, we want it to add together the a
and b
members of the lug_add
struct, and save it to the r
member.
copy_from_user
copys data from userspace into the specified pointer, we use the arg
parameter as our userspace pointer, and the local lug_add
struct as our destination.
With this knowledge the rest should be run-of-the-mill C code.
Now we can recompile and test our changes! To test we can write a simple test program.
#include "lugmod/lugmod.h"
#include <linux/ioctl.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
int main() {
int ret;
int lug;
lug = open("/dev/lug", O_RDONLY);
if (lug < 0) {
printf(":(\n");
}
struct lug_add a = {
.a = 400,
.b = 20,
};
ret = ioctl(lug, LUG_ADD, &a);
if (ret != 0)
printf("ioctl failed.");
else
printf("%d + %d = %d\n", a.a, a.b, a.r);
close(lug);
}
Compile it and run it as root (your device file is by default owned by root, and cannot be interacted with by regular users).
File lugmod.c
:
#include <linux/init.h>
#include <linux/module.h>
#include <linux/kdev_t.h>
#include <linux/cdev.h>
#include <linux/device.h>
#include <linux/fs.h>
#include <asm/ioctl.h>
#include <linux/kernel.h>
#include "lugmod.h"
MODULE_LICENSE("GPL");
enum {
LUGMOD_MINOR_COUNT = 1,
};
struct lugmod_dev_data {
struct cdev cdev;
};
struct lugmod_dev_data dev;
static struct class *lugmod_class = NULL;
static int lugmod_open(struct inode *inode, struct file *file) {
printk("lugmod opened!");
return 0;
}
static int lugmod_release(struct inode *inode, struct file *file) {
printk("lugmod released!");
return 0;
}
static long lugmod_ioctl(struct file *file, unsigned int cmd, unsigned long arg) {
switch(cmd) {
case LUG_ADD:
struct lug_add add;
if (copy_from_user(&add, (struct lug_add *)arg, sizeof(struct lug_add)))
return -1;
add.r = add.a + add.b;
if (copy_to_user((struct lug_add *)arg, &add, sizeof(struct lug_add)))
return -1;
return 0;
default:
return 1;
}
}
const struct file_operations lugmod_fops = {
.owner = THIS_MODULE,
.open = lugmod_open,
.release = lugmod_release,
.unlocked_ioctl = lugmod_ioctl
};
static dev_t lugmod_dev_t;
static int lugmod_init(void) {
int ret;
ret = alloc_chrdev_region(&lugmod_dev_t, 0, LUGMOD_MINOR_COUNT, "lugmod_char");
if (ret != 0) {
printk(KERN_ALERT "oopsie ;owo");
return ret;
}
lugmod_class = class_create("lugmod_char");
cdev_init(&dev.cdev, &lugmod_fops);
dev.cdev.owner = THIS_MODULE;
cdev_add(&dev.cdev, MKDEV(MAJOR(lugmod_dev_t), 0), 1);
device_create(lugmod_class, NULL, MKDEV(MAJOR(lugmod_dev_t), 0), NULL, "lug");
printk("hewwo ^w^");
return 0;
}
static void lugmod_exit(void) {
device_destroy(lugmod_class, MKDEV(MAJOR(lugmod_dev_t), 0));
cdev_del(&dev.cdev);
unregister_chrdev_region(lugmod_dev_t, LUGMOD_MINOR_COUNT);
class_unregister(lugmod_class);
class_destroy(lugmod_class);
printk("bye bye ;w;");
}
module_init(lugmod_init);
module_exit(lugmod_exit);
File lugmod.h
:
struct lug_add {
int a;
int b;
int r;
};
#define LUG_ADD _IOWR('k', 1, struct lug_add)
File Makefile
:
obj-m := lugmod.o
Linux Device Drivers by Corbet, Rubini, and Kroah-Hartman is an indespensible resource in learning how to implement Linux device drivers. Due to it's age, it relies upon kernel version 2.3, which is quite outdated at the time of writing. As such, this article by Linux Kernel Labs was also relied upon.
This webpage is a transcription of a talk I gave to the MTU Linux User Group in January 2024.