TimeLinux1

Sunday, March 17, 2013

Tainted Kernel - What to do?

Tainted Kernel:

In the Linux world, where Opensource drivers and modules rule the roost, you may happen to receive a message in your system alerts saying that your kernel is 'Tainted'. Before we go further its useful to understand what it means.

Because of the Opensource model, Linux kernel modules and drivers are designed, developed and maintained by the Linux Opensource community. Some Linux flavors like GNU Hurd are strict about their principles about Free Open Source software fundamentals and therefore would not include any non-Free drivers and modules in their distribution. On the other hand Linux distributions like Ubuntu and LinuxMint are not so strict. They emphasise functionality and ease of use over Principles of FSF. Their driver and module inclusion policy in their distributions is more relaxed. What that means is that often in these distributions would include proprietary drivers and modules to make something work. This is especially true in case of Graphical, Audio or Wireless drivers. For instance the Graphic Drivers from AMD (Radeon) could be included in a certain Linux distribution.
Now, whenever proprietary drivers are included, they come only as binary code--no source code.
Which means if there is any problem or bug detected in that (proprietary) softwar, the Linux community cannot  access the source code and therefore cannot fix it. This fix can only come from the vendor who provided that proprietary driver or module. This is lack of visibility in the source code of a proprietary driver or module that is having problems working with Linux kernel is said to have 'Tainted the Linux kernel'.

Great, now we know what it means, how do we confirm it? How do we fix diagnose and fix it?
Well, it varies, but in almost all cases, you would know that your kernel is Tainted if you receive a popup/warning or alert from the system. On Fedora, the distribution that I use all the time, the 'Tainted kernel' alert is easily visible from the system alerts on the bottom right corner of the GUI screen. From the command line, its visible under the kernel message utility dmesg, as follows:


[root@ms-vaio ~]# dmesg | grep -i taint
[83346.433966] Pid: 31775, comm: kworker/u:0 Tainted: G        W    3.7.6-102.fc17.x86_64 #1

This can also be confirmed by looking into the file /proc/sys/kernel/tainted:


[root@ms-vaio ~]# cat /proc/sys/kernel/tainted 
512

Typically (for an un-tainted kernel), this file is empty. But as you can see, in my case its not. And in this case the process id 31775 is causing the kernel to be tainted. To dig into what that process is you can look into the process listing as follows:

[root@ms-vaio ~]# ps -efly | grep 31775
S root      4330  1539  0  80   0   872 27350 pipe_w 17:52 pts/0    00:00:00 grep --color=auto 31775
[root@ms-vaio ~]# 

As you can see, the process with pid=31775 does not exist in my case, which means the process was shortlived, it caused a kernel warning and then it was termintated.


For a detailed discussion on the meaning of each Taint Flag like 512, G, W etc, go here and search for 'tainted'.

Now what to do to fix it?

Typically there is, Nothing to fix.

All it means is that your kernel has a device driver that is not open source. It may be wireless, or video driver (either are common) or you have some slightly unusual hardware that doesn't have a kernel supported driver.

The only time it causes a problem is if a kernel failure/panic occurs. The kernel developers cannot debug such a failure because they are not able to trace the entire problem. It may have been caused by a bug in the proprietary driver either directly (it happens to be in the traceback) or indirectly by modifying some kernel data that other drivers depended on.

So a reboot usually clears the warning and resets the /proc/sys/kernel/tainted file to zero bytes.


3 comments:

  1. "All it means is that your kernel has a device driver that is not open source."

    That is incorrect. Kernel taint is not exclusively caused by a non-GPL compatible module. It just happens to be the most common cause.

    You can have taint even when all your drivers are licensed GPL/GPL-compatible. One case: Out of tree GPL modules will STILL taint the kernel. If you see a kernel oops/panic with a taint flag of "P" that means you have a proprietary module and chances are no one will do anything about it unless you can reproduce the problem WITHOUT the cause of the taint.

    "G" means all the modules in your kernel are GPL and you have an out-of-tree module loaded.

    Basically, the taint flag is actually very useful not only for determining where a possible problem might be but also maybe WHERE to report a bug. If you can successfully reproduce the problem without your kernel being tainted, then you file a bug on the kernel. If the problem comes up only after a taint, look at the module instead. In-tree modules are "maintained" by the kernel team even if technically it's actually the module team.

    It's all about what the kernel developers can reasonably debug and actually fix. If you have a proprietary driver it could very well be a bug in that driver, which no one in the kernel team can do anything with. This is why it's important to try to reproduce a kernel oops without taint.

    You can still file the bug, but unless you can show the bug is not caused by the taint they'll probably mark it as invalid or wontfix.

    Do some research into what can cause kernel taint. It's usually proprietary/out of tree modules, but it can also be caused by force loading/unloading modules (In which case the bug is BKAC and if you have problems it could be caused by the fact you forced a module in/out of the kernel.), user/usermode software explicitly tainting the kernel (In which case, bug is either BKAC or the program did something with the hardware directly.). there's taint for previous warnings (If your kernel oopsed earlier, then taint will occur, meaning your subsequent oopses could have been a result of the first oops and are not technically bugs themselves.), etc.

    In other words, taint flags signify that the kernel itself could have been compromised in some way by some event in your system (Forcing module behavior.) OR is running in a way that can't be supported (Out of tree/binary modules.). Filing bugs on an oops requires you make sure the oops is NOT the result of these compromises.

    ReplyDelete
  2. NOTE: One of your blog ad networks is providing a sketchy "you have (1) prize" waiting ads. That same network (or another one) spawned a new tab with a NSFW ad.

    ReplyDelete