Re: What to do in response to a kernel warning


Lukas Bulwahn
 

On Tue, Nov 23, 2021 at 10:30 PM Shuah Khan <skhan@...> wrote:

On 11/23/21 12:47 PM, Paul Sherwood wrote:
On 2021-11-23 18:41, elana.copperman@... wrote:
Warnings MUST be managed. No argument about that.
Panic is NOT feasible for every warning. That is the point.
Sorry if I'm missing it, but I think maybe we're agreeing?

Folks taking responsibility for safety could/should

- review all warnings in the code, and take steps to manage them in their use case
If anybody is curious about the scope of this work:

git grep WARN_ON shows about 18486 usages (- definition). It isn't surprising
as WARN_ON is used a debug mechanism by some drivers. In general WARN_ON use
is discouraged and supposed to be used in only cases where there is no other
choice bu to panic. However in reality there are several usages that are just
for debug.
And a big thanks to Elana to start this; there is no need to look at
the 18486 usages at first.

It would be a good start to just look at the 1614 WARN_ON* in the
kernel/ directory.

Please either eliminate them in the code by handling them in place in
that context, or describe a reliable reaction that can be executed in
user-space and brings the kernel back into a proper operating mode
despite the state after the existing warning.

Also, https://syzkaller.appspot.com/upstream will point you to
warnings that we encountered during our kernel testing campaigns.

Good luck.

Lukas

thanks,
-- Shuah

Join devel@lists.elisa.tech to automatically receive all group messages.