Linux kernel oops

From HandWiki
Revision as of 04:35, 27 June 2023 by NBrushPhys (talk | contribs) (add)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Short description: Serious, non-fatal error in the Linux kernel
Linux kernel oops on SPARC
Linux kernel oops on PA-RISC with a dead ASCII cow

In computing, an oops is a serious but non-fatal error in the Linux kernel. An oops may precede a kernel panic, but it may also allow continued operation with compromised reliability. The term does not stand for anything, other than that it is a simple mistake.

Functioning

When the kernel detects a problem, it kills any offending processes and prints an oops message, which Linux kernel engineers can use in debugging the condition that created the oops and fixing the underlying programming error. After a system has experienced an oops, some internal resources may no longer be operational. Thus, even if the system appears to work correctly, undesirable side effects may have resulted from the active task being killed. A kernel oops often leads to a kernel panic when the system attempts to use resources that have been lost. Some kernels are configured to panic when many oopses (10,000 by default) have occurred.[1][2] This oops limit is due to the potential, for example, for attackers to repeatedly trigger an oops and an associated resource leak, which eventually overflows an integer and allows further exploitation.[3][4]

The official Linux kernel documentation regarding oops messages resides in the file Documentation/admin-guide/bug-hunting.rst[5] of the kernel sources. Some logger configurations may affect the ability to collect oops messages.[6] The kerneloops software can collect and submit kernel oopses to a repository such as the www.kerneloops.org website,[7] which provides statistics and public access to reported oopses.

For a person not familiar with technical details of computers and operating systems, an oops message might look confusing. Unlike other operating systems such as Windows or macOS, Linux chooses to present details explaining the crash of the kernel rather than display a simplified, user-friendly message, such as the BSoD on Windows. A simplified crash screen has been proposed a few times, however currently none are in development.[8]

See also

  • kdump (Linux) – Linux kernel's crash dump mechanism, which internally uses kexec
  • System.map – contains mappings between symbol names and their addresses in memory, used to interpret oopses

References

Further reading

  • Linux Device Drivers, 3rd edition, Chapter 4.
  • John Bradford (2003-03-08). "Re: what's an OOPS". LKML (Mailing list). Archived from the original on 2007-03-10. Retrieved 2006-05-22.
  • Szakacsits Szabolcs (2003-03-08). "Re: what's an OOPS". LKML (Mailing list). Archived from the original on 2007-03-13. Retrieved 2006-05-22.
  • Al Viro (2008-01-14). "OOPS report analysis". LKML (Mailing list). Archived from the original on 2008-04-21. Retrieved 2008-01-14.
  • Kernel Oops Howto (the madwifi project) Useful information on configuration files and tools to help display oops messages. Also many other links.