| .. _fatal: |
| |
| Fatal Errors |
| ############ |
| |
| Software Errors Triggered in Source Code |
| **************************************** |
| |
| Zephyr provides several methods for inducing fatal error conditions through |
| either build-time checks, conditionally compiled assertions, or deliberately |
| invoked panic or oops conditions. |
| |
| Runtime Assertions |
| ================== |
| |
| Zephyr provides some macros to perform runtime assertions which may be |
| conditionally compiled. Their definitions may be found in |
| :zephyr_file:`include/zephyr/sys/__assert.h`. |
| |
| Assertions are enabled by setting the ``__ASSERT_ON`` preprocessor symbol to a |
| non-zero value. There are two ways to do this: |
| |
| - Use the :kconfig:option:`CONFIG_ASSERT` and :kconfig:option:`CONFIG_ASSERT_LEVEL` kconfig |
| options. |
| - Add ``-D__ASSERT_ON=<level>`` to the project's CFLAGS, either on the |
| build command line or in a CMakeLists.txt. |
| |
| The ``__ASSERT_ON`` method takes precedence over the kconfig option if both are |
| used. |
| |
| Specifying an assertion level of 1 causes the compiler to issue warnings that |
| the kernel contains debug-type ``__ASSERT()`` statements; this reminder is |
| issued since assertion code is not normally present in a final product. |
| Specifying assertion level 2 suppresses these warnings. |
| |
| Assertions are enabled by default when running Zephyr test cases, as |
| configured by the :kconfig:option:`CONFIG_TEST` option. |
| |
| The policy for what to do when encountering a failed assertion is controlled |
| by the implementation of :c:func:`assert_post_action`. Zephyr provides |
| a default implementation with weak linkage which invokes a kernel oops if |
| the thread that failed the assertion was running in user mode, and a kernel |
| panic otherwise. |
| |
| __ASSERT() |
| ---------- |
| |
| The ``__ASSERT()`` macro can be used inside kernel and application code to |
| perform optional runtime checks which will induce a fatal error if the |
| check does not pass. The macro takes a string message which will be printed |
| to provide context to the assertion. In addition, the kernel will print |
| a text representation of the expression code that was evaluated, and the |
| file and line number where the assertion can be found. |
| |
| For example: |
| |
| .. code-block:: c |
| |
| __ASSERT(foo == 0xF0CACC1A, "Invalid value of foo, got 0x%x", foo); |
| |
| If at runtime ``foo`` had some unexpected value, the error produced may |
| look like the following: |
| |
| .. code-block:: none |
| |
| ASSERTION FAIL [foo == 0xF0CACC1A] @ ZEPHYR_BASE/tests/kernel/fatal/src/main.c:367 |
| Invalid value of foo, got 0xdeadbeef |
| [00:00:00.000,000] <err> os: r0/a1: 0x00000004 r1/a2: 0x0000016f r2/a3: 0x00000000 |
| [00:00:00.000,000] <err> os: r3/a4: 0x00000000 r12/ip: 0x00000000 r14/lr: 0x00000a6d |
| [00:00:00.000,000] <err> os: xpsr: 0x61000000 |
| [00:00:00.000,000] <err> os: Faulting instruction address (r15/pc): 0x00009fe4 |
| [00:00:00.000,000] <err> os: >>> ZEPHYR FATAL ERROR 4: Kernel panic |
| [00:00:00.000,000] <err> os: Current thread: 0x20000414 (main) |
| [00:00:00.000,000] <err> os: Halting system |
| |
| __ASSERT_EVAL() |
| --------------- |
| |
| The ``__ASSERT_EVAL()`` macro can also be used inside kernel and application |
| code, with special semantics for the evaluation of its arguments. |
| |
| It makes use of the ``__ASSERT()`` macro, but has some extra flexibility. It |
| allows the developer to specify different actions depending whether the |
| ``__ASSERT()`` macro is enabled or not. This can be particularly useful to |
| prevent the compiler from generating comments (errors, warnings or remarks) |
| about variables that are only used with ``__ASSERT()`` being assigned a value, |
| but otherwise unused when the ``__ASSERT()`` macro is disabled. |
| |
| Consider the following example: |
| |
| .. code-block:: c |
| |
| int x; |
| x = foo(); |
| __ASSERT(x != 0, "foo() returned zero!"); |
| |
| If ``__ASSERT()`` is disabled, then 'x' is assigned a value, but never used. |
| This type of situation can be resolved using the __ASSERT_EVAL() macro. |
| |
| .. code-block:: c |
| |
| __ASSERT_EVAL ((void) foo(), |
| int x = foo(), |
| x != 0, |
| "foo() returned zero!"); |
| |
| The first parameter tells ``__ASSERT_EVAL()`` what to do if ``__ASSERT()`` is |
| disabled. The second parameter tells ``__ASSERT_EVAL()`` what to do if |
| ``__ASSERT()`` is enabled. The third and fourth parameters are the parameters |
| it passes to ``__ASSERT()``. |
| |
| __ASSERT_NO_MSG() |
| ----------------- |
| |
| The ``__ASSERT_NO_MSG()`` macro can be used to perform an assertion that |
| reports the failed test and its location, but lacks additional debugging |
| information provided to assist the user in diagnosing the problem; its use is |
| discouraged. |
| |
| Build Assertions |
| ================ |
| |
| Zephyr provides two macros for performing build-time assertion checks. |
| These are evaluated completely at compile-time, and are always checked. |
| |
| BUILD_ASSERT() |
| -------------- |
| |
| This has the same semantics as C's ``_Static_assert`` or C++'s |
| ``static_assert``. If the evaluation fails, a build error will be generated by |
| the compiler. If the compiler supports it, the provided message will be printed |
| to provide further context. |
| |
| Unlike ``__ASSERT()``, the message must be a static string, without |
| :c:func:`printf()`-like format codes or extra arguments. |
| |
| For example, suppose this check fails: |
| |
| .. code-block:: c |
| |
| BUILD_ASSERT(FOO == 2000, "Invalid value of FOO"); |
| |
| With GCC, the output resembles: |
| |
| .. code-block:: none |
| |
| tests/kernel/fatal/src/main.c: In function 'test_main': |
| include/toolchain/gcc.h:28:37: error: static assertion failed: "Invalid value of FOO" |
| #define BUILD_ASSERT(EXPR, MSG) _Static_assert(EXPR, "" MSG) |
| ^~~~~~~~~~~~~~ |
| tests/kernel/fatal/src/main.c:370:2: note: in expansion of macro 'BUILD_ASSERT' |
| BUILD_ASSERT(FOO == 2000, |
| ^~~~~~~~~~~~~~~~ |
| |
| Kernel Oops |
| =========== |
| |
| A kernel oops is a software triggered fatal error invoked by |
| :c:func:`k_oops()`. This should be used to indicate an unrecoverable condition |
| in application logic. |
| |
| The fatal error reason code generated will be ``K_ERR_KERNEL_OOPS``. |
| |
| Kernel Panic |
| ============ |
| |
| A kernel error is a software triggered fatal error invoked by |
| :c:func:`k_panic()`. This should be used to indicate that the Zephyr kernel is |
| in an unrecoverable state. Implementations of |
| :c:func:`k_sys_fatal_error_handler()` should not return if the kernel |
| encounters a panic condition, as the entire system needs to be reset. |
| |
| Threads running in user mode are not permitted to invoke :c:func:`k_panic()`, |
| and doing so will generate a kernel oops instead. Otherwise, the fatal error |
| reason code generated will be ``K_ERR_KERNEL_PANIC``. |
| |
| Exceptions |
| ********** |
| |
| Spurious Interrupts |
| =================== |
| |
| If the CPU receives a hardware interrupt on an interrupt line that has not had |
| a handler installed with ``IRQ_CONNECT()`` or :c:func:`irq_connect_dynamic()`, |
| then the kernel will generate a fatal error with the reason code |
| ``K_ERR_SPURIOUS_IRQ()``. |
| |
| Stack Overflows |
| =============== |
| |
| In the event that a thread pushes more data onto its execution stack than its |
| stack buffer provides, the kernel may be able to detect this situation and |
| generate a fatal error with a reason code of ``K_ERR_STACK_CHK_FAIL``. |
| |
| If a thread is running in user mode, then stack overflows are always caught, |
| as the thread will simply not have permission to write to adjacent memory |
| addresses outside of the stack buffer. Because this is enforced by the |
| memory protection hardware, there is no risk of data corruption to memory |
| that the thread would not otherwise be able to write to. |
| |
| If a thread is running in supervisor mode, or if :kconfig:option:`CONFIG_USERSPACE` is |
| not enabled, depending on configuration stack overflows may or may not be |
| caught. :kconfig:option:`CONFIG_HW_STACK_PROTECTION` is supported on some |
| architectures and will catch stack overflows in supervisor mode, including |
| when handling a system call on behalf of a user thread. Typically this is |
| implemented via dedicated CPU features, or read-only MMU/MPU guard regions |
| placed immediately adjacent to the stack buffer. Stack overflows caught in this |
| way can detect the overflow, but cannot guarantee against data corruption and |
| should be treated as a very serious condition impacting the health of the |
| entire system. |
| |
| If a platform lacks memory management hardware support, |
| :kconfig:option:`CONFIG_STACK_SENTINEL` is a software-only stack overflow detection |
| feature which periodically checks if a sentinel value at the end of the stack |
| buffer has been corrupted. It does not require hardware support, but provides |
| no protection against data corruption. Since the checks are typically done at |
| interrupt exit, the overflow may be detected a nontrivial amount of time after |
| the stack actually overflowed. |
| |
| Finally, Zephyr supports GCC compiler stack canaries via |
| :kconfig:option:`CONFIG_STACK_CANARIES`. If enabled, the compiler will insert a canary |
| value randomly generated at boot into function stack frames, checking that the |
| canary has not been overwritten at function exit. If the check fails, the |
| compiler invokes :c:func:`__stack_chk_fail()`, whose Zephyr implementation |
| invokes a fatal stack overflow error. An error in this case does not indicate |
| that the entire stack buffer has overflowed, but instead that the current |
| function stack frame has been corrupted. See the compiler documentation for |
| more details. |
| |
| Other Exceptions |
| ================ |
| |
| Any other type of unhandled CPU exception will generate an error code of |
| ``K_ERR_CPU_EXCEPTION``. |
| |
| Fatal Error Handling |
| ******************** |
| |
| The policy for what to do when encountering a fatal error is determined by the |
| implementation of the :c:func:`k_sys_fatal_error_handler()` function. This |
| function has a default implementation with weak linkage that calls |
| ``LOG_PANIC()`` to dump all pending logging messages and then unconditionally |
| halts the system with :c:func:`k_fatal_halt()`. |
| |
| Applications are free to implement their own error handling policy by |
| overriding the implementation of :c:func:`k_sys_fatal_error_handler()`. |
| If the implementation returns, the faulting thread will be aborted and |
| the system will otherwise continue to function. See the documentation for |
| this function for additional details and constraints. |
| |
| API Reference |
| ************* |
| |
| .. doxygengroup:: fatal_apis |