| ZEPHYR TASK PROFILER |
| #################### |
| |
| This project permits to enable profiling in zephyr application |
| Profiling trace can be retrived over JTAG or UART |
| - JTAG: slow but no impact on CPU execution |
| - UART: much faster but impacting CPU |
| |
| On QEMU, profiler trace can be retrieved using gdb (following JTAG guidelines) |
| or on virtual serial port (when enabling UART flush) |
| |
| Profiler consists in: |
| 1) Enabling KERNEL_EVENT_LOGGER in zephyr |
| - Various config switches allows selecting events to capture |
| 2) Applying a set of patches on top of zephyr kernel (provided in this package) |
| 3) Extracting profiler events via JTAG (requires debugger) or UART (requires |
| profiler background task) |
| - for UART: profiler background task must be added to the application |
| 4) Post-processing tools for getting profiler binary data and get ASCII/graphic |
| output |
| |
| --- |
| NOTE: KERNEL_TASK_MONITOR logs have been merged in KERNEL_EVENT_LOGGER ring |
| buffer since Zephyr 1.1 |
| If using Zephyr version inferior to 1.1, monitor buffer can be dumped using |
| JTAG and post-processed using profile_monitor.py |
| The command to dump the buffer is: |
| (gdb) dump binary value <monitor_file> k_monitor_buff |
| Then post-processing: |
| ./profile_monitor.py (--qemu) <monitor_file> <zephyr_map_file> |
| |
| Project organisation |
| ==================== |
| |
| This project includes: |
| |
| |_ profiler: profiler code (kernel event logger flush) and post-processing |
| | | scripts |
| | |_ src: profiler code to integrate into zephyr application to enable |
| | | flushing of kernel events |
| | |_ scripts: post-processing scripts |
| | | |_ term: terminal to use to get zephyr console logs and retrieve profiler |
| | | data into a separate file for post-processing (linux only) |
| |_ sample_microkernel_app: sample code to exercise profiler in microkernel |
| | mode |
| |_ sample_nanokernel_app: sample code to exercise profiler in nanokernel mode |
| |
| Following guide explains how to enable profiling in your application and |
| how to use profiling data |
| |
| Requirements |
| ============ |
| |
| To be installed on host PC: |
| - python and matplotlib |
| |
| For JTAG only |
| - gdb |
| - For galileo |
| - JTAG dongle (e.g. OLIMEX ARM-USB-OCD-H) |
| - openocd |
| |
| How to enable profiler in your application |
| ========================================== |
| |
| Conventions: |
| $ZEPHYR_BASE = Zephyr base folder |
| $APP_BASE = Application base folder |
| $PROFILER_BASE = Profiler base folder i.e. $ZEPHYR_BASE/samples/task_profiler/profiler |
| prof.log = profiler output file generated from Zephyr target (binary) |
| |
| 1) Enable KERNEL_EVENT_LOGGER |
| ----------------------------- |
| |
| 1.1) Kernel event logger configuration |
| -------------------------------------- |
| |
| Edit project configuration file (e.g. $APP_BASE/prj.conf) and add the following |
| flags: |
| |
| <-- snippet |
| CONFIG_RING_BUFFER=y |
| CONFIG_KERNEL_EVENT_LOGGER=y |
| CONFIG_NANO_TIMEOUTS=y |
| CONFIG_KERNEL_EVENT_LOGGER_BUFFER_SIZE=<BUFFER_SIZE> |
| --> |
| |
| <BUFFER_SIZE> may depend on use case. Note that |
| - on JTAG, buffer is no more populated when full |
| - on UART, buffer is flushed at UART speed every 5s (can be modified in |
| src/profiler.c) |
| |
| Events have to be enabled on top of previous flags: |
| - Task/fiber switches |
| CONFIG_KERNEL_EVENT_LOGGER_CONTEXT_SWITCH=y |
| - Interrupts logging |
| CONFIG_KERNEL_EVENT_LOGGER_INTERRUPT=y |
| - Sleep logging |
| CONFIG_KERNEL_EVENT_LOGGER_SLEEP=y |
| - Task monitor (microkernel only) |
| CONFIG_TASK_MONITOR=y |
| CONFIG_TASK_MONITOR_MASK=<MASK> |
| |
| <MASK> bits are defined in kernel/microkernel/include/micro_private.h |
| For information: |
| - Task swap event: 1 |
| - Task state bit change: 2 |
| - Kserver commands: 4 |
| - Kevents: 8 |
| If wanting to understand microkernel object usage and task bit handling, |
| recommended value is 6 |
| Otherwise, task monitor can be disabled (to decrease logger ring buffer usage) |
| |
| 1.2) Kernel event logger timestamp |
| ---------------------------------- |
| |
| By default, the kernel event logger is using the system timer for timestamping |
| events. As system timer may not work properly depending on platform, this timer |
| can be set at runtime using following flag: |
| |
| $APP_BASE/prj.conf: |
| CONFIG_KERNEL_EVENT_LOGGER_CUSTOM_TIMESTAMP=y |
| |
| In that case, the profiler will set the RTC as timestamp so RTC must be enabled |
| $APP_BASE/prj.conf: |
| CONFIG_RTC=y |
| |
| In case the RTC cannot be used, profiler can support counters as well by |
| building application with the following way: |
| |
| $APP_BASE/prj.conf: |
| CONFIG_COUNTER=y |
| |
| Build command: |
| make BOARD=... PROFILER_USE_COUNTER=y |
| |
| 2) Enable UART flush [OPTIONAL] |
| ------------------------------- |
| |
| 2.1) Enable profiler UART flush function |
| ---------------------------------------- |
| |
| 2.1.1) MICROKERNEL mode |
| ----------------------- |
| |
| Add to $APP_BASE/prj.mdef |
| |
| % TASK NAME PRIO ENTRY STACK GROUPS |
| % ================================== |
| <--snippet |
| TASK PROF 10 prof 512 [EXE] |
| --> |
| |
| Note: |
| Task priority must be the lowest (depending on application code and zephyr |
| lowest supported priority). Anyway, an application never letting lower priority |
| task execute (so never going to background task) may avoid the profiler task |
| to execute. In this case, NANOKERNEL below method can be used (calling |
| prof_flush in the main loop, where processing will be the least impacted) |
| |
| 2.1.2) NANOKERNEL mode |
| ---------------------- |
| |
| prof_flush function must be called from main task before going in idle mode |
| |
| <--snippet |
| #include "profiler.h" |
| |
| ... |
| |
| void main(void) |
| { |
| ... |
| while(1) { |
| ... |
| prof_flush(); |
| // Going to idle e.g. calling nano_cpu_idle() |
| --> |
| |
| Note that profiler src folder must be added to the Makefile: |
| $APP_BASE/src/Makefile: |
| |
| <--snippet |
| ccflags-y += -I${ZEPHYR_BASE}/samples/task_profiler/profiler/src |
| --> |
| |
| 2.2) Add path to $PROFILER_BASE folder in application Makefile |
| -------------------------------------------------------------- |
| |
| For example in $APP_BASE/src/Makefile: |
| |
| <--snippet |
| obj-y += $PROFILER_BASE/ |
| --> |
| |
| Note: |
| - the final "/" after $PROFILER_BASE is important |
| - the path must be relative to application src folder |
| |
| 2.3) Increase console UART baud rate (depending on your board) |
| ------------------------------------ |
| |
| On Quark X1000, maximum standard baud rate is 921600 |
| This is done by adding following line in $APP_BASE/prj.conf: |
| |
| <--snippet |
| CONFIG_UART_NS16550_PORT_1_BAUD_RATE=921600 |
| --> |
| |
| Profiler has been tested at 2 Mbauds on Quark SE. |
| |
| 2.4) (Galileo only) disable UART 0 |
| ---------------------------------- |
| |
| Due to an issue with Quark X1000 UART_0 IRQ mapping, UART0 must be disabled in |
| $APP_BASE/prj.conf: |
| |
| <--snippet |
| CONFIG_UART_NS16550_PORT_0=n |
| --> |
| |
| 3) Enable profiler console for dynamic control |
| ---------------------------------------------- |
| |
| For having the capability to dynamically enable/disable/configure the profiler |
| at runtime, shell command is supported. |
| |
| For that purpose, following flags must be enabled in project configuration file |
| (e.g. $APP_BASE/prj.conf): |
| |
| <-- snippet |
| CONFIG_KERNEL_EVENT_LOGGER_DYNAMIC=y |
| CONFIG_CONSOLE_HANDLER=y |
| CONFIG_CONSOLE_HANDLER_SHELL=y |
| CONFIG_MINIMAL_LIBC_EXTENDED=y |
| --> |
| |
| CONFIG_MINIMAL_LIBC_EXTENDED is required for atoi() function used in profiler shell |
| command implementation. |
| |
| If done, the profiler will automatically enable a shell over UART allowing to |
| use commands to start/stop/configure the profiler: |
| |
| - Start profiling: |
| prof start |
| - Stop profiling: |
| prof stop |
| - Configure profiler: |
| prof cfg <cfg1> <cfg2> |
| This sets the kernel event logger configuration (cfg1) and kernel monitor |
| configuration (cfg2) |
| By default cfg1 is set to log all events enabled at build time and cfg2 is set |
| to CONFIG_TASK_MONITOR_MASK |
| The new configuration is applied for the following start of the profiler. |
| Any on-going profiling session isn't altered |
| |
| - Read profiler configuration |
| prof cfg |
| Get return like X(Y) A(B) |
| X being currently used kernel event logger configuration / Y being the next |
| value (Set by cfg command). |
| A being the currently used kernel monitor configuration / B being the next |
| value. |
| |
| WARNING: if the application has already defined its shell, then the profiler |
| command can be added to the list of application commands this way (apps_cmd |
| being the array of shell commands defined by the app) |
| |
| <-- snippet |
| #include "profiler.h" |
| |
| const struct shell_cmd apps_cmd[] = { |
| ... |
| <apps commands defined here> |
| ... |
| PROF_CMD, |
| { NULL, NULL} |
| } |
| --> |
| |
| Note that profiler src folder must be added to the Makefile: |
| $APP_BASE/src/Makefile: |
| |
| <--snippet |
| ccflags-y += -I${ZEPHYR_BASE}/samples/task_profiler/profiler/src |
| --> |
| |
| Additionally, the profiler must not register its console so |
| PROFILER_NO_SHELL_REGISTER must be set to 1 while building: |
| export PROFILER_NO_SHELL_REGISTER=1 |
| or |
| make BOARD=... PROFILER_NO_SHELL_REGISTER=1 |
| |
| 4) Build and flash project |
| -------------------------- |
| |
| Flags that can be used by the profiler |
| PROFILER_NO_SHELL_REGISTER: when profiler dynamic control is set, the profiler |
| won't set its own console handler. In that case the |
| profiler commands must be added to the application |
| console handler |
| |
| PROFILER_USECOUNTER: in case custom timestamp is set for the kernel event |
| logger, the profiler will use RTC. This flag allows to |
| use a counter instead |
| |
| 5) Get prepared for collecting profiler data |
| -------------------------------------------- |
| |
| 5.1) for using JTAG on galileo with OpenOCD |
| ------------------------------------------- |
| |
| Connect OpenOCD JTAG to PC |
| |
| <--snippet |
| cd $OPENOCD_FOLDER |
| openocd -f interface/ftdi/olimex-arm-usb-ocd-h.cfg -f board/quark_x10xx_board.cfg |
| cd $APP_BASE |
| gdb outdir/zephyr.elf |
| (gdb) target remote localhost:3333 |
| (gdb) c |
| --> |
| |
| 5.2) for using UART on any board |
| -------------------------------- |
| |
| - connect console UART to host PC |
| - start "profterm" |
| |
| <--snippet |
| cd $PROFILER_BASE/scripts/term/ |
| make |
| cd $PROFILER_BASE/scripts |
| ./profterm /dev/<TTY_USB> prof.log console.log |
| --> |
| |
| <TTY_USB> being the board UART console device |
| |
| Note: profterm UART rate can be set using -s option. Otherwise default speed is |
| 921600. |
| |
| 5.3) for using with QEMU (UART flushing mode) |
| --------------------------------------------- |
| |
| Two terminals are required |
| - one for running QEMU, basically redirecting serial to a virtual device |
| - one for running profterm (as it would be done with a board) |
| |
| <--snippet |
| cd $PROFILER_BASE/scripts/term/ |
| make |
| --> |
| |
| Term1: QEMU must be launched the following way |
| |
| <--snippet |
| cd $APP_BASE |
| make BOARD=... QEMU_PTY=1 qemu |
| --> |
| |
| QEMU will display virtually created TTY |
| "char device redicected to /dev/pts/XY" |
| |
| Term2: |
| <--snippet |
| ./profterm /dev/pts/XY prof.log console.log |
| --> |
| |
| 6) Run program |
| -------------- |
| |
| 7) Get log (JTAG only) |
| ---------------------- |
| |
| Break execution and dump buffer |
| <--snippet |
| (gdb) ** break <CTRL-C> ** |
| (gdb) dump binary value prof.log *(int[<BUFFER_SIZE>] *)sys_k_event_logger.ring_buf.buf |
| --> |
| |
| <BUFFER_SIZE> being the buffer size defined in $APP_BASE/prj.conf file |
| |
| 8) Post-processing |
| ------------------ |
| |
| WARNING: zephyr ELF file is required for post-processing. This file must be |
| kept together with dump files in order to be able to run the post-processing. |
| |
| profile script under $PROFILER_BASE/scripts must be called in its folder for the |
| moment |
| Run "./profile.sh -h" for usage |
| |
| Note that if using JTAG, HW cycles per sec is not part of the dump and so HW |
| cycles per sec must be provided to "profile" script using -c option (see |
| profile.sh help) |
| |
| Notes: |
| - python is required for the post processing. matplotlib shall be installed for |
| graphical output |
| |