| Latency Measurements |
| #################### |
| |
| This benchmark measures the average latency of selected kernel capabilities, |
| including: |
| |
| * Context switch time between preemptive threads using k_yield |
| * Context switch time between cooperative threads using k_yield |
| * Time to switch from ISR back to interrupted thread |
| * Time from ISR to executing a different thread (rescheduled) |
| * Time to signal a semaphore then test that semaphore |
| * Time to signal a semaphore then test that semaphore with a context switch |
| * Times to lock a mutex then unlock that mutex |
| * Time it takes to create a new thread (without starting it) |
| * Time it takes to start a newly created thread |
| * Time it takes to suspend a thread |
| * Time it takes to resume a suspended thread |
| * Time it takes to abort a thread |
| * Time it takes to add data to a fifo.LIFO |
| * Time it takes to retrieve data from a fifo.LIFO |
| * Time it takes to wait on a fifo.lifo.(and context switch) |
| * Time it takes to wake and switch to a thread waiting on a fifo.LIFO |
| * Time it takes to send and receive events |
| * Time it takes to wait for events (and context switch) |
| * Time it takes to wake and switch to a thread waiting for events |
| * Time it takes to push and pop to/from a k_stack |
| * Measure average time to alloc memory from heap then free that memory |
| |
| When userspace is enabled, this benchmark will where possible, also test the |
| above capabilities using various configurations involving user threads: |
| |
| * Kernel thread to kernel thread |
| * Kernel thread to user thread |
| * User thread to kernel thread |
| * User thread to user thread |
| |
| The default configuration builds only for the kernel. However, additional |
| configurations can be enabled via the use of EXTRA_CONF_FILE. |
| |
| For example, the following will build this project with userspace support: |
| |
| EXTRA_CONF_FILE="prj.userspace.conf" west build -p -b <board> <path to project> |
| |
| The following table summarizes the purposes of the different extra |
| configuration files that are available to be used with this benchmark. |
| A tester may mix and match them allowing them different scenarios to |
| be easily compared the default. Benchmark output can be saved and subsequently |
| exported to third party tools to compare and chart performance differences |
| both between configurations as well as across Zephyr versions. |
| |
| +-----------------------------+------------------------------------+ |
| | prj.canaries.conf | Enable stack canaries | |
| +-----------------------------+------------------------------------+ |
| | prj.objcore.conf | Enable object cores and statistics | |
| +-----------------------------+------------------------------------+ |
| | prj.userspace.conf | Enable userspace support | |
| +-----------------------------+------------------------------------+ |
| |
| Sample output of the benchmark using the defaults:: |
| |
| thread.yield.preemptive.ctx.k_to_k - Context switch via k_yield : 315 cycles , 2625 ns : |
| thread.yield.cooperative.ctx.k_to_k - Context switch via k_yield : 315 cycles , 2625 ns : |
| isr.resume.interrupted.thread.kernel - Return from ISR to interrupted thread : 289 cycles , 2416 ns : |
| isr.resume.different.thread.kernel - Return from ISR to another thread : 374 cycles , 3124 ns : |
| thread.create.kernel.from.kernel - Create thread : 382 cycles , 3191 ns : |
| thread.start.kernel.from.kernel - Start thread : 394 cycles , 3291 ns : |
| thread.suspend.kernel.from.kernel - Suspend thread : 289 cycles , 2416 ns : |
| thread.resume.kernel.from.kernel - Resume thread : 339 cycles , 2833 ns : |
| thread.abort.kernel.from.kernel - Abort thread : 339 cycles , 2833 ns : |
| fifo.put.immediate.kernel - Add data to FIFO (no ctx switch) : 214 cycles , 1791 ns : |
| fifo.get.immediate.kernel - Get data from FIFO (no ctx switch) : 134 cycles , 1124 ns : |
| fifo.put.alloc.immediate.kernel - Allocate to add data to FIFO (no ctx switch) : 834 cycles , 6950 ns : |
| fifo.get.free.immediate.kernel - Free when getting data from FIFO (no ctx switch) : 560 cycles , 4666 ns : |
| fifo.get.blocking.k_to_k - Get data from FIFO (w/ ctx switch) : 510 cycles , 4257 ns : |
| fifo.put.wake+ctx.k_to_k - Add data to FIFO (w/ ctx switch) : 590 cycles , 4923 ns : |
| fifo.get.free.blocking.k_to_k - Free when getting data from FIFO (w/ ctx switch) : 510 cycles , 4250 ns : |
| fifo.put.alloc.wake+ctx.k_to_k - Allocate to add data to FIFO (w/ ctx switch) : 585 cycles , 4875 ns : |
| lifo.put.immediate.kernel - Add data to LIFO (no ctx switch) : 214 cycles , 1791 ns : |
| lifo.get.immediate.kernel - Get data from LIFO (no ctx switch) : 120 cycles , 1008 ns : |
| lifo.put.alloc.immediate.kernel - Allocate to add data to LIFO (no ctx switch) : 831 cycles , 6925 ns : |
| lifo.get.free.immediate.kernel - Free when getting data from LIFO (no ctx switch) : 555 cycles , 4625 ns : |
| lifo.get.blocking.k_to_k - Get data from LIFO (w/ ctx switch) : 502 cycles , 4191 ns : |
| lifo.put.wake+ctx.k_to_k - Add data to LIFO (w/ ctx switch) : 585 cycles , 4875 ns : |
| lifo.get.free.blocking.k_to_k - Free when getting data from LIFO (w/ ctx switch) : 513 cycles , 4275 ns : |
| lifo.put.alloc.wake+ctx.k_to_k - Allocate to add data to LIFO (w/ ctx switch) : 585 cycles , 4881 ns : |
| events.post.immediate.kernel - Post events (nothing wakes) : 225 cycles , 1875 ns : |
| events.set.immediate.kernel - Set events (nothing wakes) : 230 cycles , 1923 ns : |
| events.wait.immediate.kernel - Wait for any events (no ctx switch) : 120 cycles , 1000 ns : |
| events.wait_all.immediate.kernel - Wait for all events (no ctx switch) : 110 cycles , 917 ns : |
| events.wait.blocking.k_to_k - Wait for any events (w/ ctx switch) : 514 cycles , 4291 ns : |
| events.set.wake+ctx.k_to_k - Set events (w/ ctx switch) : 754 cycles , 6291 ns : |
| events.wait_all.blocking.k_to_k - Wait for all events (w/ ctx switch) : 528 cycles , 4400 ns : |
| events.post.wake+ctx.k_to_k - Post events (w/ ctx switch) : 765 cycles , 6375 ns : |
| semaphore.give.immediate.kernel - Give a semaphore (no waiters) : 59 cycles , 492 ns : |
| semaphore.take.immediate.kernel - Take a semaphore (no blocking) : 69 cycles , 575 ns : |
| semaphore.take.blocking.k_to_k - Take a semaphore (context switch) : 450 cycles , 3756 ns : |
| semaphore.give.wake+ctx.k_to_k - Give a semaphore (context switch) : 509 cycles , 4249 ns : |
| condvar.wait.blocking.k_to_k - Wait for a condvar (context switch) : 578 cycles , 4817 ns : |
| condvar.signal.wake+ctx.k_to_k - Signal a condvar (context switch) : 630 cycles , 5250 ns : |
| stack.push.immediate.kernel - Add data to k_stack (no ctx switch) : 107 cycles , 899 ns : |
| stack.pop.immediate.kernel - Get data from k_stack (no ctx switch) : 80 cycles , 674 ns : |
| stack.pop.blocking.k_to_k - Get data from k_stack (w/ ctx switch) : 467 cycles , 3899 ns : |
| stack.push.wake+ctx.k_to_k - Add data to k_stack (w/ ctx switch) : 550 cycles , 4583 ns : |
| mutex.lock.immediate.recursive.kernel - Lock a mutex : 83 cycles , 692 ns : |
| mutex.unlock.immediate.recursive.kernel - Unlock a mutex : 44 cycles , 367 ns : |
| heap.malloc.immediate - Average time for heap malloc : 610 cycles , 5083 ns : |
| heap.free.immediate - Average time for heap free : 425 cycles , 3541 ns : |
| =================================================================== |
| PROJECT EXECUTION SUCCESSFUL |
| |
| |
| Sample output of the benchmark with stack canaries enabled:: |
| |
| thread.yield.preemptive.ctx.k_to_k - Context switch via k_yield : 485 cycles , 4042 ns : |
| thread.yield.cooperative.ctx.k_to_k - Context switch via k_yield : 485 cycles , 4042 ns : |
| isr.resume.interrupted.thread.kernel - Return from ISR to interrupted thread : 545 cycles , 4549 ns : |
| isr.resume.different.thread.kernel - Return from ISR to another thread : 590 cycles , 4924 ns : |
| thread.create.kernel.from.kernel - Create thread : 585 cycles , 4883 ns : |
| thread.start.kernel.from.kernel - Start thread : 685 cycles , 5716 ns : |
| thread.suspend.kernel.from.kernel - Suspend thread : 490 cycles , 4091 ns : |
| thread.resume.kernel.from.kernel - Resume thread : 569 cycles , 4749 ns : |
| thread.abort.kernel.from.kernel - Abort thread : 629 cycles , 5249 ns : |
| fifo.put.immediate.kernel - Add data to FIFO (no ctx switch) : 439 cycles , 3666 ns : |
| fifo.get.immediate.kernel - Get data from FIFO (no ctx switch) : 320 cycles , 2674 ns : |
| fifo.put.alloc.immediate.kernel - Allocate to add data to FIFO (no ctx switch) : 1499 cycles , 12491 ns : |
| fifo.get.free.immediate.kernel - Free when getting data from FIFO (no ctx switch) : 1230 cycles , 10250 ns : |
| fifo.get.blocking.k_to_k - Get data from FIFO (w/ ctx switch) : 868 cycles , 7241 ns : |
| fifo.put.wake+ctx.k_to_k - Add data to FIFO (w/ ctx switch) : 991 cycles , 8259 ns : |
| fifo.get.free.blocking.k_to_k - Free when getting data from FIFO (w/ ctx switch) : 879 cycles , 7325 ns : |
| fifo.put.alloc.wake+ctx.k_to_k - Allocate to add data to FIFO (w/ ctx switch) : 990 cycles , 8250 ns : |
| lifo.put.immediate.kernel - Add data to LIFO (no ctx switch) : 429 cycles , 3582 ns : |
| lifo.get.immediate.kernel - Get data from LIFO (no ctx switch) : 320 cycles , 2674 ns : |
| lifo.put.alloc.immediate.kernel - Allocate to add data to LIFO (no ctx switch) : 1499 cycles , 12491 ns : |
| lifo.get.free.immediate.kernel - Free when getting data from LIFO (no ctx switch) : 1220 cycles , 10166 ns : |
| lifo.get.blocking.k_to_k - Get data from LIFO (w/ ctx switch) : 863 cycles , 7199 ns : |
| lifo.put.wake+ctx.k_to_k - Add data to LIFO (w/ ctx switch) : 985 cycles , 8208 ns : |
| lifo.get.free.blocking.k_to_k - Free when getting data from LIFO (w/ ctx switch) : 879 cycles , 7325 ns : |
| lifo.put.alloc.wake+ctx.k_to_k - Allocate to add data to LIFO (w/ ctx switch) : 985 cycles , 8208 ns : |
| events.post.immediate.kernel - Post events (nothing wakes) : 420 cycles , 3501 ns : |
| events.set.immediate.kernel - Set events (nothing wakes) : 420 cycles , 3501 ns : |
| events.wait.immediate.kernel - Wait for any events (no ctx switch) : 280 cycles , 2334 ns : |
| events.wait_all.immediate.kernel - Wait for all events (no ctx switch) : 270 cycles , 2251 ns : |
| events.wait.blocking.k_to_k - Wait for any events (w/ ctx switch) : 919 cycles , 7665 ns : |
| events.set.wake+ctx.k_to_k - Set events (w/ ctx switch) : 1310 cycles , 10924 ns : |
| events.wait_all.blocking.k_to_k - Wait for all events (w/ ctx switch) : 954 cycles , 7950 ns : |
| events.post.wake+ctx.k_to_k - Post events (w/ ctx switch) : 1340 cycles , 11166 ns : |
| semaphore.give.immediate.kernel - Give a semaphore (no waiters) : 110 cycles , 917 ns : |
| semaphore.take.immediate.kernel - Take a semaphore (no blocking) : 180 cycles , 1500 ns : |
| semaphore.take.blocking.k_to_k - Take a semaphore (context switch) : 755 cycles , 6292 ns : |
| semaphore.give.wake+ctx.k_to_k - Give a semaphore (context switch) : 812 cycles , 6767 ns : |
| condvar.wait.blocking.k_to_k - Wait for a condvar (context switch) : 1027 cycles , 8558 ns : |
| condvar.signal.wake+ctx.k_to_k - Signal a condvar (context switch) : 1040 cycles , 8666 ns : |
| stack.push.immediate.kernel - Add data to k_stack (no ctx switch) : 220 cycles , 1841 ns : |
| stack.pop.immediate.kernel - Get data from k_stack (no ctx switch) : 205 cycles , 1716 ns : |
| stack.pop.blocking.k_to_k - Get data from k_stack (w/ ctx switch) : 791 cycles , 6599 ns : |
| stack.push.wake+ctx.k_to_k - Add data to k_stack (w/ ctx switch) : 870 cycles , 7250 ns : |
| mutex.lock.immediate.recursive.kernel - Lock a mutex : 175 cycles , 1459 ns : |
| mutex.unlock.immediate.recursive.kernel - Unlock a mutex : 61 cycles , 510 ns : |
| heap.malloc.immediate - Average time for heap malloc : 1060 cycles , 8833 ns : |
| heap.free.immediate - Average time for heap free : 899 cycles , 7491 ns : |
| =================================================================== |
| PROJECT EXECUTION SUCCESSFUL |
| |
| The sample output above (stack canaries are enabled) shows longer times than |
| for the default build. Not only does each stack frame in the call tree have |
| its own stack canary check, but enabling this feature impacts how the compiler |
| chooses to inline or not inline routines. |
| |
| Sample output of the benchmark with object core enabled:: |
| |
| thread.yield.preemptive.ctx.k_to_k - Context switch via k_yield : 740 cycles , 6167 ns : |
| thread.yield.cooperative.ctx.k_to_k - Context switch via k_yield : 740 cycles , 6167 ns : |
| isr.resume.interrupted.thread.kernel - Return from ISR to interrupted thread : 284 cycles , 2374 ns : |
| isr.resume.different.thread.kernel - Return from ISR to another thread : 784 cycles , 6541 ns : |
| thread.create.kernel.from.kernel - Create thread : 714 cycles , 5958 ns : |
| thread.start.kernel.from.kernel - Start thread : 819 cycles , 6833 ns : |
| thread.suspend.kernel.from.kernel - Suspend thread : 704 cycles , 5874 ns : |
| thread.resume.kernel.from.kernel - Resume thread : 761 cycles , 6349 ns : |
| thread.abort.kernel.from.kernel - Abort thread : 544 cycles , 4541 ns : |
| fifo.put.immediate.kernel - Add data to FIFO (no ctx switch) : 211 cycles , 1766 ns : |
| fifo.get.immediate.kernel - Get data from FIFO (no ctx switch) : 132 cycles , 1108 ns : |
| fifo.put.alloc.immediate.kernel - Allocate to add data to FIFO (no ctx switch) : 850 cycles , 7091 ns : |
| fifo.get.free.immediate.kernel - Free when getting data from FIFO (no ctx switch) : 565 cycles , 4708 ns : |
| fifo.get.blocking.k_to_k - Get data from FIFO (w/ ctx switch) : 947 cycles , 7899 ns : |
| fifo.put.wake+ctx.k_to_k - Add data to FIFO (w/ ctx switch) : 1015 cycles , 8458 ns : |
| fifo.get.free.blocking.k_to_k - Free when getting data from FIFO (w/ ctx switch) : 950 cycles , 7923 ns : |
| fifo.put.alloc.wake+ctx.k_to_k - Allocate to add data to FIFO (w/ ctx switch) : 1010 cycles , 8416 ns : |
| lifo.put.immediate.kernel - Add data to LIFO (no ctx switch) : 226 cycles , 1891 ns : |
| lifo.get.immediate.kernel - Get data from LIFO (no ctx switch) : 123 cycles , 1033 ns : |
| lifo.put.alloc.immediate.kernel - Allocate to add data to LIFO (no ctx switch) : 848 cycles , 7066 ns : |
| lifo.get.free.immediate.kernel - Free when getting data from LIFO (no ctx switch) : 565 cycles , 4708 ns : |
| lifo.get.blocking.k_to_k - Get data from LIFO (w/ ctx switch) : 951 cycles , 7932 ns : |
| lifo.put.wake+ctx.k_to_k - Add data to LIFO (w/ ctx switch) : 1010 cycles , 8416 ns : |
| lifo.get.free.blocking.k_to_k - Free when getting data from LIFO (w/ ctx switch) : 959 cycles , 7991 ns : |
| lifo.put.alloc.wake+ctx.k_to_k - Allocate to add data to LIFO (w/ ctx switch) : 1010 cycles , 8422 ns : |
| events.post.immediate.kernel - Post events (nothing wakes) : 210 cycles , 1750 ns : |
| events.set.immediate.kernel - Set events (nothing wakes) : 230 cycles , 1917 ns : |
| events.wait.immediate.kernel - Wait for any events (no ctx switch) : 120 cycles , 1000 ns : |
| events.wait_all.immediate.kernel - Wait for all events (no ctx switch) : 150 cycles , 1250 ns : |
| events.wait.blocking.k_to_k - Wait for any events (w/ ctx switch) : 951 cycles , 7932 ns : |
| events.set.wake+ctx.k_to_k - Set events (w/ ctx switch) : 1179 cycles , 9833 ns : |
| events.wait_all.blocking.k_to_k - Wait for all events (w/ ctx switch) : 976 cycles , 8133 ns : |
| events.post.wake+ctx.k_to_k - Post events (w/ ctx switch) : 1190 cycles , 9922 ns : |
| semaphore.give.immediate.kernel - Give a semaphore (no waiters) : 59 cycles , 492 ns : |
| semaphore.take.immediate.kernel - Take a semaphore (no blocking) : 69 cycles , 575 ns : |
| semaphore.take.blocking.k_to_k - Take a semaphore (context switch) : 870 cycles , 7250 ns : |
| semaphore.give.wake+ctx.k_to_k - Give a semaphore (context switch) : 929 cycles , 7749 ns : |
| condvar.wait.blocking.k_to_k - Wait for a condvar (context switch) : 1010 cycles , 8417 ns : |
| condvar.signal.wake+ctx.k_to_k - Signal a condvar (context switch) : 1060 cycles , 8833 ns : |
| stack.push.immediate.kernel - Add data to k_stack (no ctx switch) : 90 cycles , 758 ns : |
| stack.pop.immediate.kernel - Get data from k_stack (no ctx switch) : 86 cycles , 724 ns : |
| stack.pop.blocking.k_to_k - Get data from k_stack (w/ ctx switch) : 910 cycles , 7589 ns : |
| stack.push.wake+ctx.k_to_k - Add data to k_stack (w/ ctx switch) : 975 cycles , 8125 ns : |
| mutex.lock.immediate.recursive.kernel - Lock a mutex : 105 cycles , 875 ns : |
| mutex.unlock.immediate.recursive.kernel - Unlock a mutex : 44 cycles , 367 ns : |
| heap.malloc.immediate - Average time for heap malloc : 621 cycles , 5183 ns : |
| heap.free.immediate - Average time for heap free : 422 cycles , 3516 ns : |
| =================================================================== |
| PROJECT EXECUTION SUCCESSFUL |
| |
| The sample output above (object core and statistics enabled) shows longer |
| times than for the default build when context switching is involved. A blanket |
| enabling of the object cores as was done here results in the additional |
| gathering of thread statistics when a thread is switched in/out. The |
| gathering of these statistics can be controlled at both at the time of |
| project configuration as well as at runtime. |
| |
| Sample output of the benchmark with userspace enabled:: |
| |
| thread.yield.preemptive.ctx.k_to_k - Context switch via k_yield : 975 cycles , 8125 ns : |
| thread.yield.preemptive.ctx.u_to_u - Context switch via k_yield : 1303 cycles , 10860 ns : |
| thread.yield.preemptive.ctx.k_to_u - Context switch via k_yield : 1180 cycles , 9834 ns : |
| thread.yield.preemptive.ctx.u_to_k - Context switch via k_yield : 1097 cycles , 9144 ns : |
| thread.yield.cooperative.ctx.k_to_k - Context switch via k_yield : 975 cycles , 8125 ns : |
| thread.yield.cooperative.ctx.u_to_u - Context switch via k_yield : 1302 cycles , 10854 ns : |
| thread.yield.cooperative.ctx.k_to_u - Context switch via k_yield : 1180 cycles , 9834 ns : |
| thread.yield.cooperative.ctx.u_to_k - Context switch via k_yield : 1097 cycles , 9144 ns : |
| isr.resume.interrupted.thread.kernel - Return from ISR to interrupted thread : 329 cycles , 2749 ns : |
| isr.resume.different.thread.kernel - Return from ISR to another thread : 1014 cycles , 8457 ns : |
| isr.resume.different.thread.user - Return from ISR to another thread : 1223 cycles , 10192 ns : |
| thread.create.kernel.from.kernel - Create thread : 970 cycles , 8089 ns : |
| thread.start.kernel.from.kernel - Start thread : 1074 cycles , 8957 ns : |
| thread.suspend.kernel.from.kernel - Suspend thread : 949 cycles , 7916 ns : |
| thread.resume.kernel.from.kernel - Resume thread : 1004 cycles , 8374 ns : |
| thread.abort.kernel.from.kernel - Abort thread : 2734 cycles , 22791 ns : |
| thread.create.user.from.kernel - Create thread : 832 cycles , 6935 ns : |
| thread.start.user.from.kernel - Start thread : 9023 cycles , 75192 ns : |
| thread.suspend.user.from.kernel - Suspend thread : 1312 cycles , 10935 ns : |
| thread.resume.user.from.kernel - Resume thread : 1187 cycles , 9894 ns : |
| thread.abort.user.from.kernel - Abort thread : 2597 cycles , 21644 ns : |
| thread.create.user.from.user - Create thread : 2144 cycles , 17872 ns : |
| thread.start.user.from.user - Start thread : 9399 cycles , 78330 ns : |
| thread.suspend.user.from.user - Suspend thread : 1504 cycles , 12539 ns : |
| thread.resume.user.from.user - Resume thread : 1574 cycles , 13122 ns : |
| thread.abort.user.from.user - Abort thread : 3237 cycles , 26981 ns : |
| thread.start.kernel.from.user - Start thread : 1452 cycles , 12102 ns : |
| thread.suspend.kernel.from.user - Suspend thread : 1143 cycles , 9525 ns : |
| thread.resume.kernel.from.user - Resume thread : 1392 cycles , 11602 ns : |
| thread.abort.kernel.from.user - Abort thread : 3372 cycles , 28102 ns : |
| fifo.put.immediate.kernel - Add data to FIFO (no ctx switch) : 239 cycles , 1999 ns : |
| fifo.get.immediate.kernel - Get data from FIFO (no ctx switch) : 184 cycles , 1541 ns : |
| fifo.put.alloc.immediate.kernel - Allocate to add data to FIFO (no ctx switch) : 920 cycles , 7666 ns : |
| fifo.get.free.immediate.kernel - Free when getting data from FIFO (no ctx switch) : 650 cycles , 5416 ns : |
| fifo.put.alloc.immediate.user - Allocate to add data to FIFO (no ctx switch) : 1710 cycles , 14256 ns : |
| fifo.get.free.immediate.user - Free when getting data from FIFO (no ctx switch) : 1440 cycles , 12000 ns : |
| fifo.get.blocking.k_to_k - Get data from FIFO (w/ ctx switch) : 1209 cycles , 10082 ns : |
| fifo.put.wake+ctx.k_to_k - Add data to FIFO (w/ ctx switch) : 1230 cycles , 10250 ns : |
| fifo.get.free.blocking.k_to_k - Free when getting data from FIFO (w/ ctx switch) : 1210 cycles , 10083 ns : |
| fifo.put.alloc.wake+ctx.k_to_k - Allocate to add data to FIFO (w/ ctx switch) : 1260 cycles , 10500 ns : |
| fifo.get.free.blocking.u_to_k - Free when getting data from FIFO (w/ ctx switch) : 1745 cycles , 14547 ns : |
| fifo.put.alloc.wake+ctx.k_to_u - Allocate to add data to FIFO (w/ ctx switch) : 1600 cycles , 13333 ns : |
| fifo.get.free.blocking.k_to_u - Free when getting data from FIFO (w/ ctx switch) : 1550 cycles , 12922 ns : |
| fifo.put.alloc.wake+ctx.u_to_k - Allocate to add data to FIFO (w/ ctx switch) : 1795 cycles , 14958 ns : |
| fifo.get.free.blocking.u_to_u - Free when getting data from FIFO (w/ ctx switch) : 2084 cycles , 17374 ns : |
| fifo.put.alloc.wake+ctx.u_to_u - Allocate to add data to FIFO (w/ ctx switch) : 2135 cycles , 17791 ns : |
| lifo.put.immediate.kernel - Add data to LIFO (no ctx switch) : 234 cycles , 1957 ns : |
| lifo.get.immediate.kernel - Get data from LIFO (no ctx switch) : 189 cycles , 1582 ns : |
| lifo.put.alloc.immediate.kernel - Allocate to add data to LIFO (no ctx switch) : 935 cycles , 7791 ns : |
| lifo.get.free.immediate.kernel - Free when getting data from LIFO (no ctx switch) : 650 cycles , 5416 ns : |
| lifo.put.alloc.immediate.user - Allocate to add data to LIFO (no ctx switch) : 1715 cycles , 14291 ns : |
| lifo.get.free.immediate.user - Free when getting data from LIFO (no ctx switch) : 1445 cycles , 12041 ns : |
| lifo.get.blocking.k_to_k - Get data from LIFO (w/ ctx switch) : 1219 cycles , 10166 ns : |
| lifo.put.wake+ctx.k_to_k - Add data to LIFO (w/ ctx switch) : 1230 cycles , 10250 ns : |
| lifo.get.free.blocking.k_to_k - Free when getting data from LIFO (w/ ctx switch) : 1210 cycles , 10083 ns : |
| lifo.put.alloc.wake+ctx.k_to_k - Allocate to add data to LIFO (w/ ctx switch) : 1260 cycles , 10500 ns : |
| lifo.get.free.blocking.u_to_k - Free when getting data from LIFO (w/ ctx switch) : 1744 cycles , 14541 ns : |
| lifo.put.alloc.wake+ctx.k_to_u - Allocate to add data to LIFO (w/ ctx switch) : 1595 cycles , 13291 ns : |
| lifo.get.free.blocking.k_to_u - Free when getting data from LIFO (w/ ctx switch) : 1544 cycles , 12874 ns : |
| lifo.put.alloc.wake+ctx.u_to_k - Allocate to add data to LIFO (w/ ctx switch) : 1795 cycles , 14958 ns : |
| lifo.get.free.blocking.u_to_u - Free when getting data from LIFO (w/ ctx switch) : 2080 cycles , 17339 ns : |
| lifo.put.alloc.wake+ctx.u_to_u - Allocate to add data to LIFO (w/ ctx switch) : 2130 cycles , 17750 ns : |
| events.post.immediate.kernel - Post events (nothing wakes) : 285 cycles , 2375 ns : |
| events.set.immediate.kernel - Set events (nothing wakes) : 290 cycles , 2417 ns : |
| events.wait.immediate.kernel - Wait for any events (no ctx switch) : 235 cycles , 1958 ns : |
| events.wait_all.immediate.kernel - Wait for all events (no ctx switch) : 245 cycles , 2042 ns : |
| events.post.immediate.user - Post events (nothing wakes) : 800 cycles , 6669 ns : |
| events.set.immediate.user - Set events (nothing wakes) : 811 cycles , 6759 ns : |
| events.wait.immediate.user - Wait for any events (no ctx switch) : 780 cycles , 6502 ns : |
| events.wait_all.immediate.user - Wait for all events (no ctx switch) : 770 cycles , 6419 ns : |
| events.wait.blocking.k_to_k - Wait for any events (w/ ctx switch) : 1210 cycles , 10089 ns : |
| events.set.wake+ctx.k_to_k - Set events (w/ ctx switch) : 1449 cycles , 12082 ns : |
| events.wait_all.blocking.k_to_k - Wait for all events (w/ ctx switch) : 1250 cycles , 10416 ns : |
| events.post.wake+ctx.k_to_k - Post events (w/ ctx switch) : 1475 cycles , 12291 ns : |
| events.wait.blocking.u_to_k - Wait for any events (w/ ctx switch) : 1612 cycles , 13435 ns : |
| events.set.wake+ctx.k_to_u - Set events (w/ ctx switch) : 1627 cycles , 13560 ns : |
| events.wait_all.blocking.u_to_k - Wait for all events (w/ ctx switch) : 1785 cycles , 14875 ns : |
| events.post.wake+ctx.k_to_u - Post events (w/ ctx switch) : 1790 cycles , 14923 ns : |
| events.wait.blocking.k_to_u - Wait for any events (w/ ctx switch) : 1407 cycles , 11727 ns : |
| events.set.wake+ctx.u_to_k - Set events (w/ ctx switch) : 1828 cycles , 15234 ns : |
| events.wait_all.blocking.k_to_u - Wait for all events (w/ ctx switch) : 1585 cycles , 13208 ns : |
| events.post.wake+ctx.u_to_k - Post events (w/ ctx switch) : 2000 cycles , 16666 ns : |
| events.wait.blocking.u_to_u - Wait for any events (w/ ctx switch) : 1810 cycles , 15087 ns : |
| events.set.wake+ctx.u_to_u - Set events (w/ ctx switch) : 2004 cycles , 16705 ns : |
| events.wait_all.blocking.u_to_u - Wait for all events (w/ ctx switch) : 2120 cycles , 17666 ns : |
| events.post.wake+ctx.u_to_u - Post events (w/ ctx switch) : 2315 cycles , 19291 ns : |
| semaphore.give.immediate.kernel - Give a semaphore (no waiters) : 125 cycles , 1042 ns : |
| semaphore.take.immediate.kernel - Take a semaphore (no blocking) : 125 cycles , 1042 ns : |
| semaphore.give.immediate.user - Give a semaphore (no waiters) : 645 cycles , 5377 ns : |
| semaphore.take.immediate.user - Take a semaphore (no blocking) : 680 cycles , 5669 ns : |
| semaphore.take.blocking.k_to_k - Take a semaphore (context switch) : 1140 cycles , 9500 ns : |
| semaphore.give.wake+ctx.k_to_k - Give a semaphore (context switch) : 1174 cycles , 9791 ns : |
| semaphore.take.blocking.k_to_u - Take a semaphore (context switch) : 1350 cycles , 11251 ns : |
| semaphore.give.wake+ctx.u_to_k - Give a semaphore (context switch) : 1542 cycles , 12852 ns : |
| semaphore.take.blocking.u_to_k - Take a semaphore (context switch) : 1512 cycles , 12603 ns : |
| semaphore.give.wake+ctx.k_to_u - Give a semaphore (context switch) : 1382 cycles , 11519 ns : |
| semaphore.take.blocking.u_to_u - Take a semaphore (context switch) : 1723 cycles , 14360 ns : |
| semaphore.give.wake+ctx.u_to_u - Give a semaphore (context switch) : 1749 cycles , 14580 ns : |
| condvar.wait.blocking.k_to_k - Wait for a condvar (context switch) : 1285 cycles , 10708 ns : |
| condvar.signal.wake+ctx.k_to_k - Signal a condvar (context switch) : 1315 cycles , 10964 ns : |
| condvar.wait.blocking.k_to_u - Wait for a condvar (context switch) : 1547 cycles , 12898 ns : |
| condvar.signal.wake+ctx.u_to_k - Signal a condvar (context switch) : 1855 cycles , 15458 ns : |
| condvar.wait.blocking.u_to_k - Wait for a condvar (context switch) : 1990 cycles , 16583 ns : |
| condvar.signal.wake+ctx.k_to_u - Signal a condvar (context switch) : 1640 cycles , 13666 ns : |
| condvar.wait.blocking.u_to_u - Wait for a condvar (context switch) : 2313 cycles , 19280 ns : |
| condvar.signal.wake+ctx.u_to_u - Signal a condvar (context switch) : 2170 cycles , 18083 ns : |
| stack.push.immediate.kernel - Add data to k_stack (no ctx switch) : 189 cycles , 1582 ns : |
| stack.pop.immediate.kernel - Get data from k_stack (no ctx switch) : 194 cycles , 1624 ns : |
| stack.push.immediate.user - Add data to k_stack (no ctx switch) : 679 cycles , 5664 ns : |
| stack.pop.immediate.user - Get data from k_stack (no ctx switch) : 1014 cycles , 8455 ns : |
| stack.pop.blocking.k_to_k - Get data from k_stack (w/ ctx switch) : 1209 cycles , 10083 ns : |
| stack.push.wake+ctx.k_to_k - Add data to k_stack (w/ ctx switch) : 1235 cycles , 10291 ns : |
| stack.pop.blocking.u_to_k - Get data from k_stack (w/ ctx switch) : 2050 cycles , 17089 ns : |
| stack.push.wake+ctx.k_to_u - Add data to k_stack (w/ ctx switch) : 1575 cycles , 13125 ns : |
| stack.pop.blocking.k_to_u - Get data from k_stack (w/ ctx switch) : 1549 cycles , 12916 ns : |
| stack.push.wake+ctx.u_to_k - Add data to k_stack (w/ ctx switch) : 1755 cycles , 14625 ns : |
| stack.pop.blocking.u_to_u - Get data from k_stack (w/ ctx switch) : 2389 cycles , 19916 ns : |
| stack.push.wake+ctx.u_to_u - Add data to k_stack (w/ ctx switch) : 2095 cycles , 17458 ns : |
| mutex.lock.immediate.recursive.kernel - Lock a mutex : 165 cycles , 1375 ns : |
| mutex.unlock.immediate.recursive.kernel - Unlock a mutex : 80 cycles , 668 ns : |
| mutex.lock.immediate.recursive.user - Lock a mutex : 685 cycles , 5711 ns : |
| mutex.unlock.immediate.recursive.user - Unlock a mutex : 615 cycles , 5128 ns : |
| heap.malloc.immediate - Average time for heap malloc : 626 cycles , 5224 ns : |
| heap.free.immediate - Average time for heap free : 427 cycles , 3558 ns : |
| =================================================================== |
| PROJECT EXECUTION SUCCESSFUL |
| |
| The sample output above (userspace enabled) shows longer times than for |
| the default build scenario. Enabling userspace results in additional |
| runtime overhead on each call to a kernel object to determine whether the |
| caller is in user or kernel space and consequently whether a system call |
| is needed or not. |