Change std::map to std::unordered_map This provides a speedup to the Parse() function. It turns out that iteration stability was not required, only pointer stability was. std::unordered_map (unlike absl::flat_hash_map) guarantees pointer stability on rehash (even when the corresponding iterators are invalidated). Benchmarks taken on my workstation (Intel(R) Xeon(R) Gold 6154 CPU @ 3.00 Ghz). Before this change: ``` ------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------ BM_KallSymsLoad/4/4 99121604 ns 99088253 ns 7 BM_KallSymsLoad/8/4 98921494 ns 98890095 ns 7 BM_KallSymsLoad/16/4 98740662 ns 98706812 ns 7 BM_KallSymsLoad/32/4 98859902 ns 98833852 ns 7 BM_KallSymsLoad/64/4 98863256 ns 98835927 ns 7 BM_KallSymsLoad/128/4 99153072 ns 99124082 ns 7 BM_KallSymsLoad/256/4 98922027 ns 98893640 ns 7 BM_KallSymsLoad/512/4 98935214 ns 98907115 ns 7 BM_KallSymsLoad/4/8 98728370 ns 98695819 ns 7 BM_KallSymsLoad/8/8 98621335 ns 98585123 ns 7 BM_KallSymsLoad/16/8 98607226 ns 98576839 ns 7 BM_KallSymsLoad/32/8 98697781 ns 98663957 ns 7 BM_KallSymsLoad/64/8 98560117 ns 98531298 ns 7 BM_KallSymsLoad/128/8 98737596 ns 98703986 ns 7 BM_KallSymsLoad/256/8 98629217 ns 98597508 ns 7 BM_KallSymsLoad/512/8 98615277 ns 98584443 ns 7 BM_KallSymsLoad/4/16 98850964 ns 98808132 ns 7 BM_KallSymsLoad/8/16 98695239 ns 98659430 ns 7 BM_KallSymsLoad/16/16 98599332 ns 98554617 ns 7 BM_KallSymsLoad/32/16 98756136 ns 98720076 ns 7 BM_KallSymsLoad/64/16 98466355 ns 98436129 ns 7 BM_KallSymsLoad/128/16 98523223 ns 98487406 ns 7 BM_KallSymsLoad/256/16 98544608 ns 98499732 ns 7 BM_KallSymsLoad/512/16 98625144 ns 98588725 ns 7 BM_KallSymsLoad/4/32 98502094 ns 98468959 ns 7 BM_KallSymsLoad/8/32 98501834 ns 98467530 ns 7 BM_KallSymsLoad/16/32 98434360 ns 98395627 ns 7 BM_KallSymsLoad/32/32 98368997 ns 98330952 ns 7 BM_KallSymsLoad/64/32 98433614 ns 98390613 ns 7 BM_KallSymsLoad/128/32 98415589 ns 98380287 ns 7 BM_KallSymsLoad/256/32 98502306 ns 98471922 ns 7 BM_KallSymsLoad/512/32 98459373 ns 98412036 ns 7 BM_KallSymsLoad/4/64 98430941 ns 98396895 ns 7 BM_KallSymsLoad/8/64 98414775 ns 98378596 ns 7 BM_KallSymsLoad/16/64 98374289 ns 98338748 ns 7 BM_KallSymsLoad/32/64 98393516 ns 98359190 ns 7 BM_KallSymsLoad/64/64 98261359 ns 98220605 ns 7 BM_KallSymsLoad/128/64 98393945 ns 98364689 ns 7 BM_KallSymsLoad/256/64 98462455 ns 98424850 ns 7 BM_KallSymsLoad/512/64 98457300 ns 98426069 ns 7 BM_KallSymsLoad/4/128 98501957 ns 98463384 ns 7 BM_KallSymsLoad/8/128 98674380 ns 98634825 ns 7 BM_KallSymsLoad/16/128 98324142 ns 98293977 ns 7 BM_KallSymsLoad/32/128 98302021 ns 98259279 ns 7 BM_KallSymsLoad/64/128 98365625 ns 98333438 ns 7 BM_KallSymsLoad/128/128 98385330 ns 98349551 ns 7 BM_KallSymsLoad/256/128 98473149 ns 98433697 ns 7 BM_KallSymsLoad/512/128 98348658 ns 98318590 ns 7 BM_KallSymsLoad/4/256 98312813 ns 98274486 ns 7 BM_KallSymsLoad/8/256 98364746 ns 98334950 ns 7 BM_KallSymsLoad/16/256 98394537 ns 98368012 ns 7 BM_KallSymsLoad/32/256 98315049 ns 98286329 ns 7 BM_KallSymsLoad/64/256 98298456 ns 98273549 ns 7 BM_KallSymsLoad/128/256 98326489 ns 98294106 ns 7 BM_KallSymsLoad/256/256 98251913 ns 98222781 ns 7 BM_KallSymsLoad/512/256 98256001 ns 98229861 ns 7 BM_KallSymsLoad/4/512 98279660 ns 98245921 ns 7 BM_KallSymsLoad/8/512 98216327 ns 98183392 ns 7 BM_KallSymsLoad/16/512 98311585 ns 98273809 ns 7 BM_KallSymsLoad/32/512 98262419 ns 98236094 ns 7 BM_KallSymsLoad/64/512 98305662 ns 98275588 ns 7 BM_KallSymsLoad/128/512 98261368 ns 98226209 ns 7 BM_KallSymsLoad/256/512 98204740 ns 98177210 ns 7 BM_KallSymsLoad/512/512 98239170 ns 98208909 ns 7 ``` After the change: ``` ------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------ BM_KallSymsLoad/4/4 64868469 ns 64838461 ns 10 BM_KallSymsLoad/8/4 64200499 ns 64178356 ns 11 BM_KallSymsLoad/16/4 64222060 ns 64191195 ns 11 BM_KallSymsLoad/32/4 64125385 ns 64096874 ns 11 BM_KallSymsLoad/64/4 64127636 ns 64098626 ns 11 BM_KallSymsLoad/128/4 65087918 ns 65055339 ns 11 BM_KallSymsLoad/256/4 64165957 ns 64138396 ns 11 BM_KallSymsLoad/512/4 64163988 ns 64137760 ns 11 BM_KallSymsLoad/4/8 63941703 ns 63909630 ns 11 BM_KallSymsLoad/8/8 63890683 ns 63863018 ns 11 BM_KallSymsLoad/16/8 63900996 ns 63873838 ns 11 BM_KallSymsLoad/32/8 63866876 ns 63834743 ns 11 BM_KallSymsLoad/64/8 64177354 ns 64140083 ns 10 BM_KallSymsLoad/128/8 63870833 ns 63838597 ns 10 BM_KallSymsLoad/256/8 63888823 ns 63860425 ns 11 BM_KallSymsLoad/512/8 63898532 ns 63868791 ns 11 BM_KallSymsLoad/4/16 63778783 ns 63747664 ns 11 BM_KallSymsLoad/8/16 63746139 ns 63713986 ns 11 BM_KallSymsLoad/16/16 63738234 ns 63711361 ns 11 BM_KallSymsLoad/32/16 63722081 ns 63692323 ns 11 BM_KallSymsLoad/64/16 63796452 ns 63773792 ns 11 BM_KallSymsLoad/128/16 63774938 ns 63747945 ns 11 BM_KallSymsLoad/256/16 63773658 ns 63747640 ns 11 BM_KallSymsLoad/512/16 63728806 ns 63698138 ns 11 BM_KallSymsLoad/4/32 63750977 ns 63720920 ns 11 BM_KallSymsLoad/8/32 63784311 ns 63756742 ns 11 BM_KallSymsLoad/16/32 63674778 ns 63646585 ns 11 BM_KallSymsLoad/32/32 63631719 ns 63610206 ns 11 BM_KallSymsLoad/64/32 63663878 ns 63635998 ns 11 BM_KallSymsLoad/128/32 63648718 ns 63621974 ns 11 BM_KallSymsLoad/256/32 63616506 ns 63591153 ns 11 BM_KallSymsLoad/512/32 64305002 ns 64276171 ns 11 BM_KallSymsLoad/4/64 63718325 ns 63691790 ns 11 BM_KallSymsLoad/8/64 63712721 ns 63687291 ns 11 BM_KallSymsLoad/16/64 63628396 ns 63601652 ns 11 BM_KallSymsLoad/32/64 63637803 ns 63612854 ns 11 BM_KallSymsLoad/64/64 63695238 ns 63666736 ns 11 BM_KallSymsLoad/128/64 63598349 ns 63570673 ns 11 BM_KallSymsLoad/256/64 64078859 ns 64049537 ns 11 BM_KallSymsLoad/512/64 63589260 ns 63559930 ns 11 BM_KallSymsLoad/4/128 63953082 ns 63921907 ns 11 BM_KallSymsLoad/8/128 63644521 ns 63624484 ns 11 BM_KallSymsLoad/16/128 64232853 ns 64210635 ns 11 BM_KallSymsLoad/32/128 63747808 ns 63725410 ns 11 BM_KallSymsLoad/64/128 63769731 ns 63744970 ns 11 BM_KallSymsLoad/128/128 64006306 ns 63981250 ns 11 BM_KallSymsLoad/256/128 63846712 ns 63824817 ns 11 BM_KallSymsLoad/512/128 63621425 ns 63594064 ns 11 BM_KallSymsLoad/4/256 64087718 ns 64064789 ns 11 BM_KallSymsLoad/8/256 63668691 ns 63636015 ns 11 BM_KallSymsLoad/16/256 63632569 ns 63610564 ns 11 BM_KallSymsLoad/32/256 63605429 ns 63583740 ns 11 BM_KallSymsLoad/64/256 63640899 ns 63616038 ns 11 BM_KallSymsLoad/128/256 63603973 ns 63582177 ns 11 BM_KallSymsLoad/256/256 63578361 ns 63556459 ns 11 BM_KallSymsLoad/512/256 63596795 ns 63574374 ns 11 BM_KallSymsLoad/4/512 63709812 ns 63682670 ns 11 BM_KallSymsLoad/8/512 63672293 ns 63647508 ns 11 BM_KallSymsLoad/16/512 63568765 ns 63549006 ns 11 BM_KallSymsLoad/32/512 63682813 ns 63654980 ns 11 BM_KallSymsLoad/64/512 63732016 ns 63705784 ns 11 BM_KallSymsLoad/128/512 63711329 ns 63684863 ns 11 BM_KallSymsLoad/256/512 63609111 ns 63585198 ns 11 BM_KallSymsLoad/512/512 63601920 ns 63577604 ns 11 ``` I manually checked this on a Pixel 3a, and the runtime (with real /proc/kallsyms) decreases from ~430ms to ~350ms. Bug: 180093708 Change-Id: I6de2d9717a08ea72ca9e4c35841b1658acd49ad2
Perfetto is a production-grade open-source stack for performance instrumentation and trace analysis. It offers services and libraries and for recording system-level and app-level traces, native + java heap profiling, a library for analyzing traces using SQL and a web-based UI to visualize and explore multi-GB traces.
See https://perfetto.dev/docs or the /docs/ directory for documentation.