pw_tokenizer: Specify UTF-8 encoding when reading databases

The default encoding for opening a file is platform-dependent.
Tokenizer expects UTF-8, but that was only specified for CSV databases,
not directory databases. This causes problems when opening databases
that use non-ASCII characters on Windows.

Consolidate code paths for opening CSV and directory databases so that
UTF-8 is always used. Also, specify UTF-8 when opening JSON databases.

Change-Id: I56ac7e6722b5eda1cd3351798aec4e952d852b56
Reviewed-on: https://pigweed-review.googlesource.com/c/pigweed/pigweed/+/130473
Commit-Queue: Auto-Submit <auto-submit@pigweed.google.com.iam.gserviceaccount.com>
Reviewed-by: Keir Mierle <keir@google.com>
Pigweed-Auto-Submit: Wyatt Hepler <hepler@google.com>
Reviewed-by: William Abajian <williamabajian@google.com>
2 files changed
tree: f52673ba4293c8e1b8ee014e0c87082e77c56e88
  1. .allstar/
  2. .vscode/
  3. build_overrides/
  4. docker/
  5. docs/
  6. pw_allocator/
  7. pw_analog/
  8. pw_android_toolchain/
  9. pw_arduino_build/
  10. pw_assert/
  11. pw_assert_basic/
  12. pw_assert_log/
  13. pw_assert_tokenized/
  14. pw_assert_zephyr/
  15. pw_async/
  16. pw_async_basic/
  17. pw_base64/
  18. pw_bloat/
  19. pw_blob_store/
  20. pw_bluetooth/
  21. pw_bluetooth_hci/
  22. pw_bluetooth_profiles/
  23. pw_boot/
  24. pw_boot_cortex_m/
  25. pw_build/
  26. pw_build_info/
  27. pw_build_mcuxpresso/
  28. pw_bytes/
  29. pw_checksum/
  30. pw_chrono/
  31. pw_chrono_embos/
  32. pw_chrono_freertos/
  33. pw_chrono_stl/
  34. pw_chrono_threadx/
  35. pw_chrono_zephyr/
  36. pw_cli/
  37. pw_compilation_testing/
  38. pw_console/
  39. pw_containers/
  40. pw_cpu_exception/
  41. pw_cpu_exception_cortex_m/
  42. pw_crypto/
  43. pw_digital_io/
  44. pw_docgen/
  45. pw_doctor/
  46. pw_env_setup/
  47. pw_file/
  48. pw_function/
  49. pw_fuzzer/
  50. pw_hdlc/
  51. pw_hex_dump/
  52. pw_i2c/
  53. pw_i2c_mcuxpresso/
  54. pw_ide/
  55. pw_interrupt/
  56. pw_interrupt_cortex_m/
  57. pw_interrupt_zephyr/
  58. pw_intrusive_ptr/
  59. pw_kvs/
  60. pw_libc/
  61. pw_log/
  62. pw_log_android/
  63. pw_log_basic/
  64. pw_log_null/
  65. pw_log_rpc/
  66. pw_log_string/
  67. pw_log_tokenized/
  68. pw_log_zephyr/
  69. pw_malloc/
  70. pw_malloc_freelist/
  71. pw_metric/
  72. pw_minimal_cpp_stdlib/
  73. pw_module/
  74. pw_multisink/
  75. pw_package/
  76. pw_perf_test/
  77. pw_persistent_ram/
  78. pw_polyfill/
  79. pw_preprocessor/
  80. pw_presubmit/
  81. pw_protobuf/
  82. pw_protobuf_compiler/
  83. pw_random/
  84. pw_result/
  85. pw_ring_buffer/
  86. pw_router/
  87. pw_rpc/
  88. pw_rust/
  89. pw_snapshot/
  90. pw_software_update/
  91. pw_span/
  92. pw_spi/
  93. pw_status/
  94. pw_stm32cube_build/
  95. pw_stream/
  96. pw_string/
  97. pw_symbolizer/
  98. pw_sync/
  99. pw_sync_baremetal/
  100. pw_sync_embos/
  101. pw_sync_freertos/
  102. pw_sync_stl/
  103. pw_sync_threadx/
  104. pw_sync_zephyr/
  105. pw_sys_io/
  106. pw_sys_io_arduino/
  107. pw_sys_io_baremetal_lm3s6965evb/
  108. pw_sys_io_baremetal_stm32f429/
  109. pw_sys_io_emcraft_sf2/
  110. pw_sys_io_mcuxpresso/
  111. pw_sys_io_pico/
  112. pw_sys_io_stdio/
  113. pw_sys_io_stm32cube/
  114. pw_sys_io_zephyr/
  115. pw_system/
  116. pw_target_runner/
  117. pw_thread/
  118. pw_thread_embos/
  119. pw_thread_freertos/
  120. pw_thread_stl/
  121. pw_thread_threadx/
  122. pw_tls_client/
  123. pw_tls_client_boringssl/
  124. pw_tls_client_mbedtls/
  125. pw_tokenizer/
  126. pw_tool/
  127. pw_toolchain/
  128. pw_trace/
  129. pw_trace_tokenized/
  130. pw_transfer/
  131. pw_unit_test/
  132. pw_varint/
  133. pw_watch/
  134. pw_web/
  135. pw_work_queue/
  136. seed/
  137. targets/
  138. third_party/
  139. ts/
  140. zephyr/
  141. .bazelignore
  142. .bazelrc
  143. .black.toml
  144. .clang-format
  145. .clang-tidy
  146. .eslintrc.json
  147. .git-blame-ignore-revs
  148. .gitattributes
  149. .gitignore
  150. .gn
  151. .mypy.ini
  152. .prettierrc.js
  153. .pw_ide.yaml
  154. .pylintrc
  155. activate.bat
  156. Android.bp
  157. AUTHORS
  158. bootstrap.bat
  159. bootstrap.sh
  160. BUILD.bazel
  161. BUILD.gn
  162. BUILDCONFIG.gn
  163. CMakeLists.txt
  164. jest.config.ts
  165. Kconfig.zephyr
  166. LICENSE
  167. modules.gni
  168. OWNERS
  169. package-lock.json
  170. package.json
  171. PIGWEED_MODULES
  172. PW_PLUGINS
  173. README.md
  174. rollup.config.js
  175. tsconfig.json
  176. WORKSPACE
README.md

Pigweed

Pigweed is an open source collection of embedded-targeted libraries–or as we like to call them, modules. These modules are building blocks and infrastructure that enable faster and more reliable development on small-footprint MMU-less 32-bit microcontrollers like the STMicroelectronics STM32L452 or the Nordic nRF52832.

For more information please see our website: https://pigweed.dev/.

Links