Richard Levasseur | 3730803 | 2024-05-18 09:44:18 -0700 | [diff] [blame] | 1 | # Precompiling |
| 2 | |
Richard Levasseur | 47ad4d9 | 2024-05-18 19:47:14 -0700 | [diff] [blame] | 3 | Precompiling is compiling Python source files (`.py` files) into byte code |
| 4 | (`.pyc` files) at build time instead of runtime. Doing it at build time can |
| 5 | improve performance by skipping that work at runtime. |
Richard Levasseur | 3730803 | 2024-05-18 09:44:18 -0700 | [diff] [blame] | 6 | |
Richard Levasseur | 47ad4d9 | 2024-05-18 19:47:14 -0700 | [diff] [blame] | 7 | Precompiling is disabled by default, so you must enable it using flags or |
| 8 | attributes to use it. |
Richard Levasseur | 3730803 | 2024-05-18 09:44:18 -0700 | [diff] [blame] | 9 | |
| 10 | ## Overhead of precompiling |
| 11 | |
| 12 | While precompiling helps runtime performance, it has two main costs: |
| 13 | 1. Increasing the size (count and disk usage) of runfiles. It approximately |
| 14 | double the count of the runfiles because for every `.py` file, there is also |
| 15 | a `.pyc` file. Compiled files are generally around the same size as the |
| 16 | source files, so it approximately doubles the disk usage. |
| 17 | 2. Precompiling requires running an extra action at build time. While |
| 18 | compiling itself isn't that expensive, the overhead can become noticable |
| 19 | as more files need to be compiled. |
| 20 | |
| 21 | ## Binary-level opt-in |
| 22 | |
Richard Levasseur | a3cdab5 | 2024-10-11 11:25:34 -0700 | [diff] [blame] | 23 | Binary-level opt-in allows enabling precompiling on a per-target basic. This is |
| 24 | useful for situations such as: |
Richard Levasseur | 3730803 | 2024-05-18 09:44:18 -0700 | [diff] [blame] | 25 | |
Richard Levasseur | a3cdab5 | 2024-10-11 11:25:34 -0700 | [diff] [blame] | 26 | * Globally enabling precompiling in your `.bazelrc` isn't feasible. This may |
| 27 | be because some targets don't work with precompiling, e.g. because they're too |
| 28 | big. |
| 29 | * Enabling precompiling for build tools (exec config targets) separately from |
| 30 | target-config programs. |
Richard Levasseur | 3730803 | 2024-05-18 09:44:18 -0700 | [diff] [blame] | 31 | |
Richard Levasseur | a3cdab5 | 2024-10-11 11:25:34 -0700 | [diff] [blame] | 32 | To use this approach, set the {bzl:attr}`pyc_collection` attribute on the |
| 33 | binaries/tests that should or should not use precompiling. Then change the |
| 34 | {bzl:flag}`--precompile` default. |
Richard Levasseur | 3730803 | 2024-05-18 09:44:18 -0700 | [diff] [blame] | 35 | |
Richard Levasseur | a3cdab5 | 2024-10-11 11:25:34 -0700 | [diff] [blame] | 36 | The default for the {bzl:attr}`pyc_collection` attribute is controlled by the flag |
| 37 | {bzl:obj}`--@rules_python//python/config_settings:precompile`, so you |
Richard Levasseur | 6ca2f58 | 2024-05-23 20:44:47 -0700 | [diff] [blame] | 38 | can use an opt-in or opt-out approach by setting its value: |
Richard Levasseur | a3cdab5 | 2024-10-11 11:25:34 -0700 | [diff] [blame] | 39 | * targets must opt-out: `--@rules_python//python/config_settings:precompile=enabled` |
| 40 | * targets must opt-in: `--@rules_python//python/config_settings:precompile=disabled` |
Richard Levasseur | 3730803 | 2024-05-18 09:44:18 -0700 | [diff] [blame] | 41 | |
| 42 | ## Advanced precompiler customization |
| 43 | |
| 44 | The default implementation of the precompiler is a persistent, multiplexed, |
| 45 | sandbox-aware, cancellation-enabled, json-protocol worker that uses the same |
| 46 | interpreter as the target toolchain. This works well for local builds, but may |
| 47 | not work as well for remote execution builds. To customize the precompiler, two |
| 48 | mechanisms are available: |
| 49 | |
| 50 | * The exec tools toolchain allows customizing the precompiler binary used with |
Richard Levasseur | a3cdab5 | 2024-10-11 11:25:34 -0700 | [diff] [blame] | 51 | the {bzl:attr}`precompiler` attribute. Arbitrary binaries are supported. |
Richard Levasseur | 3730803 | 2024-05-18 09:44:18 -0700 | [diff] [blame] | 52 | * The execution requirements can be customized using |
| 53 | `--@rules_python//tools/precompiler:execution_requirements`. This is a list |
| 54 | flag that can be repeated. Each entry is a key=value that is added to the |
Richard Levasseur | 66550ec | 2024-05-22 07:28:20 -0700 | [diff] [blame] | 55 | execution requirements of the `PyCompile` action. Note that this flag |
Richard Levasseur | 3730803 | 2024-05-18 09:44:18 -0700 | [diff] [blame] | 56 | is specific to the rules_python precompiler. If a custom binary is used, |
| 57 | this flag will have to be propagated from the custom binary using the |
| 58 | `testing.ExecutionInfo` provider; refer to the `py_interpreter_program` an |
| 59 | |
| 60 | The default precompiler implementation is an asynchronous/concurrent |
| 61 | implementation. If you find it has bugs or hangs, please report them. In the |
Richard Levasseur | 66550ec | 2024-05-22 07:28:20 -0700 | [diff] [blame] | 62 | meantime, the flag `--worker_extra_flag=PyCompile=--worker_impl=serial` can |
Richard Levasseur | 3730803 | 2024-05-18 09:44:18 -0700 | [diff] [blame] | 63 | be used to switch to a synchronous/serial implementation that may not perform |
| 64 | as well, but is less likely to have issues. |
| 65 | |
| 66 | The `execution_requirements` keys of most relevance are: |
| 67 | * `supports-workers`: 1 or 0, to indicate if a regular persistent worker is |
| 68 | desired. |
| 69 | * `supports-multiplex-workers`: 1 o 0, to indicate if a multiplexed persistent |
| 70 | worker is desired. |
| 71 | * `requires-worker-protocol`: json or proto; the rules_python precompiler |
| 72 | currently only supports json. |
| 73 | * `supports-multiplex-sandboxing`: 1 or 0, to indicate if sanboxing is of the |
| 74 | worker is supported. |
| 75 | * `supports-worker-cancellation`: 1 or 1, to indicate if requests to the worker |
| 76 | can be cancelled. |
| 77 | |
| 78 | Note that any execution requirements values can be specified in the flag. |
| 79 | |
| 80 | ## Known issues, caveats, and idiosyncracies |
| 81 | |
| 82 | * Precompiling requires Bazel 7+ with the Pystar rule implementation enabled. |
| 83 | * Mixing rules_python PyInfo with Bazel builtin PyInfo will result in pyc files |
| 84 | being dropped. |
| 85 | * Precompiled files may not be used in certain cases prior to Python 3.11. This |
Richard Levasseur | 1a92c97 | 2024-10-21 19:51:10 -0700 | [diff] [blame] | 86 | occurs due to Python adding the directory of the binary's main `.py` file, which |
Richard Levasseur | 3730803 | 2024-05-18 09:44:18 -0700 | [diff] [blame] | 87 | causes the module to be found in the workspace source directory instead of |
| 88 | within the binary's runfiles directory (where the pyc files are). This can |
| 89 | usually be worked around by removing `sys.path[0]` (or otherwise ensuring the |
| 90 | runfiles directory comes before the repos source directory in `sys.path`). |
| 91 | * The pyc filename does not include the optimization level (e.g. |
| 92 | `foo.cpython-39.opt-2.pyc`). This works fine (it's all byte code), but also |
| 93 | means the interpreter `-O` argument can't be used -- doing so will cause the |
| 94 | interpreter to look for the non-existent `opt-N` named files. |
Richard Levasseur | a3cdab5 | 2024-10-11 11:25:34 -0700 | [diff] [blame] | 95 | * Targets with the same source files and different exec properites will result |
| 96 | in action conflicts. This most commonly occurs when a `py_binary` and |
| 97 | `py_library` have the same source files. To fix, modify both targets so |
| 98 | they have the same exec properties. If this is difficult because unsupported |
| 99 | exec groups end up being passed to the Python rules, please file an issue |
| 100 | to have those exec groups added to the Python rules. |