blob: 05a6c4d17e346d537ada3d1164348697ca7f7e11 [file] [log] [blame] [view]
Richard Levasseur37308032024-05-18 09:44:18 -07001# Precompiling
2
Richard Levasseur47ad4d92024-05-18 19:47:14 -07003Precompiling is compiling Python source files (`.py` files) into byte code
4(`.pyc` files) at build time instead of runtime. Doing it at build time can
5improve performance by skipping that work at runtime.
Richard Levasseur37308032024-05-18 09:44:18 -07006
Richard Levasseur47ad4d92024-05-18 19:47:14 -07007Precompiling is disabled by default, so you must enable it using flags or
8attributes to use it.
Richard Levasseur37308032024-05-18 09:44:18 -07009
10## Overhead of precompiling
11
12While precompiling helps runtime performance, it has two main costs:
131. Increasing the size (count and disk usage) of runfiles. It approximately
14 double the count of the runfiles because for every `.py` file, there is also
15 a `.pyc` file. Compiled files are generally around the same size as the
16 source files, so it approximately doubles the disk usage.
172. Precompiling requires running an extra action at build time. While
18 compiling itself isn't that expensive, the overhead can become noticable
19 as more files need to be compiled.
20
21## Binary-level opt-in
22
23Because of the costs of precompiling, it may not be feasible to globally enable it
24for your repo for everything. For example, some binaries may be
25particularly large, and doubling the number of runfiles isn't doable.
26
27If this is the case, there's an alternative way to more selectively and
28incrementally control precompiling on a per-binry basis.
29
30To use this approach, the two basic steps are:
311. Disable pyc files from being automatically added to runfiles:
32 `--@rules_python//python/config_settings:precompile_add_to_runfiles=decided_elsewhere`,
332. Set the `pyc_collection` attribute on the binaries/tests that should or should
34 not use precompiling.
35
36The default for the `pyc_collection` attribute is controlled by a flag, so you
37can use an opt-in or opt-out approach by setting the flag:
38* targets must opt-out: `--@rules_python//python/config_settings:pyc_collection=include_pyc`,
39* targets must opt-in: `--@rules_python//python/config_settings:pyc_collection=disabled`,
40
41## Advanced precompiler customization
42
43The default implementation of the precompiler is a persistent, multiplexed,
44sandbox-aware, cancellation-enabled, json-protocol worker that uses the same
45interpreter as the target toolchain. This works well for local builds, but may
46not work as well for remote execution builds. To customize the precompiler, two
47mechanisms are available:
48
49* The exec tools toolchain allows customizing the precompiler binary used with
50 the `precompiler` attribute. Arbitrary binaries are supported.
51* The execution requirements can be customized using
52 `--@rules_python//tools/precompiler:execution_requirements`. This is a list
53 flag that can be repeated. Each entry is a key=value that is added to the
54 execution requirements of the `PyPrecompile` action. Note that this flag
55 is specific to the rules_python precompiler. If a custom binary is used,
56 this flag will have to be propagated from the custom binary using the
57 `testing.ExecutionInfo` provider; refer to the `py_interpreter_program` an
58
59The default precompiler implementation is an asynchronous/concurrent
60implementation. If you find it has bugs or hangs, please report them. In the
61meantime, the flag `--worker_extra_flag=PyPrecompile=--worker_impl=serial` can
62be used to switch to a synchronous/serial implementation that may not perform
63as well, but is less likely to have issues.
64
65The `execution_requirements` keys of most relevance are:
66* `supports-workers`: 1 or 0, to indicate if a regular persistent worker is
67 desired.
68* `supports-multiplex-workers`: 1 o 0, to indicate if a multiplexed persistent
69 worker is desired.
70* `requires-worker-protocol`: json or proto; the rules_python precompiler
71 currently only supports json.
72* `supports-multiplex-sandboxing`: 1 or 0, to indicate if sanboxing is of the
73 worker is supported.
74* `supports-worker-cancellation`: 1 or 1, to indicate if requests to the worker
75 can be cancelled.
76
77Note that any execution requirements values can be specified in the flag.
78
79## Known issues, caveats, and idiosyncracies
80
81* Precompiling requires Bazel 7+ with the Pystar rule implementation enabled.
82* Mixing rules_python PyInfo with Bazel builtin PyInfo will result in pyc files
83 being dropped.
84* Precompiled files may not be used in certain cases prior to Python 3.11. This
85 occurs due Python adding the directory of the binary's main `.py` file, which
86 causes the module to be found in the workspace source directory instead of
87 within the binary's runfiles directory (where the pyc files are). This can
88 usually be worked around by removing `sys.path[0]` (or otherwise ensuring the
89 runfiles directory comes before the repos source directory in `sys.path`).
90* The pyc filename does not include the optimization level (e.g.
91 `foo.cpython-39.opt-2.pyc`). This works fine (it's all byte code), but also
92 means the interpreter `-O` argument can't be used -- doing so will cause the
93 interpreter to look for the non-existent `opt-N` named files.