| # Python Gazelle plugin |
| |
| [Gazelle](https://github.com/bazelbuild/bazel-gazelle) |
| is a build file generator for Bazel projects. It can create new BUILD.bazel files for a project that follows language conventions, and it can update existing build files to include new sources, dependencies, and options. |
| |
| Gazelle may be run by Bazel using the gazelle rule, or it may be installed and run as a command line tool. |
| |
| This directory contains a plugin for |
| [Gazelle](https://github.com/bazelbuild/bazel-gazelle) |
| that generates BUILD files content for Python code. When Gazelle is run as a command line tool with this plugin, it embeds a Python interpreter resolved during the plugin build. |
| The behavior of the plugin is slightly different with different version of the interpreter as the Python `stdlib` changes with every minor version release. |
| Distributors of Gazelle binaries should, therefore, build a Gazelle binary for each OS+CPU architecture+Minor Python version combination they are targeting. |
| |
| The following instructions are for when you use [bzlmod](https://docs.bazel.build/versions/5.0.0/bzlmod.html). |
| Please refer to older documentation that includes instructions on how to use Gazelle |
| without using bzlmod as your dependency manager. |
| |
| ## Example |
| |
| We have an example of using Gazelle with Python located [here](https://github.com/bazelbuild/rules_python/tree/main/examples/bzlmod). |
| A fully-working example without using bzlmod is in [`examples/build_file_generation`](../examples/build_file_generation). |
| |
| The following documentation covers using bzlmod. |
| |
| ## Adding Gazelle to your project |
| |
| First, you'll need to add Gazelle to your `MODULES.bazel` file. |
| Get the current version of Gazelle from there releases here: https://github.com/bazelbuild/bazel-gazelle/releases/. |
| |
| |
| See the installation `MODULE.bazel` snippet on the Releases page: |
| https://github.com/bazelbuild/rules_python/releases in order to configure rules_python. |
| |
| You will also need to add the `bazel_dep` for configuration for `rules_python_gazelle_plugin`. |
| |
| Here is a snippet of a `MODULE.bazel` file. |
| |
| ```starlark |
| # The following stanza defines the dependency rules_python. |
| bazel_dep(name = "rules_python", version = "0.22.0") |
| |
| # The following stanza defines the dependency rules_python_gazelle_plugin. |
| # For typical setups you set the version. |
| bazel_dep(name = "rules_python_gazelle_plugin", version = "0.22.0") |
| |
| # The following stanza defines the dependency gazelle. |
| bazel_dep(name = "gazelle", version = "0.31.0", repo_name = "bazel_gazelle") |
| |
| # Import the python repositories generated by the given module extension into the scope of the current module. |
| use_repo(python, "python3_9") |
| use_repo(python, "python3_9_toolchains") |
| |
| # Register an already-defined toolchain so that Bazel can use it during toolchain resolution. |
| register_toolchains( |
| "@python3_9_toolchains//:all", |
| ) |
| |
| # Use the pip extension |
| pip = use_extension("@rules_python//python:extensions.bzl", "pip") |
| |
| # Use the extension to call the `pip_repository` rule that invokes `pip`, with `incremental` set. |
| # Accepts a locked/compiled requirements file and installs the dependencies listed within. |
| # Those dependencies become available in a generated `requirements.bzl` file. |
| # You can instead check this `requirements.bzl` file into your repo. |
| # Because this project has different requirements for windows vs other |
| # operating systems, we have requirements for each. |
| pip.parse( |
| name = "pip", |
| requirements_lock = "//:requirements_lock.txt", |
| requirements_windows = "//:requirements_windows.txt", |
| ) |
| |
| # Imports the pip toolchain generated by the given module extension into the scope of the current module. |
| use_repo(pip, "pip") |
| ``` |
| Next, we'll fetch metadata about your Python dependencies, so that gazelle can |
| determine which package a given import statement comes from. This is provided |
| by the `modules_mapping` rule. We'll make a target for consuming this |
| `modules_mapping`, and writing it as a manifest file for Gazelle to read. |
| This is checked into the repo for speed, as it takes some time to calculate |
| in a large monorepo. |
| |
| Gazelle will walk up the filesystem from a Python file to find this metadata, |
| looking for a file called `gazelle_python.yaml` in an ancestor folder of the Python code. |
| Create an empty file with this name. It might be next to your `requirements.txt` file. |
| (You can just use `touch` at this point, it just needs to exist.) |
| |
| To keep the metadata updated, put this in your `BUILD.bazel` file next to `gazelle_python.yaml`: |
| |
| ```starlark |
| load("@pip//:requirements.bzl", "all_whl_requirements") |
| load("@rules_python_gazelle_plugin//manifest:defs.bzl", "gazelle_python_manifest") |
| load("@rules_python_gazelle_plugin//modules_mapping:def.bzl", "modules_mapping") |
| |
| # This rule fetches the metadata for python packages we depend on. That data is |
| # required for the gazelle_python_manifest rule to update our manifest file. |
| modules_mapping( |
| name = "modules_map", |
| wheels = all_whl_requirements, |
| ) |
| |
| # Gazelle python extension needs a manifest file mapping from |
| # an import to the installed package that provides it. |
| # This macro produces two targets: |
| # - //:gazelle_python_manifest.update can be used with `bazel run` |
| # to recalculate the manifest |
| # - //:gazelle_python_manifest.test is a test target ensuring that |
| # the manifest doesn't need to be updated |
| gazelle_python_manifest( |
| name = "gazelle_python_manifest", |
| modules_mapping = ":modules_map", |
| # This is what we called our `pip_parse` rule, where third-party |
| # python libraries are loaded in BUILD files. |
| pip_repository_name = "pip", |
| # This should point to wherever we declare our python dependencies |
| # (the same as what we passed to the modules_mapping rule in WORKSPACE) |
| # This argument is optional. If provided, the `.test` target is very |
| # fast because it just has to check an integrity field. If not provided, |
| # the integrity field is not added to the manifest which can help avoid |
| # merge conflicts in large repos. |
| requirements = "//:requirements_lock.txt", |
| ) |
| ``` |
| |
| Finally, you create a target that you'll invoke to run the Gazelle tool |
| with the rules_python extension included. This typically goes in your root |
| `/BUILD.bazel` file: |
| |
| ```starlark |
| load("@bazel_gazelle//:def.bzl", "gazelle") |
| |
| # Our gazelle target points to the python gazelle binary. |
| # This is the simple case where we only need one language supported. |
| # If you also had proto, go, or other gazelle-supported languages, |
| # you would also need a gazelle_binary rule. |
| # See https://github.com/bazelbuild/bazel-gazelle/blob/master/extend.rst#example |
| gazelle( |
| name = "gazelle", |
| gazelle = "@rules_python_gazelle_plugin//python:gazelle_binary", |
| ) |
| ``` |
| |
| That's it, now you can finally run `bazel run //:gazelle` anytime |
| you edit Python code, and it should update your `BUILD` files correctly. |
| |
| ## Usage |
| |
| Gazelle is non-destructive. |
| It will try to leave your edits to BUILD files alone, only making updates to `py_*` targets. |
| However it will remove dependencies that appear to be unused, so it's a |
| good idea to check in your work before running Gazelle so you can easily |
| revert any changes it made. |
| |
| The rules_python extension assumes some conventions about your Python code. |
| These are noted below, and might require changes to your existing code. |
| |
| Note that the `gazelle` program has multiple commands. At present, only the `update` command (the default) does anything for Python code. |
| |
| ### Directives |
| |
| You can configure the extension using directives, just like for other |
| languages. These are just comments in the `BUILD.bazel` file which |
| govern behavior of the extension when processing files under that |
| folder. |
| |
| See https://github.com/bazelbuild/bazel-gazelle#directives |
| for some general directives that may be useful. |
| In particular, the `resolve` directive is language-specific |
| and can be used with Python. |
| Examples of these directives in use can be found in the |
| /gazelle/testdata folder in the rules_python repo. |
| |
| Python-specific directives are as follows: |
| |
| | **Directive** | **Default value** | |
| |--------------------------------------|-------------------| |
| | `# gazelle:python_extension` | `enabled` | |
| | Controls whether the Python extension is enabled or not. Sub-packages inherit this value. Can be either "enabled" or "disabled". | | |
| | [`# gazelle:python_root`](#directive-python_root) | n/a | |
| | Sets a Bazel package as a Python root. This is used on monorepos with multiple Python projects that don't share the top-level of the workspace as the root. See [Directive: `python_root`](#directive-python_root) below. | | |
| | `# gazelle:python_manifest_file_name`| `gazelle_python.yaml` | |
| | Overrides the default manifest file name. | | |
| | `# gazelle:python_ignore_files` | n/a | |
| | Controls the files which are ignored from the generated targets. | | |
| | `# gazelle:python_ignore_dependencies`| n/a | |
| | Controls the ignored dependencies from the generated targets. | | |
| | `# gazelle:python_validate_import_statements`| `true` | |
| | Controls whether the Python import statements should be validated. Can be "true" or "false" | | |
| | `# gazelle:python_generation_mode`| `package` | |
| | Controls the target generation mode. Can be "file", "package", or "project" | | |
| | `# gazelle:python_generation_mode_per_file_include_init`| `false` | |
| | Controls whether `__init__.py` files are included as srcs in each generated target when target generation mode is "file". Can be "true", or "false" | | |
| | `# gazelle:python_library_naming_convention`| `$package_name$` | |
| | Controls the `py_library` naming convention. It interpolates `$package_name$` with the Bazel package name. E.g. if the Bazel package name is `foo`, setting this to `$package_name$_my_lib` would result in a generated target named `foo_my_lib`. | | |
| | `# gazelle:python_binary_naming_convention` | `$package_name$_bin` | |
| | Controls the `py_binary` naming convention. Follows the same interpolation rules as `python_library_naming_convention`. | | |
| | `# gazelle:python_test_naming_convention` | `$package_name$_test` | |
| | Controls the `py_test` naming convention. Follows the same interpolation rules as `python_library_naming_convention`. | | |
| | `# gazelle:resolve py ...` | n/a | |
| | Instructs the plugin what target to add as a dependency to satisfy a given import statement. The syntax is `# gazelle:resolve py import-string label` where `import-string` is the symbol in the python `import` statement, and `label` is the Bazel label that Gazelle should write in `deps`. | | |
| | [`# gazelle:python_default_visibility labels`](#directive-python_default_visibility) | | |
| | Instructs gazelle to use these visibility labels on all python targets. `labels` is a comma-separated list of labels (without spaces). | `//$python_root$:__subpackages__` | |
| | [`# gazelle:python_visibility label`](#directive-python_visibility) | | |
| | Appends additional visibility labels to each generated target. This directive can be set multiple times. | | |
| | [`# gazelle:python_test_file_pattern`](#directive-python_test_file_pattern) | `*_test.py,test_*.py` | |
| | Filenames matching these comma-separated `glob`s will be mapped to `py_test` targets. | |
| | `# gazelle:python_label_convention` | `$distribution_name$` | |
| | Defines the format of the distribution name in labels to third-party deps. Useful for using Gazelle plugin with other rules with different repository conventions (e.g. `rules_pycross`). Full label is always prepended with (pip) repository name, e.g. `@pip//numpy`. | |
| | `# gazelle:python_label_normalization` | `snake_case` | |
| | Controls how distribution names in labels to third-party deps are normalized. Useful for using Gazelle plugin with other rules with different label conventions (e.g. `rules_pycross` uses PEP-503). Can be "snake_case", "none", or "pep503". | |
| |
| #### Directive: `python_root`: |
| |
| Set this directive within the Bazel package that you want to use as the Python root. |
| For example, if using a `src` dir (as recommended by the [Python Packaging User |
| Guide][python-packaging-user-guide]), then set this directive in `src/BUILD.bazel`: |
| |
| ```starlark |
| # ./src/BUILD.bazel |
| # Tell gazelle that are python root is the same dir as this Bazel package. |
| # gazelle:python_root |
| ``` |
| |
| Note that the directive does not have any arguments. |
| |
| Gazelle will then add the necessary `imports` attribute to all targets that it |
| generates: |
| |
| ```starlark |
| # in ./src/foo/BUILD.bazel |
| py_libary( |
| ... |
| imports = [".."], # Gazelle adds this |
| ... |
| ) |
| |
| # in ./src/foo/bar/BUILD.bazel |
| py_libary( |
| ... |
| imports = ["../.."], # Gazelle adds this |
| ... |
| ) |
| ``` |
| |
| [python-packaging-user-guide]: https://github.com/pypa/packaging.python.org/blob/4c86169a/source/tutorials/packaging-projects.rst |
| |
| |
| #### Directive: `python_default_visibility`: |
| |
| Instructs gazelle to use these visibility labels on all _python_ targets |
| (typically `py_*`, but can be modified via the `map_kind` directive). The arg |
| to this directive is a a comma-separated list (without spaces) of labels. |
| |
| For example: |
| |
| ```starlark |
| # gazelle:python_default_visibility //:__subpackages__,//tests:__subpackages__ |
| ``` |
| |
| produces the following visibility attribute: |
| |
| ```starlark |
| py_library( |
| ..., |
| visibility = [ |
| "//:__subpackages__", |
| "//tests:__subpackages__", |
| ], |
| ..., |
| ) |
| ``` |
| |
| You can also inject the `python_root` value by using the exact string |
| `$python_root$`. All instances of this string will be replaced by the `python_root` |
| value. |
| |
| ```starlark |
| # gazelle:python_default_visibility //$python_root$:__pkg__,//foo/$python_root$/tests:__subpackages__ |
| |
| # Assuming the "# gazelle:python_root" directive is set in ./py/src/BUILD.bazel, |
| # the results will be: |
| py_library( |
| ..., |
| visibility = [ |
| "//foo/py/src/tests:__subpackages__", # sorted alphabetically |
| "//py/src:__pkg__", |
| ], |
| ..., |
| ) |
| ``` |
| |
| Two special values are also accepted as an argument to the directive: |
| |
| + `NONE`: This removes all default visibility. Labels added by the |
| `python_visibility` directive are still included. |
| + `DEFAULT`: This resets the default visibility. |
| |
| For example: |
| |
| ```starlark |
| # gazelle:python_default_visibility NONE |
| |
| py_library( |
| name = "...", |
| srcs = [...], |
| ) |
| ``` |
| |
| ```starlark |
| # gazelle:python_default_visibility //foo:bar |
| # gazelle:python_default_visibility DEFAULT |
| |
| py_library( |
| ..., |
| visibility = ["//:__subpackages__"], |
| ..., |
| ) |
| ``` |
| |
| These special values can be useful for sub-packages. |
| |
| |
| #### Directive: `python_visibility`: |
| |
| Appends additional `visibility` labels to each generated target. |
| |
| This directive can be set multiple times. The generated `visibility` attribute |
| will include the default visibility and all labels defined by this directive. |
| All labels will be ordered alphabetically. |
| |
| ```starlark |
| # ./BUILD.bazel |
| # gazelle:python_visibility //tests:__pkg__ |
| # gazelle:python_visibility //bar:baz |
| |
| py_library( |
| ... |
| visibility = [ |
| "//:__subpackages__", # default visibility |
| "//bar:baz", |
| "//tests:__pkg__", |
| ], |
| ... |
| ) |
| ``` |
| |
| Child Bazel packages inherit values from parents: |
| |
| ```starlark |
| # ./bar/BUILD.bazel |
| # gazelle:python_visibility //tests:__subpackages__ |
| |
| py_library( |
| ... |
| visibility = [ |
| "//:__subpackages__", # default visibility |
| "//bar:baz", # defined in ../BUILD.bazel |
| "//tests:__pkg__", # defined in ../BUILD.bazel |
| "//tests:__subpackages__", # defined in this ./BUILD.bazel |
| ], |
| ... |
| ) |
| |
| ``` |
| |
| This directive also supports the `$python_root$` placeholder that |
| `# gazelle:python_default_visibility` supports. |
| |
| ```starlark |
| # gazlle:python_visibility //$python_root$/foo:bar |
| |
| py_library( |
| ... |
| visibility = ["//this_is_my_python_root/foo:bar"], |
| ... |
| ) |
| ``` |
| |
| |
| #### Directive: `python_test_file_pattern`: |
| |
| This directive adjusts which python files will be mapped to the `py_test` rule. |
| |
| + The default is `*_test.py,test_*.py`: both `test_*.py` and `*_test.py` files |
| will generate `py_test` targets. |
| + This directive must have a value. If no value is given, an error will be raised. |
| + It is recommended, though not necessary, to include the `.py` extension in |
| the `glob`s: `foo*.py,?at.py`. |
| + Like most directives, it applies to the current Bazel package and all subpackages |
| until the directive is set again. |
| + This directive accepts multiple `glob` patterns, separated by commas without spaces: |
| |
| ```starlark |
| # gazelle:python_test_file_pattern foo*.py,?at |
| |
| py_library( |
| name = "mylib", |
| srcs = ["mylib.py"], |
| ) |
| |
| py_test( |
| name = "foo_bar", |
| srcs = ["foo_bar.py"], |
| ) |
| |
| py_test( |
| name = "cat", |
| srcs = ["cat.py"], |
| ) |
| |
| py_test( |
| name = "hat", |
| srcs = ["hat.py"], |
| ) |
| ``` |
| |
| |
| ##### Notes |
| |
| Resetting to the default value (such as in a subpackage) is manual. Set: |
| |
| ```starlark |
| # gazelle:python_test_file_pattern *_test.py,test_*.py |
| ``` |
| |
| There currently is no way to tell gazelle that _no_ files in a package should |
| be mapped to `py_test` targets (see [Issue #1826][issue-1826]). The workaround |
| is to set this directive to a pattern that will never match a `.py` file, such |
| as `foo.bar`: |
| |
| ```starlark |
| # No files in this package should be mapped to py_test targets. |
| # gazelle:python_test_file_pattern foo.bar |
| |
| py_library( |
| name = "my_test", |
| srcs = ["my_test.py"], |
| ) |
| ``` |
| |
| [issue-1826]: https://github.com/bazelbuild/rules_python/issues/1826 |
| |
| |
| ### Annotations |
| |
| *Annotations* refer to comments found _within Python files_ that configure how |
| Gazelle acts for that particular file. |
| |
| Annotations have the form: |
| |
| ```python |
| # gazelle:annotation_name value |
| ``` |
| |
| and can reside anywhere within a Python file where comments are valid. For example: |
| |
| ```python |
| import foo |
| # gazelle:annotation_name value |
| |
| def bar(): # gazelle:annotation_name value |
| pass |
| ``` |
| |
| The annotations are: |
| |
| | **Annotation** | **Default value** | |
| |---------------------------------------------------------------|-------------------| |
| | [`# gazelle:ignore imports`](#annotation-ignore) | N/A | |
| | Tells Gazelle to ignore import statements. `imports` is a comma-separated list of imports to ignore. | | |
| | [`# gazelle:include_dep targets`](#annotation-include_dep) | N/A | |
| | Tells Gazelle to include a set of dependencies, even if they are not imported in a Python module. `targets` is a comma-separated list of target names to include as dependencies. | | |
| |
| |
| #### Annotation: `ignore` |
| |
| This annotation accepts a comma-separated string of values. Values are names of Python |
| imports that Gazelle should _not_ include in target dependencies. |
| |
| The annotation can be added multiple times, and all values are combined and |
| de-duplicated. |
| |
| For `python_generation_mode = "package"`, the `ignore` annotations |
| found across all files included in the generated target are removed from `deps`. |
| |
| Example: |
| |
| ```python |
| import numpy # a pypi package |
| |
| # gazelle:ignore bar.baz.hello,foo |
| import bar.baz.hello |
| import foo |
| |
| # Ignore this import because _reasons_ |
| import baz # gazelle:ignore baz |
| ``` |
| |
| will cause Gazelle to generate: |
| |
| ```starlark |
| deps = ["@pypi//numpy"], |
| ``` |
| |
| |
| #### Annotation: `include_dep` |
| |
| This annotation accepts a comma-separated string of values. Values _must_ |
| be Python targets, but _no validation is done_. If a value is not a Python |
| target, building will result in an error saying: |
| |
| ``` |
| <target> does not have mandatory providers: 'PyInfo' or 'CcInfo' or 'PyInfo'. |
| ``` |
| |
| Adding non-Python targets to the generated target is a feature request being |
| tracked in [Issue #1865](https://github.com/bazelbuild/rules_python/issues/1865). |
| |
| The annotation can be added multiple times, and all values are combined |
| and de-duplicated. |
| |
| For `python_generation_mode = "package"`, the `include_dep` annotations |
| found across all files included in the generated target are included in `deps`. |
| |
| Example: |
| |
| ```python |
| # gazelle:include_dep //foo:bar,:hello_world,//:abc |
| # gazelle:include_dep //:def,//foo:bar |
| import numpy # a pypi package |
| ``` |
| |
| will cause Gazelle to generate: |
| |
| ```starlark |
| deps = [ |
| ":hello_world", |
| "//:abc", |
| "//:def", |
| "//foo:bar", |
| "@pypi//numpy", |
| ] |
| ``` |
| |
| |
| ### Libraries |
| |
| Python source files are those ending in `.py` but not ending in `_test.py`. |
| |
| First, we look for the nearest ancestor BUILD file starting from the folder |
| containing the Python source file. |
| |
| In package generation mode, if there is no `py_library` in this BUILD file, one |
| is created using the package name as the target's name. This makes it the |
| default target in the package. Next, all source files are collected into the |
| `srcs` of the `py_library`. |
| |
| In project generation mode, all source files in subdirectories (that don't have |
| BUILD files) are also collected. |
| |
| In file generation mode, each file is given its own target. |
| |
| Finally, the `import` statements in the source files are parsed, and |
| dependencies are added to the `deps` attribute. |
| |
| ### Unit Tests |
| |
| A `py_test` target is added to the BUILD file when gazelle encounters |
| a file named `__test__.py`. |
| Often, Python unit test files are named with the suffix `_test`. |
| For example, if we had a folder that is a package named "foo" we could have a Python file named `foo_test.py` |
| and gazelle would create a `py_test` block for the file. |
| |
| The following is an example of a `py_test` target that gazelle would add when |
| it encounters a file named `__test__.py`. |
| |
| ```starlark |
| py_test( |
| name = "build_file_generation_test", |
| srcs = ["__test__.py"], |
| main = "__test__.py", |
| deps = [":build_file_generation"], |
| ) |
| ``` |
| |
| You can control the naming convention for test targets by adding a gazelle directive named |
| `# gazelle:python_test_naming_convention`. See the instructions in the section above that |
| covers directives. |
| |
| ### Binaries |
| |
| When a `__main__.py` file is encountered, this indicates the entry point |
| of a Python program. A `py_binary` target will be created, named `[package]_bin`. |
| |
| When no such entry point exists, Gazelle will look for a line like this in the top level in every module: |
| |
| ```python |
| if __name == "__main__": |
| ``` |
| |
| Gazelle will create a `py_binary` target for every module with such a line, with |
| the target name the same as the module name. |
| |
| If `python_generation_mode` is set to `file`, then instead of one `py_binary` |
| target per module, Gazelle will create one `py_binary` target for each file with |
| such a line, and the name of the target will match the name of the script. |
| |
| Note that it's possible for another script to depend on a `py_binary` target and |
| import from the `py_binary`'s scripts. This can have possible negative effects on |
| Bazel analysis time and runfiles size compared to depending on a `py_library` |
| target. The simplest way to avoid these negative effects is to extract library |
| code into a separate script without a `main` line. Gazelle will then create a |
| `py_library` target for that library code, and other scripts can depend on that |
| `py_library` target. |
| |
| ## Developer Notes |
| |
| Gazelle extensions are written in Go. This gazelle plugin is a hybrid, as it uses Go to execute a |
| Python interpreter as a subprocess to parse Python source files. |
| See the gazelle documentation https://github.com/bazelbuild/bazel-gazelle/blob/master/extend.md |
| for more information on extending Gazelle. |
| |
| If you add new Go dependencies to the plugin source code, you need to "tidy" the go.mod file. |
| After changing that file, run `go mod tidy` or `bazel run @go_sdk//:bin/go -- mod tidy` |
| to update the go.mod and go.sum files. Then run `bazel run //:gazelle_update_repos` to have gazelle |
| add the new dependenies to the deps.bzl file. The deps.bzl file is used as defined in our /WORKSPACE |
| to include the external repos Bazel loads Go dependencies from. |
| |
| Then after editing Go code, run `bazel run //:gazelle` to generate/update the rules in the |
| BUILD.bazel files in our repo. |