| # Python Gazelle plugin |
| |
| [Gazelle](https://github.com/bazelbuild/bazel-gazelle) |
| is a build file generator for Bazel projects. It can create new BUILD.bazel files for a project that follows language conventions, and it can update existing build files to include new sources, dependencies, and options. |
| |
| Gazelle may be run by Bazel using the gazelle rule, or it may be installed and run as a command line tool. |
| |
| This directory contains a plugin for |
| [Gazelle](https://github.com/bazelbuild/bazel-gazelle) |
| that generates BUILD files content for Python code. When Gazelle is run as a command line tool with this plugin, it embeds a Python interpreter resolved during the plugin build. |
| The behavior of the plugin is slightly different with different version of the interpreter as the Python `stdlib` changes with every minor version release. |
| Distributors of Gazelle binaries should, therefore, build a Gazelle binary for each OS+CPU architecture+Minor Python version combination they are targeting. |
| |
| The following instructions are for when you use [bzlmod](https://docs.bazel.build/versions/5.0.0/bzlmod.html). |
| Please refer to older documentation that includes instructions on how to use Gazelle |
| without using bzlmod as your dependency manager. |
| |
| ## Example |
| |
| We have an example of using Gazelle with Python located [here](https://github.com/bazelbuild/rules_python/tree/main/examples/bzlmod). |
| A fully-working example without using bzlmod is in [`examples/build_file_generation`](../examples/build_file_generation). |
| |
| The following documentation covers using bzlmod. |
| |
| ## Adding Gazelle to your project |
| |
| First, you'll need to add Gazelle to your `MODULES.bazel` file. |
| Get the current version of Gazelle from there releases here: https://github.com/bazelbuild/bazel-gazelle/releases/. |
| |
| |
| See the installation `MODULE.bazel` snippet on the Releases page: |
| https://github.com/bazelbuild/rules_python/releases in order to configure rules_python. |
| |
| You will also need to add the `bazel_dep` for configuration for `rules_python_gazelle_plugin`. |
| |
| Here is a snippet of a `MODULE.bazel` file. |
| |
| ```starlark |
| # The following stanza defines the dependency rules_python. |
| bazel_dep(name = "rules_python", version = "0.22.0") |
| |
| # The following stanza defines the dependency rules_python_gazelle_plugin. |
| # For typical setups you set the version. |
| bazel_dep(name = "rules_python_gazelle_plugin", version = "0.22.0") |
| |
| # The following stanza defines the dependency gazelle. |
| bazel_dep(name = "gazelle", version = "0.31.0", repo_name = "bazel_gazelle") |
| |
| # Import the python repositories generated by the given module extension into the scope of the current module. |
| use_repo(python, "python3_9") |
| use_repo(python, "python3_9_toolchains") |
| |
| # Register an already-defined toolchain so that Bazel can use it during toolchain resolution. |
| register_toolchains( |
| "@python3_9_toolchains//:all", |
| ) |
| |
| # Use the pip extension |
| pip = use_extension("@rules_python//python:extensions.bzl", "pip") |
| |
| # Use the extension to call the `pip_repository` rule that invokes `pip`, with `incremental` set. |
| # Accepts a locked/compiled requirements file and installs the dependencies listed within. |
| # Those dependencies become available in a generated `requirements.bzl` file. |
| # You can instead check this `requirements.bzl` file into your repo. |
| # Because this project has different requirements for windows vs other |
| # operating systems, we have requirements for each. |
| pip.parse( |
| name = "pip", |
| requirements_lock = "//:requirements_lock.txt", |
| requirements_windows = "//:requirements_windows.txt", |
| ) |
| |
| # Imports the pip toolchain generated by the given module extension into the scope of the current module. |
| use_repo(pip, "pip") |
| ``` |
| Next, we'll fetch metadata about your Python dependencies, so that gazelle can |
| determine which package a given import statement comes from. This is provided |
| by the `modules_mapping` rule. We'll make a target for consuming this |
| `modules_mapping`, and writing it as a manifest file for Gazelle to read. |
| This is checked into the repo for speed, as it takes some time to calculate |
| in a large monorepo. |
| |
| Gazelle will walk up the filesystem from a Python file to find this metadata, |
| looking for a file called `gazelle_python.yaml` in an ancestor folder of the Python code. |
| Create an empty file with this name. It might be next to your `requirements.txt` file. |
| (You can just use `touch` at this point, it just needs to exist.) |
| |
| To keep the metadata updated, put this in your `BUILD.bazel` file next to `gazelle_python.yaml`: |
| |
| ```starlark |
| load("@pip//:requirements.bzl", "all_whl_requirements") |
| load("@rules_python_gazelle_plugin//manifest:defs.bzl", "gazelle_python_manifest") |
| load("@rules_python_gazelle_plugin//modules_mapping:def.bzl", "modules_mapping") |
| |
| # This rule fetches the metadata for python packages we depend on. That data is |
| # required for the gazelle_python_manifest rule to update our manifest file. |
| modules_mapping( |
| name = "modules_map", |
| wheels = all_whl_requirements, |
| ) |
| |
| # Gazelle python extension needs a manifest file mapping from |
| # an import to the installed package that provides it. |
| # This macro produces two targets: |
| # - //:gazelle_python_manifest.update can be used with `bazel run` |
| # to recalculate the manifest |
| # - //:gazelle_python_manifest.test is a test target ensuring that |
| # the manifest doesn't need to be updated |
| gazelle_python_manifest( |
| name = "gazelle_python_manifest", |
| modules_mapping = ":modules_map", |
| # This is what we called our `pip_parse` rule, where third-party |
| # python libraries are loaded in BUILD files. |
| pip_repository_name = "pip", |
| # This should point to wherever we declare our python dependencies |
| # (the same as what we passed to the modules_mapping rule in WORKSPACE) |
| requirements = "//:requirements_lock.txt", |
| ) |
| ``` |
| |
| Finally, you create a target that you'll invoke to run the Gazelle tool |
| with the rules_python extension included. This typically goes in your root |
| `/BUILD.bazel` file: |
| |
| ```starlark |
| load("@bazel_gazelle//:def.bzl", "gazelle") |
| |
| # Our gazelle target points to the python gazelle binary. |
| # This is the simple case where we only need one language supported. |
| # If you also had proto, go, or other gazelle-supported languages, |
| # you would also need a gazelle_binary rule. |
| # See https://github.com/bazelbuild/bazel-gazelle/blob/master/extend.rst#example |
| gazelle( |
| name = "gazelle", |
| gazelle = "@rules_python_gazelle_plugin//python:gazelle_binary", |
| ) |
| ``` |
| |
| That's it, now you can finally run `bazel run //:gazelle` anytime |
| you edit Python code, and it should update your `BUILD` files correctly. |
| |
| ## Usage |
| |
| Gazelle is non-destructive. |
| It will try to leave your edits to BUILD files alone, only making updates to `py_*` targets. |
| However it will remove dependencies that appear to be unused, so it's a |
| good idea to check in your work before running Gazelle so you can easily |
| revert any changes it made. |
| |
| The rules_python extension assumes some conventions about your Python code. |
| These are noted below, and might require changes to your existing code. |
| |
| Note that the `gazelle` program has multiple commands. At present, only the `update` command (the default) does anything for Python code. |
| |
| ### Directives |
| |
| You can configure the extension using directives, just like for other |
| languages. These are just comments in the `BUILD.bazel` file which |
| govern behavior of the extension when processing files under that |
| folder. |
| |
| See https://github.com/bazelbuild/bazel-gazelle#directives |
| for some general directives that may be useful. |
| In particular, the `resolve` directive is language-specific |
| and can be used with Python. |
| Examples of these directives in use can be found in the |
| /gazelle/testdata folder in the rules_python repo. |
| |
| Python-specific directives are as follows: |
| |
| | **Directive** | **Default value** | |
| |--------------------------------------|-------------------| |
| | `# gazelle:python_extension` | `enabled` | |
| | Controls whether the Python extension is enabled or not. Sub-packages inherit this value. Can be either "enabled" or "disabled". | | |
| | `# gazelle:python_root` | n/a | |
| | Sets a Bazel package as a Python root. This is used on monorepos with multiple Python projects that don't share the top-level of the workspace as the root. | | |
| | `# gazelle:python_manifest_file_name`| `gazelle_python.yaml` | |
| | Overrides the default manifest file name. | | |
| | `# gazelle:python_ignore_files` | n/a | |
| | Controls the files which are ignored from the generated targets. | | |
| | `# gazelle:python_ignore_dependencies`| n/a | |
| | Controls the ignored dependencies from the generated targets. | | |
| | `# gazelle:python_validate_import_statements`| `true` | |
| | Controls whether the Python import statements should be validated. Can be "true" or "false" | | |
| | `# gazelle:python_generation_mode`| `package` | |
| | Controls the target generation mode. Can be "file", "package", or "project" | | |
| | `# gazelle:python_generation_mode_per_file_include_init`| `package` | |
| | Controls whether `__init__.py` files are included as srcs in each generated target when target generation mode is "file". Can be "true", or "false" | | |
| | `# gazelle:python_library_naming_convention`| `$package_name$` | |
| | Controls the `py_library` naming convention. It interpolates \$package_name\$ with the Bazel package name. E.g. if the Bazel package name is `foo`, setting this to `$package_name$_my_lib` would result in a generated target named `foo_my_lib`. | | |
| | `# gazelle:python_binary_naming_convention` | `$package_name$_bin` | |
| | Controls the `py_binary` naming convention. Follows the same interpolation rules as `python_library_naming_convention`. | | |
| | `# gazelle:python_test_naming_convention` | `$package_name$_test` | |
| | Controls the `py_test` naming convention. Follows the same interpolation rules as `python_library_naming_convention`. | | |
| | `# gazelle:resolve py ...` | n/a | |
| | Instructs the plugin what target to add as a dependency to satisfy a given import statement. The syntax is `# gazelle:resolve py import-string label` where `import-string` is the symbol in the python `import` statement, and `label` is the Bazel label that Gazelle should write in `deps`. | | |
| |
| ### Libraries |
| |
| Python source files are those ending in `.py` but not ending in `_test.py`. |
| |
| First, we look for the nearest ancestor BUILD file starting from the folder |
| containing the Python source file. |
| |
| In package generation mode, if there is no `py_library` in this BUILD file, one |
| is created using the package name as the target's name. This makes it the |
| default target in the package. Next, all source files are collected into the |
| `srcs` of the `py_library`. |
| |
| In project generation mode, all source files in subdirectories (that don't have |
| BUILD files) are also collected. |
| |
| In file generation mode, each file is given its own target. |
| |
| Finally, the `import` statements in the source files are parsed, and |
| dependencies are added to the `deps` attribute. |
| |
| ### Unit Tests |
| |
| A `py_test` target is added to the BUILD file when gazelle encounters |
| a file named `__test__.py`. |
| Often, Python unit test files are named with the suffix `_test`. |
| For example, if we had a folder that is a package named "foo" we could have a Python file named `foo_test.py` |
| and gazelle would create a `py_test` block for the file. |
| |
| The following is an example of a `py_test` target that gazelle would add when |
| it encounters a file named `__test__.py`. |
| |
| ```starlark |
| py_test( |
| name = "build_file_generation_test", |
| srcs = ["__test__.py"], |
| main = "__test__.py", |
| deps = [":build_file_generation"], |
| ) |
| ``` |
| |
| You can control the naming convention for test targets by adding a gazelle directive named |
| `# gazelle:python_test_naming_convention`. See the instructions in the section above that |
| covers directives. |
| |
| ### Binaries |
| |
| When a `__main__.py` file is encountered, this indicates the entry point |
| of a Python program. A `py_binary` target will be created, named `[package]_bin`. |
| |
| When no such entry point exists, Gazelle will look for a line like this in the top level in every module: |
| |
| ```python |
| if __name == "__main__": |
| ``` |
| |
| Gazelle will create a `py_binary` target for every module with such a line, with |
| the target name the same as the module name. |
| |
| If `python_generation_mode` is set to `file`, then instead of one `py_binary` |
| target per module, Gazelle will create one `py_binary` target for each file with |
| such a line, and the name of the target will match the name of the script. |
| |
| Note that it's possible for another script to depend on a `py_binary` target and |
| import from the `py_binary`'s scripts. This can have possible negative effects on |
| Bazel analysis time and runfiles size compared to depending on a `py_library` |
| target. The simplest way to avoid these negative effects is to extract library |
| code into a separate script without a `main` line. Gazelle will then create a |
| `py_library` target for that library code, and other scripts can depend on that |
| `py_library` target. |
| |
| ## Developer Notes |
| |
| Gazelle extensions are written in Go. This gazelle plugin is a hybrid, as it uses Go to execute a |
| Python interpreter as a subprocess to parse Python source files. |
| See the gazelle documentation https://github.com/bazelbuild/bazel-gazelle/blob/master/extend.md |
| for more information on extending Gazelle. |
| |
| If you add new Go dependencies to the plugin source code, you need to "tidy" the go.mod file. |
| After changing that file, run `go mod tidy` or `bazel run @go_sdk//:bin/go -- mod tidy` |
| to update the go.mod and go.sum files. Then run `bazel run //:update_go_deps` to have gazelle |
| add the new dependenies to the deps.bzl file. The deps.bzl file is used as defined in our /WORKSPACE |
| to include the external repos Bazel loads Go dependencies from. |
| |
| Then after editing Go code, run `bazel run //:gazelle` to generate/update the rules in the |
| BUILD.bazel files in our repo. |