| # Design: Heuristic Board Discovery and Filtering |
| |
| ## Background |
| |
| ### Desired User Experience |
| The goal of the Zephyr-Bazel integration is to provide a seamless developer |
| experience that mirrors the flexibility of the native Zephyr build system while |
| leveraging Bazel's correctness and speed. Ideally, a user should be able to add |
| a new application or a new board to the workspace, and Bazel should |
| automatically discover the new combinations without requiring manual |
| registration in `MODULE.bazel` or other configuration files. Developers expect |
| to build any app for any board using a simple command like `bazel build |
| //apps/blinky --platforms=//boards/my_board`. |
| |
| ### Limitations of Bazel and Bzlmod |
| While the current design achieves this automatic discovery by eagerly declaring |
| repositories for the Cartesian product of all apps and boards in a module |
| extension, it hits a hard scaling limit due to Bazel's Bzlmod lockfile |
| implementation: |
| |
| 1. **Lockfile Bloat**: Module extensions must declare all repositories they |
| produce during the loading phase. |
| For $N$ apps and $M$ boards, this results in $N \times M$ repository |
| declarations. Bazel records all these declarations in the `MODULE.bazel.lock` |
| file. |
| 2. **Performance Degradation**: With a large number of apps and boards, the |
| lockfile grows to tens of megabytes. |
| Reading, parsing, and updating this massive JSON file during the analysis phase |
| causes severe performance degradation, even when the lockfile is only being |
| checked and not updated. |
| 3. **Lazy Execution vs. Eager Declaration**: Bazel's repository rules are |
| executed lazily (only when needed), |
| but they are *declared* eagerly by the extension. The performance penalty is |
| paid in the loading/analysis phase simply by having the repositories registered |
| in the lockfile, regardless of whether they are part of the current build graph. |
| |
| To overcome these limitations, we need a mechanism to prune the number of |
| declared repositories during the loading phase while maintaining as much of the |
| automatic discovery experience as possible. |
| |
| ## Repository Structure Alternatives |
| |
| ### One Repository per Application |
| We considered moving from a repository per board-app combination to a repository |
| per app model. Two variants were explored: |
| |
| 1. **Eager Evaluation per App**: The repository rule runs Zephyr scripts for |
| all boards supported by the app. |
| This was not chosen because the time to parse Kconfig for hundreds or thousands |
| of in-tree boards sequentially during the fetch phase is not viable and would |
| cause unacceptable delays. |
| 2. **Lazy Evaluation via Actions**: The repository rule generates targets that |
| use standard Bazel actions to parse |
| configurations lazily. This was not chosen because it causes the loss of |
| `select()` behavior on Kconfig symbols. Losing `select()` behavior cannot be |
| done since this is the core methodology for correctly generating the build for |
| all the Zephyr kernel, driver, and subsystem code based on the configuration. |
| |
| ### One Repository for All Applications |
| This option is even worse than the current design. It would require a single |
| repository to handle all applications and boards. This would either force eager |
| evaluation of the entire Cartesian product of apps and boards (which is |
| impossible at scale) or require complex custom rules that still cannot overcome |
| Bazel's limitations regarding dynamic target generation. |
| |
| ### Conclusion on Structure |
| Because we cannot sacrifice lazy evaluation of the matrix, and we cannot lose |
| the ability to use Kconfig symbols in Bazel `select()` statements, the existing |
| system of **one repository per combination** is required. |
| |
| ## Board Down-Selection Alternatives |
| |
| Since the repository-per-combination structure is required, the only way to |
| reduce the lockfile size is to limit the number of combinations declared in the |
| loading phase. We reviewed several methods for down-selecting boards: |
| |
| ### 1. Explicit Allowlist in MODULE.bazel |
| * **Description**: Users manually list active boards in `MODULE.bazel`. |
| * **Pros**: Keeps the lockfile small and predictable. |
| * **Cons**: Violates the "Automatic Discovery" goal and requires manual |
| maintenance. |
| |
| ### 2. Environment Variables |
| * **Description**: The module extension reads an environment variable (e.g., |
| `ZEPHYR_BOARDS`) to filter boards. |
| * **Pros**: Dynamic, good for local development switching. |
| * **Cons**: Changing the variable invalidates the extension and forces |
| lockfile updates. |
| |
| ### 3. Smart Heuristics based on App Structure |
| * **Description**: The extension scans the app's `boards/` directory and only |
| generates repositories for boards |
| that have specific configuration files or overlays. |
| * **Pros**: Maintains automatic discovery while drastically reducing the |
| matrix size. |
| * **Cons**: Skips boards that might work with defaults (requires a fallback |
| mechanism). |
| |
| ## Proposed Solution: Stratified Heuristic Discovery |
| |
| ### Overview |
| The proposed solution adopts a stratified approach to reducing the number of |
| declared repositories. It applies different heuristic rules based on the origin |
| of the assets (In-Tree vs. Out-of-Tree) and the presence of Zephyr metadata |
| files, while maintaining a fallback mechanism. |
| |
| Instead of generating repositories for the full Cartesian product, the module |
| extension applies the following priority-based rules for each `(app, board)` |
| combination: |
| |
| 1. **Out-of-Tree Origin Exemption**: If **both** the application and the board |
| are Out-of-Tree (OOT), the extension |
| generates the repository eagerly. This assumes that the number of custom OOT |
| apps and boards is small, avoiding unnecessary pruning overhead. If either the |
| application or the board is an in-tree asset (mixed cases), the combination |
| follows the heuristics rules below. |
| 2. **Metadata-Driven Filtering (Tests/Samples)**: If the application directory |
| contains a `testcase.yaml` or |
| `sample.yaml` file, the extension parses it. To prevent lockfile Cartesian |
| product explosion, it **only** generates repositories for boards that are |
| explicitly listed in `platform_allow`. If a test relies on broad dynamic |
| constraints (like `arch_allow`, `filter`, `min_flash`) or exclusion |
| (`platform_exclude`), the extension ignores Rule 2 and drops down to Rule 3 |
| (Filesystem heuristics). |
| 3. **Filesystem-Driven Filtering (Heuristics)**: For all other combinations |
| (including mixed cases involving in-tree |
| boards or in-tree apps), the extension scans the application's `boards/` |
| directory. It **only** generates repositories for boards that have specific |
| config files (e.g., `<board>.conf` or `<board>.overlay`) or specific revision |
| overrides in that folder. |
| 4. **Fallback Filter**: To support building apps for boards that rely on |
| default configurations without specific |
| files, the extension accepts an explicit list of manual boards via |
| `MODULE.bazel` or an environment variable. Repositories are generated for these |
| boards for all apps. |
| |
| ### Pros |
| * **Precise Lockfile Reduction**: The number of declared repositories scales |
| with the number of supported combinations. |
| * **High Fidelity for Tests**: Using `testcase.yaml` prevents generating |
| thousands of useless test/board combinations. |
| * **Maintains Automatic Discovery**: Adding a board file or updating a YAML |
| file automatically triggers discovery. |
| * **Preserves `select()` and Lazy Execution**: Keeps the repository-per- |
| combination model. |
| |
| ### Cons |
| * **Increased Discovery Logic Complexity**: The extension must handle YAML |
| parsing and directory scanning. |
| |
| ## Detailed Design |
| |
| ### 1. Discovery Strategy Implementation |
| |
| The discovery pruning logic will be implemented inside `_zephyr_setup_core_impl` |
| in `setup.bzl`. This function is responsible for scanning application |
| directories, applying heuristics validation rules filtering, and setting up |
| configurations Cartesian survival indexes. |
| |
| #### Origin and Metadata Checks |
| * **Rule 1: OOT Origin Exemption**: |
| For each `(app, board)` pair, if the app is located in an out-of-tree directory, |
| and the board is also an OOT board, the extension will unconditionally generate |
| the configuration repository. In mixed cases (where one of the two is an in-tree |
| asset), the combination is subject to the heuristics metadata filtering rules. |
| |
| An app is considered Out-of-Tree if it originates from a directory explicitly |
| listed in `apps_dirs`. A board is considered Out-of-Tree if it originates from a |
| directory explicitly listed in `boards_dirs`. The in-tree boards folder inside |
| `zephyr_root` is automatically scanned and considered in-tree. |
| * **Rule 2: Metadata-driven Filtering (YAML)**: |
| Before directory scanning, the extension will look for `testcase.yaml` or |
| `sample.yaml` in the app's root. To parse these files, it will invoke a small |
| Python helper script. This script logic is run from inside the `zephyr_setup` |
| module extension where the path to `@zephyr` has already been resolved and |
| dynamically supplied in `@zephyr_state//:state.json`. |
| |
| To safely resolve the path to the helper script inside the module extension in |
| `setup.bzl`, the extension should use `script_path = mctx.path(Label("@zephyr- |
| bazel//scripts/build:parse_test_metadata.py"))`. |
| |
| > [!IMPORTANT] > To prevent Cartesian product scaling and analysis loading phase |
| performance degradation, the helper script must > process all applications at |
| once. It receives a JSON mapping of application package names to their absolute |
| paths > via a `--apps-json` argument and returns a resolved JSON mapping of |
| valid board combinations. |
| |
| The Python helper script dynamically integrates `@zephyr//scripts/pylib/twister` |
| into its `sys.path` runtime Python imports environment. If dynamic twister |
| dependencies (like `ruamel.yaml`) are missing from the Python runtime |
| environment, the script should fall back to a lightweight, internal YAML parsing |
| logic for `platform_allow` statements to be robust during the Bazel loading |
| phase. |
| |
| The script evaluates the YAML file and returns the union of all boards |
| explicitly allowed under `platform_allow`. If the file contains broad dynamic |
| constraints (like `arch_allow`), the script returns an empty set, directing the |
| extension to skip Rule 2 and drop down to Rule 3 (boards/ scan heuristics). |
| |
| > [!NOTE] > Returning a massive list of matching boards dynamically constraint |
| scenarios (such as all `arch_allow: arm` > boards) would trigger lockfile |
| explosions again, which violates the optimization goal of the discovery > |
| pruning structure. Dropping down to Rule 3 acts as validation fallback rules. |
| |
| The output forms a JSON structure which Starlark decodes using `json.decode()`. |
| |
| The script must only write the final JSON array to `stdout`. All other logging, |
| warnings, or debugging print statements must be routed to `sys.stderr` to ensure |
| that Starlark's `json.decode(res.stdout)` behaves robustly. |
| |
| #### Filesystem Heuristics and Fallback |
| * **Rule 3: Filesystem Pruning (boards/ scan)**: |
| For combinations involving in-tree boards without matching YAML metadata, the |
| extension will use the shared filesystem heuristics module |
| `scripts/build/discovery_utils.py` to search for overrides. A repository is |
| declared if `<board>.conf`, `<board>.overlay`, or specific revision overrides |
| exist in the app's `boards/` directory. |
| |
| If a board has qualified names (e.g. `board/qualifiers`), candidate files should |
| follow the syntax format `<board_name>_<qualifiers>.conf`. This heuristics logic |
| should be extracted and shared from `kconfig_gen_values.py:268-274`. |
| * **Rule 4: Manual Boards Fallback**: |
| The global manual list defined in `MODULE.bazel` (e.g., `manual_boards = |
| ["nrf52840dk_nrf52840"]`) will be unconditionally combined with any discovered |
| lists, acting as a fallback. |
| |
| ### 2. Manual Boards Syntax & Implementation |
| |
| To support apps that rely on the default configuration without custom overlays, |
| users can explicitly allow list boards using the `manual_boards` parameter in |
| their `MODULE.bazel`. |
| |
| #### Configuration Syntax in `MODULE.bazel` |
| ```python |
| zephyr_setup.env( |
| apps_dirs = ["//apps"], |
| boards_dirs = ["//boards"], |
| manual_boards = [ |
| "nrf52840dk_nrf52840", |
| "same70q21b", |
| ], |
| ) |
| ``` |
| |
| #### Modifying `setup.bzl` |
| * **Tag Class**: Add `manual_boards = attr.string_list()` to the `_env` tag |
| class attributes in `setup.bzl`. |
| * **Extension Logic**: |
| * In `_zephyr_setup_core_impl`, we iterate over tags and extend a |
| workspace `manual_boards` list. |
| * Add `manual_boards` to the JSON written to `@zephyr_state//:state.json`. |
| * **Pruning logic**: In `_zephyr_setup_apps_impl`, the extension will pull in |
| `manual_boards` from the state and |
| ensure repositories are always generated for combinations involving these boards |
| for all apps in the discovery loop. |
| |
| ## Implementation Plan |
| |
| The implementation will proceed in the following stages: |
| |
| ### Phase 1: Manual Boards Integration |
| * **File**: `setup.bzl` |
| * **Changes**: |
| * Line 665: Add `manual_boards = attr.string_list()` to the `_env` tag |
| class attributes. |
| * Line 574: In `_zephyr_setup_core_impl`, aggregate manual boards lists, |
| add it to the `state_data` dictionary |
| mapping, and assign it to `@zephyr_state//:state.json`. |
| |
| ### Phase 2: Heuristic Logic and Pruning |
| * **Files**: `setup.bzl`, `scripts/build/parse_test_metadata.py` [NEW], |
| `scripts/build/discovery_utils.py` [NEW] |
| * **Twister Metadata Parsing Python script**: |
| * Accepts `--apps-json` and `--zephyr-root`. |
| * Inserts dynamic runtime paths: `sys.path.insert(0, args.zephyr_root + |
| "/scripts/pylib/twister")`. |
| * Processes multiple applications in a single batch, returning a JSON |
| mapping of valid app-board combinations list. |
| * **Shared discovery file configurations module**: |
| * Extract overlapping filesystem candidates search logic from |
| `kconfig_gen_values.py:268-274`. |
| * Expose shared utilities checklists validation module candidates. |
| * **Apps Heuristics logic extension loops**: |
| * Line 572 in `_zephyr_setup_core_impl`: Inject combinations pruning |
| heuristics rules setup after scanning |
| apps and boards. |
| * Invoke test metadata YAML parsing rules in Core module with global |
| python helper script validation. |
| |
| ### Phase 3: Index Building and Validation |
| * **File**: `setup.bzl` |
| * **Changes**: |
| * `_zephyr_setup_core_impl`: Filter surviving combinations and store valid |
| combinations in `state_data` as a dictionary mapping `norm_app_pkg: |
| [supported_board_names]`. |
| * `_zephyr_setup_apps_impl`: Ensure the extension relies on this |
| dictionary inside `@zephyr_state` and only generates configuration |
| repositories for combinations that survived pruning. |
| * `_zephyr_setup_core_impl:594`: Update Cartesian index builder generator |
| rules for `@zephyr_index`. |
| Only generate indices mapping combinations that survived heuristics pruning |
| logic validation set. |
| * Run and validate with local build updates. |
| |
| |