Design: Zephyr Sysbuild Support in Zephyr-Bazel

Goal

Implement support for Zephyr‘s Sysbuild (System build) feature in zephyr-bazel. This will allow developers to define multi-image, multi-core, and multi-SoC targets in Bazel, coordinating their configuration, compilation, and flashing while maintaining Kconfig/DTS correctness and Bazel’s lazy evaluation.


Background

Current Multi-Image Limitation in zephyr-bazel

Currently, zephyr-bazel excels at building single-image Zephyr applications (e.g., bazel build //apps/blinky). However, modern embedded systems often require building and packaging multiple related images for a single target board. Examples include:

  • An application running alongside a bootloader (e.g., MCUboot).
  • A multi-core SoC (like nRF5340) where the main application runs on the application core and a companion image (network or audio coprocessor) runs on a secondary core.

Currently, zephyr-bazel lacks a native way to define and coordinate these multi-image builds. Developers are forced to build each image separately and manually stitch them together, breaking the unified build graph and caching guarantees of Bazel.


Conflict with Zephyr CMake Behavior (Sysbuild)

In standard Zephyr, multi-image builds are handled by Sysbuild. Sysbuild is a meta-build system that:

  1. Takes a main application as the entry point.
  2. Dynamically adds helper applications (domains) based on Kconfig (e.g., SB_CONFIG_BOOTLOADER_MCUBOOT=y) or CMake (ExternalZephyrProject_Add).
  3. Builds all images for the same board or related cores/SoCs on the same physical board.
  4. Allows the main app to provide configuration overlays for helpers (e.g., applying sysbuild/mcuboot.conf in the main app's directory as an overlay to the MCUboot helper).
  5. Can generate a merged binary (e.g., merged.hex) for production.
  6. Coordinates flashing and debugging.

To maintain compatibility with standard Zephyr projects and support realistic production hardware workloads, zephyr-bazel must implement a comparable meta-build capability.


Alternatives Considered

We evaluated several approaches to support multi-image builds in Bazel, balancing Starlark complexity, lockfile performance, and developer usability.


Alternative A: Automatic CMake/Kconfig Parsing (Auto-Discovery)

In this approach, the Bzlmod module extension would eagerly parse Zephyr's sysbuild.conf or sysbuild.cmake files to automatically discover what helper images are enabled and generate the Bazel build graph dynamically.

  • Pros:
    • Zero Duplication: Mirrors Zephyr's configuration files directly without requiring the user to write Bazel-specific graphs.
  • Cons:
    • High Complexity & Fragility: CMake is a Turing-complete language. Parsing sysbuild.cmake in Python/Starlark is extremely fragile and error-prone.
    • Bzlmod Lockfile Bloat: Auto-discovering hundreds of potential combinations would eagerly declare repositories in the loading phase, causing severe lockfile bloat and analysis slowdowns.

Alternative B: Union Registration + select() in BUILD.bazel

Here, the user explicitly registers the union of all possible helper images in MODULE.bazel, and then uses standard Bazel select() statements in BUILD.bazel to conditionally wire the build graph based on target board constraints.

  • Pros:
    • Standard Bazel Idioms: Relies on native select() statements.
    • Correct Pruning: Only the selected helpers are built.
  • Cons:
    • High Boilerplate: Users must write complex, verbose select() statements in every BUILD.bazel for multi-core/multi-image targets, creating a high barrier to entry and risking dual-source-of-truth discrepancies.

Alternative C: Declarative JSON Files

Similar to the proposed solution, but using JSON files (sysbuild.json) next to the source code to define the hardware graph.

  • Pros:
    • No AST/Regex Parsing: Python has native JSON support.
  • Cons:
    • Inconsistency: Zephyr configures everything via YAML (module.yml, board.yml, testcase.yaml). Using JSON would introduce a different syntax style into the codebase.

Proposed Solution: Declarative Hierarchical YAML Sysbuild

We propose adopting a Declarative YAML-based Hierarchical design.

Instead of writing complex CMake or verbose Bazel select() code, developers describe the hardware graph declaratively in simple YAML files (sysbuild.yml) next to their source code. Bazel's module extension recursively parses these files during the loading phase to generate the complete multi-image build graph automatically.

graph TD
    Platform["User Platform<br>(//boards/my_board)"]
    Sysbuild["zephyr_sysbuild Target<br>(my_firmware)"]
    Main["Main App<br>(my_app on cpuapp)"]
    Mcuboot["Mcuboot Helper<br>(on cpuapp)"]
    NetApp["Net App Helper<br>(on cpunet)"]
    Netboot["Netboot Nested Helper<br>(on cpunet)"]

    Platform --> Sysbuild
    Sysbuild -->|Transition| Main
    Sysbuild -->|Transition| Mcuboot
    Sysbuild -->|Transition| NetApp
    NetApp -->|Recursive Transition| Netboot

How It Works

  1. YAML Graph Definition: The user defines helpers and platform overrides in sysbuild.yml and boards/<board>.sysbuild.yml.
  2. Recursive Discovery: The module extension scans these files recursively to resolve nested helper chains.
  3. Path-Based Namespacing with Smart Sharing: To support standard Zephyr behavior where a top-level app can configure nested helpers, helper repositories are resolved by their full dependency path. To prevent lockfile bloat, if no configuration overrides are detected along the path, the helper shares a single “default” repository for that board.
  4. Zero-Boilerplate Target: The user declares a simple zephyr_sysbuild target in BUILD.bazel that automatically resolves the entire multi-core, multi-SoC graph at build time using transitions.

Detailed Design

This section details the specific implementation details.


1. Declarative YAML Files (sysbuild.yml)

Instead of writing complex Bazel code, the hardware graph is described declaratively in YAML files next to the source code, matching Zephyr's configuration style.

Global Helpers (sysbuild.yml)

Defined in the app's root directory. These helpers are built for all boards, inheriting the active platform.

# //apps/my_app/sysbuild.yml
helpers:
  mcuboot:
    target: "@mcuboot//boot/zephyr:zephyr_project"

Board-Specific, Multi-Core & External Helpers

Defined in the app's boards/<board>.sysbuild.yml file. This allows adding companion images running on different cores/SoCs, or board-specific external (non-Zephyr) helper targets.

# //apps/my_app/boards/nrf5340dk_nrf5340_cpuapp.sysbuild.yml
helpers:
  net_app:
    target: "//apps/net_companion"
    # Explicitly transition this helper to the companion network core SoC/board:
    platform: "//boards/nordic/nrf5340dk:nrf5340_cpunet"

  custom_coprocessor_firmware:
    target: "//third_party/custom_firmware"
    # Mark as external to bypass Zephyr contextual repository parsing:
    type: "external"
  • Multi-Core / Multi-SoC Companion Support: Handled via the platform attribute, which explicitly redirects the helper to compile for a companion core or SoC using its specific platform target.
  • Non-Zephyr External Projects: Declared using type: "external". This allows including custom bootloaders or companion firmware built with other rules, which will only be compiled when building for boards that list them in their <board>.sysbuild.yml.

2. Recursive Discovery & Path-Based Namespacing with Smart Sharing

During the loading phase, the zephyr_setup module extension recursively scans these YAML files starting from the main application:

  1. Main App on nrf5340_cpuapp: Resolves mcuboot (inherits nrf5340_cpuapp), net_app (targets nrf5340_cpunet), and custom_coprocessor_firmware (external target).
  2. Recursive Scan: Since net_companion is transitioned to nrf5340_cpunet, the extension scans //apps/net_companion/sysbuild.yml in the context of nrf5340_cpunet, resolving netboot (inherits nrf5340_cpunet).
  3. External Target Exemption: Helpers declared with type: "external" bypass the generation of namespaced contextual repositories (@zc_...). Their dependency wiring is managed directly in the transition.

Optimization: Smart Sharing of Default Configurations

In standard Zephyr, a top-level application can override configurations for any helper in the tree (e.g., my_app providing sysbuild/netboot.conf). This means a helper's configuration is technically unique to its full dependency path (e.g., my_app -> net_companion -> netboot).

To prevent lockfile Cartesian product explosion, the module extension applies a Smart Sharing optimization during the loading phase:

  1. Trace Overlays: For every helper in the resolved graph, the extension scans all parent directories in its path for corresponding configuration overlays.
    • For netboot at my_app -> net_companion -> netboot, it checks for my_app/sysbuild/netboot.conf and net_companion/sysbuild/netboot.conf.
  2. Case A: No Overlays Detected (Default Config):
    • If no overlay files are found anywhere in the path, the helper is considered to be using the default configuration for that board.
    • We do not declare a unique repository.
    • The transition index (@zephyr_index//:sysbuild_index.bzl) is configured to point this helper to the shared default repository: @zc_<helper_hash>_<board>.
  3. Case B: Overlays Detected (Custom Config):
    • If one or more overlay files are found, this helper requires a custom configuration.
    • We declare a unique path-specific repository using a hash of the full dependency path: @zc_<path_hash>_<board>.
    • The extension registers this repository eagerly, ensuring Bazel can resolve its unique Kconfig/DTS targets.

This ensures 100% compatibility with Zephyr's hierarchical configuration while keeping the lockfile size minimal for the common case where nested helpers are not customized.

The complete resolved graph mapping is stored in @zephyr_index//:sysbuild_index.bzl.


3. Sysbuild Configuration Parsing & Namespaced Kconfig (sysbuild.conf)

While sysbuild.yml defines the hardware graph (which images and platforms are built), standard Zephyr projects use sysbuild.conf (a Kconfig file) to define sysbuild-specific configurations and pass namespaced variables to specific helper images.

To support this, the module extension executes a Python translation helper during the discovery phase to parse sysbuild.conf and propagate settings to the respective images, replicating Zephyr's CMake namespacing and translation logic.

Namespaced Variables Support

Zephyr Sysbuild allows directing Kconfig settings to specific helper images using the domain name as a prefix: -D<domain_name>_CONFIG_<OPTION>=<value> (on command line) or <domain_name>_CONFIG_<OPTION>=<value> in sysbuild.conf.

For example, to configure the MCUboot helper specifically, a user writes in sysbuild.conf:

mcuboot_CONFIG_BOOT_SIGNATURE_TYPE_RSA=y

Translation and Parsing Mechanism

The Python translation helper script parses sysbuild.conf (and any board-specific sysbuild_<board>.conf if present) during the loading phase:

  1. Extract Namespaced Settings: It scans the file for lines matching the namespaced pattern: ^([a-zA-Z0-9_]+)_(CONFIG_[a-zA-Z0-9_]+)=(.*)$
    • If a match is found (e.g., mcuboot_CONFIG_BOOT_SIGNATURE_TYPE_RSA=y), it extracts the target domain (mcuboot) and the Kconfig setting (CONFIG_BOOT_SIGNATURE_TYPE_RSA=y).
    • It validates that the target domain is a registered helper in the graph.
  2. Translate Global SB_CONFIG_ Symbols: It parses standard SB_CONFIG_ symbols and translates them to target CONFIG_ values based on Zephyr‘s standard mapping rules (defined in Zephyr’s share/sysbuild/image_configurations/).
    • Main App Translations: Maps SB_CONFIG_BOOTLOADER_MCUBOOT to CONFIG_BOOTLOADER_MCUBOOT, SB_CONFIG_MCUBOOT_MODE_* to CONFIG_MCUBOOT_BOOTLOADER_MODE_*, etc.
    • Bootloader (MCUboot) Translations: Maps SB_CONFIG_MCUBOOT_MODE_* to CONFIG_BOOT_UPGRADE_ONLY or similar, SB_CONFIG_BOOT_SIGNATURE_* to CONFIG_BOOT_SIGNATURE_*, etc.
  3. Generate Kconfig Fragments: It groups all extracted and translated configurations by target image and writes them to temporary Kconfig fragments (e.g., sysbuild_generated.conf) for each affected image.

Propagation to Contextual Repositories

The generated Kconfig fragments are passed to the respective contextual repositories (@zc_...) via the zephyr_state or as explicit attributes in gen_zephyr_config.

During repository rule execution, gen_zephyr_config includes these generated fragments in the Kconfig merge list (as an extra overlay), ensuring that the final compiled binaries are correctly configured according to the sysbuild.conf settings.

This maintains 100% compatibility with standard Zephyr Kconfig propagation and namespacing without requiring the execution of CMake.


4. Contextual Repository Generation (gen_zephyr_config)

For custom helper configurations (Case B above), the gen_zephyr_config repository rule is instantiated with the full list of resolved configuration overlays propagated by the module extension.

The repository rule uses these explicit paths to resolve Kconfig and Devicetree overlays during execution, ensuring that:

  • All parent overlays (owner_app/sysbuild/helper.conf) are correctly applied.
  • Top-level overlays (top_app/sysbuild/helper.conf) take precedence.

This replicates Zephyr's standard hierarchical configuration overlay behavior hermetically inside the Bazel repository rule execution phase.


5. The zephyr_sysbuild Starlark Target (Macro + Rule)

To provide a zero-boilerplate user experience, the public API zephyr_sysbuild is implemented as a Starlark Macro that wraps an underlying private rule _zephyr_sysbuild_rule.

# In //apps/my_app/BUILD.bazel
load("@zephyr-bazel//bazel:sysbuild.bzl", "zephyr_sysbuild")

zephyr_sysbuild(
    name = "my_firmware",
    main = "//apps/my_app",
)

The Macro Dependency Injection

Because Bazel rules cannot dynamically discover dependencies during the Analysis Phase, the macro resolves them during the Loading Phase by loading the index generated by the module extension:

# //bazel:sysbuild.bzl (Starlark Macro API)
load("@zephyr_index//:sysbuild_index.bzl", "SYSBUILD_GRAPHS")

def zephyr_sysbuild(name, main, **kwargs):
    # 1. Resolve helpers at loading time using the generated index
    graph = SYSBUILD_GRAPHS.get(main, {})
    helpers = graph.get("helpers", [])

    # 2. Instantiate the underlying rule with resolved dependencies
    _zephyr_sysbuild_rule(
        name = name,
        main = main,
        helpers = helpers,
        **kwargs
    )

Platform Transition Rules (The Core Mechanism)

To support heterogeneous multi-core and multi-SoC setups where different helpers require different target platforms, we use intermediate Platform Transition Targets generated dynamically by the zephyr_sysbuild macro.

This avoids the Bazel limitation where a single transition on a label_list attribute must transition all targets to the same platform.

1. The Transition Rule

A generic Starlark rule transitioned_dep is defined to transition a single dependency to a dynamically specified platform:

# //bazel:private/transition_rule.bzl

def _transition_impl(settings, attr):
    return {"//command_line_option:platforms": [attr.platform]}

platform_transition = transition(
    implementation = _transition_impl,
    inputs = [],
    outputs = ["//command_line_option:platforms"],
)

def _transitioned_dep_impl(ctx):
    # Forward providers from the actual target
    return [ctx.attr.dep[DefaultInfo]]

transitioned_dep = rule(
    implementation = _transitioned_dep_impl,
    attrs = {
        "dep": attr.label(cfg = platform_transition),
        "platform": attr.string(mandatory = True),
    },
)
2. Macro-Level Target Generation

The zephyr_sysbuild macro loads the resolved graph from the index and generates transitioned_dep targets for the main application and each helper. It resolves the correct platform (shared default or custom path-specific) at loading time:

# //bazel:sysbuild.bzl (Starlark Macro API)
load("@zephyr_index//:sysbuild_index.bzl", "SYSBUILD_GRAPHS")
load("//bazel:private/transition_rule.bzl", "transitioned_dep")

def zephyr_sysbuild(name, main, **kwargs):
    graph = SYSBUILD_GRAPHS.get(main, {})
    helpers = graph.get("helpers", [])
    main_platform = graph.get("main_platform")

    transitioned_helpers = []
    for i, helper in enumerate(helpers):
        helper_target = helper["target"]
        helper_platform = helper["resolved_platform"]

        helper_name = "%s_helper_%d" % (name, i)
        transitioned_dep(
            name = helper_name,
            dep = helper_target,
            platform = helper_platform,
        )
        transitioned_helpers.append(":" + helper_name)

    # Transition the main application
    main_name = "%s_main" % name
    transitioned_dep(
        name = main_name,
        dep = main,
        platform = main_platform,
    )

    # Instantiate the underlying sysbuild rule with transitioned targets
    _zephyr_sysbuild_rule(
        name = name,
        main = ":" + main_name,
        helpers = transitioned_helpers,
        **kwargs
    )

This keeps the transition “pure” (only changing --platforms), preventing configuration fan-out, while supporting heterogeneous multi-SoC and multi-core hardware.


6. Invalidation & File Watching

To ensure build correctness while maintaining loading-phase performance, the system must implement a precise invalidation strategy.

Loading-Phase Invalidation (Module Extension)

The zephyr_setup module extension must watch all configuration files that define the sysbuild graph. To avoid redundant watches and performance degradation:

  • Local Assets: If the main app or a helper resides within a directory already watched by the global apps_dirs or boards_dirs discovery (which uses mctx.watch_tree), the extension should not add duplicate watches for files inside these directories.
  • External Assets: If a helper resides in an external repository (e.g., @mcuboot//boot/zephyr), the extension must explicitly watch its sysbuild.yml and any candidate overlay files using mctx.watch(). This ensures that changes to external bootloaders or companion firmware configurations correctly invalidate the Bazel lockfile and re-generate the graph.
  • Overlay Detection: For the “Smart Sharing” optimization, the extension must watch the candidate overlay paths (e.g., sysbuild/<helper>.conf). If the file does not exist, watching the path ensures that creating the file triggers invalidation to transition the helper to a custom configuration.

Analysis/Execution-Phase Invalidation (Repository Rules)

  • The generated @zc_... repository rules must watch the specific resolved overlay files passed to them by the extension using rctx.watch().
  • They must also watch the app's sysbuild/ directory to detect configuration changes.

User Documentation Required

The user documentation must be updated to add a user facing description of the supported sysbuild features and how they are used. It should enumerate the differences between the bazel and cmake sysbuild. It must also include this warning about toolchains:

[!IMPORTANT] Heterogeneous Toolchains: Building for multi-SoC setups (e.g., ARM main core + RISC-V companion) requires that the workspace contains registered toolchains for both architectures. Bazel will automatically select the correct toolchain for each core based on the platform constraints, but both toolchains must be configured in the workspace.


Verification Plan

Automated Tests

We will verify this implementation by adding integration tests in the examples/ folder:

  • Example App: Add a new example app named sysbuild_app in the examples/ directory.

  • Multi-Board Configuration: Configure sysbuild_app with two different boards (e.g., board_single_core and board_multi_core) that define different child helper configurations in their <board>.sysbuild.yml files.

    • board_single_core will only enable mcuboot.
    • board_multi_core will enable mcuboot + a companion net_app (which recursively enables netboot on the companion core).
  • CI/CD Integration: Add the build commands for both board configurations to examples/workflows.json under the "builds" block. This ensures that running ./pw default will build both multi-image graphs automatically during local testing and CI.

  • Output Verification: Verify that:

    • All target binaries (sysbuild_app, mcuboot, net_app, netboot) are compiled successfully for their respective platforms.
    • Kconfig overlays from the owner apps are correctly applied.
    • A merged binary (merged.hex) is generated for both board configurations.
  • Invalidation & Watching Tests: We will add automated unit tests to verify the invalidation logic:

    1. Local Change Invalidation: Verify that modifying a local sysbuild.yml or adding a sysbuild/<helper>.conf file triggers a re-run of the module extension and updates the build graph.
    2. External Change Invalidation: Verify that modifying a sysbuild.yml in an external repository (mocked in tests) correctly invalidates the extension.
    3. No-Op Stability: Verify that building when no files have changed does not trigger re-running of the extension or repository rules (zero-operational overhead).

Implementation Plan

This plan breaks down the implementation into progressive phases, each with clear verification steps to ensure correctness before proceeding.

Phase 1: YAML Discovery & Graph Resolution (Loading Phase)

  • Goal: Implement sysbuild.yml recursive scanning in Bzlmod extension.
  • Tasks:
    • Update setup.bzl (zephyr_setup extension) to scan sysbuild.yml and boards/<board>.sysbuild.yml using PyYAML.
    • Implement topological sorting of helpers.
    • Implement the “Smart Sharing” optimization logic (sharing default repos vs. declaring custom path-specific repos).
    • Generate SYSBUILD_GRAPHS index in sysbuild_index.bzl.
  • Verification / Tests:
    • Unit Tests: Add Python unit tests for YAML parsing and graph resolution logic (mocking the filesystem).
    • Integration Test: Verify that running bazel query on a test app with nested and companion helpers generates the expected graph in sysbuild_index.bzl.

Phase 2: sysbuild.conf Parsing & Kconfig Translation

  • Goal: Parse sysbuild.conf and translate SB_CONFIG_ to CONFIG_.
  • Tasks:
    • Implement sysbuild.conf parsing using kconfiglib in the extension.
    • Implement Python translation script that maps SB_CONFIG_ to CONFIG_ for the main app and mcuboot (replicating Zephyr's CMake logic).
    • Generate translated Kconfig fragments (sysbuild_generated.conf).
    • Update gen_zephyr_config repository rule to accept these fragments and append them as Kconfig overlays.
  • Verification / Tests:
    • Unit Tests: Add Python unit tests for the translation logic (verify SB_CONFIG_BOOTLOADER_MCUBOOT=y translates correctly for both main app and bootloader).
    • Integration Test: Verify that the generated autoconf.h in the @zc_ repository contains the translated CONFIG_ symbols.

Phase 3: Starlark Rules and Transitions

  • Goal: Implement transitioned dependencies and zephyr_sysbuild target.
  • Tasks:
    • Implement transitioned_dep rule in bazel/private/transition_rule.bzl.
    • Implement zephyr_sysbuild macro in bazel/sysbuild.bzl to generate transitioned_dep targets and instantiate _zephyr_sysbuild_rule.
    • Implement _zephyr_sysbuild_rule to depend on transitioned targets and coordinate packaging (generating merged binaries if needed).
  • Verification / Tests:
    • Integration Test: Create a test zephyr_sysbuild target and verify that building it triggers compilation of both main app and helpers for their respective platforms. Use bazel query to inspect the analyzed configuration.

Phase 4: Invalidation and Watching

  • Goal: Implement robust invalidation for configuration changes.
  • Tasks:
    • Implement directory-level watches (mctx.watch_tree / mctx.watch) on the sysbuild/ directory in the Bzlmod extension.
  • Verification / Tests:
    • Manual / Automated Invalidation Test:
      1. Build a sysbuild target.
      2. Modify sysbuild.yml (e.g., add a helper) -> Verify Bazel detects the change and compiles the new helper.
      3. Modify sysbuild.conf (e.g., change MCUboot mode) -> Verify Bazel recompiles the helper with the new configuration.

Phase 5: User Documentation Update

  • Goal: Update the User Guide to document Sysbuild support.
  • Tasks:
    • Update user_guide.md to document Sysbuild features.
    • Explain sysbuild.yml and sysbuild.conf usage.
    • Enumerate differences between Bazel Sysbuild and CMake Sysbuild.
    • Include the Heterogeneous Toolchains warning.
  • Verification / Tests:
    • Verify that the document renders correctly and adheres to the 80-char line limit.