refactor: optimize venv creation for nvidia and pkgutil style namespace packages (#3460)

When pkgutil style namespace packages are used, multiple distributions
provide
the same venv path (e.g. `foo/__init__.py`). The venv symlink logic then
tries to
symlink the `foo/` directory as it looks like the highest linkable
directory. When
conflict merging logic runs later, it then has to flatten a depset with
all the files
in the conflicting distributions.

To fix, have whl_library() try to guess when a file is a pkgutil
namespace package.
These are then pass onto py_library's venv building logic so it can
treat the
directories as not directly linkable. A conflict still occurs, but it
only
contains the single `__init__.py` file.

Along the way, special case the "nvidia" package name and always treat
it as a namespace
package. This is because nvidia packages aren't strictly correct: each
has a blank
`__init__.py` file (which marks it as a regular package, not namespace
package). Special
casing like this is undesirable, but it greatly reduces the number of
conflicts if
e.g. torch is installed, and I couldn't find any other metadata to
indicate it's a
namespace package.

Along the way, add some hints to AGENTS.md so they understand repository
rules better.

Fixes https://github.com/bazel-contrib/rules_python/issues/3401

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
diff --git a/.bazelrc b/.bazelrc
index 6473f10..718f830 100644
--- a/.bazelrc
+++ b/.bazelrc
@@ -18,6 +18,8 @@
 
 # Ensure ongoing compatibility with this flag.
 common --incompatible_disallow_struct_provider_syntax
+# Makes Bazel 7 act more like Bazel 8
+common --incompatible_use_plus_in_repo_names
 
 # Windows makes use of runfiles for some rules
 build --enable_runfiles
diff --git a/AGENTS.md b/AGENTS.md
index 671b85c..c1c9f79 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -27,6 +27,68 @@
 For doc strings, using triple quoted strings when the doc string is more than
 three lines. Do not use a trailing backslack (`\`) for the opening triple-quote.
 
+### Starlark Code
+
+Starlark does not support recursion. Use iterative algorithms instead.
+
+Starlark does not support `while` loops. Use `for` loop with an appropriately
+sized iterable instead.
+
+#### Starlark testing
+
+For Starlark tests:
+
+* Use `rules_testing`, not `bazel_skylib`.
+* See https://rules-testing.readthedocs.io/en/latest/analysis_tests.html for
+  examples on using rules_testing.
+* See `tests/builders/builders_tests.bzl` for an example of using it in
+  this project.
+
+A test is defined in two parts:
+  * A setup function, e.g. `def _test_foo(name)`. This defines targets
+    and calls `analysis_test`.
+  * An implementation function, e.g. `def _test_foo_impl(env, target)`. This
+    contains asserts.
+
+Example:
+
+```
+# File: foo_tests.bzl
+
+load("@rules_testing//lib:analysis_test.bzl", "analysis_test")
+load("@rules_testing//lib:test_suite.bzl", "test_suite")
+
+_tests = []
+
+def _test_foo(name):
+    foo_library(
+        name = name + "_subject",
+    )
+    analysis_test(
+        name = name,
+        impl = _test_foo_impl,
+        target = name + "_subject",
+    )
+_tests.append(_test_foo)
+
+def _test_foo_impl(env, target):
+    env.expect.that_whatever(target[SomeInfo].whatever).equals(expected)
+
+def foo_test_suite(name):
+    test_suite(name=name, tests=_tests)
+```
+
+
+#### Repository rules
+
+The function argument `rctx` is a hint that the function is a repository rule,
+or used by a repository rule.
+
+The function argument `mrctx` is a hint that the function can be used by a
+repository rule or module extension.
+
+The `repository_ctx` API docs are at: https://bazel.build/rules/lib/builtins/repository_ctx
+
 ### bzl_library targets for bzl source files
 
 * A `bzl_library` target should be defined for every `.bzl` file outside
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 87ae3f5..69e159c 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -118,6 +118,8 @@
   `RULES_PYTHON_ENABLE_PIPSTAR=1` by default. Users of `experimental_index_url` that perform
   cross-builds should add {obj}`target_platforms` to their `pip.parse` invocations, which will
   become mandatory if any cross-builds are required from the next release.
+* (py_library) Attribute {obj}`namespace_package_files` added. It is a hint for
+  optimizing venv creation.
 
 [20251031]: https://github.com/astral-sh/python-build-standalone/releases/tag/20251031
 [20251202]: https://github.com/astral-sh/python-build-standalone/releases/tag/20251202
diff --git a/MODULE.bazel b/MODULE.bazel
index ef7499e..80c7ab1 100644
--- a/MODULE.bazel
+++ b/MODULE.bazel
@@ -238,6 +238,8 @@
     "buildkite_config",
     "implicit_namespace_ns_sub1",
     "implicit_namespace_ns_sub2",
+    "pkgutil_nspkg1",
+    "pkgutil_nspkg2",
     "rules_python_runtime_env_tc_info",
     "somepkg_with_build_files",
     "whl_with_build_files",
diff --git a/python/private/BUILD.bazel b/python/private/BUILD.bazel
index 4404857..e92c45d 100644
--- a/python/private/BUILD.bazel
+++ b/python/private/BUILD.bazel
@@ -675,6 +675,7 @@
         "//:__subpackages__",
     ],
     deps = [
+        ":py_internal_bzl",
         "@bazel_skylib//lib:types",
     ],
 )
diff --git a/python/private/internal_dev_deps.bzl b/python/private/internal_dev_deps.bzl
index d621a5d..0e21d8b 100644
--- a/python/private/internal_dev_deps.bzl
+++ b/python/private/internal_dev_deps.bzl
@@ -67,6 +67,30 @@
         enable_implicit_namespace_pkgs = False,
     )
 
+    whl_from_dir_repo(
+        name = "pkgutil_nspkg1_whl",
+        root = "//tests/repos/pkgutil_nspkg1:BUILD.bazel",
+        output = "pkgutil_nspkg1-1.0-any-none-any.whl",
+    )
+    whl_library(
+        name = "pkgutil_nspkg1",
+        whl_file = "@pkgutil_nspkg1_whl//:pkgutil_nspkg1-1.0-any-none-any.whl",
+        requirement = "pkgutil_nspkg1",
+        enable_implicit_namespace_pkgs = False,
+    )
+
+    whl_from_dir_repo(
+        name = "pkgutil_nspkg2_whl",
+        root = "//tests/repos/pkgutil_nspkg2:BUILD.bazel",
+        output = "pkgutil_nspkg2-1.0-any-none-any.whl",
+    )
+    whl_library(
+        name = "pkgutil_nspkg2",
+        whl_file = "@pkgutil_nspkg2_whl//:pkgutil_nspkg2-1.0-any-none-any.whl",
+        requirement = "pkgutil_nspkg2",
+        enable_implicit_namespace_pkgs = False,
+    )
+
 internal_dev_deps = module_extension(
     implementation = _internal_dev_deps_impl,
     doc = "This extension creates internal rules_python dev dependencies.",
diff --git a/python/private/py_library.bzl b/python/private/py_library.bzl
index b2a9fdd..7e9a59f 100644
--- a/python/private/py_library.bzl
+++ b/python/private/py_library.bzl
@@ -101,6 +101,20 @@
 :::
 """,
         ),
+        "namespace_package_files": lambda: attrb.LabelList(
+            allow_empty = True,
+            allow_files = True,
+            doc = """
+Files whose directories are namespace packages.
+
+When {obj}`--venv_site_packages=yes` is set, this helps inform which directories should be
+treated as namespace packages and expect files from other targets to be contributed.
+This allows optimizing the generation of symlinks to be cheaper at analysis time.
+
+:::{versionadded} VERSION_NEXT_FEATURE
+:::
+""",
+        ),
         "_add_srcs_to_runfiles_flag": lambda: attrb.Label(
             default = labels.ADD_SRCS_TO_RUNFILES,
         ),
@@ -251,6 +265,7 @@
             package,
             version_str,
             site_packages_root = imports[0],
+            namespace_package_files = ctx.files.namespace_package_files,
         )
     else:
         imports = collect_imports(ctx, semantics)
diff --git a/python/private/pypi/BUILD.bazel b/python/private/pypi/BUILD.bazel
index dd86dcf..18caa53 100644
--- a/python/private/pypi/BUILD.bazel
+++ b/python/private/pypi/BUILD.bazel
@@ -431,16 +431,16 @@
         ":attrs_bzl",
         ":deps_bzl",
         ":generate_whl_library_build_bazel_bzl",
+        ":parse_whl_name_bzl",
         ":patch_whl_bzl",
-        ":pep508_requirement_bzl",
         ":pypi_repo_utils_bzl",
         ":whl_metadata_bzl",
         ":whl_target_platforms_bzl",
         "//python/private:auth_bzl",
-        "//python/private:bzlmod_enabled_bzl",
         "//python/private:envsubst_bzl",
         "//python/private:is_standalone_interpreter_bzl",
         "//python/private:repo_utils_bzl",
+        "//python/private:util_bzl",
         "@rules_python_internal//:rules_python_config_bzl",
     ],
 )
diff --git a/python/private/pypi/pypi_repo_utils.bzl b/python/private/pypi/pypi_repo_utils.bzl
index 04c9b5d..d8e3200 100644
--- a/python/private/pypi/pypi_repo_utils.bzl
+++ b/python/private/pypi/pypi_repo_utils.bzl
@@ -16,6 +16,7 @@
 
 load("@bazel_skylib//lib:types.bzl", "types")
 load("//python/private:repo_utils.bzl", "repo_utils")
+load("//python/private:util.bzl", "is_importable_name")
 
 def _get_python_interpreter_attr(mrctx, *, python_interpreter = None):
     """A helper function for getting the `python_interpreter` attribute or it's default
@@ -161,9 +162,44 @@
         **_execute_prep(mrctx, python = python, srcs = srcs, **kwargs)
     )
 
+def _find_namespace_package_files(rctx, install_dir):
+    """Finds all `__init__.py` files that belong to namespace packages.
+
+    A `__init__.py` file belongs to a namespace package if it contains `__path__ =`,
+    `pkgutil`, and `extend_path(`.
+
+    Args:
+        rctx (repository_ctx): The repository context.
+        install_dir (path): The path to the install directory.
+
+    Returns:
+        list[str]: A list of relative paths to `__init__.py` files that belong
+            to namespace packages.
+    """
+
+    repo_root = str(rctx.path(".")) + "/"
+    namespace_package_files = []
+    for top_level_dir in install_dir.readdir():
+        if not is_importable_name(top_level_dir.basename):
+            continue
+        init_py = top_level_dir.get_child("__init__.py")
+        if not init_py.exists:
+            continue
+        content = rctx.read(init_py)
+
+        # Look for code resembling the pkgutil namespace setup code:
+        # __path__ = __import__("pkgutil").extend_path(__path__, __name__)
+        if ("__path__ =" in content and
+            "pkgutil" in content and
+            "extend_path(" in content):
+            namespace_package_files.append(str(init_py).removeprefix(repo_root))
+
+    return namespace_package_files
+
 pypi_repo_utils = struct(
     construct_pythonpath = _construct_pypath,
     execute_checked = _execute_checked,
     execute_checked_stdout = _execute_checked_stdout,
+    find_namespace_package_files = _find_namespace_package_files,
     resolve_python_interpreter = _resolve_python_interpreter,
 )
diff --git a/python/private/pypi/whl_library.bzl b/python/private/pypi/whl_library.bzl
index fdb3f93..201ef59 100644
--- a/python/private/pypi/whl_library.bzl
+++ b/python/private/pypi/whl_library.bzl
@@ -384,11 +384,13 @@
             supports_whl_extraction = rp_config.supports_whl_extraction,
         )
 
+        install_dir_path = whl_path.dirname.get_child("site-packages")
         metadata = whl_metadata(
-            install_dir = whl_path.dirname.get_child("site-packages"),
+            install_dir = install_dir_path,
             read_fn = rctx.read,
             logger = logger,
         )
+        namespace_package_files = pypi_repo_utils.find_namespace_package_files(rctx, install_dir_path)
 
         # NOTE @aignas 2024-06-22: this has to live on until we stop supporting
         # passing `twine` as a `:pkg` library via the `WORKSPACE` builds.
@@ -432,6 +434,7 @@
             data_exclude = rctx.attr.pip_data_exclude,
             group_deps = rctx.attr.group_deps,
             group_name = rctx.attr.group_name,
+            namespace_package_files = namespace_package_files,
         )
     else:
         target_platforms = rctx.attr.experimental_target_platforms or []
@@ -491,6 +494,8 @@
             )
             entry_points[entry_point_without_py] = entry_point_script_name
 
+        namespace_package_files = pypi_repo_utils.find_namespace_package_files(rctx, rctx.path("site-packages"))
+
         build_file_contents = generate_whl_library_build_bazel(
             name = whl_path.basename,
             sdist_filename = sdist_filename,
@@ -509,6 +514,7 @@
                 "pypi_name={}".format(metadata["name"]),
                 "pypi_version={}".format(metadata["version"]),
             ],
+            namespace_package_files = namespace_package_files,
         )
 
     # Delete these in case the wheel had them. They generally don't cause
diff --git a/python/private/pypi/whl_library_targets.bzl b/python/private/pypi/whl_library_targets.bzl
index 39b1ccc..0fe2c52 100644
--- a/python/private/pypi/whl_library_targets.bzl
+++ b/python/private/pypi/whl_library_targets.bzl
@@ -123,6 +123,7 @@
         entry_points = {},
         native = native,
         enable_implicit_namespace_pkgs = False,
+        namespace_package_files = [],
         rules = struct(
             copy_file = copy_file,
             py_binary = py_binary,
@@ -169,6 +170,8 @@
         enable_implicit_namespace_pkgs: {type}`boolean` generate __init__.py
             files for namespace pkgs.
         native: {type}`native` The native struct for overriding in tests.
+        namespace_package_files: {type}`list[str]` A list of labels of files whose
+            directories are namespace packages.
         rules: {type}`struct` A struct with references to rules for creating targets.
     """
     dependencies = sorted([normalize_name(d) for d in dependencies])
@@ -365,7 +368,7 @@
         )
 
         if not enable_implicit_namespace_pkgs:
-            srcs = srcs + select({
+            generated_namespace_package_files = select({
                 Label("//python/config_settings:is_venvs_site_packages"): [],
                 "//conditions:default": rules.create_inits(
                     srcs = srcs + data + pyi_srcs,
@@ -373,6 +376,8 @@
                     root = "site-packages",
                 ),
             })
+            namespace_package_files += generated_namespace_package_files
+            srcs = srcs + generated_namespace_package_files
 
         rules.py_library(
             name = py_library_label,
@@ -391,6 +396,7 @@
             tags = tags,
             visibility = impl_vis,
             experimental_venvs_site_packages = Label("@rules_python//python/config_settings:venvs_site_packages"),
+            namespace_package_files = namespace_package_files,
         )
 
 def _config_settings(dependencies_by_platform, dependencies_with_markers, rules, native = native, **kwargs):
diff --git a/python/private/util.bzl b/python/private/util.bzl
index d3053fe..31f317f 100644
--- a/python/private/util.bzl
+++ b/python/private/util.bzl
@@ -15,6 +15,7 @@
 """Functionality shared by multiple pieces of code."""
 
 load("@bazel_skylib//lib:types.bzl", "types")
+load("//python/private:py_internal.bzl", "py_internal")
 
 def copy_propagating_kwargs(from_kwargs, into_kwargs = None):
     """Copies args that must be compatible between two targets with a dependency relationship.
@@ -69,3 +70,18 @@
             attrs["tags"] = tags + [tag]
     else:
         attrs["tags"] = [tag]
+
+def is_importable_name(name):
+    # Requires Bazel 8+
+    if hasattr(py_internal, "regex_match"):
+        # ?U means activates unicode matching (Python allows most unicode
+        # in module names / identifiers).
+        # \w matches alphanumeric and underscore.
+        # NOTE: regex_match has an implicit ^ and $
+        return py_internal.regex_match(name, "(?U)\\w+")
+    else:
+        # Otherwise, use a rough hueristic that should catch most cases.
+        return (
+            "." not in name and
+            "-" not in name
+        )
diff --git a/python/private/venv_runfiles.bzl b/python/private/venv_runfiles.bzl
index 7ff5c85..851d701 100644
--- a/python/private/venv_runfiles.bzl
+++ b/python/private/venv_runfiles.bzl
@@ -13,7 +13,15 @@
     "VenvSymlinkEntry",
     "VenvSymlinkKind",
 )
-load(":py_internal.bzl", "py_internal")
+load(":util.bzl", "is_importable_name")
+
+# List of top-level package names that are known to be namespace
+# packages, but cannot be detected as such automatically.
+_WELL_KNOWN_NAMESPACE_PACKAGES = [
+    # nvidia wheels incorrectly use an empty `__init__.py` file, even
+    # though multiple distributions install into the directory.
+    "nvidia",
+]
 
 def create_venv_app_files(ctx, deps, venv_dir_map):
     """Creates the tree of app-specific files for a venv for a binary.
@@ -232,22 +240,34 @@
                 if venv_path not in keep_map:
                     keep_map[venv_path] = file
 
-def _is_importable_name(name):
-    # Requires Bazel 8+
-    if hasattr(py_internal, "regex_match"):
-        # ?U means activates unicode matching (Python allows most unicode
-        # in module names / identifiers).
-        # \w matches alphanumeric and underscore.
-        # NOTE: regex_match has an implicit ^ and $
-        return py_internal.regex_match(name, "(?U)\\w+")
-    else:
-        # Otherwise, use a rough hueristic that should catch most cases.
-        return (
-            "." not in name and
-            "-" not in name
-        )
+def _get_file_venv_path(ctx, f, site_packages_root):
+    """Computes a file's venv_path if it's under the site_packages_root.
 
-def get_venv_symlinks(ctx, files, package, version_str, site_packages_root):
+    Args:
+        ctx: The current ctx.
+        f: The file to compute the venv_path for.
+        site_packages_root: The site packages root path.
+
+    Returns:
+        A tuple `(venv_path, rf_root_path)` if the file is under
+        `site_packages_root`, otherwise `(None, None)`.
+    """
+    rf_root_path = runfiles_root_path(ctx, f.short_path)
+    _, _, repo_rel_path = rf_root_path.partition("/")
+    head, found_sp_root, venv_path = repo_rel_path.partition(site_packages_root)
+    if head or not found_sp_root:
+        # If head is set, then the path didn't start with site_packages_root
+        # if found_sp_root is empty, then it means it wasn't found at all.
+        return (None, None)
+    return (venv_path, rf_root_path)
+
+def get_venv_symlinks(
+        ctx,
+        files,
+        package,
+        version_str,
+        site_packages_root,
+        namespace_package_files = []):
     """Compute the VenvSymlinkEntry objects for a library.
 
     Args:
@@ -259,6 +279,9 @@
         version_str: {type}`str` the distribution's version.
         site_packages_root: {type}`str` prefix under which files are
             considered to be part of the installed files.
+        namespace_package_files: {type}`list[File]` a list of files
+            that are pkgutil-style namespace packages and cannot be
+            directly linked.
 
     Returns:
         {type}`list[VenvSymlinkEntry]` the entries that describe how
@@ -276,8 +299,24 @@
 
     all_files = sorted(files, key = lambda f: f.short_path)
 
+    # dict[str venv-relative dirname, bool is_namespace_package]
+    namespace_package_dirs = {
+        ns: True
+        for ns in _WELL_KNOWN_NAMESPACE_PACKAGES
+    }
+
     # venv paths that cannot be directly linked. Dict acting as set.
-    cannot_be_linked_directly = {}
+    cannot_be_linked_directly = {
+        dirname: True
+        for dirname in namespace_package_dirs.keys()
+    }
+    for f in namespace_package_files:
+        venv_path, _ = _get_file_venv_path(ctx, f, site_packages_root)
+        if venv_path == None:
+            continue
+        ns_dir = paths.dirname(venv_path)
+        namespace_package_dirs[ns_dir] = True
+        cannot_be_linked_directly[ns_dir] = True
 
     # dict[str path, VenvSymlinkEntry]
     # Where path is the venv path (i.e. relative to site_packages_prefix)
@@ -286,9 +325,6 @@
     # List of (File, str venv_path) tuples
     files_left_to_link = []
 
-    # dict[str dirname, bool is_namespace_package]
-    namespace_package_dirs = {}
-
     # We want to minimize the number of files symlinked. Ideally, only the
     # top-level directories are symlinked. Unfortunately, shared libraries
     # complicate matters: if a shared library's directory is linked, then the
@@ -298,12 +334,8 @@
     # all the parent directories of the shared library can't be linked
     # directly.
     for src in all_files:
-        rf_root_path = runfiles_root_path(ctx, src.short_path)
-        _, _, repo_rel_path = rf_root_path.partition("/")
-        head, found_sp_root, venv_path = repo_rel_path.partition(site_packages_root)
-        if head or not found_sp_root:
-            # If head is set, then the path didn't start with site_packages_root
-            # if found_sp_root is empty, then it means it wasn't found at all.
+        venv_path, rf_root_path = _get_file_venv_path(ctx, src, site_packages_root)
+        if venv_path == None:
             continue
 
         filename = paths.basename(venv_path)
@@ -336,7 +368,7 @@
             # If its already known to be non-implicit namespace, then skip
             namespace_package_dirs.get(top_level_dirname, True) and
             # It must be an importable name to be an implicit namespace package
-            _is_importable_name(top_level_dirname)
+            is_importable_name(top_level_dirname)
         ):
             namespace_package_dirs.setdefault(top_level_dirname, True)
 
@@ -350,7 +382,10 @@
     # to avoid conflict merging later.
     for dirname, is_namespace_package in namespace_package_dirs.items():
         if is_namespace_package:
-            cannot_be_linked_directly[dirname] = True
+            # If it's already in cannot_be_linked_directly due to pkgutil_namespace_packages
+            # then we should not unset it.
+            if not cannot_be_linked_directly.get(dirname, False):
+                cannot_be_linked_directly[dirname] = True
 
     # At this point, venv_symlinks has entries for the shared libraries
     # and cannot_be_linked_directly has the directories that cannot be
diff --git a/tests/pypi/whl_library_targets/whl_library_targets_tests.bzl b/tests/pypi/whl_library_targets/whl_library_targets_tests.bzl
index 1d80340..9b57403 100644
--- a/tests/pypi/whl_library_targets/whl_library_targets_tests.bzl
+++ b/tests/pypi/whl_library_targets/whl_library_targets_tests.bzl
@@ -257,6 +257,10 @@
         "tags": ["pypi_name=Foo", "pypi_version=0"],
         "visibility": ["//visibility:public"],
         "experimental_venvs_site_packages": Label("//python/config_settings:venvs_site_packages"),
+        "namespace_package_files": [] + select({
+            Label("//python/config_settings:is_venvs_site_packages"): [],
+            "//conditions:default": ["_create_inits_target"],
+        }),
     })  # buildifier: @unsorted-dict-items
 
     env.expect.that_collection(mock_glob.calls).contains_exactly([
@@ -380,6 +384,10 @@
         "tags": ["tag1", "tag2"],
         "visibility": ["//visibility:public"],
         "experimental_venvs_site_packages": Label("//python/config_settings:venvs_site_packages"),
+        "namespace_package_files": [] + select({
+            Label("//python/config_settings:is_venvs_site_packages"): [],
+            "//conditions:default": ["_create_inits_target"],
+        }),
     })  # buildifier: @unsorted-dict-items
 
 _tests.append(_test_whl_and_library_deps)
@@ -449,6 +457,10 @@
         "tags": [],
         "visibility": ["@pypi__config//_groups:__pkg__"],
         "experimental_venvs_site_packages": Label("//python/config_settings:venvs_site_packages"),
+        "namespace_package_files": [] + select({
+            Label("//python/config_settings:is_venvs_site_packages"): [],
+            "//conditions:default": ["_create_inits_target"],
+        }),
     })  # buildifier: @unsorted-dict-items
 
     env.expect.that_collection(mock_glob.calls, expr = "glob calls").contains_exactly([
diff --git a/tests/repos/BUILD.bazel b/tests/repos/BUILD.bazel
new file mode 100644
index 0000000..e69de29
--- /dev/null
+++ b/tests/repos/BUILD.bazel
diff --git a/tests/repos/pkgutil_nspkg1/BUILD.bazel b/tests/repos/pkgutil_nspkg1/BUILD.bazel
new file mode 100644
index 0000000..e69de29
--- /dev/null
+++ b/tests/repos/pkgutil_nspkg1/BUILD.bazel
diff --git a/tests/repos/pkgutil_nspkg1/nspkg/__init__.py b/tests/repos/pkgutil_nspkg1/nspkg/__init__.py
new file mode 100644
index 0000000..c4da2cf
--- /dev/null
+++ b/tests/repos/pkgutil_nspkg1/nspkg/__init__.py
@@ -0,0 +1,2 @@
+# __init__.py
+__path__ = __import__("pkgutil").extend_path(__path__, __name__)
diff --git a/tests/repos/pkgutil_nspkg1/nspkg/one/a.txt b/tests/repos/pkgutil_nspkg1/nspkg/one/a.txt
new file mode 100644
index 0000000..f4dbe63
--- /dev/null
+++ b/tests/repos/pkgutil_nspkg1/nspkg/one/a.txt
@@ -0,0 +1 @@
+dummy content
\ No newline at end of file
diff --git a/tests/repos/pkgutil_nspkg1/pkgutil_nspkg1-1.0.dist-info/METADATA b/tests/repos/pkgutil_nspkg1/pkgutil_nspkg1-1.0.dist-info/METADATA
new file mode 100644
index 0000000..7ffb60b
--- /dev/null
+++ b/tests/repos/pkgutil_nspkg1/pkgutil_nspkg1-1.0.dist-info/METADATA
@@ -0,0 +1,3 @@
+Metadata-Version: 2.1
+Name: pkgutil-nspkg1
+Version: 1.0
diff --git a/tests/repos/pkgutil_nspkg1/pkgutil_nspkg1-1.0.dist-info/RECORD b/tests/repos/pkgutil_nspkg1/pkgutil_nspkg1-1.0.dist-info/RECORD
new file mode 100644
index 0000000..e039fee
--- /dev/null
+++ b/tests/repos/pkgutil_nspkg1/pkgutil_nspkg1-1.0.dist-info/RECORD
@@ -0,0 +1,5 @@
+nspkg/__init__.py,sha256=d10f14d9ce938ae14416c3c1f4b516c4f40dfe0c9bf973a833e4ef40517ba7d0,81
+nspkg/one/a.txt,sha256=bf0ecbdb9b814248d086c9b69cf26182d9d4138f2ad3d0637c4555fc8cbf68e5,13
+pkgutil_nspkg1-1.0.dist-info/METADATA,sha256=49525c3e6f1fc8f46d9c92c996e35f81327f9fe417df2d9d10784fa4e9cfe84b,59
+pkgutil_nspkg1-1.0.dist-info/WHEEL,sha256=d652ec50af6f144788dc1ffef052e8833b6704e98818e6cf78d80a625ad498fb,105
+pkgutil_nspkg1-1.0.dist-info/RECORD,,
diff --git a/tests/repos/pkgutil_nspkg1/pkgutil_nspkg1-1.0.dist-info/WHEEL b/tests/repos/pkgutil_nspkg1/pkgutil_nspkg1-1.0.dist-info/WHEEL
new file mode 100644
index 0000000..dad2b72
--- /dev/null
+++ b/tests/repos/pkgutil_nspkg1/pkgutil_nspkg1-1.0.dist-info/WHEEL
@@ -0,0 +1,4 @@
+Wheel-Version: 1.0
+Generator: rules_python_whl_from_dir_repo
+Root-Is-Purelib: true
+Tag: py3-none-any
diff --git a/tests/repos/pkgutil_nspkg2/BUILD.bazel b/tests/repos/pkgutil_nspkg2/BUILD.bazel
new file mode 100644
index 0000000..e69de29
--- /dev/null
+++ b/tests/repos/pkgutil_nspkg2/BUILD.bazel
diff --git a/tests/repos/pkgutil_nspkg2/nspkg/__init__.py b/tests/repos/pkgutil_nspkg2/nspkg/__init__.py
new file mode 100644
index 0000000..c4da2cf
--- /dev/null
+++ b/tests/repos/pkgutil_nspkg2/nspkg/__init__.py
@@ -0,0 +1,2 @@
+# __init__.py
+__path__ = __import__("pkgutil").extend_path(__path__, __name__)
diff --git a/tests/repos/pkgutil_nspkg2/nspkg/two/b.txt b/tests/repos/pkgutil_nspkg2/nspkg/two/b.txt
new file mode 100644
index 0000000..f4dbe63
--- /dev/null
+++ b/tests/repos/pkgutil_nspkg2/nspkg/two/b.txt
@@ -0,0 +1 @@
+dummy content
\ No newline at end of file
diff --git a/tests/repos/pkgutil_nspkg2/pkgutil_nspkg2-1.0.dist-info/METADATA b/tests/repos/pkgutil_nspkg2/pkgutil_nspkg2-1.0.dist-info/METADATA
new file mode 100644
index 0000000..368e64c
--- /dev/null
+++ b/tests/repos/pkgutil_nspkg2/pkgutil_nspkg2-1.0.dist-info/METADATA
@@ -0,0 +1,3 @@
+Metadata-Version: 2.1
+Name: pkgutil-nspkg2
+Version: 1.0
diff --git a/tests/repos/pkgutil_nspkg2/pkgutil_nspkg2-1.0.dist-info/RECORD b/tests/repos/pkgutil_nspkg2/pkgutil_nspkg2-1.0.dist-info/RECORD
new file mode 100644
index 0000000..c93970c
--- /dev/null
+++ b/tests/repos/pkgutil_nspkg2/pkgutil_nspkg2-1.0.dist-info/RECORD
@@ -0,0 +1,5 @@
+nspkg/__init__.py,sha256=d10f14d9ce938ae14416c3c1f4b516c4f40dfe0c9bf973a833e4ef40517ba7d0,81
+nspkg/two/b.txt,sha256=bf0ecbdb9b814248d086c9b69cf26182d9d4138f2ad3d0637c4555fc8cbf68e5,13
+pkgutil_nspkg2-1.0.dist-info/METADATA,sha256=9a72654b480f17c55df07fe27d026c4ea461c493babcc675f26fbcfce40828b6,59
+pkgutil_nspkg2-1.0.dist-info/WHEEL,sha256=d652ec50af6f144788dc1ffef052e8833b6704e98818e6cf78d80a625ad498fb,105
+pkgutil_nspkg2-1.0.dist-info/RECORD,,
diff --git a/tests/repos/pkgutil_nspkg2/pkgutil_nspkg2-1.0.dist-info/WHEEL b/tests/repos/pkgutil_nspkg2/pkgutil_nspkg2-1.0.dist-info/WHEEL
new file mode 100644
index 0000000..dad2b72
--- /dev/null
+++ b/tests/repos/pkgutil_nspkg2/pkgutil_nspkg2-1.0.dist-info/WHEEL
@@ -0,0 +1,4 @@
+Wheel-Version: 1.0
+Generator: rules_python_whl_from_dir_repo
+Root-Is-Purelib: true
+Tag: py3-none-any
diff --git a/tests/venv_site_packages_libs/app_files_building/app_files_building_tests.bzl b/tests/venv_site_packages_libs/app_files_building/app_files_building_tests.bzl
index e92c0aa..f85508d 100644
--- a/tests/venv_site_packages_libs/app_files_building/app_files_building_tests.bzl
+++ b/tests/venv_site_packages_libs/app_files_building/app_files_building_tests.bzl
@@ -2,8 +2,12 @@
 
 load("@rules_testing//lib:analysis_test.bzl", "analysis_test")
 load("@rules_testing//lib:test_suite.bzl", "test_suite")
+load("//python:py_info.bzl", "PyInfo")
+load("//python:py_library.bzl", "py_library")
+load("//python/private:common_labels.bzl", "labels")  # buildifier: disable=bzl-visibility
 load("//python/private:py_info.bzl", "VenvSymlinkEntry", "VenvSymlinkKind")  # buildifier: disable=bzl-visibility
 load("//python/private:venv_runfiles.bzl", "build_link_map", "get_venv_symlinks")  # buildifier: disable=bzl-visibility
+load("//tests/support:support.bzl", "SUPPORTS_BZLMOD_UNIXY")
 
 def _empty_files_impl(ctx):
     files = []
@@ -33,6 +37,7 @@
 def _ctx(workspace_name = "_main"):
     return struct(
         workspace_name = workspace_name,
+        label = Label("@@FAKE-CTX//:fake_ctx"),
     )
 
 def _file(short_path):
@@ -330,6 +335,178 @@
     # The point of the optimization is to avoid having to merge conflicts.
     env.expect.that_collection(conflicts).contains_exactly([])
 
+def _test_optimized_grouping_pkgutil_namespace_packages(name):
+    empty_files(
+        name = name + "_files",
+        paths = [
+            "site-packages/pkgutilns/__init__.py",
+            "site-packages/pkgutilns/foo.py",
+            # Special cases: These dirnames under site-packages are always
+            # treated as namespace packages
+            "site-packages/nvidia/whatever/w.py",
+        ],
+    )
+    analysis_test(
+        name = name,
+        impl = _test_optimized_grouping_pkgutil_namespace_packages_impl,
+        target = name + "_files",
+    )
+
+_tests.append(_test_optimized_grouping_pkgutil_namespace_packages)
+
+def _test_optimized_grouping_pkgutil_namespace_packages_impl(env, target):
+    test_ctx = _ctx(workspace_name = env.ctx.workspace_name)
+    files = target.files.to_list()
+    ns_inits = [f for f in files if f.basename == "__init__.py"]
+
+    entries = get_venv_symlinks(
+        test_ctx,
+        files,
+        package = "pkgutilns",
+        version_str = "1.0",
+        site_packages_root = env.ctx.label.package + "/site-packages",
+        namespace_package_files = ns_inits,
+    )
+    actual = _venv_symlinks_from_entries(entries)
+
+    rr = "{}/{}/site-packages/".format(test_ctx.workspace_name, env.ctx.label.package)
+    expected = [
+        _venv_symlink(
+            "pkgutilns/__init__.py",
+            link_to_path = rr + "pkgutilns/__init__.py",
+            files = [
+                "tests/venv_site_packages_libs/app_files_building/site-packages/pkgutilns/__init__.py",
+            ],
+        ),
+        _venv_symlink(
+            "pkgutilns/foo.py",
+            link_to_path = rr + "pkgutilns/foo.py",
+            files = [
+                "tests/venv_site_packages_libs/app_files_building/site-packages/pkgutilns/foo.py",
+            ],
+        ),
+        _venv_symlink(
+            "nvidia/whatever",
+            link_to_path = rr + "nvidia/whatever",
+            files = [
+                "tests/venv_site_packages_libs/app_files_building/site-packages/nvidia/whatever/w.py",
+            ],
+        ),
+    ]
+    expected = sorted(expected, key = lambda e: (e.link_to_path, e.venv_path))
+    env.expect.that_collection(
+        actual,
+    ).contains_exactly(expected)
+
+    _, conflicts = build_link_map(test_ctx, entries, return_conflicts = True)
+
+    # The point of the optimization is to avoid having to merge conflicts.
+    env.expect.that_collection(conflicts).contains_exactly([])
+
+def _test_optimized_grouping_pkgutil_whls(name):
+    """Verify that the whl_library pkgutli style detection logic works."""
+    py_library(
+        name = name + "_lib",
+        deps = [
+            "@pkgutil_nspkg1//:pkg",
+            "@pkgutil_nspkg2//:pkg",
+        ],
+        target_compatible_with = SUPPORTS_BZLMOD_UNIXY,
+    )
+    analysis_test(
+        name = name,
+        impl = _test_optimized_grouping_pkgutil_whls_impl,
+        target = name + "_lib",
+        config_settings = {
+            labels.VENVS_SITE_PACKAGES: "yes",
+        },
+        attr_values = dict(
+            target_compatible_with = SUPPORTS_BZLMOD_UNIXY,
+        ),
+    )
+
+_tests.append(_test_optimized_grouping_pkgutil_whls)
+
+def _test_optimized_grouping_pkgutil_whls_impl(env, target):
+    test_ctx = _ctx(workspace_name = env.ctx.workspace_name)
+    actual_raw_entries = target[PyInfo].venv_symlinks.to_list()
+
+    actual = _venv_symlinks_from_entries(actual_raw_entries)
+
+    # The important condition is that the top-level 'nspkg' directory
+    # is NOT linked because it's a pkgutil namespace package.
+    env.expect.that_collection(actual).contains_exactly([
+        # Entries from pkgutil_ns1
+        _venv_symlink(
+            "nspkg/__init__.py",
+            link_to_path = "+internal_dev_deps+pkgutil_nspkg1/site-packages/nspkg/__init__.py",
+            files = [
+                "../+internal_dev_deps+pkgutil_nspkg1/site-packages/nspkg/__init__.py",
+            ],
+        ),
+        _venv_symlink(
+            "nspkg/one",
+            link_to_path = "+internal_dev_deps+pkgutil_nspkg1/site-packages/nspkg/one",
+            files = [
+                "../+internal_dev_deps+pkgutil_nspkg1/site-packages/nspkg/one/a.txt",
+            ],
+        ),
+        _venv_symlink(
+            "pkgutil_nspkg1-1.0.dist-info",
+            link_to_path = "+internal_dev_deps+pkgutil_nspkg1/site-packages/pkgutil_nspkg1-1.0.dist-info",
+            files = [
+                "../+internal_dev_deps+pkgutil_nspkg1/site-packages/pkgutil_nspkg1-1.0.dist-info/INSTALLER",
+                "../+internal_dev_deps+pkgutil_nspkg1/site-packages/pkgutil_nspkg1-1.0.dist-info/METADATA",
+                "../+internal_dev_deps+pkgutil_nspkg1/site-packages/pkgutil_nspkg1-1.0.dist-info/WHEEL",
+            ],
+        ),
+        # Entries from pkgutil_ns2
+        _venv_symlink(
+            "nspkg/__init__.py",
+            link_to_path = "+internal_dev_deps+pkgutil_nspkg2/site-packages/nspkg/__init__.py",
+            files = [
+                "../+internal_dev_deps+pkgutil_nspkg2/site-packages/nspkg/__init__.py",
+            ],
+        ),
+        _venv_symlink(
+            "nspkg/two",
+            link_to_path = "+internal_dev_deps+pkgutil_nspkg2/site-packages/nspkg/two",
+            files = [
+                "../+internal_dev_deps+pkgutil_nspkg2/site-packages/nspkg/two/b.txt",
+            ],
+        ),
+        _venv_symlink(
+            "pkgutil_nspkg2-1.0.dist-info",
+            link_to_path = "+internal_dev_deps+pkgutil_nspkg2/site-packages/pkgutil_nspkg2-1.0.dist-info",
+            files = [
+                "../+internal_dev_deps+pkgutil_nspkg2/site-packages/pkgutil_nspkg2-1.0.dist-info/INSTALLER",
+                "../+internal_dev_deps+pkgutil_nspkg2/site-packages/pkgutil_nspkg2-1.0.dist-info/METADATA",
+                "../+internal_dev_deps+pkgutil_nspkg2/site-packages/pkgutil_nspkg2-1.0.dist-info/WHEEL",
+            ],
+        ),
+    ])
+
+    # Verifying that the expected VenvSymlink structure is processed with minimal number
+    # of conflicts (Just the single pkgutil style __init__.py file)
+    _, conflicts = build_link_map(test_ctx, actual_raw_entries, return_conflicts = True)
+    env.expect.that_collection(_venv_symlinks_from_entries(conflicts[0])).contains_exactly([
+        _venv_symlink(
+            "nspkg/__init__.py",
+            link_to_path = "+internal_dev_deps+pkgutil_nspkg1/site-packages/nspkg/__init__.py",
+            files = [
+                "../+internal_dev_deps+pkgutil_nspkg1/site-packages/nspkg/__init__.py",
+            ],
+        ),
+        _venv_symlink(
+            "nspkg/__init__.py",
+            link_to_path = "+internal_dev_deps+pkgutil_nspkg2/site-packages/nspkg/__init__.py",
+            files = [
+                "../+internal_dev_deps+pkgutil_nspkg2/site-packages/nspkg/__init__.py",
+            ],
+        ),
+    ])
+    env.expect.that_collection(conflicts).has_size(1)
+
 def _test_package_version_filtering(name):
     analysis_test(
         name = name,