2

I'm trying to translate my company's project from legacy build tool to bazel. Now I'm facing this problem and searched a lot, but unfortunately, I haven't had a clue so far.

Here's the thing:

For compliance with open source audit, we must provide a list of open-source software which are built into our binary. As external dependencies are introduced by repository rules, my intuitive thought is to query these rules and get the URLs. However, subcommand query/cquery hasn't provided such functionality yet AFAIK, it can print rule/target/buildfiles but no repository rules nor their attributes.

Is there a way that I can gather such information from repository rules in WORKSPACE? It's not viable to do it manually as there are thousands of projects in my company and the dependencies also change frequently.

For example, a workspace rule:

http_archive(
    name = "testrunner",
    urls = ["https://github.com/testrunner/v2.zip"],
    sha256 = "..."
)

This dependency is used by a rule named "my_target", so what i expected is that the dependency could be queried like this:

> bazel queryExtDep my_target
External Dependency of my_target: name->testrunner, urls = "https://github.com/testrunner/v2.zip"
Chen Zhuo
  • 71
  • 5

1 Answers1

1

--experimental_repository_resolved_file will give you all that information in a single Starlark file, which you can easily process with Starlark or Python etc to extract the information you're looking for.

The resolved file looks something like this:

resolved = [
    ...,
    {
        "original_rule_class": "@bazel_tools//tools/build_defs/repo:git.bzl%git_repository",
        "original_attributes": {
            "name": "com_google_protobuf",
            "remote": "https://github.com/google/protobuf",
            "branch": "master"
        },
        "repositories": [
            {
                "rule_class": "@bazel_tools//tools/build_defs/repo:git.bzl%git_repository",
                "attributes": {
                    "remote": "https://github.com/google/protobuf",
                    "commit": "78ba021b846e060d5b8f3424259d30a1f3ae4eef",
                    "shallow_since": "2018-02-07",
                    "init_submodules": False,
                    "verbose": False,
                    "strip_prefix": "",
                    "patches": [],
                    "patch_tool": "patch",
                    "patch_args": [
                        "-p0"
                    ],
                    "patch_cmds": [],
                    "name": "com_google_protobuf"
                }
            }
        ]
    }
]

This includes the original attributes, which is where that URL you're looking for is. It also includes any additional information returned by the repository rule (ie for git_repository, the actual commit a given ref refers to).

I got that example from blog post introducing that flag, which also has some more background.

Brian Silverman
  • 3,085
  • 11
  • 13
  • A very good point, it sounds like what I need, let me take some experiments and then confirm it. By the way, does it also apply to the new feature bazmod? By the way, the official documentation of Bazel is really not good... – Chen Zhuo Apr 03 '22 at 14:30
  • I noticed that the document states: If non-empty, write a Starlark value with the resolved information of all Starlark repository rules that were executed. This may imply that if an external dependency is cached, it doesn't show in the returned values. This is not sufficient if it's true. I need to provide precisely all the open-source/3rd software built into my binary. – Chen Zhuo Apr 03 '22 at 15:04
  • I think `bazel build --nobuild --experimental_repository_resolved_file=X //...` will include everything. Executing a rule usually means the thing that happens every time you reference the package after restarting the bazel server, regardless of whether that runs any commands vs produces cache hits. I would guess that it will include the repositories generated by bzlmod modules, but I have not tried that. – Brian Silverman Apr 03 '22 at 23:15
  • I tried it, but it didn't work unfortunately. For the first time I executed the command, the output of `bazel build --nobuild --experimental_repository_resolved_file=X //...` is just as we expect, however, for the second time(which means the cache is hit), there's nothing in the output... – Chen Zhuo Apr 06 '22 at 02:15
  • Did you try `bazel shutdown` in between the first and second time you run it? – Brian Silverman Apr 06 '22 at 18:41
  • Thanks for reminding me. But it didn't work too. All content in @bazel_tool will be reloaded after a `bazel shutdown` but no external dependency. In fact, I tried `bazel clean` and the result is similar. Only after a `bazel clean --expunge` is executed the external dependency will be reloaded...Seems the cached external dependency expires only when a deep clean is executed. – Chen Zhuo Apr 07 '22 at 02:56