This document details the steps to migrate a package to build with Bazel. These steps are easiest to understand with a working example, so this doc references tfjs-core
's setup as much as possible. Since this migration is still in progress, the steps and processes listed here may change as we improve on the process, add features to each package's build, and create tfjs-specific build functions.
Migrating a package to Bazel involves adding Bazel targets that build the package, running its tests, and packing the package for publishing to npm. To ease the transition to Bazel, we're incrementally transitioning packages to build with Bazel, starting with root packages (tfjs-core, tfjs-backend-cpu) and gradually expanding to leaf packages. This is different from our original approach of maintaining our current build and a new Bazel build in parallel, which ended up not working due to some changes that Bazel required to the ts sources.
- Bazel will only make a dependency or file available to the build if you explicitly declare it as a dependency / input to the rule you're using.
- All Bazel builds use the root package.json, so you may have to add packages to it. As long as you use
@npm//dependency-name
in BUILD files to add dependencies, you won't need to worry about the build accidentally seeing the package'snode_modules
directory instead of the rootnode_modules
. Bazel will only make the rootnode_modules
directory visible to the build.- Even though the build doesn't use the package's
node_modules
, you may have to runyarn
within the package to get code completion to work correctly. We're looking into why this is the case.
- Even though the build doesn't use the package's
- There may be issues with depending on explicitly pinned versions of
@tensorflow
scoped packages, which might affect some demos if they're migrated to use Bazel. We might just want to leave demos out of Bazel so they're easier to understand.
These steps are general guidelines for how to build a package with Bazel. They should work for most packages, but there may be some exceptions (e.g. wasm, react native).
A package's dependencies must be migrated before it can be migrated. Take a look at the package's issue, which can be found by checking #5287, to find its dependencies.
Bazel (through rules_nodejs
) uses a single root package.json
for its npm dependencies. When converting a package to build with Bazel, dependencies in the package's package.json
will need to be added to the root package.json
as well.
Bazel looks for targets to run in BUILD
and BUILD.bazel
files. Use the .bazel
extension since blaze uses BUILD
. You may want to install an extension for your editor to get syntax highlighting. Here's the vscode extension.
This BUILD file will handle package-wide rules like bundling for npm.
This BUILD file will compile the source files of the package using ts_library
and may also define test bundles.
In the src
BUILD.bazel file, we use ts_library
to compile the package's typescript files. ts_library is a rule provided by rules_nodejs. We wrap ts_library
in a macro that sets some project-specific settings.
Here's an example of how tfjs-core
uses ts_library
to build.
tfjs-core/src/BUILD.bazel
load("//tools:defaults.bzl", "ts_library")
TEST_SRCS = [
"**/*_test.ts",
"image_test_util.ts",
]
# Compiles the majority of tfjs-core using the `@tensorflow/tfjs-core/dist`
# module name.
ts_library(
name = "tfjs-core_src_lib",
srcs = glob(
["**/*.ts"],
exclude = TEST_SRCS + ["index.ts"],
),
module_name = "@tensorflow/tfjs-core/dist",
deps = [
"@npm//@types",
"@npm//jasmine-core",
"@npm//seedrandom",
],
)
# Compiles the `index.ts` entrypoint of tfjs-core separately from the rest of
# the sources in order to use the `@tensorflow/tfjs-core` module name instead
# of `@tensorflow/tfjs-core/dist`,
ts_library(
name = "tfjs-core_lib",
srcs = ["index.ts"],
module_name = "@tensorflow/tfjs-core",
deps = [
":tfjs-core_src_lib",
],
)
ts_library
is used twice in order to have the correct module_name
for the output files. Most files are imported relative to @tensorflow/tfjs-core/src/
, but index.ts
, the entrypoint of tfjs-core
, should be importable as @tensorflow/tfjs-core
.
If your package imports from dist
(e.g. import {} from @tensorflow/tfjs-core/dist/ops/ops_for_converter
), that import likely corresponds to a rule in that packages src/BUILD.bazel
file. Look for a rule that includes the file you're importing and has module_name
set correctly for that import.
This step involves bundling the compiled files from the compilation step into a single file, and the rules are added to the package's root BUILD file (instead of src/BUILD.bazel
). In order to support different execution environments, TFJS generates several bundles for each package. We provide a tfjs_bundle
macro to generate these bundles.
tfjs-core/BUILD.bazel
load("//tools:tfjs_bundle.bzl", "tfjs_bundle")
tfjs_bundle(
name = "tf-core",
entry_point = "//tfjs-core/src:index.ts",
external = [
"node-fetch",
"util",
],
umd_name = "tf",
deps = [
"//tfjs-core/src:tfjs-core_lib",
"//tfjs-core/src:tfjs-core_src_lib",
],
)
The tfjs_bundle
macro generates several different bundles which are published in the package publishing step.
In the src/BUILD.bazel
file, we compile the tests with ts_library
. In the case of tfjs-core
, we actually publish the test files, since other packages use them in their tests. Therefore, it's important that we set the module_name
to @tensorflow/tfjs-core/dist
. If a package's tests are not published, the module_name
can probably be omitted. In a future major version of tfjs, we may stop publishing the tests to npm.
tfjs-core/src/BUILD.bazel
load("//tools:defaults.bzl", "ts_library")
ts_library(
name = "tfjs-core_test_lib",
srcs = glob(TEST_SRCS),
# TODO(msoulanille): Mark this as testonly once it's no longer needed in the
# npm package (for other downstream packages' tests).
module_name = "@tensorflow/tfjs-core/dist",
deps = [
":tfjs-core_lib",
":tfjs-core_src_lib",
],
)
Many packages have a src/run_tests.ts
file (or similar) that they use for selecting which tests to run. That file defines the paths to the test files that Jasmine uses. Since Bazel outputs appearin a different location, the paths to the test files must be updated. As an example, the following paths
const coreTests = 'node_modules/@tensorflow/tfjs-core/src/tests.ts';
const unitTests = 'src/**/*_test.ts';
would need to be updated to
const coreTests = 'tfjs-core/src/tests.js';
const unitTests = 'the-package-name/src/**/*_test.js';
Note that .ts
has been changed to .js
. This is because we're no longer running node tests with ts-node
, so the input test files are now .js
outputs created by the ts_library
rule that compiled the tests.
It's also important to make sure the nodejs_test
rule that runs the test has link_workspace_root = True
. Otherwise, the test files will not be accessable at runtime.
Our test setup allows fine-tuning of exactly what tests are run via setTestEnvs
and setupTestFilters
in jasmine_util.ts
, which are used in a custom Jasmine entrypoint file setup_test.ts
. This setup does not work well with jasmine_node_test, which provides its own entrypoint for starting Jasmine. Instead, we use the nodejs_test rule.
tfjs-core/BUILD.bazel
load("@build_bazel_rules_nodejs//:index.bzl", "js_library", "nodejs_test")
# This is necessary for tests to have acess to
# the package.json so src/version_test.ts can 'require()' it.
js_library(
name = "package_json",
srcs = [
":package.json",
],
)
nodejs_test(
name = "tfjs-core_node_test",
data = [
":package_json",
"//tfjs-backend-cpu/src:tfjs-backend-cpu_lib",
"//tfjs-core/src:tfjs-core_lib",
"//tfjs-core/src:tfjs-core_src_lib",
"//tfjs-core/src:tfjs-core_test_lib",
],
entry_point = "//tfjs-core/src:test_node.ts",
link_workspace_root = True,
tags = ["ci"],
)
It's important to tag tests with ci
if you would like them to run in continuous integration.
We use esbuild
to bundle the tests into a single file.
tfjs-core/src/BUILD.bazel
load("//tools:defaults.bzl", "esbuild")
esbuild(
name = "tfjs-core_test_bundle",
testonly = True,
entry_point = "setup_test.ts",
external = [
# webworker tests call 'require('@tensorflow/tfjs')', which
# is external to the test bundle.
# Note: This is not a bazel target. It's just a string.
"@tensorflow/tfjs",
"worker_threads",
"util",
],
sources_content = True,
deps = [
":tfjs-core_lib",
":tfjs-core_test_lib",
"//tfjs-backend-cpu/src:tfjs-backend-cpu_lib",
"//tfjs-core:package_json",
],
)
The esbuild bundle is then used in the tfjs_web_test macro, which uses karma_web_test to serve it to a browser to be run. Different browserstack browsers can be enabled or disabled in the browsers
argument, and the full list of browsers is located in tools/karma_template.conf.js
. Browserstack browser tests are automatically tagged with ci
.
tfjs-core/BUILD.bazel
load("//tools:tfjs_web_test.bzl", "tfjs_web_test")
tfjs_web_test(
name = "tfjs-core_test",
srcs = [
"//tfjs-core/src:tfjs-core_test_bundle",
],
browsers = [
"bs_chrome_mac",
"bs_firefox_mac",
"bs_safari_mac",
"bs_ios_12",
"bs_android_9",
"win_10_chrome",
],
static_files = [
# Listed here so sourcemaps are served
"//tfjs-core/src:tfjs-core_test_bundle",
# For the webworker
":tf-core.min.js",
":tf-core.min.js.map",
"//tfjs-backend-cpu:tf-backend-cpu.min.js",
"//tfjs-backend-cpu:tf-backend-cpu.min.js.map",
],
)
Whereas before, tests were included based on the karma.conf.js
file, now, tests must be included in the test bundle to be run. Make sure to import
each test file in the test bundle's entrypoint. To help with this, we provide an enumerate_tests
Bazel rule to generate a tests.ts
file with the required imports.
load("//tools:enumerate_tests.bzl", "enumerate_tests")
# Generates the 'tests.ts' file that imports all test entrypoints.
enumerate_tests(
name = "tests",
srcs = [":all_test_entrypoints"], # all_test_entrypoints is a filegroup
root_path = "tfjs-core/src",
)
- Verify the entrypoints of the package.json to match the outputs generated by
tfjs_bundle
andts_library
. tfjs-core/package.json is an example.- The main entrypoint should point to the node bundle,
dist/tf-package-name.node.js
. jsnext:main
andmodule
should point to the ESModule outputdist/index.js
created bycopy_ts_library_to_dist
.
- The main entrypoint should point to the node bundle,
- If the package has browser tests, update the
sideEffects
field to include.mjs
files generated by thets_library
under./src
(e.g.src/foo.mjs
). Bazel outputs directly tosrc
, and although we copy those outputs todist
with another Bazel rule, the browser test bundles still import fromsrc
, so we need to mark them as sideEffects.
We use the pkg_npm rule to create and publish the package to npm. However, there are a few steps needed before we can declare the package. For most packages, we distribute all our compiled outputs in the dist
directory. However, due to how ts_library
works, it creates outputs in the same directory as the source files were compiled from (except they show up in Bazel's dist/bin
output dir). We need to copy these from src
to dist
while making sure Bazel is aware of this copy (so we can still use pkg_npm
).
We also need to copy several other files to dist
, such as the bundles created by tfjs_bundle
, and we need to create miniprogram files for WeChat.
To copy files, we usually use the copy_to_dist
rule. This rule creates symlinks to all the files in srcs
and places them in a filetree with the same structure in dest_dir
(which defaults to dist
).
However, we can't just copy the output of a ts_library
, since its default output is the .d.ts
declaration files. We need to extract the desired ES Module .mjs
outputs of the rule and rename them to have the .js
extension. The copy_ts_library_to_dist
does this rename, and it also copies the files to dist
(including the .d.ts
declaration files).
load("//tools:copy_to_dist.bzl", "copy_ts_library_to_dist")
copy_ts_library_to_dist(
name = "copy_src_to_dist",
srcs = [
"//tfjs-core/src:tfjs-core_lib",
"//tfjs-core/src:tfjs-core_src_lib",
"//tfjs-core/src:tfjs-core_test_lib",
],
root = "src", # Consider 'src' to be the root directory of the copy
# (i.e. create 'dist/index.js' instead of 'dist/src/index.js')
dest_dir = "dist", # Where to copy the files to. Defaults to 'dist', so it can
# actually be omitted in this case.
)
We can also copy the bundles output from tfjs_bundle
copy_to_dist(
name = "copy_bundles",
srcs = [
":tf-core",
":tf-core.node",
":tf-core.es2017",
":tf-core.es2017.min",
":tf-core.fesm",
":tf-core.fesm.min",
":tf-core.min",
],
)
We copy the miniprogram files as well, this time using the copy_file
rule, which copies a single file to a destination.
load("@bazel_skylib//rules:copy_file.bzl", "copy_file")
copy_file(
name = "copy_miniprogram",
src = ":tf-core.min.js",
out = "dist/miniprogram/index.js",
)
copy_file(
name = "copy_miniprogram_map",
src = ":tf-core.min.js.map",
out = "dist/miniprogram/index.js.map",
)
Now that all the files are copied, we can declare a pkg_npm
load("@build_bazel_rules_nodejs//:index.bzl", "pkg_npm")
pkg_npm(
name = "tfjs-core_pkg",
package_name = "@tensorflow/tfjs-core",
srcs = [
# Add any static files the package should include here
"package.json",
"README.md",
],
tags = ["ci"],
deps = [
":copy_bundles",
":copy_miniprogram",
":copy_miniprogram_map",
":copy_src_to_dist",
":copy_test_snippets", # <- This is only in core, so I've omitted its
# definition in these docs.
],
)
Now the package can be published to npm with bazel run //tfjs-core:tfjs-core_pkg.publish
.
With a pkg_npm
rule defined, we add a script to package.json
to run it. This script will be used by the main script that publishes the monorepo.
"scripts" {
"publish-npm": "bazel run :tfjs-core_pkg.publish"
}
Since we now use the publish-npm
script to publish this package instead of npm publish
, we need to make sure the release tests and release script know how to publish it.
- In
scripts/publish-npm.ts
, add your package's name to theBAZEL_PACKAGES
set. - In
e2e/scripts/publish-tfjs-ci.sh
, add your package's name to theBAZEL_PACKAGES
list.
You should also add a script to build the package itself without publishing (used for the link-package
).
"build": "bazel build :tfjs-core_pkg",
If no packages depend on your package (i.e. no package.json
file includes your package via a link
dependency), then you can skip this section.
As a core featue of its design, Bazel places outputs in a different directory than sources. Outputs are symlinked to dist/bin/[package-name]/.....
instead of appearing in [package-name]/dist
. Due to the different location, all downstream packages' package.json
files need to be updated to point to the new outputs. However, due to some details of how Bazel and the Node module resolution algorithm work, we can't directly link:
to Bazel's output.
Instead, we maintain a link-package
pseudopackage where we copy the Bazel outputs. This package allows for correct Node module resolution between Bazel outputs because it has its own node_modules
folder. This package will never be published and will be removed once the migration is complete.
Add your package to the PACKAGES
list in the build_deps.ts
script in link-package
. For a package with npm name @tensorflow/tfjs-foo
, the package's directory in the monorepo and the value to add to PACKAGES
should both be tfjs-foo
. The name of the package's pkg_npm
target should be tfjs-foo_pkg
.
const PACKAGES: ReadonlySet<string> = new Set([
..., 'tfjs-foo',
]);
Update all downstream dependencies that depend on the package to point to its location in the link-package
.
"devDependencies": {
"@tensorflow/tfjs-core": "link:../link-package/node_modules/@tensorflow/tfjs-core",
"@tensorflow/tfjs-foo": "link:../link-package/node_modules/@tensorflow/tfjs-foo",
},
To find downstream packages, run grep -r --exclude=yarn.lock --exclude-dir=node_modules "link:.*tfjs-foo" .
in the root of the repository.
Make sure to list the new package in the call to yarn build-deps-for ...
script for downstream packages. This includes packages that have tfjs-foo
as a transitive dependency.
"scripts": {
"build-link-package": "cd ../link-package && yarn build-deps-for tfjs-foo tfjs-some-other-dependency",
"build-tfjs-foo": "remove this script", // <-- Don't forget to remove this earlier build script from downstream packages.
}
Add the new Bazel package to the repo-wide tslint tsconfig:
Add the path mapping:
"paths": {
...,
"@tensorflow/the-new-package": ["the-new-package/src/index.ts"],
"@tensorflow/the-new-package/dist/*": ["the-new-package/src/*"]
Also, remove the package from the exclude
list.
It's a good idea to test that linting is working on the package. Create a lint error in one if its files, e.g. const x = "Hello, world!"
(note the double quotes), and then run yarn lint
in the root of the repository.
Remove the package.json
lint
script, the tslint.json
file, and the cloudbuild lint
step from the package's cloudbuild.yml
file. Remove tslint
-related dependencies from the package's package.json
and run yarn
to regenerate the yarn.lock
file.
Update the cloudbuild.yml
to remove any steps that are now built with Bazel. These will be run by the bazel-tests
step, which runs before other packages' steps. Any Bazel rule tagged as ci
will be tested / build in CI.
Note that the output paths of Bazel-created outputs will be different, so any remaining steps that now rely on Bazel outputs may need to be updated. Bazel outputs are located in tfjs/dist/bin/...
.
If all steps of the cloudbuild.yml
file are handled by Bazel, it can be deleted. Do not remove the package from tfjs/scripts/package_dependencies.json
.
Rebuild the cloudbuild golden files by running yarn update-cloudbuild-tests
in the root of the repository.
Before pushing to Git, run the Bazel linter by running yarn bazel:format
and yarn bazel:lint-fix
in the root of the repo. We run the linter in CI, so if your build is failing in CI only, incorrectly formatted files may be the reason.
🎉🎉🎉
- Make sure the package is added to
BAZEL_PACKAGES
in e2e/scripts/publish-tfjs-ci.sh - Make sure the package is added to
BAZEL_PACKAGES
in scripts/publish-npm.ts - Make sure the package generated by
pkg_npm
has all the files it needs, e.g. the README. - Make sure the package is added to the link-package's package.json and that downstream pakcages are updated to point to the link package's copy instead of the package's directory.
- For browser tests, it may be worth checking that all desired browser configurations will run in nightly CI.
- Make sure browser tests include all required tests. The
enumerate_tests
rule is usually necessary to make the browser actually run tests. - Make sure as many cloudbuild steps as possible are converted to Bazel, and that those steps are removed from the cloudbuild file.
- If the build and tests are fully handled by Bazel and don't need any other cloudbuild steps, make sure the package's
cloudbuild.yml
file is removed. Do not remove the package from scripts/package_dependencies.json. - Make sure tests are tagged with
nightly
orci
(tfjs_web_test
automatically tags tests withnightly
andci
). - Make sure the main
pkg_npm
rule is tagged withci
ornightly
so all parts of the build are tested. - Make sure the
package.json
scripts are updated and that the package.json includes@bazel/bazelisk
as a dev dependency. - Make sure the package has a
build-npm
script and apublish-npm
script. These are used by the release script. - Check the generated bundle sizes and make sure they don't include any unexpected files. Check the
_stats
files for info on this. - Make sure the package is added to the repo-wide tslint tsconfig and that its original lint scripts are removed.