Toolchain Notes

by Christophe Rosset

My original goal was to better understand the internals of the oxc ecosystem, which as of this day may well be on track to be the new standard for JavaScript tooling.

This involves subjects that I really enjoy, some of them I know well, some others I wish to get better at. This repository contains the notes I've taken while diving into the source code of rolldown, oxc and its ecosystem.

The goal isn't to explain each line of code (they will eventually evolve over time), but to understand how the blocks fit together and how the technologies are used.

If you read this book, you will:

  • Get into the internals of rolldown, oxc (the oxc part is still wip)
  • Discover how technologies such as Wasm and Napi are used inside those projects

Disclaimer: This work is based on myself reading the source code over multiple repositories.

  • I might get some things wrong
  • Some of my findings may end up out of date somehow

Don't hesitate to give a ⭐️ to the project on github.

Contributing

I wrote something incorrect ? You can make a contribution and fix it.

Click on the link on top right of the book, this will prompt you to fork the project and you will be able to make the modification directly in the github UI.

You can also propose a PR by hand.

The Author

Christophe Rosset

github - twitter - linkedin

About the tools

Vite

Vite is a modern build tool for the web. For the moment, it relies on the following bundlers:

  • esbuild (in go) for development
    • Very fast, serves native esm (no bundling overhead)
  • rollup for production
    • Better plugin api / ecosystem than esbuild

The fact that there are two different bundlers isn't optimal, this is where comes:

Rolldown

Rolldown is a JavaScript bundler written in Rust intended to serve as the future bundler used in Vite. It provides Rollup-compatible APIs and plugin interface, but will be more similar to esbuild in scope.

Oxc

Oxc (The JavaScript Oxidation Compiler) is a collection of high-performance tools for the JavaScript language written in Rust.

The goal of the project is to provide the next generation toolchain for JavaScript low-level abstraction that each can share so that:

  • we share one kind of parser
  • we share the same AST
  • ...

rolldown / Introduction

https://github.com/rolldown/rolldown

https://rolldown.rs/contrib-guide

rolldown / Shared Crates

https://github.com/rolldown/rolldown

rolldown_binding

Contains the bindings to wasm and napi for the objects used on the JavaScript side, thanks to the #[napi] macro.

This code is used in packages/rolldown to generate bindings.

Difference between napi and napi-rs.

📄

rolldown_common

rolldown_common::file_emitter

The FileEmitter is instanciated and passed as an Arc<FileEmitter> to the rolldown_plugin::PluginDriver

rolldown_css

Tiny wrapper around lightningcss

An extremely fast CSS parser, transformer, bundler, and minifier (built in rust)

📄

rolldown_ecmascript

📄

rolldown_fs

Thin abstraction over the traits oxc_resolver::{FileMetadata, FileSystem}.

For the implementation, see oxc-project/oxc-resolver, a Rust version of webpack/enhanced-resolve.

📄

rolldown_loader_utils

📄

rolldown_plugin

  • trait Plugin contains default implementation for the interfaces it declares:
    • build_start, resolve_id, load, transform, transform_ast and many more
  • trait Pluginable
    • exposes interfaces without implementations for call_load, call_transform, call_transform_ast, that kinda stuff
    • should not be used directly, it is recommended to use the Plugin trait - comment from source code:
      • "The main reason we don't expose this trait is that it used async_trait, which make it rust-analyzer can't provide a good auto-completion experience."
  • impl<T: Plugin> Pluginable for T block creates implementations for the methods call_* based on Plugin trait implementation

📄

rolldown_plugin::PluginDriver

PluginDriver

rolldown_resolver

This plugin relies on the traits rolldown_fs::{FileSystem, OsFileSystem} from rolldown_fs which relies on oxc_resolver::{FileMetadata, FileSystem}.

This is the plugin in charge of resolving the paths of the imports, which is a very tricky thing in JavaScript. The resolving part is handled by the oxc_resolver crate (in order to be able to share).

  1. rolldown_resolver::Resolver::new creates an instance of the resolver
  1. Call site of rolldown_resolver::Resolver::new is in BundlerBuilder, while creating PluginDriver
  2. rolldown_resolver::Resolver::resolve is exposed
  • it accepts:
    • importer: Option<&Path> - the path from where the module to be imported is to be resolved
    • specifier: &str - the "name" of the module to resolve
    • import_kind: rolldown_common::ImportKind whether it is an import, a dynamic import, a require, an AtImport (css)
  • it resolve the directory of the importer from importer
  • calls the adequate resolver (from oxc_resolver) based on import_kind with (importer, specifier)
  • retrieves the package.json related to the module being resolved, since it can affect how we should resolve it
  • caches the package.json
  • calculates the following for the return:
    • module_type: rolldown_common::ModuleDefFormat:
      • ending with .mjs or .cjs is easy
      • however, the type field of the package.json may affect the resolution (module, commonjs)

📄

rolldown_rstr

Exposes rolldown_rstr::Rstr which is a thin wrapper over oxc::span::CompactStr, which is a wrapper over the compact_str crate.

A memory efficient string type that can store up to 24* bytes on the stack.

A CompactString is a more memory efficient string type, that can store smaller strings on the stack, and transparently stores longer strings on the heap (aka a small string optimization). It can mostly be used as a drop in replacement for String and are particularly useful in parsing, deserializing, or any other application where you may have smaller strings.

rolldown_rstr::Rstr is used in many places in the project.

📄

rolldown_sourcemap

Exposes collapse_sourcemaps(mut sourcemap_chain: Vec<&SourceMap>) -> SourceMap.

It collapses multiple sourcemaps generated by calls to oxc::oxc_codegen::CodeGen::new().enable_source_map(&filename, &source_text).build() into one giant sourcemap.

Relies on oxc::sourcemap::*.

📄

rolldown_testing

Utils for used for bench testing. Used for benchmark.

📄

rolldown_tracing

See contribution guide chapter about tracing/logging.

This crate exposes try_init_tracing which is called when building the bundler and correctly initializes tracing according to env vars.

rolldown_binding::bundler::Bundler calls try_init_custom_trace_subscriber which does the same as try_init_tracing but ensures to call napi_env.add_env_cleanup_hook and manually flush and drop tracing_chrome::FlushGuard.

This crate relies on:

📄

rolldown / Builtin Plugins

https://github.com/rolldown/rolldown

rolldown_plugin_transform

  1. figure out what type of source using ocx::oxc_span::source_type
  2. parse the source code with rolldown_ecmascript::EcmaCompiler
  3. keep track of the comments (extracted by the parser)
  4. Extract symbols and scopes using oxc::oxc_semantic::SemanticBuilder
  5. Pass the ast oxc::oxc_codegen::CodeGenerator that will generate code + sourcemap

📄

rolldown / rolldown (rust)

https://github.com/rolldown/rolldown

Notes about the rolldown crate, which contains the public api on the rust side.

rolldown / rolldown (js) - bundling

https://github.com/rolldown/rolldown

Notes about the rolldown package, which is meant to be consume on the JavaScript side (runtimes like NodeJS).

Generate bindings

The bindings between rust and wasm / napi are declared in the rolldown_binding crate.

cd packages/rolldown
pnpm run build-binding

This will generate:

  • packages/rolldown/src/binding.d.ts: the TypeScript definition corresponding to the rolldown_binding crate
  • packages/rolldown/src/binding.js: the glue code that loads the compiled binary (corresponding to your cpu/arch) and exposes the binded methods
  • packages/rolldown/src/browser.js
  • packages/rolldown/src/rolldown-binding.{platform}-{arch}.node: the compiled binary from rust that is executable by node (example on macOS: rolldown-binding.darwin-x64.node)
  • packages/rolldown/src/rolldown-binding.wasi-browser.js: the glue code that loads the wasm version
  • packages/rolldown/src/rolldown-binding.wasi.cjs: same glue code in common-js

To generate WASI (WebAssembly System Interface) bindings:

# The compilation target wasm32-wasip1-threads may not be installed
rustup target add wasm32-wasip1-threads
cd packages/rolldown
pnpm run build-binding:wasi:release # or debug

https://doc.rust-lang.org/rustc/platform-support/wasm32-wasip1-threads.html

This will generate the same files as the previous step and:

  • packages/rolldown/src/rolldown-binding.debug.wasm32-wasi.wasm
  • packages/rolldown/src/rolldown-binding.wasm32-wasi.wasm
  • packages/rolldown/src/wasi-worker-browser.mjs
  • packages/rolldown/src/wasi-worker.mjs

We now understand where the *.wasm files come from.

They are required when process.env.NAPI_RS_FORCE_WASI is truthy, which will trigger require('./rolldown-binding.wasi.cjs') in packages/rolldown/src/binding.js.

Difference between napi and napi-rs.

Bundling rolldown with rolldown

A packages/rolldown/rolldown.config.mjs file exists which is used by a local previous version of rolldown (aliased as npm-rolldown, see in packages/rolldown/package.json).

That way, rolldown can be bundled with rolldown 🎉.

Most of those files previously generated in packages/rolldown/src will be moved to packages/rolldown/dist/shared when running pnpm run build-node which calls rolldown with the config: node ../../node_modules/npm-rolldown/bin/cli.js -c ./rolldown.config.mjs.

You have tasks available that will build bindings AND bundle:

  • pnpm run build-native:release (or debug)
  • pnpm run build-wasi:release (or debug)

If we look at the the building section of the contrib-guide tells us:

rolldown

To build the rolldown package, there are two commands:

  • just build / just build native
  • just build native release (important if running benchmarks)

They will automatically build the Rust crates and the Node.js package. So no matter what changes you made, you can always run these commands to build the latest rolldown package.

WASI

Rolldown supports WASI by considering is as a special platform. So we still use the rolldown package to distribute the WASI version of Rolldown.

To build the WASI version, you can run the following command:

  • just build wasi
  • just build wasi release (important if running benchmarks)

Building the WASI version will remove the native version of Rolldown. We designed the local build process on purpose that is you either build the native version or the WASI version. You can't mix them together, though NAPI-RS supports it.

If we look at the justfile, we can see that everything above gets in place:

build target="native" mode="debug":
    pnpm run --filter rolldown build-{{ target }}:{{ mode }}

Use rolldown local version

You can now use the local version you have built (either native or wasi):

node ./packages/rolldown/bin/cli.js

The rolldown package is linked to node_modules via pnpm workspace automatically, so you can also do:

pnpm rolldown

https://rolldown.rs/contrib-guide/building-and-running#running

rolldown / Bindings (rust/js)

Before reading this chapter, take a look at:

The topic of this chapter is how does the napi bindings declared in the rolldown_binding crate (and the code generated from it) are finally consumed.

createBundler

packages/rolldown/src/utils/create-bundler.ts

options

normalize

bindingify

Once normalized, those options are instanciated with objects following the interfaces packages/rolldown/src/binding.d.ts which was generated based on rolldown_binding.

  • There are multiple bindingify-*.ts files exporting different bindingify*() functions
  • Those functions accept a the normalized TS types and return objects that will be able to interact with napi

result

It returns a Bundler as describe in packages/rolldown/src/binding.d.ts:

export declare class Bundler {
  constructor(inputOptions: BindingInputOptions, outputOptions: BindingOutputOptions, parallelPluginsRegistry?: ParallelJsPluginRegistry | undefined | null)
  write(): Promise<BindingOutputs>
  generate(): Promise<BindingOutputs>
  scan(): Promise<void>
  close(): Promise<void>
  watch(): Promise<BindingWatcher>
}

This means the calls to the methods write, generate, scan, close, watch on the NodeJS side will be routed via napi to rolldown_binding::bundler. Where we can see the following implementations in rust:

#![allow(unused)]
fn main() {
impl Bundler {
  #[napi(constructor)]
  #[cfg_attr(target_family = "wasm", allow(unused))]
  pub fn new(
    env: Env,
    mut input_options: BindingInputOptions,
    output_options: BindingOutputOptions,
    parallel_plugins_registry: Option<ParallelJsPluginRegistry>,
  ) -> napi::Result<Self> {
    // ...
  }

  #[napi]
  #[tracing::instrument(level = "debug", skip_all)]
  pub async fn write(&self) -> napi::Result<BindingOutputs> {
    self.write_impl().await
  }

  #[napi]
  #[tracing::instrument(level = "debug", skip_all)]
  pub async fn write(&self) -> napi::Result<BindingOutputs> {
    self.write_impl().await
  }

  #[napi]
  #[tracing::instrument(level = "debug", skip_all)]
  pub async fn generate(&self) -> napi::Result<BindingOutputs> {
    self.generate_impl().await
  }

  #[napi]
  #[tracing::instrument(level = "debug", skip_all)]
  pub async fn scan(&self) -> napi::Result<()> {
    self.scan_impl().await
  }

  #[napi]
  #[tracing::instrument(level = "debug", skip_all)]
  pub async fn close(&self) -> napi::Result<()> {
    self.close_impl().await
  }

  #[napi]
  #[tracing::instrument(level = "debug", skip_all)]
  pub async fn watch(&self) -> napi::Result<BindingWatcher> {
    self.watch_impl().await
  }
}
}

rolldown / rolldown (js) - explore

https://github.com/rolldown/rolldown

rolldown / Build

https://rolldown.rs/contrib-guide/building-and-running

To understand what's happening under the hood, read rolldown / rolldown (js) - bundling.

Setup

just setup && just roll

Build

just build

Use rolldown local version

pnpm rolldown
## or
just run

Check the justfile.

rolldown / bench

just setup-bench is already called with just setup.

There are benchmarks for:

oxc / Introduction

https://github.com/oxc-project/oxc

oxc / Explore crates

https://github.com/oxc-project/oxc

oxc_ast

Special section for oxc_ast.

Printing an AST

For this one, you will need to read oxc_codegen before.

oxc_codegen

oxc:oxc_codegen::Codegen is the struct that holds everything needed to transform an oxc::oxc_ast::ast::Program into a CodegenReturn thanks to its build method.

The CodegenReturn contains:

  • code: String (the code generated from the ast)
  • map: Option<oxc_sourcemap::SourceMap> (the sourcemap if activated)

Codegen::build

  • prepares a buffer for the code that will be generated - self.code.reserve(program.source_text.len())
  • creates a HashMap of the comments contained in the AST (if the comments are to be printed - like not in minified code)
  • creates a oxc::oxc_codegen::SourcemapBuilder (if sourcemaps are active)

Finally, calls the print method on the AST, passing itself &mut Codegen and a default oxc::oxc_codegen::Context which will be passed to each print calls of each AST node.

Codegen::gen

Each kind of of AST nodes needs to implement the following traits (according to their behavior)

#![allow(unused)]
fn main() {
/// Generate source code for an AST node.
pub trait Gen: GetSpan {
    /// Generate code for an AST node.
    fn gen(&self, p: &mut Codegen, ctx: Context);

    /// Generate code for an AST node. Alias for `gen`.
    fn print(&self, p: &mut Codegen, ctx: Context) {
        self.gen(p, ctx);
    }
}
}
#![allow(unused)]
fn main() {
/// Generate source code for an expression.
pub trait GenExpr: GetSpan {
    /// Generate code for an expression, respecting operator precedence.
    fn gen_expr(&self, p: &mut Codegen, precedence: Precedence, ctx: Context);

    /// Generate code for an expression, respecting operator precedence. Alias for `gen_expr`.
    fn print_expr(&self, p: &mut Codegen, precedence: Precedence, ctx: Context) {
        self.gen_expr(p, precedence, ctx);
    }
}
}

See the follow-up in oxc_ast.

See more about sourcemaps on oxc::oxc_sourcemaps.

📄

oxc_sourcemap

The sourcemap implement port from rust-sourcemap, but has some different with it.

Encode sourcemap at parallel, including quote sourceContent and encode token to vlq mappings. Avoid Sourcemap some methods overhead, like SourceMap::tokens().

The main interface for creating sourcemaps from existing files seems to be oxc::oxc_codegen::CodeGen::new().enable_source_map(&filename, &source_text).build() (or any other options allowed by the builder pattern).

Understand the relation about SourceMap between oxc_codegen and oxc_sourcemap

See more in oxc::oxc_codegen.

📄

oxc_span

oxc::oxc_span::span::types

A range in text, represented by a zero-indexed start and end offset.

#![allow(unused)]
fn main() {
use oxc_span::Span;
let text = "foo bar baz";
let span = Span::new(4, 7);
assert_eq!(&text[span], "bar");
}

📄

oxc / oxc_ast

This crate deserves a chapter of itself.

It is split in 3 parts:

  1. The structs that describe each node of the AST.
  2. A few implementations for those structs.
  3. Generated implementations under oxc_ast/src/generated generated by scripts in tasks/ast_tools/src, based on decorators like #[ast(...)]

A few concepts/keywords to be known:

The Oxc AST differs slightly from the estree AST by removing ambiguous nodes and introducing distinct types. For example, instead of using a generic estree Identifier, the Oxc AST provides specific types such as BindingIdentifier, IdentifierReference, and IdentifierName. This clear distinction greatly enhances the development experience by aligning more closely with the ECMAScript specification.

  • AST: Abstract Syntax Tree
  • estree: one of the standard for representing an AST for JavaScript programs
    • There are other kinds of ASTs

Structs

The oxc_ast module exposes multiple structs such as Program, IdentifierName, ObjectProperty ...

They represent each node of the AST of a modern JavaScript program. Since oxc supports jsx and TypeScript by default, those nodes specific to their syntax are also present, like: JSXElement, JSXFragment, JSXExpression, TSEnumDeclaration, TSUnionType ...

https://github.com/oxc-project/oxc/tree/main/crates/oxc_ast/src/ast

Implementations

A few implementations specific to each of these structs are handcoded in crates/oxc_ast/src/ast_impl.

Generated implementations

You can read this comment on top of the structs

#![allow(unused)]
fn main() {
// NB: `#[span]`, `#[scope(...)]`,`#[visit(...)]` and `#[generate_derive(...)]` do NOT do anything to the code.
// They are purely markers for codegen used in `tasks/ast_tools` and `crates/oxc_traverse/scripts`. See docs in those crates.
// Read [`macro@oxc_ast_macros::ast`] for more information.
}

Here is an example of structs where you can see some macros applied:

#![allow(unused)]
fn main() {
#[ast(visit)]
#[derive(Debug, Clone)]
#[generate_derive(CloneIn, GetSpan, GetSpanMut, ContentEq, ContentHash, ESTree)]
#[estree(type = "Identifier")]
pub struct BindingIdentifier<'a> {
    pub span: Span,
    pub name: Atom<'a>,
    #[estree(skip)]
    #[clone_in(default)]
    pub symbol_id: Cell<Option<SymbolId>>,
}
}

Like said in the comment above, those macros don't really do anything by themselves (like regular macros would), they are markers for the internal tool tasks/ast_tools which:

  • goes through the structs using the macros above, in different crates like oxc_ast, oxc_regular_expression, oxc_span, oxc_syntax ...
  • generate the implementations for those structs in a ./generated folder for those structs

The generators are located in tasks/ast_tools/src.

Why was it done like that, instead of regular use of macros ?

Because of performance and maintenance reasons. See the following links for more informations:

oxc / Bundling (js)

First, read napi / Build.

You can install the following from npm, which include a pre-built binary - let's talk about how they are bundled.

Bundling

oxlint

It doesn't need bundling, the binary is directly used.

oxlint does not require Node.js, the binaries can be downloaded from the latest GitHub releases.

https://www.npmjs.com/package/oxlint

https://oxc.rs/docs/guide/usage/linter.html#installation

oxc-parser

cd napi/parser
pnpm build

Will generate:

  • napi/parser/index.d.ts: TypeScript definition of the objects exposed by Rust
  • napi/parser/index.js: glue code that handles requiring the binary
  • napi/parser/parser.darwin-x64.node (according to your os/arch)

https://www.npmjs.com/package/oxc-parser

https://oxc.rs/docs/guide/usage/parser.html#installation

oxc-transform

cd napi/transform
pnpm build

Will generate:

  • napi/transform/index.d.ts: TypeScript definition of the objects exposed by Rust
  • napi/transform/index.js: glue code that handles requiring the binary
  • napi/transform/transform.darwin-x64.node (according to your os/arch)

https://www.npmjs.com/package/oxc-transform

https://oxc.rs/docs/guide/usage/transformer.html#installation

Prepare publishing

Each of the folders in npm/{oxlint,oxc-parser,oxc-transform} contain a scripts/generate-packages.mjs file that will copy the binaries.

oxc / Build

Setup

https://oxc.rs/docs/contribute/development.html

just init
just ready

oxc-resolver / Introduction

https://github.com/oxc-project/oxc-resolver

Rust version of webpack/enhanced-resolve.

The tests are ported from:

There is a Node.js API which allows to resolve requests according to the Node.js resolving rules (esm / cjs).

Those resolving rules are applied, taking in account:

  • module type:
    • esm: { "conditionNames": ["node", "import"] }
    • cjs: { "conditionNames": ["node", "import"] }
  • browserField: package.json#browser
  • mainFields:
    • package.json#main
    • package.json#module

And a lot of other things that can alter the resolution algorithm.

This crate is widely used by rolldown_resolver.

napi / Introduction

https://napi.rs

From the node docs: https://nodejs.org/api/n-api.html

Node-API (formerly N-API) is an API for building native Addons. It is independent from the underlying JavaScript runtime (for example, V8) and is maintained as part of Node.js itself. This API will be Application Binary Interface (ABI) stable across versions of Node.js. It is intended to insulate addons from changes in the underlying JavaScript engine and allow modules compiled for one major version to run on later major versions of Node.js without recompilation. The ABI Stability guide provides a more in-depth explanation.

Difference between napi and napi-rs

  • napi is the api exposed by node
  • napi-rs is the rust framework over it that lets you build pre-compiled NodeJS addons in Rust

napi / Build

The napi-build crate is used for the build part.

A build.rs file is in the crate/package where you want to generate bindings from src/lib.rs.

The build.rs will only contain this kind of code:

extern crate napi_build;

fn main() {
    napi_build::setup();
}

Then, you will call something like:

napi build --platform --release

Which will generate:

  • the binaries for the targeted platforms
  • the bindings:
    • .d.ts TypeScript decleration files
    • .js glue code to load and bind the binary to JavaScript runtime

When running it from a task in a package.json, a configuration can be passed to a napi field

https://napi.rs/docs/introduction/simple-package

napi / Class

Class as argument

https://napi.rs/docs/concepts/class#class-as-argument

  • Class is different from Object. Class can have Rust methods and associated functions on it. Every field in Class can mutated in JavaScript.
  • So the ownership of the Class is actually transferred to the JavaScript side while you are creating it. It is managed by the JavaScript GC, and you can only pass it back by passing its reference.

Custom Finalize logic

https://napi.rs/docs/concepts/class#custom-finalize-logic

NAPI-RS will drop the Rust struct wrapped in the JavaScript object when the JavaScript object is garbage collected. You can also specify a custom finalize logic for the Rust struct.

env.adjust_external_memory

This function gives V8 an indication of the amount of externally allocated memory that is kept alive by JavaScript objects (i.e. a JavaScript object that points to its own memory allocated by a native addon). Registering externally allocated memory will trigger global garbage collections more often than it would otherwise.

Arguments types

https://napi.rs/docs/concepts/function

ThreadsafeFunction

https://napi.rs/docs/concepts/threadsafe-function

https://github.com/rolldown/rolldown/blob/main/crates/rolldown_binding/src/types/js_callback.rs

JavaScript functions can normally only be called from a native addon's main thread. If an addon creates additional threads, then N-API functions that require a Env, JsValue, or Ref must not be called from those threads. When an addon has additional threads and JavaScript functions need to be invoked based on the processing completed by those threads, those threads must communicate with the addon's main thread so that the main thread can invoke the JavaScript function on their behalf. The thread-safe function APIs provide an easy way to do this.

rust / crates

Some rust specifics