Rust is a young language. Version 1.0 shipped in 2015, which makes its ecosystem genuinely unripe next to long-running incumbents like Python. On top of that, Rust is a difficult systems language sitting in a sea of developers looking for the simplest ways to solve a task. It is no wonder that Rust is not the first choice for browser automation.
But in my opinion, Rust is the *best* language for browser automation. Here is why.
Browser automation is chaotic. You can argue that most programming work is, but automation lives at a particularly vicious intersection: a real browser, a real network, real timing, real DOM mutations, and a script that has to coordinate all of it without breaking. When you depend on this many external moving parts behaving in symphony, you need something stable for the orchestrator. Otherwise things *will* break, and you won’t have the slightest clue as to why.
This is where my distrust of dynamically typed languages for this particular job comes from. Things can be subtly wrong under the hood, and not because of the developer — because somewhere in the contract between two systems a value was described one way and implemented another, which at time could result in bugs like the classic heisenbug.
A real example: WebDriver BiDi’s `message_id` is documented as an integer in the documentation, but on the wire it is actually a float. In Python or TypeScript that passes silently for weeks, then blows up one Tuesday at 3 a.m. after a PR that performs an arithmetic or “ValueError mutation” operation on it. Rust would have caught it at first run of that program.
That is the kind of guarantee you want when you are running thousands of automated browsers in parallel and a single silent type mismatch costs you real money.
Introducing Rustenium
Rustenium is the first Rust automation library that supports **WebDriver BiDi** *and* the **Chrome DevTools Protocol (CDP)**, with built-in stealth capabilities. Think Puppeteer or Playwright, but for Rust — and dual-protocol from day one.
It is built around four ideas:
1. **Type-safe protocol bindings.** The BiDi and CDP types are generated directly from the official specifications. If a field is required by the protocol, it is required at compile time. No “did I forget a key in this JSON object” debugging at 2 a.m.
2. **Both protocols, optional and composable.** Use BiDi only, CDP only, or both. Connect either of them at any point in the browser’s lifetime.
3. **First-class Chrome *and*Firefox.** Firefox speaks BiDi natively (no driver). Chrome speaks both BiDi (via chromedriver) and CDP (directly). Rustenium auto-downloads the right binaries the first time you run.
4. **Realistic input.** A `BidiMouse` for precise robotic movements, a `HumanMouse` that draws Bezier curves with Gaussian distortion so cursor traces look like a human’s, full keyboard support with modifier keys, and a multi-touch `Touchscreen`.
The rest of this article is a tour. Each section is a runnable snippet.
Github: https://github.com/dashn9/rustenium
Crates: https://crates.io/crates/rustenium
Setting Up
Add Rustenium and Tokio to your `Cargo.toml`. The `macros` feature enables the `css!()` and `xpath!()` selector macros.
[dependencies]
rustenium = { version = "1.0.1", features = ["macros"] }
tokio = { version = "1", features = ["full"] }Rustenium is async end-to-end, so every example runs under `#[tokio::main]`.
Spawning a Chrome or Firefox Browser
The shortest possible script: launch Chrome, navigate, screenshot, close. Rustenium downloads Chrome and chromedriver automatically the first time, so this just works on a clean machine.
use rustenium::browsers::{chrome, BidiBrowser};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut browser = chrome(None).await;
browser.navigate("https://example.com").await?;
browser.screenshot().await?;
browser.close().await?;
Ok(())
}`chrome(None)` uses defaults: BiDi enabled, CDP off, headful, a fresh profile, normal window size. When you want to actually configure things, pass a `ChromeConfig`. Browser flags (the things you would put on the command line) and capabilities (the things WebDriver cares about) are separate, deliberately:
use rustenium::browsers::{chrome, BidiBrowser, ChromeConfig};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut config = ChromeConfig::default();
config.browser_flags = Some(vec![
"--window-size=1280,720".to_string(),
"--disable-gpu".to_string(),
"--headless=new".to_string(),
]);
config.capabilities
.add_arg("--disable-blink-features=AutomationControlled")
.accept_insecure_certs(true);
let mut browser = chrome(Some(config)).await;
browser.navigate("https://example.com").await?;
browser.close().await?;
Ok(())
}Firefox is even simpler. Firefox ships BiDi natively, so there is no separate driver process and no extra round trip. Rustenium speaks straight to Firefox’s BiDi WebSocket.
use rustenium::browsers::{firefox, BidiBrowser};
use rustenium_macros::css;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut browser = firefox(None).await;
browser.navigate("https://example.com").await?;
let headings = browser.find_nodes(css!("h1")).await?;
println!("Found {} headings", headings.len());
browser.close().await?;
Ok(())
}The same `BidiBrowser` trait is implemented by both `ChromeBrowser` and `FirefoxBrowser`, so any code that talks to one can talk to the other unchanged. That portability is one of the quiet wins of building on the BiDi standard rather than a vendor protocol.
Protocol Selection
This is where Rustenium starts to look different from Puppeteer or Playwright. On Chrome you can run BiDi only, CDP only, or both at the same time. CDP exposes Chrome-only features that BiDi does not yet cover — fine-grained device emulation, accessibility-tree access, target manipulation — and BiDi gives you a stable, cross-browser surface. Rustenium lets you pick per task.
use rustenium::browsers::ChromeConfig;
fn bidi_only() -> ChromeConfig {
ChromeConfig::default()
}
fn cdp_only() -> ChromeConfig {
ChromeConfig {
enable_bidi: false,
enable_cdp: true,
..Default::default()
}
}
fn both() -> ChromeConfig {
ChromeConfig {
enable_bidi: true,
enable_cdp: true,
..Default::default()
}
}You can also start with one and attach the other at runtime. A common pattern: use CDP for the noisy startup work (device metrics, network conditions, target setup) where it is faster and more capable, then attach BiDi only when you need the cross-browser primitives like locators, preload scripts, or network interception with auth.
use rustenium::browsers::{BidiBrowser, ChromeBrowser, ChromeConfig};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut browser = ChromeBrowser::new(ChromeConfig {
enable_bidi: false,
enable_cdp: true,
..Default::default()
}).await;
// ... do CDP-only work first ...
browser.connect_bidi().await; // attach BiDi without disturbing CDP
// ... now both protocols are available ...
browser.close().await?;
Ok(())
}One caveat from experience: CDP behaves a little differently when a BiDi session is already attached (BiDi installs its own internal targets that CDP can see). If you rely on CDP target enumeration, attach CDP first, then BiDi — connecting BiDi *after* CDP is the safe order.
Finding Elements
Rustenium ships with `css!()` and `xpath!()` macros so selectors are validated at compile time as much as possible. Both macros build a `Locator` value that the BiDi engine consumes directly — there is no string concatenation under the hood.
use rustenium::browsers::{chrome, BidiBrowser};
use rustenium_macros::{css, xpath};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut browser = chrome(None).await;
browser.navigate("https://example.com").await?;
// Single match
let heading = browser.find_node(css!("h1")).await?;
// Many matches
let buttons = browser.find_nodes(css!(".btn-primary")).await?;
// XPath when CSS is not enough
let titled_h1 = browser.find_nodes(xpath!("//h1[@class='title']")).await?;
// Wait until a dynamic element appears (default 4s timeout)
let dynamic = browser.wait_for_node(css!(".dynamic-content")).await?;
browser.close().await?;
Ok(())
}`find_node` returns `Option<Node>` — none if nothing matched. `wait_for_node` polls until the timeout (default four seconds, configurable via `wait_for_node_with_options`) and is the version you should reach for any time the element might not exist *yet*. Single-page apps especially.
Mouse Input — Precise vs. Human-like
Rustenium gives you two mouse implementations, both behind the same `Mouse` trait. `BidiMouse` produces direct, instant, machine-perfect movements — useful when you actually want a robot. The path is a straight line, optionally stepped, and clicks happen with no jitter.
use rustenium::browsers::{chrome, BidiBrowser};
use rustenium::input::{Mouse, MouseClickOptions, MouseMoveOptions, Point};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut browser = chrome(None).await;
let context_id = browser.get_active_context_id()?;
browser.mouse().move_to(Point { x: 100.0, y: 200.0 },&context_id,
MouseMoveOptions { steps: Some(5), ..Default::default() },).await?;
browser.mouse().click(None, &context_id, MouseClickOptions::default()).await?;
browser.close().await?;
Ok(())
}`HumanMouse` does the opposite. It generates a Bezier curve between source and destination, applies Gaussian distortion to make the path look organic, and inserts variable per-step delays sampled from a distribution that matches how humans actually move a cursor.
use rustenium::browsers::{chrome, BidiBrowser};
use rustenium::input::{Mouse, MouseClickOptions, MouseMoveOptions, Point};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut browser = chrome(None).await;
let context_id = browser.get_active_context_id()?;
browser.human_mouse().move_to(Point { x: 320.0, y: 480.0 }, &context_id, MouseMoveOptions::default(),).await?;
browser.human_mouse().click(None, &context_id, MouseClickOptions::default()).await?;
browser.close().await?;
Ok(())
}In practice, use `HumanMouse` when something on the other side cares whether the cursor came from a person — anti-bot checks, fraud heuristics, behavioural analytics. Use `BidiMouse` when nobody is watching and you want speed.
Keyboard Input
The keyboard surfaces three primitives: `down`, `up`, and `press` (a held-down chord), plus `type_text` for entire strings. `KeyboardTypeOptions::delay` takes a `DelayRange`, and per character Rustenium picks a random hold time inside it. The defaults documented in the source map to plausible human latencies — fast typist `(30, 80)`, average `(60, 140)`, slow and careful `(100, 250)`.
use rustenium::browsers::{chrome, BidiBrowser};
use rustenium::input::{DelayRange, KeyboardTypeOptions};
use rustenium_macros::css;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut browser = chrome(None).await;
let context_id = browser.get_active_context_id()?;
browser.navigate("https://example.com/login").await?;
let _input = browser.wait_for_node(css!("input[name='email']")).await?;
// Type with a randomised 60–140 ms hold per key (average typist).
browser.keyboard().type_text("user@example.com", &context_id,
Some(KeyboardTypeOptions {
delay: DelayRange::new(60, 140),
gap_multiplier: 1.2,
}),
).await?;
// Ctrl+A to select all
browser.keyboard().down("Control", &context_id).await?;
browser.keyboard().press("a", &context_id, None).await?;
browser.keyboard().up("Control", &context_id).await?;
browser.close().await?;
Ok(())
}Note that `down` and `up` are explicitly paired, which means modifier-key combinations are easy and unambiguous: hold Control, press a, release Control. No string parsing of `”Control+a”` to go wrong.
Network Interception
This is one of BiDi’s killer features and one of the strongest reasons to choose Rustenium over a CDP-only library. You register a single async closure and it gets called for every request, with full ability to abort, continue, modify, or authenticate.
use rustenium::browsers::{chrome, BidiBrowser};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut browser = chrome(None).await;
browser
.on_request(|request| async move {
if request.url().contains("ads.example.com") {
let _ = request.abort().await;
return;
}
request.continue_().await;
})
.await?;
browser.authenticate("username", "password").await?;
browser.navigate("https://example.com").await?;
browser.close().await?;
Ok(())
}A real-world use of this: drop tracking and ad domains to make your scrapes faster, intercept and rewrite analytics calls so they do not pollute someone else’s dashboard during testing, or feed every observed request into a tracing layer so you have a full audit log of what each automation step actually did.
JavaScript Evaluation and Preload Scripts
`evaluate_script` runs a JavaScript expression in the active browsing context. `add_preload_script` registers a function that runs on every fresh document load, before any of the page’s own scripts. This is how you patch `navigator.webdriver`, install custom globals for your tests, or seed state into a SPA before it boots.
use rustenium::browsers::{chrome, BidiBrowser};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut browser = chrome(None).await;
browser
.on_request(|request| async move {
if request.url().contains("ads.example.com") {
let _ = request.abort().await;
return;
}
request.continue_().await;
})
.await?;
browser.authenticate("username", "password").await?;
browser.navigate("https://example.com").await?;
browser.close().await?;
Ok(())
}The second argument to `evaluate_script` is `await_promise`. Pass `true` and Rustenium awaits the returned promise before resolving the call — handy for any script that does asynchronous work.
Timezone Emulation
For testing locale-sensitive logic — date formatting, scheduling, anything that crosses a midnight boundary — you can spoof the browser’s timezone without touching the host system.
use rustenium::browsers::{chrome, BidiBrowser};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut browser = chrome(None).await;
browser.emulate_timezone(Some("Asia/Tokyo".to_string())).await?;
browser.navigate("https://browserleaks.com/timezone").await?;
browser.close().await?;
Ok(())
}Pass `None` to clear the override.
CDP — Device Emulation and Tabs
When you need Chrome-specific superpowers, drop into CDP. Device-metric emulation simulates a phone — viewport size, device pixel ratio, mobile flag — well enough that responsive sites behave the same as they do on a real handset. Tab management via CDP is also more flexible than the BiDi `create_context` route.
use rustenium::browsers::cdp_browser::CdpBrowser;
use rustenium::browsers::{BidiBrowser, ChromeBrowser, ChromeConfig};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut browser = ChromeBrowser::new(ChromeConfig {
enable_bidi: false,
enable_cdp: true,
..Default::default()
}).await;
// iPhone-X-ish viewport
<ChromeBrowser as CdpBrowser>::emulate_device_metrics(
&mut browser, 375, 812, 3.0, true,
).await?;
<ChromeBrowser as CdpBrowser>::navigate(
&mut browser, "https://example.com",
).await?;
<ChromeBrowser as CdpBrowser>::create_tab(
&mut browser, "https://example.org",
).await?;
browser.close().await?;
Ok(())
}The fully qualified `<ChromeBrowser as CdpBrowser>::method(…)` syntax disambiguates between the BiDi and CDP `navigate` methods. Both are implemented on `ChromeBrowser`, both are correct, and the compiler refuses to guess for you. This is one of those small examples where Rust’s type system saves you from a silent foot-gun: in a dynamically typed library, calling `browser.navigate(…)` could resolve to either protocol depending on import order, and you would not know which one until something broke.
Putting It Together — A Real Search
Here is everything from above woven into a single end-to-end script: launch, wait for the search box to render, click it like a person would, type a query at a human cadence, press enter, wait for results, take a screenshot, exit cleanly.
use rustenium::browsers::{chrome, BidiBrowser};
use rustenium::input::{DelayRange, KeyboardTypeOptions};
use rustenium::nodes::Node;
use rustenium_macros::css;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut browser = chrome(None).await;
let context_id = browser.get_active_context_id()?;
browser.navigate("https://duckduckgo.com").await?;
let mut search = browser
.wait_for_node(css!("input[name='q']"))
.await?
.expect("search box not found");
search.mouse_click().await?;
browser.keyboard().type_text("rust browser automation", &context_id,
Some(KeyboardTypeOptions {
delay: DelayRange::new(60, 140),
gap_multiplier: 1.2,
}),
).await?;
browser.keyboard().press("Enter", &context_id, None).await?;
let _results = browser.wait_for_node(css!("[data-testid='result']")).await?;
browser.screenshot().await?;
browser.close().await?;
Ok(())
}Two things to notice here. First, `search.mouse_click()` is a method on the node itself — Rustenium computes the element’s center on the page and routes through the active mouse implementation, so if you swap to `human_mouse()` everywhere, every click in your script becomes humanlike with no other code change. Second, the whole thing is statically checked. If you typo a CSS selector, the macro flags it. If you misuse a builder, the type system flags it. If you forget to await something, Clippy flags it. The script either compiles and runs, or it does not exist.
Why This Matters in Production
I started this article complaining about heisenbugs in dynamically typed automation. It is worth saying *why* that matters in practice for a workload that, on the surface, sounds simple.
A typical scraping or testing pipeline runs hundreds or thousands of browser sessions a day. Each one walks a flow that depends on a dozen network calls completing, a handful of DOM mutations happening in the right order, and the wider page settling into a state your code expects. The failure modes are notoriously asymmetric: a working session looks identical from the outside to a session that took a wrong turn three steps in and is now solving a CAPTCHA into the void. Logs help, but only if the language has not silently coerced something on the way in.
Rust’s contribution here is unglamorous but huge. The BiDi message is parsed once, into a real type, and every downstream call gets that type. You cannot accidentally pass `Option<Node>` where `Node` is expected. You cannot forget to handle the empty case of a selector that matched nothing. You cannot silently coerce `f64` to `i64` and lose the high bits of a session ID. The cost is a steeper initial slope; the payoff is a system that, once it compiles, tends to actually work.
Next Ideas: Affordable Headful Browser APIs
Headless browsers are great for unit tests and scrapes that nobody is watching, but more and more anti-bot systems flag headless fingerprints on sight. Real headful Chromium running on a real desktop session is the gold standard — and also expensive, because every session needs a display, a GPU surface, and a real OS user.
The next thing I want to build on top of Rustenium is a thin orchestration layer for *affordable* headful browser APIs: pools of cheap headful sessions on shared hardware with proper isolation, with Rustenium as the per-session driver and a Rust-level scheduler doling out browsers to workers. The same type-safe story all the way down, with the option to swap in `HumanMouse` and `add_preload_script` for stealth without rewriting any of the scrape code.
That is a topic for the next article. For now, `cargo add rustenium`, run the snippets above, and see how it feels to write automation in a language that refuses to lie to you.
Appendix — Crate Map
A quick reference for the workspace layout, in case you want to dig into the source:
| Crate | Description |
| rustenium | Main library — browser implementations, input devices, node interactions |
| rustenium-core | Protocol transport, sessions, connections, event system |
| rustenium-bidi-definitions | WebDriver BiDi protocol type definitions |
| rustenium-cdp-definitions | Chrome DevTools Protocol type definitions |
| rustenium-macros | Procedural macros (`css!`, `xpath!`) |
| rustenium-generator | Code generator that turns the official protocol specs into Rust types |
The generator is what makes the type-safe story actually work: every time a new BiDi or CDP revision lands, the bindings get regenerated from spec, and any breaking change shows up as a compile error in the higher-level crates. Free regression tests, courtesy of the type system.
Comments
No comments yet. Be the first to share your thoughts!
Leave a Comment