lightpanda-browser

Architecture

Lightpanda is a headless browser written in Zig, designed for high-performance web automation, scraping, and AI agent workflows. This page provides an overview of its major architectural components and how they fit together. For in-depth coverage, see the dedicated pages linked in each section.

Prerequisites

Before reading this document, you should be familiar with:

High-Level Design

Lightpanda follows a layered architecture with three main subsystems:

  1. Browser Engine — DOM management, HTML parsing, JavaScript execution, and page lifecycle
  2. Network Layer — HTTP client, WebSocket support, robots.txt handling, and proxy configuration
  3. CDP Protocol — Chrome DevTools Protocol server that exposes browser functionality to automation clients

These layers are coordinated by two central components: the App struct (application-level state) and the Server struct (connection management).

Application Lifecycle

The entry point (main.zig) initializes the allocator, parses command-line arguments, and creates the App instance. The application supports three operational modes:

// Simplified startup flow
var app = try App.init(allocator, &args);
defer app.deinit();

switch (args.mode) {
    .serve => { /* start CDP server */ },
    .fetch  => { /* single-page fetch */ },
    .mcp    => { /* MCP server */ },
}

The App Struct

App is the central coordinator that owns all major subsystems:

Field Purpose
network The I/O event loop (epoll/kqueue runtime)
platform JavaScript engine platform (V8)
snapshot V8 startup snapshot for fast JS context creation
telemetry Anonymous usage telemetry
arena_pool Pooled arena allocators for per-request memory
config Parsed command-line configuration

This design keeps global state minimal and makes the dependency graph explicit.

Browser Engine

The browser engine is responsible for loading web pages, parsing HTML, building the DOM tree, and executing JavaScript. It is the largest subsystem in the codebase.

Key components include:

// Creating a browser and loading a page
var browser = try Browser.init(app, .{ .http_client = http_client });
defer browser.deinit();

var session = try browser.newSession(notification);
try session.navigate(url);

The Page module implements a comprehensive set of Web APIs including Window, Document, Element, Event, MutationObserver, IntersectionObserver, Location, Performance, and more. This allows Lightpanda to faithfully execute client-side JavaScript that interacts with the DOM.

For full details on DOM implementation, JavaScript integration, HTML parsing, and the page lifecycle, see Browser Engine.

Network Layer

The network layer provides the I/O foundation that all HTTP requests and WebSocket connections run on. It uses a platform-native event loop (epoll on Linux, kqueue on macOS) for efficient non-blocking I/O.

Key components:

The HttpClient sits between the browser engine and the raw network layer, handling cookies, redirects, and content-type detection.

// Network initialization is handled by App
app.network = try Network.init(allocator, config);
// The event loop is started when serving
app.network.run();

For details on HTTP internals, WebSocket handling, robots.txt support, and proxy configuration, see Network Layer.

CDP Protocol

Lightpanda exposes a Chrome DevTools Protocol server over WebSocket, allowing standard automation tools like Puppeteer, Playwright, and chromedp to control the browser.

The CDP implementation includes:

// Server initialization binds to address and starts accepting connections
var server = lp.Server.init(app, address);
defer server.deinit();

The server handles the WebSocket upgrade handshake, parses CDP messages, and dispatches them to domain-specific handlers. Each client connection gets its own thread with a dedicated Browser instance, ensuring isolation between concurrent automation sessions.

For the full list of supported CDP domains and implementation details, see CDP Protocol.

Memory Management

Lightpanda uses Zig’s explicit memory management model with several strategies:

This approach minimizes allocation overhead in hot paths while maintaining safety during development.

Concurrency Model

Lightpanda uses a hybrid concurrency model:

// CAS loop for thread-safe connection counting
var current = self.active_threads.load(.monotonic);
while (current < max_connections) {
    current = self.active_threads.cmpxchgWeak(
        current, current + 1, .monotonic, .monotonic
    ) orelse break;
}

Next Steps

Dive deeper into each subsystem: