March 13, 2024

Diagram of playwright-core initialization process

Peering into Playwright's import process

Whenever you import playwright, there's a lot of code being executed before you can actually execute anything with its APIs. In the following short snippet

import { chromium } from '@playwright/test'

async function main () => {
  const browser = await chromium.launch()
  const page = await browser.newPage()
  await page.goto("https://qacomet.com")
}

Playwright is initializing a server controlling the browser instance, building a connection between its underlying client and server library, and using a factory method to create a client-facing API for manipulating the browser. This long seemingly roundabout process is actually Playwright's secret sauce for much of its functionality. Because of this architectural decision, it is now possible to create client API's for Playwright in multiple languages, all which use the same underlying software architecture. This gives a consistent developer experience across projects spanning multiple languages, such as Javascript, Python, Java, and .NET. In addition, it's now easier to build future community supported clients in other languages, such as in Ruby, because of the replicability of Playwright's core interfaces.

In order to see how the general project is structured, and how its design patterns can be replicated across clients, we trace through Playwright's initialization process in its core Typescript project, on which every other client library depends. By doing this, we will view its internals giving a clear picture of its internal architecture.

Merging `playwright` and `playwright-core`

Internally, when you import from @playwright/test, there is an underlying call to import from two merged packages, packages/playwright-core and packages/playwright. This merge happens in packages/playwright/test, since @playwright/test is just an outward facinging export of require('playwright/test'), an alias for packages/playwright/test. This is where the test execution API's from packages/playwright are combined with the browser automation API's from packages/playwright-core.

We will focus on the import from playwright-core, since it provides the core browser automation and is the code every other client library wraps. Looking at its index.js file

packages/playwright-core/index.js

module.exports = require("./lib/inprocess");

gives an export from its lib/inprocess.ts file (note the src directory is compiled to the lib directory after playwright runs its build script, and is what's found in node_modules/playwright-core, hence the require imports from ./lib/ and not ./src/). The inprocess.ts file just imports from the adjacent inProcessFactory.ts and executes the function from there, called createInProcessPlaywright. We include the source below in its own section but note createInProcessPlaywright dynamically intiailizes the interface between the client library, defined in src/client, and the server library, defined in src/server. The server library is responsible for dispatching browser automation actions over a browser automation protocol, such as the Chrome Devtools Protocol, abbreviated as CDP, while the client library gives a public API for playwright users to interact with the server library.

This separation of logic between client and server is what gives implementations of the client library in multiple languages. For example, if you look in the playwright-python source code, you will find the same design patterns and classes defined in playwright-core's src/client library.

Rapid overview of the client and server libraries

In playwright-core the main functionality can be found in the src/client and src/server folders. The client folder contains classes, many of which are subclasses of the ChannelOwner class. This ChannelOwner class is a client-side representation of a corresponding server-side Dispatcher class in the src/server/dispatchers folder. These dispatcher classes manage communication from the server library to the browser being automated. So for a ChannelOwner subclass called Page, containing the client side APIs (such as page.goto), there is a corresponding Dispatcher subclass called PageDispatcher. When we call page.goto a message is send over the unique client Connection instance to the unique DispatcherConnection instance, which calls the corresponding PageDispatcher, which then marshalls the automation command to the running browser process. This is a repeated pattern for many of the client-facing APIs you use while writing browser automation scripts with Playwright.

Understanding the inner workings of these abstractions is essential for grokking the architecture of Playwright, which we touch on soon. But first, let's go back to tracing the import process from Playwright and see how that works.

Playwright's core API factory, createInProcessPlaywright

This file, inProcessFactory.ts, contains only one function called createInProcessPlaywright. We include it here as a reference:

packages/playwright-core/src/inProcessFactory.ts

export function createInProcessPlaywright(): PlaywrightAPI {
  const playwright = createPlaywright({
    sdkLanguage:
      (process.env.PW_LANG_NAME as Language | undefined) || "javascript",
  });

  const clientConnection = new Connection(undefined, undefined);
  clientConnection.useRawBuffers();
  const dispatcherConnection = new DispatcherConnection(true /* local */);

  // Dispatch synchronously at first.
  dispatcherConnection.onmessage = (message) =>
    clientConnection.dispatch(message);
  clientConnection.onmessage = (message) =>
    dispatcherConnection.dispatch(message);

  const rootScope = new RootDispatcher(dispatcherConnection);

  // Initialize Playwright channel.
  new PlaywrightDispatcher(rootScope, playwright);
  const playwrightAPI = clientConnection.getObjectWithKnownName(
    "Playwright"
  ) as PlaywrightAPI;
  playwrightAPI.chromium._serverLauncher = new BrowserServerLauncherImpl(
    "chromium"
  );
  playwrightAPI.firefox._serverLauncher = new BrowserServerLauncherImpl(
    "firefox"
  );
  playwrightAPI.webkit._serverLauncher = new BrowserServerLauncherImpl(
    "webkit"
  );
  playwrightAPI._android._serverLauncher = new AndroidServerLauncherImpl();

  // Switch to async dispatch after we got Playwright object.
  dispatcherConnection.onmessage = (message) =>
    setImmediate(() => clientConnection.dispatch(message));
  clientConnection.onmessage = (message) =>
    setImmediate(() => dispatcherConnection.dispatch(message));

  clientConnection.toImpl = (x: any) =>
    x
      ? dispatcherConnection._dispatchers.get(x._guid)!._object
      : dispatcherConnection._dispatchers.get("");
  (playwrightAPI as any)._toImpl = clientConnection.toImpl;
  return playwrightAPI;
}

If you scan through the script and look at the return value, you'll see Playwright is returning the PlaywrightAPI instance playwrightAPI, which is defined somewhere within the client library playwright-core/src/client/ (which we find from our type hints). So when we write a script using playwright-core

my-automation-script.ts

import { chromium } from "playwright-core";

really we are making an import of the playwrightAPI object and accessing its chromium property, so our script is secretly

my-automation-script.ts

import playwrightAPI from "playwright-core";
const { chromium } = playwrightAPI;

Tracing where this object is instantiated, we find the call

packages/playwright-core/src/inProcessFactory.ts

const playwrightAPI = clientConnection.getObjectWithKnownName(
  "Playwright"
) as PlaywrightAPI;

which is not very descriptive. If you try looking in the Connection class definition, you won't get very far either, you'll just see

packages/playwright-core/src/client/connection.ts

export class Connection extends EventEmitter {
  readonly _objects = new Map<string, ChannelOwner>();
  // ...
  getObjectWithKnownName(guid: string): any {
    return this._objects.get(guid)!;
  }
}

which is even more opaque. From this all we know is the Connection class keeps an _objects map which at some point contains an instance of PlaywrightAPI, so tracing the calls directly is not the most helpful choice for understanding Playwright's initialization. Instead, going through the logic within inProcessFactory will give us a clear picture of how this library is wrapped together.

Cross communication between client and server

If you look through the script's imports, anything with Dispatcher in its name, and createPlaywright, come from the server directroy. The createPlaywright function creates an instance of the Playwright class defined in packages/playwright-core/server/playwright.ts.

Similarly, the imports for Connection and PlaywrightAPI come from the src/client directory. Looking through the first lines of inProcessFactory, there is an instantiation of the server Playwright class and instantiation of objects for the ClientConnection and DispatcherConnection classes

packages/playwright-core/src/inProcessFactory.ts

const playwright = createPlaywright({
  sdkLanguage:
    (process.env.PW_LANG_NAME as Language | undefined) || "javascript",
});

const clientConnection = new Connection(undefined, undefined);
clientConnection.useRawBuffers();
const dispatcherConnection = new DispatcherConnection(true /* local */);

The next lines of code are the bridge between the internal client and server libraries

packages/playwright-core/src/inProcessFactory.ts

dispatcherConnection.onmessage = (message) =>
  clientConnection.dispatch(message);
clientConnection.onmessage = (message) =>
  dispatcherConnection.dispatch(message);

given by setting the onmessage properties for each of these objects. Notice each onmessage calls the other's dispatch method; i.e., dispatcherConnection calls the clientConnection.dispatch through onmessage and vice versa. This gives us the hint somehow dispatcherConnection.onmessage will be called somewhere in the codebase.

Server-side createPlaywright

Before we continue down the logic within createInProcessPlaywright, let's go over what the createPlaywright function actually does. It is a simple function call which instantiates the Playwright class defined in packages/playwright-core/server/playwright.ts. Below is a simplified version of it

packages/playwright-core/server/playwright.ts

import { Chromium } from "./chromium/chromium";
import { Firefox } from "./firefox/firefox";
import { Selectors } from "./selectors";
import { WebKit } from "./webkit/webkit";
import { createInstrumentation, SdkObject } from "./instrumentation";

export class Playwright extends SdkObject {
  readonly selectors: Selectors;
  readonly chromium: Chromium;
  readonly android: Android;
  readonly electron: Electron;
  readonly firefox: Firefox;
  readonly webkit: WebKit;
  readonly options: PlaywrightOptions;
  readonly debugController: DebugController;

  constructor(options: PlaywrightOptions) {
    super(
      { attribution: {}, instrumentation: createInstrumentation() } as any,
      undefined,
      "Playwright"
    );
    this.options = options;
    this.chromium = new Chromium(this);
    this.firefox = new Firefox(this);
    this.webkit = new WebKit(this);
    this.selectors = new Selectors();
    this.debugController = new DebugController(this);
  }
}

This class contains all of the core browser automation functionality for Playwright. Diving deeper, if you look into packages/playwright-core/server/chromium/ you will see all the functionality for automating a chromium browser.

RootDispatcher and server-side Playwright

Continuing down createInProcessPlaywright, we see there's the construction of a RootDispatcher instance, and a PlaywrightDispatcher instance.

packages/playwright-core/src/inProcessFactory.ts

const rootScope = new RootDispatcher(dispatcherConnection);

// Initialize Playwright channel.
new PlaywrightDispatcher(rootScope, playwright);

Note the rootScope object is referenced throughout the dispatcher classes, and acts as a wrapper around the dispatcherConnection object instantiated above, so that each of the child Dispatcher classes will have access to the dispatcherConnection through the rootScope object. This wrapping functionality will become clearer when we dive into PlaywrightDispatcher's constructor, where it passes the rootScope to all of the child Dispatcher classes.

Here's a simplified version of what's happening in the constructor for Rootscope:

playwright-core/src/server/dispatchers/dispatcher.ts

class RootDispatcher extends Dispatcher {
  constructor(connection: DispatcherConnection) {
    super(connection, { guid: "" }, "Root", {});
  }
}

class Dispatcher extends EventEmitter {
  _connection: DispatcherConnection;
  _parent: DispatcherConnection | undefined;

  constructor(
    parent: ParentScopeType | DispatcherConnection,
    object: Type,
    type: string,
    initializer: channels.InitializerTraits<Type>
  ) {
    super();

    this._connection =
      parent instanceof DispatcherConnection ? parent : parent._connection;
    this._parent = parent instanceof DispatcherConnection ? undefined : parent;

    this._guid = object.guid;
    this._type = type;
    this._object = object;

    this._connection.registerDispatcher(this);

    if (this._parent)
      this._connection.sendCreate(
        this._parent,
        type,
        guid,
        initializer,
        this._parent._object
      );
  }
}

In the Dispatcher constructor we see the connection always comes from either the inherited parent Dispatcher, or from the DispatcherConnection passed into the parent parameter. This, coupled with rootScope having the name root, gives us the hint that every dispatcher in the server library will have access to the dispatcherConnection instance (from createInProcessPlaywright). Furthermore, the dispatcherConnection will register every instance of a subclass of Dispatcher, hinting that it will communicate with each of the Dispatcher objects. These points will become more clear after looking at the PlaywrightDispatcher's constructor.

PlaywrightDispatcher and its constructor

Let's look closer at a slightly simplified version of PlaywrightDispatcher's constructor

packages/playwright-core/src/server/dispatchers/playwrightDispatcher.ts

import type { Playwright } from "playwright-core/src/server/playwright";

class PlaywrightDispatcher extends Dispatcher {
  constructor(scope: RootDispatcher, playwright: Playwright) {
    super(scope, playwright, "Playwright", {
      chromium: new BrowserTypeDispatcher(scope, playwright.chromium),
      firefox: new BrowserTypeDispatcher(scope, playwright.firefox),
      webkit: new BrowserTypeDispatcher(scope, playwright.webkit),
      // ...
    });
  }
}

Notice the super call has parameters for each of the driver types (chromium, firefox, etc.), in its initializers object. And each of the values have the rootScope from createInProcessPlaywright passed into it as the first parameter. Every call to the server will run through the dispatcherConnection in the rootScope, which then can be traced through calls in each of the dispatcher classes. BrowserTypeDispatcher will launch a BrowserDispatcher, which is responsible for creating a BrowserContextDispatcher, which can create a PageDispatcher, and so on. The whole hierarchy of dispatchers directly interacting with the automated browser is contained within these few dispatcher initializations. So now we can trace all calls back to the original dispatcherConnection defined in the factory method with confidence!

Tying back to the onmessage communication

So now that we have these server-side dispatchers constructed, let's look back at how the client-side and server-side API's are connected via the dispatcherConnection.onmessage and clientConnect.onmessage functions.

From within dispatcherConnection its dispatcherConnection.onmessage function is called from two main methods. These are dispatch and _sendMessageToClient. The first, dispatch, is called from within the clientConnection.onmessage, so the dispatcher.onmessage call in that case acts as a response callback. The second, _sendMessageToClient, is called from the sendEvent, sendCreate, sendAdopt, and sendDestroy methods defined within the dispatcher. Calls to these functions are spread throughout the Dispatcher subclasses which call these functions from their internal _connection variable. These will tell the client to construct, update, and destroy, their corresponding client side ChannelOwner objects.

Separately on the client-side, clientConnection.onmessage is called from sendMessageToServer whenever you use a client-side API. This sends a message over to the dispatcherConnection, which finds the corresponding Dispatcher subclass, which executes to the corresponding API call to the browser being automated. This will respond back to the client via _sendMessageToClient, as mentioned before.

Synchronizing objects between the Dispatchers and client-side ChannelOwners

The sendCreate function is special because it is only called from within the Dispatcher constructor, hence the constructor of each of its subclasses. This method tells the client connection to create a corresponding client-side class which handles messaging to this dispatcher. Moreover, looking into the sendCreate implementation, internally it calls _sendMessageToClient with the __create__ parameter:

packages/playwright-core/src/server/dispatchers/dispatcher.ts

sendCreate(parent: DispatcherScope, type: string, guid: string, initializer: any, sdkObject?: SdkObject) {
  const validator = findValidator(type, '', 'Initializer');
  initializer = validator(initializer, '', { tChannelImpl: this._tChannelImplToWire.bind(this), binary: this._isLocal ? 'buffer' : 'toBase64' });
  this._sendMessageToClient(parent._guid, type, '__create__', { type, initializer, guid }, sdkObject);
}

so if we search through the Connection class on the client side, sure enough in its dispatch function it has a call to _createRemoteObject for the associated method __create__. This _createRemoteObject is what initializes the client-side Playwright instance, and is the reason why we call

const playwrightAPI = clientConnection.getObjectWithKnownName("Playwright");

to access the playwright API. But let's dive a little deeper as to what's happening with the message from the dispatcherConnection over to the clientConnection. For the Playwright create message, the message looks something like

{
  guid: '',
  method: '__create__',
  params: {
    type: 'Playwright',
    initializer: {
      chromium: { guid: 'browser-type@024d5a494527ece580841844a9a933a6' },
      firefox: { guid: 'browser-type@fae8f48651c02682ad3b276f0a046d63' },
      webkit: { guid: 'browser-type@ed1c30ab794ec863fe5b9b208c3635e1' },
      android: { guid: 'android@832582c466c24c6933d3a5587059e1be' },
      electron: { guid: 'electron@3829a7608477101154e15c1e25bca9ca' },
    },
    guid: 'Playwright'
  }
}

Note before this sendCreate message is passed to create the Playwright API on the client side, there were sendCreate messages for each of the device types, meaning there already exists a client-side BrowserType for chromium, firefox, etc. before the __create__ message for Playwright is sent.

The clientConnection constructs objects on the client side corresponding to dispatcher objects on the server side. These client side API's are provided by subclasses of the ChannelOwner class, which is a concept for a later section. For now, let's trace what happens in the clientConnection for the type: 'Playwright' message.

The sendCreate function calls dispatchConnection.onmessage, which calls the clientConnection.dispatch function, which then calls the clientConnection._createRemoteObject function with the following parameters

packages/playwright-core/src/client/connection.ts

this._createRemoteObject(
  "", // parentGuid - corresponds to Root, the root ChannelOwner
  "Playwright", // type
  "Playwright", // guid
  {
    // initializer
    chromium: { guid: "browser-type@024d5a494527ece580841844a9a933a6" },
    firefox: { guid: "browser-type@fae8f48651c02682ad3b276f0a046d63" },
    webkit: { guid: "browser-type@ed1c30ab794ec863fe5b9b208c3635e1" },
    android: { guid: "android@832582c466c24c6933d3a5587059e1be" },
    electron: { guid: "electron@3829a7608477101154e15c1e25bca9ca" },
  }
);

In the clientConnection._createRemoteObject function there is a transformation of the data and then a large switch-case function instantiating the corresponding client class.

packages/playwright-core/src/client/connection.ts

_createRemoteObject(parentGuid: string, type: string, guid: string, initializer: any) {
  const parent = this._objects.get(parentGuid); // here the parentGuid is ''
  if (!parent)
    throw new Error(`Cannot find parent object ${parentGuid} to create ${guid}`);
  let result: ChannelOwner<any>;
  const validator = findValidator(type, '', 'Initializer');
  initializer = validator(
    initializer,
    '',
    {
      tChannelImpl: this._tChannelImplFromWire.bind(this),
      binary: this._rawBuffers ? 'buffer' : 'fromBase64'
    }
  );
  switch (type) {
    // ...
    case 'Playwright':
      result = new Playwright(parent, type, guid, initializer);
      break;
    // ...
  }
  return result;
}

The parameter tChannelImpl: _tChannelImplFromWire in the validator function is responsible for taking the initializer object above, which contains a guid for each browser, and convert it to the corresponding object stored in the _objects variable in the Connection class. If you look at the type definition in Connection the _objects variable is a map returning one of the ChannelOwner subclasses. So in the result above for the case: 'Playwright', it returns a Playwright instance, which is a subclass of ChannelOwner, defined in client/playwright.ts.

Launching the browser instance

The final bit of code in createInProcessPlaywright to consider is the _serverLauncher variables being set in each of the drivers. This is written as

packages/playwright-core/src/inProcessFactory.ts

playwrightAPI.chromium._serverLauncher = new BrowserServerLauncherImpl(
  "chromium"
);
playwrightAPI.firefox._serverLauncher = new BrowserServerLauncherImpl(
  "firefox"
);
playwrightAPI.webkit._serverLauncher = new BrowserServerLauncherImpl("webkit");

Note the BrowserServerLauncherImpl class is defined next to the inProcessFactory.ts file in browserServerLauncherImpl.ts. The main functionality in this class lies in the launchServer function, and is only called when you use the

my-script.ts

import playwright from "playwright-core";

playwright.chromium.launchServer(serverOptions);

function. This is for launching a server which exposes a websocket for other programs to interact with the playwright API. This is not used if you are just writing a node script which accesses the 'playwright-core' library, something like

my-script.ts

import playwright from "playwright-core";

(async function () {
  const browser = await playwright.chromium.launch({ headless: false });
  const page = await browser.newPage();
  await page.goto("https://playwright.dev");
  // ... automate page interactions here
})();

so for now we skip giving an overview of the BrowserServerLauncherImpl code and defer it to a later post.

Recap

Whew! That was quite the journey, so let's recap the main points we covered in this post:

@playwright/test is a wrapper around the test runner package packages/playwright and the browser automation package packages/playwright-core.
Importing from playwright-core is a dynamically generated process under the hood, which is done in packages/playwright-core/src/inProcessFactory.ts.
In playwright-core there are two main libraries, the src/client and src/server libraries.
This separation exists so other programming languages can easily build a client library which communicates with the src/server library.
The client-side API and server-side API communicate with each other over a client-side Connection object and a server-side DispatcherConnection object. These pass messages with one another through their runtime-defined onmessage callback.
Connection is responsible for constructing client side API's, the API methods you import from playwright-core, and DispatcherConnection is responsible for keeping track of the dispatcher objects on the server side.
The instances of the Dispatcher subclasses are responsible for communicating with the browser. They will send automation commands for their specific scope of functionality.
Connection is wrapped by ChannelOwner subclasses, which all point to the same instance of Connection and are responsible for the client-side APIs. Each of which corresponds to a specific component of the browser, e.g. Page. Similarly, DispatcherConnection is wrapped by a subclass of Dispatcher which is responsible for a part of the automated browser.
Both the Connection and DispatcherConnection classes keep track of the same object on each side of the client/server side of the divide with a unique GUID.

Once you have parsed the functionality of Connection and DispatcherConnection, and their wrapper APIs, you are at an excellent spot for understanding the whole architecture of Playwright. These core components are essential for providing the client-facing interface used by test-engineers and the facade provided by the server, giving a unified API to automate browser actions.