Skip to content

Theseus Development

Tom Lieber edited this page Feb 17, 2014 · 1 revision

Theseus is split up into 3 (soon to be 5) projects:

  • Theseus: the Brackets extension (everything you see in Brackets when you have Theseus installed)
  • fondue: the JavaScript instrumentation library (adds hooks so that Theseus can tell what code does when it is executed)
  • node-theseus: the node wrapper for executing Node.js programs with the Theseus debugger attached

For what I think is a great overview of the system design, check out chapter 4 of my master's thesis starting on page 27. I promise it's not awful, though I used a bad font for the code examples. :(

1 of 3: fondue

fondue takes JavaScript source and adds log statements to every single function. That way, when the code runs, fondue can read the log statements to know which functions ran, the values of every variable, etc. I've described the process in detail on my web site.

Warning: many of the comments in fondue are out of date. Trust the code. I'm sorry. :(

index.js

There are three functions to care about:

  • instrumentationPrefix() reads tracer.js, performs some simple string substitutions for customization, and returns it.
  • instrument accepts JavaScript source, prepends the output of instrumentationPrefix(), then rewrites all the functions as described earlier, and returns the result.
  • traceFilter is used by instrument() to do the actual rewriting. It uses the falafel library, which uses esprima.

tracer.js

This almost-JavaScript file gets injected into code processed by fondue. It defines a global object (called __tracer by default) that collects the program trace information, and executes queries over that information.

The file is split into two parts by the comment // remote prebuggin' (from Brackets). The first half handles trace collection and all of those functions are called as the program runs to add information to the trace. The second half handles trace queries. The functions in the second half are used by Agents in Theseus (see below) to retrieve the information that appears in the user interface. See fondue's README for some examples of how that API works.

The querying API is meant to be polled, for flexibility and portability. A polling API allows the UI to stop polling when it is busy and catch up in one call. In addition, calls to a polling API are easy to pass through connections like Chrome's Remote Debugging API where callbacks are (were?) inconvenient. The way this works is that most functions return a handle (cursor) to represent the query, which is used to fetch new data from the corresponding polling function later.

2 of 3: Theseus

Theseus is a Brackets extension with some Node.js code. Most of the code is in src/ except for what is required to be in the root directory (package.json, README.md, etc). This is done primarily so that the README will be as close to the top of the page as possible on GitHub. -_-

The Node.js part is a small proxy server in src/proxy/. Everything else is executed in the Chrome part of Brackets where the UI lives.

main.js

main.js is the entry point of the extension. It handles a few global concerns:

  • Sets up the menus
  • Coordinates initialization of the rest of the extension
  • Contains the code for Debug > Debug Brackets with Theseus (should probably be moved)

src/UI.js

Theseus is a research project. It has been rewritten several times and will be rewritten again. The parts of Theseus which have proven reusable across rewrites have been extracted to other files. The rest is in UI.js. This file is horrible because it's meant to be thrown away periodically.

UI.js handles:

  • Listening for editor events and injecting the call counts into the gutter (including polling all connected debuggers for new information periodically--though the actual connections are maintained by the Agent modules described later)
  • The active query (which functions' call counts have been clicked, what events have been clicked in the epoch panel, etc)
  • The log (referred to as the "variables panel")

src/Agent* and src/fsm.js

Every connection to a JavaScript being debugged with Theseus has an Agent.

There used to be one agent and it lived in Agent.js. That agent worked with Brackets' Live Development connection to Chrome. Then Node.js support arrived and what used to be in Agent.js moved to Agent-chrome.js, and the Node.js agent was put in Agent-node.js. Agent.js multiplexes the Chrome and Node.js agents to make it look like there is only one so that the UI code is simpler.

The Agent API is meant to be polled, for flexibility and portability. A polling API allows the UI to stop polling when it is busy and catch up in one call. Calls to a polling API are easy to pass through connections like Chrome's Remote Debugging API where callbacks are (were?) inconvenient.

In my experience so far, the most difficult part of adding a new type of agent is mapping file paths on disk to the file paths which are available in JavaScript. See possibleRemotePathsForLocalPath and couldBeRemotePath.

Connecting to Chrome (Agent-chrome.js)

Connecting to fondue on a web page in Chrome over Brackets' Live Development connection is as complicated as it sounds. So Agent-chrome.js tracks it with a finite state machine (fsm.js) with these states:

  • waitingForApp: waiting for the app to finish launching
  • disconnected: the app has launched, but nothing's happened
  • waitingForPage: Live Development has started, but no page has loaded
  • initializingTracer: we're connected to a page, but we haven't figured out if a fondue __tracer object is on it yet
  • initializingHits and initializingExceptions: we found fondue and now we're getting handles for the things we want to poll
  • connected: everything is set, all conditions nominal

Connecting to Node.js (Agent-node.js)

Theseus periodically attempts to connect to node-theseus via WebSocket. Always and forever. When it connects, it basically sends the names of __tracer functions and the arguments as JSON, and receives the responses as JSON. It's a simple idea; there are just many edge cases to cover.

src/ProxyProvider.js and src/proxy/

Brackets has an API for extensions to say, "if you're starting Live Development, I have a proxy server I would like you to use." ProxyProvider.js implements that API with a high-priority proxy that processes all JavaScript with fondue. The code for the proxy is in src/proxy/.

Brackets refers to a bundle of Node.js code it runs on behalf of an extension as a Domain. src/proxy/ProxyDomain.js is the domain for Theseus's proxy server. The proxy server uses src/proxy/middleware-proxy.js or src/proxy/middleware-static.js, depending on whether Theseus is in Proxy or Static mode.

src/Panel.js

This file isn't really necessary any more now that Brackets provides PanelManager.createBottomPanel, but it sets a few Theseus-specific CSS classes.

src/EditorInterface.js

Brackets' Editor.js has a high-level interface for interacting with the editor. Theseus's EditorInterface.js has even higher-level commands for things like jumping to a function definition.

src/EpochPanel.js

(TODO: insert a screenshot here)

The epoch panel is a one-line panel that appears at the bottom of the screen when Theseus recognizes events being emitted in an application. It shows a list of event names and the number of times they have been emitted. Log statements and exceptions also show up in the epoch panel.

src/FileCallGraph.js

This is a ridiculous experiment that you can ignore. You might find the APIs it uses helpful for well thought out experiments of your own.

3 of 3: node-theseus

This is an npm package you install to get the node-theseus command, which you use instead of node when you want to debug the program with Theseus.

bin/node-theseus

This is the wrapper script that does all the argument parsing. It has an important comment at the top about argument parsing.

node-theseus.js

Implements 3 important functions:

  • beginInstrumentation() overloads Node.js's require() to process all code with fondue. (This is why node-theseus is a wrapper: the first file executed by Node.js won't be require()'d).
  • listen() starts the WebSocket connection used by Theseus.
  • launch() starts the script you specify on the command line by require()ing it. It also overrides some annoying functions like process.exit() and child_process.spawn() to disable them and/or print messages saying that they break Theseus.