Skip to content

How does Etherpad go from a Pad to other formats and vice versa.

John McLear edited this page Apr 15, 2020 · 1 revision

In this page we're going to explain how Etherpad JSON content is both created by Importing content but also how HTML et al content is created with exports.

Let's start by taking an example pad contents of:

Hello
World

I will make the assumption you know how we define a line, lines have Attributes, an example of an attribute would be "heading1", this is stored in the Attribute Pool, let's dive in to see what this looks like...

The Attribute Pool

{
  1: "heading1",
  2: "number",
  3: "bullet"
}

You will notice there are things in the Attribute pool that we have never used. That's because the Attribute pools job is just to store available/usable attributes. The Attribute pool isn't where we record what attribute is applied where. The Attribute pool changes throughout the history of a pad.

A Line Attribute is stored as a key value, for example:

{
  1: true
}

But then it's stored as an Attribute String, more on this later!

Collecting content

Line Attributes are processed initially by contentcollector.js. This is both on import and during edits of the pad. contentcollector.js job is to collect the content either pasted, typed or imported into a pad and analyze the HTML(dom nodes) and see what attributes should be applied.

Storing Attributes in the Attribute Pool

contentcollector.js also applies the discovered attributes and creates any required additions to the Attribute Pool.

Attribute Strings

contentcollector.js makes the Attribute Strings so for our "Heading 2" line attribute state of {2:true} the line Attributes would be represented as " *0*1*2+1+1'.

The Line Marker

Where a line attribute has been defined the contentcollector.js will add a "line marker" attribute or lmkr, this is prefixed to the text and represented as a * symbol. So if the text before adding a line attribute was hello then post contentcollector.js performing it's work the new text will be *hello, more research is required to understand why this is required.

Server side

Content flows like this contentcollector.js > linestylefilter.js > ExportHTML.js/ExportText.js (ExportHelper.js) > ExportHandler.js > api or /export/*

Content collector (Server and Client)

Content collector is responsible for collecting content from PadManager & applying attributes to lines then passing off to

Line Style Filter (Server and Client)

Line Style Filter goes through each line, looks at the attribs, applies the classes and rewrites the DOM content.

Export Helper (Server only)

The Export Helper applies additional logic for list handling that should not be present in the client. This logic is used only in Export.

An example is how lists are formatted; in Etherpad we cannot represent <ul><li>1</li><li>2</li></ul> we have to draw that to the DOM / Editor as <div><!-- line 1 in Etherpad--><ul><li>1</li></ul></div><div><!-- line 2 in Etherpad--><ul><li>1</li></ul></div>. Obviously this would suck as an export, we want it to be properly formatted so we use this Helper to handle that.

Export HTML / Text / Etherpad (Server only)

These should be self explanatory. Just making sure the content and atext available is properly drawn as HTML. A lot of heavy lifting is done here.

Export Handler (Server only)

Export Handler takes the export request and decides based on the type of request what type of method to use. For example we have to process .doc export requests differently to .txt.

Client Side

Client edits are then passed to the documentAttributeManager which applies changed content into Changesets which I won't go into details of now. See the Changeset docs.

Turning Attributes to Classes

In the editor attributes are translated to classes. So in our example class="heading2" would be applied giving content like <div class="heading2">some text</div>

Processing Attribute Classes to Node modifications / DOM content / HTML.

If required Domline.js, specifically createDomLine then finally handles adding the classes as HTML.

General

Resources

For Developers

How to's

Set up

Advanced steps

Integrating Etherpad in your web app

for Developers

Clone this wiki locally