Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea: API should return structured article json #38

Open
deoxykev opened this issue Nov 15, 2023 · 27 comments
Open

Idea: API should return structured article json #38

deoxykev opened this issue Nov 15, 2023 · 27 comments

Comments

@deoxykev
Copy link
Contributor

Using some “reading mode” algorithm (such as DOM-distiller) I think the API could return a json blob representing just the source URL, title, author, date and text content of the article, without the extra HTML.

This would make it feasible for web scraping tasks, for non-JS heavy sites.

In addition, this would open up the possibility of an endpoint that returned the cleaned content of the site, much like the old outline.org.

@mms-gianni
Copy link
Contributor

Sound good to me. But how to identify the DOM position of title, author and date?

@joncrangle
Copy link
Contributor

I found a Golang port of DOM-distiller, go-domdistiller, that also incorporates some improvements. I haven't played around with it, but I think this is a cool idea.

If the API returns a filtered json blog, I'm wondering if it makes sense to replace form.html with a lightweight frontend (perhaps an Astro or SolidJS site). That would allow the user to choose whether they would like the original site returned like it is now, or choose the outline.com clone version with the user's visual styling preferences (e.g. font size, text/bg color, serif or sans-serif font).

@deoxykev
Copy link
Contributor Author

@joncrangle I could work on the backend API if you want to work on the frontend display with nice typography.

I'm thinking a GET to /api/https://example.com should return:

{
    "url": "https://example.com",
    "title": "Example Domain",
    "author": "",
    "date": "2023-11-15",
    "text": "This domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission."
}

Then a GET to /outline/https://example.com would return the formatted content with nice typography / css. I think client-side rendering is fine, no need for any SSR.

@mms-gianni
Copy link
Contributor

I think this is a very good idea an additional route to display a simplified article. Could become a big help for People with visual impairments.

Not sure yet about adding a framework to display a simple article. This might be solvable by tailwind. I'd rather add a separate button to open it as outline to the default form.

@joncrangle
Copy link
Contributor

joncrangle commented Nov 16, 2023

I started work in a branch on the frontend piece. Rather than use a framework, I created a Get handler for the content/ route and used Go Fiber's Template package with the html engine to populate a basic html page with the returned json from the api/ route.

The data is mocked at the moment.

image

If this is moving in the right direction, I plan to add a dropdown menu in the top right so the user can select visual preferences regarding font family (serif / sans serif) and increase or decrease the font size. Perhaps even a light/dark mode switch as well.

I also made some improvements to the form.html page (client-side input validation and error handling, escape key to clear input when focused, styling improvements). I put two buttons to navigate to either the /content/:url or /:url route.

image
image

@mms-gianni
Copy link
Contributor

I like the simplicity. Maybe add an image to the article since the library is able to extract it.

But feel free to change everything.

@joncrangle
Copy link
Contributor

@deoxykev How do you envision the API return longer content / images, even if only a cover image?

@deoxykev
Copy link
Contributor Author

deoxykev commented Nov 16, 2023

@joncrangle I like the frontend.

I'm thinking to handle images, we could return it like this:

GET /api/content/https://example.com

{
  "success": true,
  "error": {
      "message": "This is an example message. If success is true, this shouldn't be here.",
      "type": "example_error",
      "cause": { ... recurse for nested errors ... }
  },
  "data": {
    "url": "https://example.com",
    "title": "Example Domain",
    "author": "",
    "date": "2023-11-15",
    "content": [
        {
            "type": "h1",
            "data": "This domain is for use in illustrative examples in documents."
        },
        {
            "type": "img",
            "url": "/https://example.com/header-image.jpg",
            "alt": "header image alt text",
            "position": "header"
        },
        {
            "type": "p",
            "data": "You may use this domain in literature without prior coordination or asking for permission."
        },
        {
            "type": "img",
            "url": "/https://example.com/inline-image.jpg",
             "alt": "inline image alt text",
            "position": "inline"
        }
    ]
  }
}

Note the image url is a relative path starting with /, such that image request would go to `http://localhost:8080/https://example.com/inline-image.jpg


Then maybe the API could look more like this:

endpoint usage
/api/raw returns raw HTML text
/api/text return only plaintext
/api/content used for outline frontend

All APIs should contain the top level objects:

{
    "success": bool,
    "error": {},
    "data": {},
}

@joncrangle
Copy link
Contributor

I've used the following mock json response to make some pretty good progress on the frontend:

{
  "success": true,
  "error": {
      "message": "This is an example message. If success is true, this shouldn't be here.",
      "type": "example_error",
      "cause": "recurse - for nested error - string for testing"
  },
  "data": {
    "url": "https://example.com",
    "title": "Example Domain",
    "author": "John Doe",
    "date": "2023-11-15",
    "content": [
        {
            "type": "h1",
            "data": "This domain is for use in illustrative examples in documents."
        },
        {
            "type": "img",
            "url": "/https://source.unsplash.com/random/900x700/?city,night",
            "alt": "header image alt text",
            "caption": "This is the image caption"
        },
        {
            "type": "h2",
            "data": "This is an h2."
        },
        {
            "type": "p",
            "data": "&lt;a&gt; tag: <a href=\"/https://example.com\">This is an example link</a>. This is <em>emphasized</em> text. This is <strong>bold</strong> text. These are example &lt;kbd&gt; tags <kbd>Ctrl</kbd> + <kbd>Shift</kbd>"
        },
        {
            "type": "p",
            "data": "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum." 
        },
        {
            "type": "p",
            "data": "Pulvinar etiam non quam lacus suspendisse faucibus. Et pharetra pharetra massa massa ultricies mi. Rhoncus dolor purus non enim praesent elementum facilisis leo vel. Phasellus vestibulum lorem sed risus ultricies tristique nulla. Duis tristique sollicitudin nibh sit amet commodo nulla. Eget aliquet nibh praesent tristique magna sit amet purus gravida. Sem fringilla ut morbi tincidunt augue interdum velit euismod. Amet consectetur adipiscing elit duis tristique sollicitudin nibh sit. Lobortis scelerisque fermentum dui faucibus in ornare quam viverra. Nunc sed blandit libero volutpat sed cras ornare. Sit amet purus gravida quis. Duis ut diam quam nulla porttitor massa."
        },
        {
            "type": "p",
            "data": "Integer malesuada nunc vel risus. Lobortis feugiat vivamus at augue eget arcu dictum varius. Pulvinar sapien et ligula ullamcorper malesuada. Vel quam elementum pulvinar etiam non quam. Magnis dis parturient montes nascetur ridiculus mus mauris vitae. Odio eu feugiat pretium nibh. Pretium nibh ipsum consequat nisl vel pretium lectus. Elementum curabitur vitae nunc sed velit dignissim sodales. Mauris sit amet massa vitae tortor condimentum lacinia quis. Orci porta non pulvinar neque laoreet suspendisse. Enim eu turpis egestas pretium aenean pharetra magna ac placerat."
        },
        {
            "type": "h3",
            "data": "This is an h3."
        },
        {
            "type": "blockquote",
            "data": "This is a blockquote."
        },
        {
            "type": "h4",
            "data": "This is an h4. Here comes a list:"
        },
        {
            "type": "ul",
            "data": "<li>Item 1</li><li>Item 2</li><li>Item 3</li>"
        },
        {
            "type": "hr",
            "data": ""
        },
        {
            "type": "ol",
            "data": "<li>Item 1</li><li>Item 2</li><li>Item 3</li>"
        },
        {
            "type": "img",
            "url": "/https://source.unsplash.com/random/900x700/?cat,dog",
            "alt": "image alt text",
            "caption": ""
        },
        {
            "type": "table",
            "data": "<tr><th>Person 1</th><th>Person 2</th><th>Person 3</th></tr><tr><td>Emil</td><td>Tobias</td><td>Linus</td></tr><tr><td>16</td><td>14</td><td>10</td></tr>"
        },
        {
            "type": "code",
            "data": "func main() { fmt.Println(\"Hello, World!\")}"
        },
        {
            "type": "made-up",
            "data": "You may use this domain in literature without prior coordination or asking for permission."
        }
    ]
  }
}

I'm handling a lot of different tags we might encounter. At the moment, I'm unescaping p, table, ul and ol tags in order to render the elements they contain and keep it simple.

image

I've also started designing a dropdown for user preferences. Javascript to handle this remains to be written. I'm thinking I'll probably save the visual preferences to local storage so it persists.

image

@deoxykev
Copy link
Contributor Author

This looks fantastic. I’m excited.

Im working on a refactor of the core proxy logic but the API should be ready soon.

@mms-gianni
Copy link
Contributor

I can only agree with that. I really like where this is going.

@joncrangle
Copy link
Contributor

I've made further progress in feat/outline. I'm still just rendering mock data from a mock.json file. I haven't directed my attention to calling the api route for the data since the types won't match at the moment.

The dropdown and its functionality works without adding any new dependencies. I've changed the route to /outline for the outline page. It will also render the errors array on the outline page if "success": false.

When the major refactor work is complete and we can direct our attention to the API, I don't think it will be too much work to plugin this frontend piece. I've used switch/case statements in handlers/outline.go to make it easy to add tags and other content types as the API evolves. At the moment, it's setup for an API that returns the content array one level deep and unescapes tags that would be nested, so the styles/input.css file is used to globally apply styles to these types of tags. In the future, we may end up with an API that returns more of a nested DOM tree so I wanted to make sure this would be easy to maintain as the API evolves and may become more complex.

I've noticed that since I'm storing the font, font size and theme preferences in local storage, any rulesets in place that clear local storage clear those as well. Instead of clearing local storage in these use cases, rules that clear local storage will need to crafted more like this to avoid clearing user preference values from local storage:

const keysToKeep = ["font", "fontsize", "theme"];
Object.keys(localStorage).filter(key => !keysToKeep.includes(key)).forEach(key => localStorage.removeItem(key));

@dxbednarczyk
Copy link
Contributor

dxbednarczyk commented Nov 24, 2023

Sounds good to me. But how to identify the DOM position of title, author and date?

My two cents (if they're not irrelevant already thanks to the work other people have already put into this issue), is that we can use goquery to parse articles for information like this.

Potentially, different selectors for each site can be provided in ruleset.yaml. Parsing an example article (run to see result, see code by forking). This example only extracts the headline, author and date, but can be expanded to almost anything, including cover photos.

@ndom91
Copy link

ndom91 commented Nov 28, 2023

Came across this issue and although yall seem pretty far along already, I wanted to share this web metadata scraping pkg / service I've used successfully before. Can return structured data of a bunch of metadata from websites

https://metascraper.js.org/#/

The maintainer, kikobeats, has a few other similar services and runs some as a SaaS as well (https://microlink.io/meta)

@deoxykev
Copy link
Contributor Author

I'm still stuck on a refactor, so no real work has been done on it yet. Metascraper seems interesting, thanks for sharing. It seems to rely on a headless browser, with a latency of around 2 seconds . It does seem to return neat metadata such as background colors, which might be interesting for CSS.

Almost there.... #50

@joncrangle
Copy link
Contributor

In addition to go-domdistiller, there is also go-trafilatura that seems to have fallback extractor functions to go-domdistiller and go-readabiility. I haven't tried these, but it seems to perform well in their benchmarks as a reading mode algorithm that can extract the metadata, and content.

@dxbednarczyk
Copy link
Contributor

In addition to go-domdistiller, there is also go-trafilatura that seems to have fallback extractor functions to go-domdistiller and go-readabiility. I haven't tried these, but it seems to perform well in their benchmarks as a reading mode algorithm that can extract the metadata, and content.

The libraries you mentioned suffer from not having 100% accuracy. This would potentially make the API hard to test, and sometimes return incorrect data. Maybe we can use these libraries for the majority of the API response data, and "fill in the blanks" with hard-coded rules I mentioned in my previous comment.

@deoxykev
Copy link
Contributor Author

deoxykev commented Nov 29, 2023

I just tried go-trafilartu in b7a012d with pretty decent results. Here's two samples 1 2.

The library returns everything as a DOM node, which can be rendered to HTML. Most articles have opengraph tags, so getting metadata such as title, description, tags, etc should be trivial.

Screenshot 2023-11-28 at 8 33 39 PM

Anyway, sometimes the article has some JS that makes the content disappear from view, but it's still in the DOM. In which case, this does a great job of extracting the content and thus "bypassing" the paywall. If there was a CSS selector in the ruleset, it could definitely pick the content out more accurately, using goquery or similar.

@joncrangle
Copy link
Contributor

This seems very promising. While exploring the go-trafilatura package, I saw that there was an output.go with a jsonExtractResult function, as well as helper.go that contains the CreateReadableDocument function. I sort of mashed them together below in a first attempt proof of concept to return json similar to the API described above.

type ImageContent struct {
	Type    string `json:"type"`
	URL     string `json:"url"`
	Alt     string `json:"alt"`
	Caption string `json:"caption"`
}

type LinkContent struct {
	Type string `json:"type"`
	Href string `json:"href"`
	Data string `json:"data"`
}

type TextContent struct {
	Type string `json:"type"`
	Data string `json:"data"`
}

type JSONDocument struct {
	Success bool `json:"success"`
	Error   struct {
		Message string `json:"message"`
		Type    string `json:"type"`
		Cause   string `json:"cause"`
	} `json:"error"`
	Metadata struct {
		Title       string   `json:"title"`
		Author      string   `json:"author"`
		URL         string   `json:"url"`
		Hostname    string   `json:"hostname"`
		Description string   `json:"description"`
		Sitename    string   `json:"sitename"`
		Date        string   `json:"date"`
		Categories  []string `json:"categories"`
		Tags        []string `json:"tags"`
		License     string   `json:"license"`
	} `json:"metadata"`
	Content  []interface{} `json:"content"`
	Comments string        `json:"comments"`
}

func CreateJSONDocument(extract *trafilatura.ExtractResult) *JSONDocument {
	jsonDoc := &JSONDocument{}

	// Populate success
	jsonDoc.Success = true

	// Populate metadata
	jsonDoc.Metadata.Title = extract.Metadata.Title
	jsonDoc.Metadata.Author = extract.Metadata.Author
	jsonDoc.Metadata.URL = extract.Metadata.URL
	jsonDoc.Metadata.Hostname = extract.Metadata.Hostname
	jsonDoc.Metadata.Description = extract.Metadata.Description
	jsonDoc.Metadata.Sitename = extract.Metadata.Sitename
	jsonDoc.Metadata.Date = extract.Metadata.Date.Format("2006-01-02")
	jsonDoc.Metadata.Categories = extract.Metadata.Categories
	jsonDoc.Metadata.Tags = extract.Metadata.Tags
	jsonDoc.Metadata.License = extract.Metadata.License

	// Populate content
	if extract.ContentNode != nil {
		jsonDoc.Content = parseContent(extract.ContentNode)
	}

	// Populate comments
	if extract.CommentsNode != nil {
		jsonDoc.Comments = dom.OuterHTML(extract.CommentsNode)
	}

	return jsonDoc
}

func parseContent(node *html.Node) []interface{} {
	var content []interface{}

	for child := node.FirstChild; child != nil; child = child.NextSibling {
		switch child.Data {
		case "img":
			image := ImageContent{
				Type:    "img",
				URL:     dom.GetAttribute(child, "src"),
				Alt:     dom.GetAttribute(child, "alt"),
				Caption: dom.GetAttribute(child, "caption"),
			}
			content = append(content, image)

		case "a":
			link := LinkContent{
				Type: "a",
				Href: dom.GetAttribute(child, "href"),
				Data: dom.InnerText(child),
			}
			content = append(content, link)

		case "h1":
			text := TextContent{
				Type: "h1",
				Data: dom.InnerText(child),
			}
			content = append(content, text)

		case "h2":
			text := TextContent{
				Type: "h2",
				Data: dom.InnerText(child),
			}
			content = append(content, text)

		case "h3":
			text := TextContent{
				Type: "h3",
				Data: dom.InnerText(child),
			}
			content = append(content, text)

		// continue with other tags

		default:
			text := TextContent{
				Type: "p",
				Data: dom.InnerText(child),
			}
			content = append(content, text)
		}
	}

	return content
}

@deoxykev
Copy link
Contributor Author

deoxykev commented Nov 29, 2023

Sweet, thanks for that. Here are the preliminary API results: 1 2.

I'll migrate this over to the /api/content/<url> endpoint tomorrow. In my branch, you should be able to go run cmd/main.go and test it out.

@deoxykev
Copy link
Contributor Author

The API is now ready for testing!

FYI I changed the path from /api/content to /api/outline.

Usage is like: curl http://localhost:8080/api/outline/https://www.newyorker.com/magazine/2023/12/04/how-jensen-huangs-nvidia-is-powering-the-ai-revolution

@joncrangle can you test your frontend in feat/outline with the origin/proxy_v2 refactor branch?

There's tons of changes, so if you submit a PR to my branch I can sort out the conflicts.

@joncrangle
Copy link
Contributor

The API is now ready for testing!

FYI I changed the path from /api/content to /api/outline.

Usage is like: curl http://localhost:8080/api/outline/https://www.newyorker.com/magazine/2023/12/04/how-jensen-huangs-nvidia-is-powering-the-ai-revolution

@joncrangle can you test your frontend in feat/outline with the origin/proxy_v2 refactor branch?

There's tons of changes, so if you submit a PR to my branch I can sort out the conflicts.

I've been tweaking the functions above to fix some issues. Right now, tags aren't escaped so we end up losing a bunch of content (basically any tags nested within another text tag are just treated like text). As I've been experimenting, I've been testing a recursive approach to generate json that is more like a nested DOM. The following is still buggy but sharing so you can see what I've been experimenting with:

type ImageContent struct {
	Type    string `json:"type"`
	URL     string `json:"url"`
	Alt     string `json:"alt"`
	Caption string `json:"caption"`
}

type LinkContent struct {
	Type string `json:"type"`
	Href string `json:"href"`
	Data string `json:"data"`
}

type TextContent struct {
	Type string `json:"type"`
	Data string `json:"data"`
}

type JSONDocument struct {
	Success bool `json:"success"`
	Error   struct {
		Message string `json:"message"`
		Type    string `json:"type"`
		Cause   string `json:"cause"`
	} `json:"error"`
	Metadata struct {
		Title       string   `json:"title"`
		Author      string   `json:"author"`
		URL         string   `json:"url"`
		Hostname    string   `json:"hostname"`
		Description string   `json:"description"`
		Sitename    string   `json:"sitename"`
		Date        string   `json:"date"`
		Categories  []string `json:"categories"`
		Tags        []string `json:"tags"`
		License     string   `json:"license"`
	} `json:"metadata"`
	Content  Content `json:"content"`
	Comments Content `json:"comments"`
}

type Content struct {
	Type     string    `json:"type"`
	Data     string    `json:"data,omitempty"`
	URL      string    `json:"url,omitempty"`
	Alt      string    `json:"alt,omitempty"`
	Caption  string    `json:"caption,omitempty"`
	Href     string    `json:"href,omitempty"`
	Children []Content `json:"children,omitempty"`
}

func createJSONDocument(extract *trafilatura.ExtractResult) *JSONDocument {
	jsonDoc := &JSONDocument{}

	// Populate success
	jsonDoc.Success = true

	// Populate metadata
	jsonDoc.Metadata.Title = extract.Metadata.Title
	jsonDoc.Metadata.Author = extract.Metadata.Author
	jsonDoc.Metadata.URL = extract.Metadata.URL
	jsonDoc.Metadata.Hostname = extract.Metadata.Hostname
	jsonDoc.Metadata.Description = extract.Metadata.Description
	jsonDoc.Metadata.Sitename = extract.Metadata.Sitename
	jsonDoc.Metadata.Date = extract.Metadata.Date.Format("2006-01-02")
	jsonDoc.Metadata.Categories = extract.Metadata.Categories
	jsonDoc.Metadata.Tags = extract.Metadata.Tags
	jsonDoc.Metadata.License = extract.Metadata.License

	// Populate content
	if extract.ContentNode != nil {
		jsonDoc.Content = parseContent(extract.ContentNode)
	}

	// Populate comments
	if extract.CommentsNode != nil {
		jsonDoc.Comments = parseContent(extract.CommentsNode)
	}

	return jsonDoc
}

func parseContent(node *html.Node) Content {
	var content Content

	switch node.Type {
	case html.ElementNode:
		switch node.Data {
		case "img":
			content = Content{
				Type:    "img",
				URL:     dom.GetAttribute(node, "src"),
				Alt:     dom.GetAttribute(node, "alt"),
				Caption: dom.GetAttribute(node, "caption"),
			}

		case "a":
			content = Content{
				Type: "a",
				Href: dom.GetAttribute(node, "href"),
				Data: dom.InnerText(node),
			}

		case "h1", "h2", "h3", "h4", "h5", "h6", "blockquote", "code", "pre", "kbd":
			content = Content{
				Type:     node.Data,
				Data:     dom.InnerText(node),
				Children: parseChildren(node),
			}

		//TODO additional tags

		default:
			// For other HTML tags, recursively call parseContent
			content = Content{
				Type:     "parent", // Use a default type for other HTML tags
				Children: parseChildren(node),
			}
		}

	case html.TextNode:
		// Handle text nodes only if they contain non-whitespace characters
		text := strings.TrimSpace(dom.InnerText(node))
		if text != "" {
			content = Content{
				Type: "p",
				Data: text,
			}
		}
	}

	return content
}

func parseChildren(node *html.Node) []Content {
	var children []Content
	var currentText string

	for child := node.FirstChild; child != nil; child = child.NextSibling {
		childContent := parseContent(child)

		if childContent.Type == "text" {
			// If the current child is a text node, concatenate its data with previous text nodes
			currentText += childContent.Data
		} else {
			// If the current child is not a text node, append the previous text nodes as a single text node
			if currentText != "" {
				textNode := Content{
					Type: "text",
					Data: currentText,
				}
				children = append(children, textNode)
				currentText = ""
			}

			// Append the current child content
			children = append(children, childContent)
		}
	}

	// If there are remaining text nodes after the loop, append them as a single text node
	if currentText != "" {
		textNode := Content{
			Type: "text",
			Data: currentText,
		}
		children = append(children, textNode)
	}

	return children
}

This provided the following output. There are still lots of mistakes (an a tag was missed, presumably because it had a nested em tag) and I've been trying to handle p tags before looking at other tags, so this still needs quote a bit of additional work.

@deoxykev
Copy link
Contributor Author

deoxykev commented Nov 29, 2023

I think we should reconsider the nested DOM structure approach. There are actually two primary goals here:

  • API Endpoint: structured article content suitable for scraping and automated extraction.
  • User-Friendly Endpoint: Provide a clean, standardized article format akin to Outline.

Separation of Concerns

The current approach, where the user-facing endpoint depends on the API structure might be too complicated. Mimicking HTML in JSON introduces unnecessary complexity and edge case handling. Because the DOM distillation algorithms tend to return an HTML DOM structure anyway, it doesn't make sense to translate the structure into JSON, only to have frontend Javascript translate that back into HTML. Let's just return the DOM structure directly.

Design proposal

To that end, I think we should have two endpoints:

  • API Endpoint: Returns JSON data, focusing on simplified structure and content.
    • /api/content/*
  • User Interface Endpoint: Server-side rendered outline version, leveraging Tailwind CSS for styling.
    • /outline/*
    • A user could request a different CSS style by hitting different subdirectories:
      • /outline/dark/*
      • /outline/light/*
      • /outline/monokai/*

A split approach here would simplify the architecture and reduces the need for display logic in two places, in two different languages, making the code easier to maintain.

Additionally, a server-side rendered approach enables us to keep the page navigable via href links.

@joncrangle
Copy link
Contributor

I agree that this nested DOM approach is getting too complex and I ran into a number of footguns just experimenting. If nested tags can be escaped and unescaped so that the result mirrors the following it keeps things simple and we probably wouldn't need two endpoints, just the /api/content/* endpoint.

{
"type": "p",
"data": "&lt;a&gt; tag: <a href=\"/https://example.com\">This is an example link</a>. This is <em>emphasized</em> text. This is <strong>bold</strong> text. These are example &lt;kbd&gt; tags <kbd>Ctrl</kbd> + <kbd>Shift</kbd>"
},

@deoxykev
Copy link
Contributor Author

What if it just sent the rendered HTML directly?

So imagine this:

You have the “outlined” and distilled HTML, with nested markup, etc. Basic html element tags only, no classes.

Then you have a shell, with a skeleton div space for the outlined HTML to be injected into via template rendering.

This shell contains the menu from your frontend where you can tweak the global css styles, as well as the ladder logo header, and a print option.

When the site is requested, the “outlined” html is injected into the shell template, and the whole blob is sent to the client. Perhaps images could be rendered inline so that users could easily save the entire document to their computer.

@joncrangle
Copy link
Contributor

That approach would probably be pretty quick to implement. If we update the templating to inject the outlined HTML content (potentially the metadata for headings) or any API errors into the main tag of outline.html, the styling for all the text elements can all be moved to input.css to apply global styles. The outline.html shell already has a ladder header, footer, dropdown and JavaScript for controlling user preferences from script.js.

@deoxykev
Copy link
Contributor Author

Alright, I’ll try integrating that when I have time later today. Will you join us on the discord @joncrangle? A few of us are already there.

https://discord.gg/DkUDeD7Z

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants