XML parser student project
Rohan Chakravarthy edited this page Jan 10, 2015
·
4 revisions
Background information: An important part of loading web pages is the process of turning HTML source into a DOM. We have parsers that do that in Servo already, but the ability to turn XHTML (based on XML) into a DOM is missing. We need a parser that can read XML and build a tree of DOM nodes out of it in the same manner that the current HTML parser does; using it as a model for the new parser is encouraged.
Initial step: Build Servo, create a new Parser trait in hubbub_html_parser.rs
, create a new HTMLParser struct that contains a hubbub::Parser<'a>
member, and implement the Parser
trait for HTMLParser
. Make sure it builds and pages still load correctly.
- Create a Parser trait has a
parse_chunk
method - Create an HTMLParser struct that contains the hubbub-related data in parse_html (such as the TreeHandler), and have it impl the Parser trait
- Make RustyXML a dependency of the script crate. Learn more about Cargo, the dependency manager and build system that we use.
- Create an XMLParser struct that contains the data necessary for using RustyXML and make it impl the Parser trait
- Rewrite parse_html to create the right Parser based on the HTTP Content-Type header (use
application/xhtml+xml
for the XML parser; see an example - Expand on the XML parser to use the events (
https://github.com/Florob/RustyXML/blob/master/src/xml/lib.rs#L135
) to perform the same actions as the HTML parser (see the callbacks inTreeHandler
). Start with creating elements and setting attributes. - Support executing scripts by checking the elements being added and using the same "discovery" mechanism the HTML parser does (see
js_script_listener
)