Skip to content

Removing MathJax from the HTML file

Davide P. Cervone edited this page May 28, 2013 · 2 revisions

From https://groups.google.com/d/msg/mathjax-users/H6QOeQS0yos/4Wbu2HOyZEoJ


HTML:
<body id="UniqueBody">
<p>I like \(\rm\TeX\) code</p>
</body>

JS:
alert( window.document.getElementById('UniqueBody').innerHTML );

This print so many additional span tags. How do I get just <p>I like \ (\rm\TeX\) code</p>. Is there any way I could remove MathJax additions from the Javascript so that I can save the HTML data. But user's should still see MathJax equations.

Regards


On further study these two links helped me to locate part of my requirement

  1. Removing typesetting - opposite of MathJax.Hub.Typeset()
    http://groups.google.com/group/mathjax-users/browse_thread/thread/fcddf2e1176e3fb1

  2. how to get the original Tex code
    http://groups.google.com/group/mathjax-users/browse_thread/thread/41519b69408fe9f6

I am able to get the original tex source using

var mathelms = MathJax.Hub.getJaxByInputType("math/tex");
for (i = 0; i < mathelms.length; ++i) {
        alert( mathelms[i].SourceElement().text );
}

But how do I replace the whole lot of MathJax span elements with my source Tex to get original HTML?

Is there an easy way of removing/destroying MathJax contents from a Javascript variable and not from the document..

Regards


Hi Dominic,

I'm not sure there is an easy way to do this. You can call mathelms[i].Remove() to remove the HTML output and then use normal DOM to remove the <script> tag mathelms[i].SourceElement(). However, that will remove the MathJax output from the HTML document. Maybe you can create a copy of the body with visibility: hidden and do the operation on this copy. Not sure this is really efficient...


Hi Frédéric.

Thank you for the tip on Remove() function to remove major HTML part.

Now I can remove the script element as I know the content and I know its ID (mathelms[i].inputID).

Also I can remove the span with MathJax_Preview class. DIV with id `MathJax_Message can be removed.

Still I have few elements.

<div style="visibility: hidden; overflow: hidden; position: absolute;
top: 0pt; height: 1px; width: auto; padding: 0pt; border: 0pt none;
margin: 0pt; text-align: left; text-indent: 0pt; text-transform: none;
line-height: normal; letter-spacing: normal; word-spacing: normal;">
        <div id="MathJax_Hidden"></div>
</div>

This is little complected but can be removed.

There should be a better way to remove the entire MathJax from a document.

Regards


MathJax isn't designed to be removed, and it does include a number of elements in the page (as you have found). These include stylesheets in the <head> as well as a number of hidden elements in the <body>.

Can you be more specific about what you are really trying to do? Why can't you just make a copy of the innerHTML before MathJax runs and inserts its content? That would be a lot easier than trying to unravel the changes that it makes. Also, if you want to avoid those extra elements, you could use

<body>
<div id="main-content">
  ... your content here ...
</div>
</body>

for the main structure of your page, and then use

 document.getElementById("main-content").innerHTML

to get the content (after you have removed the typesetting). MathJax inserts its extra stuff at the beginning or ending of the <body>, so they should be outside of the main-content div.


Let me try the main-content div.

I am actually building a simple HTML editor for our internal use. Most of my content are derived from an XML file. This html has got few Math equations in TeX format. Earlier I did not provide Math Editing capability in my editor and math was displayed as images . So after content editing the user will press a “SAVE” button that will save the html. The saved html will be similar to my source html except that it will have additional changes made by the user.

Recently I try to give Math editing capability and I am using Mathjax for viewing the TeX equations. Once I include the MathJax javascript it adds lots content. Hence on saving I am getting those additional content. Hence my original request on how to remove the content. I don’t care as long as it is in the browser but I need original html for saving and further processing.

Somehow I have managed to remove the additional content using JQuery and DOM. I was just wondering if there are any “easy” way of removing the MathJax content.

Regards


Once the user edits the data, you must be calling MathJax's Typeset function in order to typeset any mathematics that the user typed. My suggestion is that you cache a copy of the HTML that is in place BEFORE you run MathJax on it. Then your SAVE button can save the cached copy, not the current one. Don't you need to do that anyway so that if they start editing again they edit the original TeX code, not the typeset math? It just seems that you are working a lot harder than you need to.

Clone this wiki locally