Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added calls to replace_wl_with_plain_text in boxes_to_text and boxes_to_xml for String and Symbol #1136

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

GarkGarcia
Copy link
Contributor

This is a follow up to https://github.com/orgs/mathics/teams/maintainers/discussions/20/comments/32. This PR makes it so that named characters are only replaced selectively (inside of strings or symbols). This makes calling replace_wl_with_plain_text from the clients unnecessary (we only need to set use_unicode field of Evaluation appropriately).

After this is merged, PRs to the clients will follow.

@@ -233,7 +233,8 @@ def __init__(
definitions=None,
output=None,
format="text",
catch_interrupt=True
catch_interrupt=True,
use_unicode=True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is worng. This is a property of formatting, not evaluation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rocky I agree, but I don't know where to place it.

@GarkGarcia
Copy link
Contributor Author

I am concerned about the implications of this PR. Maybe it's better to leave the conversion to the clients, even if we have to traverse the AST twice.

@GarkGarcia GarkGarcia marked this pull request as draft January 28, 2021 20:53
@GarkGarcia
Copy link
Contributor Author

I am concerned about the implications of this PR. Maybe it's better to leave the conversion to the clients, even if we have to traverse the AST twice.

@rocky @mmatera Opinions?

@mmatera
Copy link
Contributor

mmatera commented Jan 28, 2021

I found similar problems working with the graph engine. Later I send you a resume of how I think we can face this

return name

def boxes_to_xml(self, **options) -> str:
return replace_wl_with_plain_text(str(self.name))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this depend on the option passed to the Evaluation object?

@@ -2543,7 +2556,7 @@ def boxes_to_xml(self, show_string_characters=False, **options) -> str:
text = self.value

def render(format, string):
encoded_text = encode_mathml(string)
encoded_text = encode_mathml(replace_wl_with_plain_text(string))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same that in Symbol

@mmatera
Copy link
Contributor

mmatera commented Mar 8, 2021

Well, this took to me more time than I had expected. After working on #1140, and having a closer look at how to implement the proposal in this PR, I think that @GarkGarcia is right: the best choice seems to add a parameter in Evaluation.
Let me explain why, and what could be the alternatives.
In #1140 I implemented a way to overwrite the default behavior of MakeBoxes, in a way that a client can redefine them. This mechanism could be used also to deal with Strings: In the client, we could define a custom MakeBoxes for String expressions, that converts or not between the different string representations (wolfram-like utf8, standard utf8, ASCII named characters). However, this solution is difficult to implement for general Symbol and operators.

A second possibility would be to hack the output through the $PrePrint symbol. This is what I did in IWolfram. I used this approach in IWolfram because I needed to make it work also with a WMA kernel, and there I do not have access to the internals. So, if a user tries to overwrite $PrePrint, the interface would get screwed.

On the other hand, the Evaluation class, already controls the formatting step. So it would be natural to introduce a property that controls how the final string should be printed. There would be two different ways to control from the front-end the behavior of Evaluation. One is what @GarkGarcia did: just add a string parameter. We could also allow for this parameter a custom call-back function, that takes the standard (WL)UTF8 and process it in the most convenient way for the front-end.

@rocky
Copy link
Member

rocky commented Mar 8, 2021

Well, this took to me more time than I had expected. After working on #1140, and having a closer look at how to implement the proposal in this PR, I think that @GarkGarcia is right: the best choice seems to add a parameter in Evaluation.
Let me explain why, and what could be the alternatives.
In #1140 I implemented a way to overwrite the default behavior of MakeBoxes, in a way that a client can redefine them. This mechanism could be used also to deal with Strings: In the client, we could define a custom MakeBoxes for String expressions, that converts or not between the different string representations (wolfram-like utf8, standard utf8, ASCII named characters). However, this solution is difficult to implement for general Symbol and operators.

A second possibility would be to hack the output through the $PrePrint symbol. This is what I did in IWolfram. I used this approach in IWolfram because I needed to make it work also with a WMA kernel, and there I do not have access to the internals. So, if a user tries to overwrite $PrePrint, the interface would get screwed.

On the other hand, the Evaluation class, already controls the formatting step. So it would be natural to introduce a property that controls how the final string should be printed. There would be two different ways to control from the front-end the behavior of Evaluation. One is what @GarkGarcia did: just add a string parameter. We could also allow for this parameter a custom call-back function, that takes the standard (WL)UTF8 and process it in the most convenient way for the front-end.

replace_wl_with_plain_text() is not something we want in the long run because it can't see the structure inside and that is important. Down the line, it would needs to be integrated better and driven by the format routine.

@mmatera
Copy link
Contributor

mmatera commented Mar 8, 2021

I am working on a better proposition. Once I finish with the checks, I am going to do a PR to this branch.

@rocky rocky force-pushed the master branch 2 times, most recently from 96ec58b to e8a5440 Compare March 14, 2021 13:55
@rocky rocky force-pushed the master branch 2 times, most recently from b007e8c to ef16eb6 Compare May 19, 2021 22:40
@rocky rocky force-pushed the master branch 5 times, most recently from 8367d69 to 83bb068 Compare June 7, 2021 21:01
@rocky rocky force-pushed the master branch 2 times, most recently from 9570fdd to 499f1bf Compare June 26, 2021 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants