Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot include # character in a "raw string" #234

Open
aramallo opened this issue Feb 8, 2024 · 7 comments
Open

Cannot include # character in a "raw string" #234

aramallo opened this issue Feb 8, 2024 · 7 comments

Comments

@aramallo
Copy link

aramallo commented Feb 8, 2024

The presence of a single # character in any raw string will casue the query parser to fail.

The following example fails with reason The query parser has encountered unexpected input / end of input at 17..17

?[data] <- [[ ___"#"___]]

Just removing the # char makes it work.

I am using Cozo Rust library version 0.7.5

@aramallo
Copy link
Author

aramallo commented Feb 8, 2024

While doing more tests I found that adding a newline after the hash character avoids the failure which suggests the issue is with the parser confusing this with a LINE_COMMENT?

So, this fails:

?[data] <- [[ ___"#"___]]

But this doesn't, yet it returns an empty string as a result which validates my assumption:

?[data] <- [[ ___"#\n"___]]

@aramallo
Copy link
Author

aramallo commented Feb 8, 2024

Using https://pest.rs I tried validating my assumption but according to the latest pest file even adding a newline should fail.

The case of an empty raw string

image

Now when adding #

image

Adding the newline does not change the tool result

image

Hope this helps. Unfortuntely I am not very good with Rust yet and not familiar with pest at all to find a solution to contribute.

@aramallo
Copy link
Author

aramallo commented Feb 9, 2024

So adding SOI to the LINE_COMMENT ruls solves the problem (but breaks LINE COMMENTS), which means we are on the right track.

LINE_COMMENT = _{ SOI ~ "#" ~ (!"\n" ~ ANY)* }

image

aramallo added a commit to aramallo/cozo that referenced this issue Feb 9, 2024
LINE_COMMENT rule was changes to match the SOI first, otherwise any raw string containing `#` will be considered a LINE_COMMENT
@aramallo
Copy link
Author

aramallo commented Feb 9, 2024

I've extracted the related rules into a fiddle that shows how this fails.

@aramallo
Copy link
Author

So I managed to fix the issue at the PEG level. The change consists in making the raw_string_inner pest rule atomic so that we can avoid the LINE_COMMENT having precedence over raw_string when # is present.

A fiddle here showing that it works.

I made the change in my fork. However, when I am pulling it from another project (my cozo binding for Erlang) , I still get the same error when running ?[data] <- [[___"#"___]]

I check Rust is compiling my fork and latest commit as shown below

Updating git repository `https://github.com/aramallo/cozo.git`
 Updating git submodule `https://github.com/facebook/rocksdb.git`
 Compiling cozorocks v0.1.7 (https://github.com/aramallo/cozo.git?branch=main#5d252699) <<<<<<<<
 Compiling cozo v0.7.6 (https://github.com/aramallo/cozo.git?branch=main#5d252699) <<<<<<<<

Could it be the case that the pest file has not produced any change on the parser? I am new to RUST and pest so not sure if I need to run something to generate the Rust parser and then commit that file or if pest is doing this when compiling automatically?

@zh217 Any ideas here?

@andrewbaxter
Copy link

Sorry, I'm not entirely sure, but that's included here:

#[grammar = "cozoscript.pest"]

#[derive(pest_derive::Parser)]
#[grammar = "cozoscript.pest"]
pub(crate) struct CozoScriptParser;

It's a derive macro, which gets automatically run during normal complication. It looks like pest_derive also accounts for external files changing (per pest-parser/pest#789). So basically there should be no extra work required aside from changing that file.

And that log looks pretty clear, but you might be able to use cargo tree -i to confirm which version of cozo are being pulled in in the dependent project.

(And thanks to this issue for teaching me that cozo supports comments! It doesn't appear to be documented when I looked)

@creatorrr
Copy link
Contributor

The only solution in the meantime is to pass the values separately and not interpolate anything. But still, this needs fixing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants