Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exponential grow for case insensitive input. #37

Open
ArthurSonzogni opened this issue Nov 6, 2020 · 3 comments
Open

Exponential grow for case insensitive input. #37

ArthurSonzogni opened this issue Nov 6, 2020 · 3 comments

Comments

@ArthurSonzogni
Copy link
Contributor

Hi,
Thanks for this tool. This is awesome!

I noticed some generator produce all the possibles combinaisons for case incentive input. It means this generate a list of item growing exponentially relative to the input length.

For instance if you use: (input: abnf)

test = "test"

Then you get your get. (output: "rrta")

add('test', Diagram(
  Choice(0,
    Terminal("test"),
    Terminal("Test"),
    Terminal("tEst"),
    Terminal("TEst"),
    Terminal("teSt"),
    Terminal("TeSt"),
    Terminal("tESt"),
    Terminal("TESt"),
    Terminal("tesT"),
    Terminal("TesT"),
    Terminal("tEsT"),
    Terminal("TEsT"),
    Terminal("teST"),
    Terminal("TeST"),
    Terminal("tEST"),
    Terminal("TEST"))));

I was curious to know if this is something you want to keep or fix later.
If you want to fix it later, would you prefer I try to submit PR?

I tried all the generator online using this tool https://arthursonzogni.com/Diagon/#code_area

exponential output:

  • rrparcon
  • rrl
  • rrta

non exponential output:

  • rrutf8
  • rrtext
  • svg
  • html5
  • xhtml5
  • ebnfhtml5
  • ebnfxhtml5
  • blab
  • abnf
  • dot
  • rdot
  • rrdump
  • rrtdump

Can't say (reached unimplemented)

  • bnf
  • wsn
  • iso-ebnf
  • rbnf
  • sid
@ArthurSonzogni ArthurSonzogni changed the title Expone Exponential grows for case insensitive input. Nov 6, 2020
@ArthurSonzogni ArthurSonzogni changed the title Exponential grows for case insensitive input. Exponential grow for case insensitive input. Nov 6, 2020
@katef
Copy link
Owner

katef commented Nov 26, 2020

This is pretty terrible, I agree. I was trying to think about better ways to handle this, when I wrote it. Do you have any suggestions please?

The problem is that those dialects don't support case insensitive terminals, so we have to expand out every spelling.

@ArthurSonzogni
Copy link
Contributor Author

I was thinking: instead of outputting every combination, to rewrite it using a sequence of alternatives:

Input:

http

output (current)

| http
| Http
| hTtp
| HTtp
| htTp
| HtTp
| hTTp
| HTTp
| httP
| HttP
| hTtP
| HTtP
| htTP
| HtTP
| hTTP
| HTTP

output (proposed;)

 (h | H) (t | T) (t | T) (p | P)

Obviously, this is a very naive idea from me. I guess there are reasons this can't be done. I just found out this is implemented by permute_case. I will take a look.

@katef
Copy link
Owner

katef commented Dec 8, 2020

Yes, unfortunately that has a different meaning. For a grammar where there could be whitespace between tokens (depending on how lexing is defined), here you're introducing the possibility for space between each letter, and to produce multiple tokens for output. The original grammar there has only one token for the entire string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants