Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proof of concept: simulating sticky mode in pre-ES6 environemnts... #152

Closed
pygy opened this issue Feb 5, 2017 · 9 comments
Closed

Proof of concept: simulating sticky mode in pre-ES6 environemnts... #152

pygy opened this issue Feb 5, 2017 · 9 comments

Comments

@pygy
Copy link
Contributor

pygy commented Feb 5, 2017

It is possible, with some hand holding, to simulate sticky matching with normal RegExps.

function stickify(original) {
  var matcher = new RegExp(original.source+'|()', 'g'+original.flags) //imagine we are deduping `g`s here
  return {
    exec: function(str) {
      matcher.lastIndex = this.lastIndex;
      var res = matcher.exec(str)
      if (res && res.pop() == null) return res // setting `this.lastIndex`  elided for the demo
      else return null
    },
    lastIndex: 0
  }
}

By appending |() to the source, you ensure that the match will succeed no matter what at the given index. If original doesn't match, the last capture will be '', null otherwise.

The trouble now is to convince UAs to treat the resulting object as a RegExp in String.prototype.* context... I suppose that in ES6 you could extend native RegExps, but the polyfill is not needed there since ES6 introduces the y flag...

Edit: now the code actualy works :)

@slevithan
Copy link
Owner

XRegExp already supports sticky mode for ES3+ using XRegExp.exec, XRegExp.test, and XRegExp.matchRecursive. I've long wanted to extend this to more methods, but have avoided this pending a broader API change (see #109).

@pygy
Copy link
Contributor Author

pygy commented Feb 6, 2017

Ooh, that's nice, I had missed the sticky mode... IIUC with the current implementation, in sticky mode, the regexp still searches till the end of the subject and XRegExp.exec fails if the match succeeds at a position different from lastIndex...

The snippet above doesn't search beyond the current index, but you have to create an extra RegExp object... It may be cached in the REGEX_DATA field though so it would be a one time cost.

Interrested in a PR?

@slevithan
Copy link
Owner

slevithan commented Feb 6, 2017

Sure--I'm interested. :) Your implementation is clever and I'm not seeing obvious problems. Also XRegExp.exec already copies/caches regexes when using sticky mode, so that shouldn't add overhead (but note that the cached copies are shared by other XRegExp methods, which could be a problem given that you're adding a capturing group). Some things to keep in mind: 1. make sure to regenerate <xregexp-all.js> and check that it passes <tests/index.html> (might need new tests as well), 2. you can compare performance to the current implementation via <tests/perf/index.html> (make sure to test in browsers with and without native /y), 3. make sure there are no compromises/loose ends. E.g., XRegExp.exec should support sticky mode with both native and XRegExp regexes, result arrays should include named backreferences and not include an extra backreference, and it should work in IE < 9 where backreferences to non-participating capturing groups give the empty string rather than undefined (you should get this for free by calling fixed.exec).

If there are any loose ends, this approach probably wouldn't be worth it given that XRegExp already defers to native /y when available, which should give the best performance in ES6 environments (plus Firefox since forever ago since Firefox introduced /y).

@pygy
Copy link
Contributor Author

pygy commented Feb 6, 2017

I think I've got something that match your criteria... Benchmarks ongoing in a pre ES6 Chromium derivative.

Do you want me to include the generated xregexp-all.js in the PR?

BTW, these lines should also exclude sticky matches whose lastIndex is updated on success.

@slevithan
Copy link
Owner

Do you want me to include the generated xregexp-all.js in the PR?

Yes please, but right now it's out of date. :-( I'll regenerate it ASAP (just need to run npm run build) so that your PR doesn't need to include unrelated stuff. If you're submitting without an already up-to-date xregexp-all.js, feel free to leave it out.

@slevithan
Copy link
Owner

PR: #153

@pygy
Copy link
Contributor Author

pygy commented Feb 7, 2017

@slevithan Ok, don't worry about the out of date bundle, I'll add an intermediate commit with a build before the sticky-related changes and another one with the full build. I usually avoid adding build artifacts to PRs because it causes merge conflicts when other PRs are merged or other code is pushed in the mean time, but given the current activity on XRegExp we're safe (I hadn't looked at the GH pulse yet).

@slevithan
Copy link
Owner

BTW, these lines should also exclude sticky matches whose lastIndex is updated on success.

Would you mind submitting a PR for this?

@slevithan
Copy link
Owner

The last remaining piece for this issue is the updates for lastIndex handling with sticky mode. Since there's a pull request covering that (#161), will close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants