Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider switching to iterative parser #263

Closed
timrwood opened this issue Apr 6, 2012 · 7 comments
Closed

Consider switching to iterative parser #263

timrwood opened this issue Apr 6, 2012 · 7 comments

Comments

@timrwood
Copy link
Member

timrwood commented Apr 6, 2012

Currently, moment parses by converting strings to arrays based on regexes.

"2001-10-5 4:50 AM" = [2001, 10, 5, 4, 50, "AM"]
"YYYY-MM-DD HH:mm a" = ["YYYY", "MM", "DD", "HH", "mm", "a"]

Then, the parser loops through each item in the array and converts it to the correct argument for parsing an array to a date.

Instead, perhaps the parser should only chuck up the tokens, then loop through them, tearing chunks of the input string apart.

"2001-10-5 4:50 AM"
"YYYY-MM-DD HH:mm a" = ["YYYY", "MM", "DD", "HH", "mm", "a"]
getNextPart("YYYY", "2001-10-5 4:50 AM");
getNextPart("MM", "-10-5 4:50 AM");
getNextPart("DD", "-5 4:50 AM");
getNextPart("HH", " 4:50 AM");
getNextPart("mm", ":50 AM");
getNextPart("a", " AM");

This could solve both the ISO8601 "T" problem/CJK number/month name problem, and the "YYYYMMDD" problem as well.

"20011005"
"YYYYMMDD" = ["YYYY", "MM", "DD"]
getNextPart("YYYY", "20011005");
getNextPart("MM", "1005");
getNextPart("DD", "05");
@rockymeza
Copy link
Contributor

While we are discussing a new parser, I feel that I should point out Jison. It's a parser generator for JavaScript that powers CoffeeScript and some other pretty big JavaScript libraries. With Jison you can generate the parser and the generated code has no dependencies.

It might not be the right thing for our use case, and it might not support switching out the tokens because of internationalization, but I feel that it would be wrong not to bring it up at least.

@rockymeza
Copy link
Contributor

Now that I've at least brought Jison up, I can actually comment on this parser idea.

I think that it provides a couple of advantages that we cannot get from the regex parsing method:

A couple questions:

  1. How will it know how to chunk up the tokens?
  2. This does not seem like it could possibly be as fast as the regex parsing. Are we willing to sacrifice some performance for a more robust parser?

@timrwood
Copy link
Member Author

timrwood commented Apr 7, 2012

I'll look into Jison more in depth, thanks for mentioning it.

re:questions

  1. We can use the regex chunker we currently use for this. This regex used with string.match will output an array of all the matches, eg ["YYYY", "MM", "DD", "HH", "mm", "a"].
  2. We should build it in a separate branch and then run jsperf tests on it to see how much slower it will be. I'm guessing it will be a little slower, but I think the added features will outweigh the speed decrease.

@timrwood
Copy link
Member Author

timrwood commented Apr 8, 2012

I've started working on this in the feature/parser branch.

https://github.com/timrwood/moment/tree/feature/parser

@ghost ghost assigned timrwood Apr 9, 2012
@timrwood
Copy link
Member Author

timrwood commented Apr 9, 2012

Well, I've got all the old unit tests up and passing for this, plus some new tests added here.

Now I'll make a jsperf test and see how it compares...

@timrwood
Copy link
Member Author

timrwood commented Apr 9, 2012

Results are in: http://jsperf.com/moment-iterative-parser

~25% slower on Chrome, ~50% slower on Firefox.

chrome (ops/second)
==================== 1.5.0
===============      new

firefox (ops/second)
==================== 1.5.0
==========           new

It's not that bad, considering other parsers like DateJS are 95% slower than moment. http://jsperf.com/underscore-date-vs-datejs/2

chrome  (ops/second)
==================== moment
=                    DateJS

@timrwood
Copy link
Member Author

This has been merged into the develop branch and will go out in the 1.6.0 release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants