Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parsing can be extremely slow on macOS #317

Open
zackschuster opened this issue May 7, 2022 · 2 comments
Open

parsing can be extremely slow on macOS #317

zackschuster opened this issue May 7, 2022 · 2 comments

Comments

@zackschuster
Copy link

the message parsing section of the emailjs test suite runs very slowly, even on an M1 Pro; a single test can take 2+ seconds to complete depending on the size of the payload (up to 5.1mb for a text file). this frequently causes test failures in ci, as parsing can take 10+ seconds to complete in github's environment.

according to profiling of the message parsing tests, the likely cause is this call (note the percentages):

 [C++]:
   ticks  total  nonlib   name
   1654   45.9%   47.2%  T _posix_spawnattr_setflags

obviously this is a very esoteric reaction, and i have no idea how or why such a low-level function is hammering / getting hammered by us; the code has been very stable for years & this slowness is not evident on ubuntu or windows. but, likewise, i've observed this slowness for years as well. i'm hopeful maybe there's just some weird one-line performance cliff? but i would have no idea how to start looking for such a thing.

@andris9
Copy link
Member

andris9 commented May 7, 2022

Can you isolate this by creating a test case for mailparser’s own test suite where parsing an email would take so long? From your example it’s hard to understand what’s going on exactly.

@zackschuster
Copy link
Author

zackschuster commented Aug 3, 2022

profiling test/mail-parser-test.js produces this:

 [C++]:
   ticks  total  nonlib   name
  27835   92.1%   92.5%  T _posix_spawnattr_setflags
   1459    4.8%    4.8%  t __ZN2v88internalL53Builtin_Impl_RelativeTimeFormatPrototypeFormatToPartsENS0_16BuiltinArgumentsEPNS0_7IsolateE
     92    0.3%    0.3%  T __ZN4node10contextify17ContextifyContext15CompileFunctionERKN2v820FunctionCallbackInfoINS2_5ValueEEE

about half of those values can be traced to Out of memory error:

 [C++]:
   ticks  total  nonlib   name
  13519   90.8%   91.3%  T _posix_spawnattr_setflags
    669    4.5%    4.5%  t __ZN2v88internalL53Builtin_Impl_RelativeTimeFormatPrototypeFormatToPartsENS0_16BuiltinArgumentsEPNS0_7IsolateE
    136    0.9%    0.9%  t __ZN4node2fsL4ReadERKN2v820FunctionCallbackInfoINS1_5ValueEEE

now, we're using simpleParser with html parsing disabled, so it's not quite an even comparison, but the behavior is the same so i'm assuming it's a valid clue. also, i have to ask, is a 1mb file size really the soft cap?


parenthetically, i'm hopeful that nodejs/node#32226 might be the source of this headache. it makes as much sense to me as any other explanation, at least...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants