Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault on Node.js 22 #58369

Closed
remcohaszing opened this issue Apr 30, 2024 · 27 comments
Closed

Segmentation fault on Node.js 22 #58369

remcohaszing opened this issue Apr 30, 2024 · 27 comments
Labels
External Relates to another program, environment, or user action which we cannot control.

Comments

@remcohaszing
Copy link

remcohaszing commented Apr 30, 2024

πŸ”Ž Search Terms

"segmentation fault"

πŸ•— Version & Regression Information

  • This is a crash

⏯ Playground Link

https://github.com/remarkjs/remark/tree/main/packages/remark-parse

πŸ’» Code

The code is not actually relevant. It still happens if you remove the content of packages/remark-parse/lib/index.js. The existence of the file is important.

πŸ™ Actual behavior

Using Node.js 22, running tsc --build from the remark-parse workspace, yields:

$ tsc -b     
[1]    57135 segmentation fault (core dumped)  tsc -b

πŸ™‚ Expected behavior

It generates type definitions

Additional information about the issue

It works in Node.js 20, but not Node.js 22.

If the project is built with Node.js 20, then an incremental build with Node.js 22 won’t crash TypeScript.

Neither tsc nor tsc --showConfig not tsc --build --clean causes a crash, only tsc --build.

See remarkjs/remark#1291 (comment) for some more info where this was first discovered.

@wooorm
Copy link

wooorm commented Apr 30, 2024

Also repro’d at https://github.com/syntax-tree/mdast-util-to-string/actions/runs/8894134799, https://github.com/syntax-tree/unist-util-visit-parents/actions/runs/8893972759, and https://github.com/syntax-tree/unist-util-visit/actions/runs/8894107813/job/24421655968, which are simpler.

The segment fault goes away by removing declaration: true in tsconfig.

Particularly mdast-util-to-string is a simple repo that I recommend looking at.

What these repos have in common is that they use .d.ts files. (edit: mdast-util-to-string doesn’t)

@wooorm
Copy link

wooorm commented Apr 30, 2024

it doesn’t happen in https://github.com/syntax-tree/nlcst-to-string, which is similar, but revolved around a different ecosystem.

I am now guessing that there something in the types in node_modules of these packages. Likely around micromark/mdast (aka markdown).

@remcohaszing
Copy link
Author

remcohaszing commented Apr 30, 2024

I narrowed it down by trimming mdast-util-to-string to a repo with the following files:

// package.json
{
  "dependencies": {
    "@types/mdast": "^4.0.0",
    "typescript": "^5.0.0"
  }
}
// tsconfig.json
{
  "compilerOptions": {
    // This option doesn’t seem to matter
    // "noEmit": true
  }
}
// index.ts
import 'mdast'

This repro also causes the crash when running tsc, not just tsc --build.

@remcohaszing
Copy link
Author

I narrowed it down even further by removing parts from @types/mdast. I trimmed it down to this:

/**
 * How phrasing content is aligned
 * ({@link https://drafts.csswg.org/css-text/ | [CSSTEXT]}).
 *
 * * `'left'`: See the
 *   {@link https://drafts.csswg.org/css-text/#valdef-text-align-left | left}
 *   value of the `text-align` CSS property
 * * `'right'`: See the
 *   {@link https://drafts.csswg.org/css-text/#valdef-text-align-right | right}
 *   value of the `text-align` CSS property
 * * `'center'`: See the
 *   {@link https://drafts.csswg.org/css-text/#valdef-text-align-center | center}
 *   value of the `text-align` CSS property
 * * `null`: phrasing content is aligned as defined by the host environment
 *
 * Used in GFM tables.
 */
export type AlignType = "center" | "left" | "right" | null;


/**
 * Internal relation from one node to another.
 *
 * Whether the value of `identifier` is expected to be a unique identifier or
 * not depends on the type of node including the Association.
 * An example of this is that they should be unique on {@link Definition},
 * whereas multiple {@link LinkReference}s can be non-unique to be associated
 * with one definition.
 */
export interface Association {
    identifier: string;

    /**
     * Relation of association, in parsed form.
     *
     * `label` is a `string` value: it works just like `title` on {@link Link}
     * or a `lang` on {@link Code}: character escapes and character references
     * are parsed.
     *
     * It can match another node.
     */
    label?: string | null | undefined;
}

export interface Reference {
}

/**
 * Registry of all mdast nodes that can occur as children of {@link Root}.
 *
 * > πŸ‘‰ **Note**: {@link Root} does not need to be an entire document.
 * > it can also be a fragment.
 *
 * This interface can be augmented to register custom node types:
 *
 *
 * For a union of all {@link Root} children, see {@link RootContent}.
 */
export interface RootContentMap {
}

This still causes a crash, but at this point, even removing random chunks from comments resolves the issue.

@wooorm
Copy link

wooorm commented Apr 30, 2024

random ideas:

  • could it be the emoji?
  • or the [ and ] in the link text ([CSSTEXT])?

@wooorm
Copy link

wooorm commented Apr 30, 2024

Yep, only removing the emoji on L312 from node_modules/@types/mdast/index.d.ts solves this for me in mdast-util-to-string.

The emoji is U+1f449, as in consisting of the surrogates 0xd83d and 0xdc49

@remcohaszing
Copy link
Author

I tried removing the emoji. That did make a difference, but so did keeping the emoji and removing another random piece of comment instead.

@wooorm
Copy link

wooorm commented Apr 30, 2024

Which other part?

@remcohaszing
Copy link
Author

remcohaszing commented Apr 30, 2024

Any piece really. I.e. the line below the emoji and the non-empty line below that.

Now tsc is crashing randomly for me. I seem to have reached some treshold around this amount of content that’s allowed in the mdast types. The emoji doesn’t cause the problem, but it counts as a big factor towards reaching this treshold.

@wooorm
Copy link

wooorm commented Apr 30, 2024

Removing the line below the emoji does not change anything for me in mdast-util-to-string. Nor does the non empty line. Nor both together. For me the emoji is consistent

@fatcerberus
Copy link

fatcerberus commented Apr 30, 2024

The TypeScript compiler is JavaScript code, which should never be able to cause a segfault (as that would likely be a security issue in e.g. browsers). This sound more like a Node and/or V8 bug to me.

@wooorm
Copy link

wooorm commented Apr 30, 2024

The crash does not happen in TS 5.1.0-dev.20230307. But does occur with TS 5.1.0-dev.20230308.

Here’s the diff, which is too big to put in markdown. And I’m not allowed to use .diff as an extension here.

diff.txt

Interesting from what I’m reading: looks like an introduction of scanJSDocCommentTextToken, and new code relating to positional info?
The rest of the diff looks like the renumbering of the enums. And renaming functions from getStartPos -> getTokenFullStart; setTextPos -> resetTokenState.
some more interesting code at the end, around removeTrailingWhitespace and pushComment in I believe the tokenizer.

Idea: could Debug.assert cause a crash on Node 22?

@fatcerberus
Copy link

My point stands. tsc is pure JS code; if any version of the compiler can cause a segmentation fault that crashes the Node.js process, that's a potential security issue and should be brought up to the Node.js team.

@jakebailey
Copy link
Member

Using Node 22 and the linked repo, I can't reproduce this segfault.

If you have this narrowed down to two specific versions, https://www.npmjs.com/package/every-ts would let you go further without knowing how to build TS itself (not that checking out and running TS is that hard). That diff is just too large to guess at, but an actual bisect would reveal things.

@remcohaszing
Copy link
Author

remcohaszing commented Apr 30, 2024

Since @wooorm and I have slightly different results, I suspect the hardware plays a role and you have a very beefy development machine.

I use Linux by the way. That may be relevant.

Based on a new project with only the files as described in #58369 (comment):

remco@vali:~/Downloads/diff $ npm i typescript @types/mdast

added 3 packages in 747ms
remco@vali:~/Downloads/diff $ tsc
[1]    95375 segmentation fault (core dumped)  tsc
Exit code: 139                                                                                                                                                                                                                                                                    
remco@vali:~/Downloads/diff $ npm uninstall typescript      

removed 1 package in 223ms
remco@vali:~/Downloads/diff $ npm i every-ts               

added 28 packages in 2s
remco@vali:~/Downloads/diff $ every-ts bisect start
Cloning TypeScript; this may take a bit...
Cloning into '/home/remco/Downloads/diff/node_modules/every-ts/.data/TypeScript'...
remote: Enumerating objects: 284974, done.
remote: Counting objects: 100% (313/313), done.
remote: Compressing objects: 100% (213/213), done.
remote: Total 284974 (delta 165), reused 202 (delta 80), pack-reused 284661
Receiving objects: 100% (284974/284974), 1.00 GiB | 11.74 MiB/s, done.
Resolving deltas: 100% (170765/170765), done.
remote: Enumerating objects: 66225, done.
remote: Counting objects: 100% (40065/40065), done.
remote: Compressing objects: 100% (30058/30058), done.
remote: Total 66225 (delta 10944), reused 10037 (delta 10007), pack-reused 26160
Receiving objects: 100% (66225/66225), 29.73 MiB | 7.86 MiB/s, done.
Resolving deltas: 100% (16170/16170), done.
Updating files: 100% (71112/71112), done.
status: waiting for both good and bad commits
Building TypeScript...
Downloading fnm...
fnm installed
TypeScript built successfully!
remco@vali:~/Downloads/diff $ every-ts bisect new 5.1.0-dev.20230308
Resolved 5.1.0-dev.20230308 to 746a6feb2e7ba6987b6c72db538dd498b35cd461
status: waiting for good commit(s), bad commit known
remco@vali:~/Downloads/diff $ every-ts bisect old 5.1.0-dev.20230307
Resolved 5.1.0-dev.20230307 to df1ddb79642212c7ef9f2f6ec03ac95e610d4280
Bisecting: 10 revisions left to test after this (roughly 4 steps)
remote: Enumerating objects: 42710, done.
remote: Counting objects: 100% (5147/5147), done.
remote: Compressing objects: 100% (2943/2943), done.
remote: Total 42710 (delta 2637), reused 2204 (delta 2204), pack-reused 37563
Receiving objects: 100% (42710/42710), 24.43 MiB | 6.27 MiB/s, done.
Resolving deltas: 100% (10862/10862), done.
[a6be79d535475c0062f3028afa51cabdb83d02ff] Remove old test262 and dt runner infra (#53125)
Building TypeScript...
TypeScript built successfully!
remco@vali:~/Downloads/diff $ every-ts bisect run tsc               
Building TypeScript...
TypeScript built successfully!
git bisect old
Bisecting: 5 revisions left to test after this (roughly 3 steps)
remote: Enumerating objects: 865, done.
remote: Counting objects: 100% (812/812), done.
remote: Compressing objects: 100% (709/709), done.
remote: Total 865 (delta 105), reused 103 (delta 103), pack-reused 53
Receiving objects: 100% (865/865), 2.30 MiB | 4.25 MiB/s, done.
Resolving deltas: 100% (110/110), done.
[88adf8014b617fb8492587941ffcf4f75986e9cc] Make special intersections order-independent (#52782)
Building TypeScript...
TypeScript built successfully!
git bisect old
Bisecting: 2 revisions left to test after this (roughly 2 steps)
remote: Enumerating objects: 66, done.
remote: Counting objects: 100% (56/56), done.
remote: Compressing objects: 100% (29/29), done.
remote: Total 66 (delta 27), reused 27 (delta 27), pack-reused 10
Receiving objects: 100% (66/66), 916.98 KiB | 3.60 MiB/s, done.
Resolving deltas: 100% (31/31), done.
[137c461bd096d0c0a948fda299e56316798df8eb] Scan bigger/fewer jsdoc tokens (#53081)
Building TypeScript...
TypeScript built successfully!
Internal Error: Command was killed with SIGSEGV (Segmentation fault): tsc
    at makeError (file:///home/remco/Downloads/diff/node_modules/execa/lib/error.js:60:11)
    at handlePromise (file:///home/remco/Downloads/diff/node_modules/execa/index.js:124:26)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async BisectRun.executeOrThrow (file:///home/remco/Downloads/diff/node_modules/every-ts/dist/git.js:98:28)
    at async BisectRun.execute (file:///home/remco/Downloads/diff/node_modules/every-ts/dist/common.js:50:20)
    at async BisectRun.validateAndExecute (/home/remco/Downloads/diff/node_modules/clipanion/lib/advanced/Command.js:73:26)
    at async Cli.run (/home/remco/Downloads/diff/node_modules/clipanion/lib/advanced/Cli.js:229:24)
    at async Cli.runExit (/home/remco/Downloads/diff/node_modules/clipanion/lib/advanced/Cli.js:238:28)
Exit code: 1                  

So if I understand correctly the culprit is #53081.

@jakebailey
Copy link
Member

Thanks for narrowing that down. Realistically, nothing in that PR should have been able to segfault node (or anything in JS at all, period), so if this is new in Node 22, it's definitely a Node bug if anything.

I wasn't sure how to test with your repro, though; if you can make a branch or new repo with that content, that'd be helpful for experimentation and reporting upstream.

@remcohaszing
Copy link
Author

Fair. I pushed the reproduction repo to https://github.com/remcohaszing/typescript-bug-58369.

@RyanCavanaugh
Copy link
Member

RyanCavanaugh commented Apr 30, 2024

Not reproing for me on Windows x64 Node 22.0.0 tsc 5.5.0-dev.20240430, potato laptop

@remcohaszing
Copy link
Author

After setting up everything again to make a nice reproduction, I see the behaviour @wooorm described: the πŸ‘‰ emoji causes the crash, it’s not random anymore.

I have also setup a test matrix in GitHub actions which shows that this errors occurs in Linux. It can’t be reproduced on Windows, because npm shipped with Node.js 22 on Windows appears to be broken.

@jakebailey
Copy link
Member

This is actually "fixed" on TS main by #58339; that PR prevents out of bounds accesses in more places, among other tweaks, so it sounds like Node 22 has a string handling bug.

@remcohaszing
Copy link
Author

I just confirmed using typescript@next fixed it for me.

@RyanCavanaugh
Copy link
Member

Do we have a standalone (not all of tsc) repro we can hand off to Node/V8?

@jakebailey
Copy link
Member

I have nothing yet. At 3358157, the segfault happens in scanJSDocCommentTextToken, but if you add console log inside of the for loop, it stops segfaulting.

@fatcerberus
Copy link

Sounds like some kind of JIT bug in V8; those are no fun. The console.log probably causes a deopt that sidesteps it.

@jakebailey
Copy link
Member

Even extracting the code out, I can't reproduce this with the same inputs to that function. At this point, I don't think there's any mimization that can be done and the only next steps are to bisect Node to see when it broke between v20 and v22 (and then when it's inevitably just a "pull in v8" change, bisect that too).

@jakebailey
Copy link
Member

In any case, I don't think this is a problem that TS can handle; if the bug is "fixed" by adding an innocuous console log, that's a runtime problem.

@RyanCavanaugh RyanCavanaugh added the External Relates to another program, environment, or user action which we cannot control. label Apr 30, 2024
@typescript-bot
Copy link
Collaborator

This issue has been marked as "External" and has seen no recent activity. It has been automatically closed for house-keeping purposes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
External Relates to another program, environment, or user action which we cannot control.
Projects
None yet
Development

No branches or pull requests

6 participants