Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sentences with an empty parse tree #7

Open
jonathon-read opened this issue Mar 16, 2016 · 2 comments
Open

Sentences with an empty parse tree #7

jonathon-read opened this issue Mar 16, 2016 · 2 comments

Comments

@jonathon-read
Copy link

I'd just like to share that we've noticed a few sentences in the data have empty parse trees (i.e. parsetree: "(())". Some of the sentences do look hard to parse, but here is the list, in case you think there's some issue present:

Subset DocId SentenceOffset String
en.dev wsj_2256 2904 Markets --
en.train wsj_2093 2948 Markets --
en.train wsj_0239 1067 In France??!!"
en.train wsj_0586 5241 Share prices also closed lower in Sydney, Hong Kong, Singapore, Taipei, Manila, Wellington and Seoul.
en.train wsj_1781 5726 Shares closed lower in other major Asian and Pacific markets, including Sydney, Hong Kong, Singapore, Taipei, Wellington, Seoul and Manila.
en.train wsj_1130 277 The firms are Morgan Stanley & Co., Salomon Brothers Inc., County Natwest Government Securities Inc., Greenwich Capital Markets Inc. and Goldman, Sachs & Co.
en.train wsj_1057 15074 When Mr. Pilson is asked directly -- can you make money on all this? -- he doesn't exactly say yes.
en.train wsj_1686 9 Just because Stamford, Conn., High School did nothing when its valuable 1930s mural was thrown in the trash doesn't mean the city no longer owns the work of art, a federal judge ruled.
en.train wsj_1065 1141 Lawyers at such firms as Sullivan & Cromwell; Willkie Farr & Gallagher; Wachtell, Lipton, Rosen & Katz; and Fried, Frank, Harris, Shriver & Jacobson all say they, too, have experienced a significant slowdown, particularly during the past few weeks.
en.train wsj_1065 4950 During the trial, Mr. Lang asked Mr. Lorin whether he had been so upset "that you considered killing Mr. Laff? . . .
en.train wsj_1217 2788 Markets --
en.train wsj_1866 1495 "I can just tell the questions are right back where they were: "What's going on?,'' "Can't anything be done about program trading?," "Doesn't the exchange understand?," "Where is the SEC on this?" "
en.train wsj_1861 1401 In addition, earnings were reduced by rate reductions in Florida, Kentucky, Alabama, Tennessee and Louisiana.
en.train wsj_1270 2449 In 1990, the issue is expected to be especially close in Alaska, California, Michigan, New York, Pennsylvania and Illinois.
en.train wsj_1493 5375 : American Suzuki Motor Corp., Brea, Calif., awarded its estimated $10 million to $30 million account to Asher/Gould, Los Angeles.
en.train wsj_1647 5686 So, is the tax code now open game again?
en.train wsj_1022 1545 Stocks involved in the shareholder suits include Union Carbide, RJR Nabisco, American Natural Resources, Boise Cascade Corp., General Foods Corp., Houston Natural Gas and FMC Corp.
en.train wsj_0675 2771 Markets --
en.train wsj_1655 3083 "How interesting."
en.train wsj_1655 3452 What's so discouraging is . . ."
en.train wsj_1723 5968 Prices also closed higher in Singapore, Sydney, Taipei, Wellington, Hong Kong and Manila but were lower in Seoul.
en.train wsj_1728 2841 Markets --
@attapol
Copy link
Owner

attapol commented Mar 18, 2016

The parser choked on these for some reason. I was thinking about patching them outside of the script. But that will make it less reproducible... So we leave it out to the participants to reparse those as necessary.

@ghost
Copy link

ghost commented Aug 13, 2019

Hi @attapol, do you still have the settings for the Berkeley Parser used for the English task? I am working on the ST data, and would like to get as close as possible the missing parses to the rest of the parses provided. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants