Skip to content

v4.5.5: further Ssurgeon upgrades, SceneGraph server module, security bugfix

Compare
Choose a tag to compare
@AngledLuffa AngledLuffa released this 06 Sep 20:46
· 104 commits to main since this release

Ssurgeon updates beyond the capabilities listed in the GURT paper

  • MergeNodes operation: combine two words into one word in a graph. one word must be a leaf headed by the other for this to work 0660fa9
  • CombineMWT operation: mark MWT on two or more words. Stanza will treat these as Token 010a955
  • DeleteLeaf operation: remove a leaf, renumber the subsequent words
    429f61a

Bugfixes

  • fix graph serialization for sentences longer than 128 words (IdentityHashSet doesn't work for integers beyond 128) d8d9d9f
  • fix valueOf for SemanticGraph if a word is just a dash 203eb06
  • fix memory usage of evaluating a PCFG model, which would run out of memory because it was saving all of the charts while evaluating b2e67b0
  • Tregex pattern would not correctly display when using optional patterns: a9965b2 8659653
  • Tregex would infinite loop on certain optional patterns which were theoretically legal cc7983e

Security fixes

English dependency converter fixes

  • addressing issue #1363
  • fix (QP up to ...) 8c46648 9a86ece
  • fix up to 1700 kilograms if misparsed in a predicable manner 6e14527
  • better LST coverage 5745de5
  • vmod/acl when the parser misinterprets NP vs NML ad4556d
  • treat lists of NML as repeated modifiers of a noun, instead of a list, as that is the likely meaning of NML. example: a 72-game, three-month season from PTB 61ef545 5e748dc

Server features

  • Scenegraph endpoint 8b40947 #1346
  • remove one json library to reduce number of json libraries we depend on 357b1bb

Small changes

  • allow fourty as a number in SUTime 7fbb7b8
  • capture forty (40) days as a duration in SUTime b3c47a0
  • feature to print out the feature index of an NER model as a text file f636673
  • clarify the INTJ rule for the ChineseHeadFinder 56cd6bb
  • consider { } as punctuation when scoring English constituency treebanks a606afa
  • fix error in test case, from @tanloong #1373 #1372
  • dead code cleanup 86b6a03