Skip to content

Mozlando Servo SMStrings

Lars Bergstrom edited this page Dec 11, 2015 · 2 revisions

JS / Servo Strings discussion

  • gecko uses externalstrings in js engine for strings it passes in
  • experiment: try assuming all Latin-1 strings are ASCII, compare performance with/without memcpy or zero-copy optimizations
    • Make sure test input really is ASCII, otherwise putting it in a Rust str can cause UB
    • How expensive is checking for ASCII if optimized to use SIMD?
  • autostablestring (might be hard to codegen for rust?)
  • are we using same code to copy strings as gecko?
  • strings might be getting linearized from ropes
  • GC interaction
    • If we share the backing store, what happens if we allow JSStrings to move?
    • For strings that are still in the live set, it won't get mutated, but potentially it could move.
    • Store a handle? Adds cost of double indirection.
    • Tracing can detect this. Tracing both marks things and causes them to move. So it can inform you when it happens.
      • Knowing the JS string has moved doesn't tell us what its corresponding Servo object is.
      • It's the trace method of the Servo object that causes it to happen.
      • So if the string is live in two places, one in Servo and one in SpiderMonkey...
      • When you call Spidermonkey's trace method, you give it the addreess of your reference, and it changes that.
  • Can we ask whether a string was an "external" string created by Servo?
    • Look at the finalizer pointer on the JS string object, and compare it to Servo's.
  • Caching
    • Gecko has an optimization where they cache the last string to go from Gecko to JS (?)
      • just a single-element cache
  • Testing moves
    • currently hard to detect any potential problems through testing.
    • Would be useful to have a version of SpiderMonkey that randomly moved things
  • safety
    • Need to use Cell / UnsafeCell
  • Optimizations from Gecko
    • Talk to bz
    • We do already have (most of?) the LICM / DCE optimizations
  • Long-term can we eliminate all copies?
    • Get SpiderMonkey to use WTF-8 internally
      • But that makes things like charAt slow
      • Could have WTF-8 be one of the available representations (like UCS-2 and Latin1 currently)
        • Lazily convert when needed
      • WTF-8 <-> UCS2 conversion is slower than Latin1 <-> UCS-2 conversion
  • Does this also affect bholley's servo_style in Gecko work?
    • Lots of things (property names, keywords, numbers) are guaranteed to be ASCII, atoms.
    • What string representation does Gecko CSS use?
Clone this wiki locally