Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IRSLOAD_READONLY of FORL_STEP slot can leak to other slots #1123

Open
corsix opened this issue Nov 17, 2023 · 1 comment
Open

IRSLOAD_READONLY of FORL_STEP slot can leak to other slots #1123

corsix opened this issue Nov 17, 2023 · 1 comment

Comments

@corsix
Copy link

corsix commented Nov 17, 2023

Occurrences of IRSLOAD_READONLY are limited to slots that are not visible to the Lua-level programmer:

  • The closure object and ftsz in frames.
  • The FORL_STOP and FORL_STEP variables.

Of these cases, the load of a FORL_STEP variable can be plucked out of the IR via fold rules, and then made visible to the Lua-level programmer. It can then be copied to another slot at a lower slot number than the original FORL_STEP variable. This causes no problems for snapshot_slots, as it only respects IRSLOAD_READONLY when ir->op1 == s, but it causes problems for lj_snap_replay, as it'll end up forming an SLOAD of the copy (rather than the original) with IRSLOAD_PARENT|IRSLOAD_READONLY, and that causes problems for snapshot_slots, as it'll mark the copy (rather than the original) with SNAP_NORESTORE.

An example of this is:

local bit = require"bit"
local function f(start, step)
  local result = "foo"
  local prev_i
  local b = false
  for i = start, 80, step do    
    if b then
      result = bit.tobit(i - prev_i) -- folds to SLOAD of FORL_STEP
      if i >= 60 then end -- required to get a side trace
    end
    b = not b
    prev_i = i
  end
  return result
end
assert(f( 1, 1) == 1)
assert(f(61, 1) == 1) -- fails (f returns "foo")

Possibly there's a solution in the area of performing the IRSLOAD_READONLY SLOAD against the slot number which had SNAP_NORESTORE:

diff --git a/src/lj_snap.c b/src/lj_snap.c
index 68de208f44..ffb3b66920 100644
--- a/src/lj_snap.c
+++ b/src/lj_snap.c
@@ -421,6 +421,7 @@ IRIns *lj_snap_regspmap(jit_State *J, GCtrace *T, SnapNo snapno, IRIns *ir)
 	lj_assertJ(n < snap->nent, "slot %d not found in snapshot", ir->op1);
 	if (snap_slot(map[n]) == ir->op1) {
 	  ref = snap_ref(map[n++]);
+	  if (ir[1].o == IR_SLOAD && ir[1].op1 < ir->op1) n = 0;
 	  break;
 	}
       }
@@ -520,8 +521,15 @@ void lj_snap_replay(jit_State *J, GCtrace *T)
     IRIns *ir = &T->ir[ref];
     TRef tr;
     /* The bloom filter avoids O(nent^2) overhead for de-duping slots. */
-    if (bloomtest(seen, ref) && (tr = snap_dedup(J, map, n, ref)) != 0)
+    if (bloomtest(seen, ref) && (tr = snap_dedup(J, map, n, ref)) != 0) {
+      if (LJ_UNLIKELY(sn & SNAP_NORESTORE) && !tref_isk(tr)) {
+        IRIns *sload = &J->cur.ir[tref_ref(tr)];
+        lj_assertJ(sload->o == IR_SLOAD, "unexpected op %d", sload->o);
+        sload->op1 = s;
+        sload->op2 |= IRSLOAD_READONLY;
+      }
       goto setslot;
+    }
     bloomset(seen, ref);
     if (irref_isk(ref)) {
       /* See special treatment of LJ_FR2 slot 1 in snapshot_slots() above. */
@@ -537,7 +545,7 @@ void lj_snap_replay(jit_State *J, GCtrace *T)
       IRType t = irt_type(ir->t);
       uint32_t mode = IRSLOAD_INHERIT|IRSLOAD_PARENT;
       if (LJ_SOFTFP32 && (sn & SNAP_SOFTFPNUM)) t = IRT_NUM;
-      if (ir->o == IR_SLOAD) mode |= (ir->op2 & IRSLOAD_READONLY);
+      if ((sn & SNAP_NORESTORE)) mode |= IRSLOAD_READONLY;
       if ((sn & SNAP_KEYINDEX)) mode |= IRSLOAD_KEYINDEX;
       tr = emitir_raw(IRT(IR_SLOAD, t), s, mode);
     }

(The assumption in lj_snap_regspmap that SLOADs occur in order of increasing slot number is mildly unpleasant. There's a major shakeup that would remove the need for it: change the SLOAD fields such that the low 8 bits of op1 are the slot number, the high 7 bits of op1 are flags, and for parent loads op2 contains the ref in the parent. One could then also eliminate IR_PVAL and instead represent it as a SLOAD from slot 255.)

@corsix
Copy link
Author

corsix commented Nov 19, 2023

If IRSLOAD_READONLY loads can be referenced from "non-home" positions, then canonicalize_slots also needs to be slightly cleverer; it can't rely just on the presence of IRSLOAD_READONLY. One option is to replicate the logic from within snapshot_slots, another would be to re-arrange things to run canonicalize_slots after lj_snap_add and use the results of the logic:

diff --git a/src/lj_record.c b/src/lj_record.c
index b7af589697..5d152544a8 100644
--- a/src/lj_record.c
+++ b/src/lj_record.c
@@ -280,19 +280,25 @@ typedef enum {
   LOOPEV_ENTER		/* Loop is entered. */
 } LoopEvent;
 
-/* Canonicalize slots: convert integers to numbers. */
+/* Canonicalize slots: convert integers to numbers in latest snapshot. */
 static void canonicalize_slots(jit_State *J)
 {
-  BCReg s;
-  if (LJ_DUALNUM) return;
-  for (s = J->baseslot+J->maxslot-1; s >= 1; s--) {
-    TRef tr = J->slot[s];
-    if (tref_isinteger(tr) && !(tr & TREF_KEYINDEX)) {
-      IRIns *ir = IR(tref_ref(tr));
-      if (!(ir->o == IR_SLOAD && (ir->op2 & (IRSLOAD_READONLY))))
-	J->slot[s] = emitir(IRTN(IR_CONV), tr, IRCONV_NUM_INT);
+  SnapShot *snap = &J->cur.snap[J->cur.nsnap] - 1;
+  SnapEntry *map = &J->cur.snapmap[snap->mapofs];
+  MSize n, nent = snap->nent;
+  for (n = 0; n < nent; n++) {
+    SnapEntry sn = map[n];
+    if (!(sn & (SNAP_NORESTORE | SNAP_SOFTFPNUM | SNAP_KEYINDEX))) {
+      if (tref_isinteger(J->slot[snap_slot(sn)])) {
+	TRef tr = emitir(IRTN(IR_CONV), snap_ref(sn), IRCONV_NUM_INT);
+	sn = SNAP_TR(snap_slot(sn), tr);
+	if (LJ_SOFTFP32)
+	  sn |= SNAP_SOFTFPNUM;
+	map[n] = sn;
+      }
     }
   }
+  snap->ref = (IRRef1)J->cur.nins;  /* To include any new IR_CONVs. */
 }
 
 /* Stop recording. */
@@ -305,19 +311,19 @@ void lj_record_stop(jit_State *J, TraceLink linktype, TraceNo lnk)
   lj_trace_end(J);
   J->cur.linktype = (uint8_t)linktype;
   J->cur.link = (uint16_t)lnk;
+  /* Note: all loop ops must set J->pc to the following instruction! */
+  lj_snap_add(J);  /* Add loop snapshot. */
+  J->needsnap = 0;
+  J->mergesnap = 1;  /* In case recording continues. */
   /* Looping back at the same stack level? */
-  if (lnk == J->cur.traceno && J->framedepth + J->retdepth == 0) {
+  if (J->cur.link == J->cur.traceno && J->framedepth + J->retdepth == 0) {
     if ((J->flags & JIT_F_OPT_LOOP))  /* Shall we try to create a loop? */
-      goto nocanon;  /* Do not canonicalize or we lose the narrowing. */
+      return;  /* Do not canonicalize or we lose the narrowing. */
     if (J->cur.root)  /* Otherwise ensure we always link to the root trace. */
       J->cur.link = J->cur.root;
   }
-  canonicalize_slots(J);
-nocanon:
-  /* Note: all loop ops must set J->pc to the following instruction! */
-  lj_snap_add(J);  /* Add loop snapshot. */
-  J->needsnap = 0;
-  J->mergesnap = 1;  /* In case recording continues. */
+  if (!LJ_DUALNUM)
+    canonicalize_slots(J);
 }
 
 /* Search bytecode backwards for a int/num constant slot initializer. */

(At this point, IRSLOAD_READONLY becomes a no-op unless IRSLOAD_PARENT is also present)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant