Skip to content
This repository has been archived by the owner on Sep 2, 2023. It is now read-only.

Add LDM/STM (LoaD Multiple/STore Multiple) #141

Open
mbitsnbites opened this issue Jun 28, 2022 · 0 comments
Open

Add LDM/STM (LoaD Multiple/STore Multiple) #141

mbitsnbites opened this issue Jun 28, 2022 · 0 comments
Milestone

Comments

@mbitsnbites
Copy link
Member

mbitsnbites commented Jun 28, 2022

Purpose

Primarily improve code density and efficiency of function prologues and epilogues. Other load/store operations can also be optimized (e.g. load N variables from memory).

Also consider adding support for pre-decrement and post-increment, and to optionally return (J LR) after a post-increment load (effectively baking the entire epilogue into a single instruction).

Encoding

We could repurpose the currently unused format A zero-opcode, and allow load/store operations of a range of registers (specifying first to last) to/from a memory address that is specified by any register (typically SP in function progogues/epilogues).

                                  V             T       OP
+-----------+---------+---------+---+---------+---+-------------+
|0 0 0 0 0 0| REGa    | REGb    |e f| REGc    |g h|0 0 0 0 0 0 0|
+-----------+---------+---------+---+---------+---+-------------+
  • REGa - First register to be loaded/stored.
  • REGc - Last register to be loaded/stored.
  • REGb - Register holding the memory address (e.g. SP).

We have four bits (e, f, g and h) that can encode the operation. Here is an example:

e f g h Operation
0 0 0 0 LDM {REGa-REGc}, [REGb]
0 0 0 1 LDM {REGa-REGc}, [REGb]+
0 0 1 0 STM {REGa-REGc}, [REGb]
0 0 1 1 STM {REGa-REGc}, -[REGb]
0 1 0 0 TBD
0 1 0 1 LDM {REGa-REGc}, [REGb]+, RET (return after load and increment)
0 1 1 0 TBD
0 1 1 1 TBD
1 0 0 0 TBD (vector?)
1 0 0 1 TBD (vector?)
1 0 1 0 TBD (vector?)
1 0 1 1 TBD (vector?)
1 1 0 0 TBD (vector?)
1 1 0 1 TBD (vector?)
1 1 1 0 TBD (vector?)
1 1 1 1 TBD (vector?)

Alternatively we could use the vacant format A & C opcodes 4 & 12 (load & store, respectively), though that is likely to waste encoding space and possibly confuse the interpretation of some bits in the instruction word.

Open questions

Handling LR (and VL)

Since STM/LDM are to be used in function prologues/epilogues, the LR register is likely to be pushed/popped. However LR is currently R30, while the register allocator typically selects R16, R17, ... for callee-saved registers, which makes it impossible to form a useful register range that includes R30 (unless all registers R16-R29 are pushed/popped). There are two simple solutions:

  • Move LR to a lower register number, e.g. R16.
  • Use one of the bits in the instruction word to indicate that LR needs to be stored/loaded too.

There is a similar problem with VL (functions that use vector operations need to push VL). This speaks in favor of re-arranging the register numbers so that VL = R16, LR = R17 (for instance).

Another consideration: In a function epilogue we want to load LR as early as possible so that it is available when doing the RET instruction. This also speaks in favor of moving LR to a low register number (assuming that LDM loads the registers from memory in the order that they are listed).

Vector registers

Should we allow load/store multiple vector registers? At least spare room in the encoding to allow for it in a later ISA change, preferably in a way that maps well to how the V field in the instruction word (bits 14 & 15) is interpreted by other vector load/store instructions.

It would be very valuable to have some efficient way to store vector registers along with their vector length, with auto increment/decrement, which is very similar to what the LDM/STM instructions do for scalar registers.

Costs

  • Another sequencer is required, similar to the vector register sequencer. The simplest design would add a new pipeline stage before ID (just pumping out regular instructions), but a more advanced solution would embed the sequencer as part of the ID stage.
  • Implementations that support memory exceptions need to support resuming the load/store instruction mid-way.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant