Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework VM dispatch #4967

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

rerobika
Copy link
Member

@rerobika rerobika commented Jan 19, 2022

Overview:

  • VM_OC_* opcodes are removed, now the dispatch is based on the CBC/CBC_EXT opcode
  • The common argument decoding is removed, now each opcode resolves it's arguments individually using VM_DECODE_{EXT}_... macros. Whenever an argument is decoded the dispatcher continues with the same opcode increased by VM_OPCODE_DECODED_ARG.
  • E.g.: A opcode with 2 literal arguments dispatches as the following:
    • step_1: case opcode: goto resolve_first_literal, opcode increment by VM_OPCODE_DECODED_ARG, dispatch again (Hidden by macro)
    • step_2: case VM_OPCODE_ONE_LITERAL(opcode): goto resolve_second_literal, opcode increment by VM_OPCODE_DECODED_ARG, dispatch again (Hidden by macro)
    • step_3: case VM_OPCODE_TWO_LITERALS (opcode): Opcode handler implementation goes here.
    • The opcode handler implementation goes as the following:
  case VM_OPCODE_TWO_LITERALS (opcode):
  { 
    VM_DECODE_LITERAL_LITERAL (opcode);
   
   /* use left and right value as before */
  }
  • The put result block is optimized, each assignment knows whether an ident or property reference should be resolved
  • free_left_value and free_both_values labels are removed, due to the extra long jumps they caused to execute 1 or 2 simple calls.

JerryScript-DCO-1.0-Signed-off-by: Robert Fancsik robert.fancsik@h-lab.eu

@rerobika rerobika added performance Affects performance interpreter Related to the virtual machine stack usage Affects stack usage labels Jan 19, 2022
@rerobika
Copy link
Member Author

Benchmark Heap (bytes) Stack (bytes) Perf (sec)
3d-cube.js 10992 -> 10992 : 0.000% 14280 -> 14272 : +0.056% 1.090 -> 0.940 : +13.730%
3d-morph.js 289248 -> 289248 : 0.000% 1664 -> 1664 : 0.000% 1.321 -> 1.305 : +1.202%
3d-raytrace.js 119456 -> 119456 : 0.000% 2572 -> 2564 : +0.311% 1.287 -> 1.210 : +5.962%
access-binary-trees.js 27120 -> 27120 : 0.000% 3548 -> 3540 : +0.225% 0.770 -> 0.751 : +2.449%
access-fannkuch.js 1816 -> 1816 : 0.000% 1736 -> 1736 : 0.000% 2.571 -> 2.374 : +7.661%
access-nbody.js 4592 -> 4592 : 0.000% 1824 -> 1824 : 0.000% 1.462 -> 1.357 : +7.211%
bitops-3bit-bits-in-byte.js 1048 -> 1048 : 0.000% 1680 -> 1680 : 0.000% 0.688 -> 0.622 : +9.500%
bitops-bits-in-byte.js 1008 -> 1008 : 0.000% 1680 -> 1680 : 0.000% 0.997 -> 0.907 : +9.050%
bitops-bitwise-and.js 784 -> 784 : 0.000% 1376 -> 1376 : 0.000% 1.259 -> 1.266 : -0.565%
bitops-nsieve-bits.js 61208 -> 61208 : 0.000% 1680 -> 1680 : 0.000% 1.492 -> 1.312 : +12.036%
controlflow-recursive.js 1288 -> 1288 : 0.000% 61584 -> 61576 : +0.013% 0.527 -> 0.503 : +4.432%
crypto-aes.js 28144 -> 28144 : 0.000% 2308 -> 2308 : 0.000% 0.884 -> 0.816 : +7.666%
crypto-md5.js 70896 -> 70896 : 0.000% 1764 -> 1764 : 0.000% 0.729 -> 0.717 : +1.597%
crypto-sha1.js 46240 -> 46240 : 0.000% 1808 -> 1808 : 0.000% 0.750 -> 0.719 : +4.190%
crypto.js 50112 -> 50112 : 0.000% 3200 -> 3192 : +0.250% 8.457 -> 7.814 : +7.603%
date-format-tofte.js 8840 -> 8840 : 0.000% 2136 -> 2136 : 0.000% 0.989 -> 0.953 : +3.629%
date-format-xparb.js 15200 -> 15200 : 0.000% 2664 -> 2664 : 0.000% 0.722 -> 0.726 : -0.582%
deltablue.js 456312 -> 456312 : 0.000% 3592 -> 3584 : +0.223% 5.278 -> 5.173 : +1.991%
math-cordic.js 2064 -> 2064 : 0.000% 1680 -> 1680 : 0.000% 1.668 -> 1.497 : +10.268%
math-partial-sums.js 1528 -> 1528 : 0.000% 1656 -> 1656 : 0.000% 1.188 -> 1.092 : +8.071%
math-spectral-norm.js 3248 -> 3248 : 0.000% 1680 -> 1680 : 0.000% 0.762 -> 0.707 : +7.189%
raytrace.js 22008 -> 22008 : 0.000% 5168 -> 5160 : +0.155% 2.654 -> 2.608 : +1.713%
richards.js 8472 -> 8472 : 0.000% 2156 -> 2148 : +0.371% 0.297 -> 0.288 : +2.974%
string-base64.js 90120 -> 90120 : 0.000% 1728 -> 1728 : 0.000% 1.423 -> 1.405 : +1.303%
string-fasta.js 4400 -> 4400 : 0.000% 1664 -> 1664 : 0.000% 2.837 -> 2.723 : +3.997%
Geometric mean: 12112.27 -> 12112.27 : 0.000% 2581.934 -> 2580.275 : +0.064% 1.251 -> 1.183 : +5.436%
Binary (bytes) master(79fd540) patch(352bd15) Diff
size 239228 243324 +4096 bytes
.rodata 20950 20170 -780 bytes
.dynstr 384 384 0 bytes
.rel.plt 384 384 0 bytes
.interp 25 25 0 bytes
.dynsym 864 864 0 bytes
.gnu.hash 428 428 0 bytes
.text 208744 214208 +5464 bytes
.comment 85 85 0 bytes
.shstrtab 226 226 0 bytes
.data 8 8 0 bytes
.ARM.exidx 8 8 0 bytes
.rel.dyn 16 16 0 bytes
.init 12 12 0 bytes
.got 208 208 0 bytes
.plt 608 608 0 bytes
.note.ABI-tag 32 32 0 bytes
.gnu.version_r 64 64 0 bytes
.bss 1575136 1575136 0 bytes
.fini 8 8 0 bytes
.hash 372 372 0 bytes
.gnu.version 108 108 0 bytes
.fini_array 4 4 0 bytes
.init_array 4 4 0 bytes
.dynamic 248 248 0 bytes
.eh_frame 4 4 0 bytes
.ARM.attributes 53 53 0 bytes

@rerobika rerobika force-pushed the rework_vm_dispatch_2 branch 2 times, most recently from a50bb28 to 03b8e31 Compare January 19, 2022 11:28
jerry-core/parser/js/byte-code.h Outdated Show resolved Hide resolved
jerry-core/parser/js/byte-code.h Outdated Show resolved Hide resolved
jerry-core/parser/js/byte-code.h Outdated Show resolved Hide resolved
@rerobika rerobika force-pushed the rework_vm_dispatch_2 branch 2 times, most recently from 12af4cc to b9e60a5 Compare January 19, 2022 12:25
Overview:
- `VM_OC_*` opcodes are removed, now the dispatch is based on the `CBC`/`CBC_EXT` opcode
- The common argument decoding is removed, now each opcode resolves it's arguments individually using `VM_DECODE_{EXT}_...` macros. Whenever an argument is decoded the dispatcher continues with the same opcode increased by `VM_OPCODE_DECODED_ARG`.
- E.g.: A opcode with 2 literal arguments dispatches as the following:
  - step_1: `case opcode: goto resolve_first_literal`, opcode increment by `VM_OPCODE_DECODED_ARG`, dispatch again (Hidden by macro)
  - step_2: `case VM_OPCODE_ONE_LITERAL(opcode): goto resolve_second_literal`, opcode increment by `VM_OPCODE_DECODED_ARG`, dispatch again (Hidden by macro)
  - step_3: `case VM_OPCODE_TWO_LITERALS (opcode):` Opcode handler implementation goes here.
  - The opcode handler implementation goes as the following:
```c
  case VM_OPCODE_TWO_LITERALS (opcode):
  {
    VM_DECODE_LITERAL_LITERAL (opcode);

   /* use left and right value as before */
  }
```
- The put result block is optimized, each assignment knows whether an ident or property reference should be resolved
- `free_left_value` and `free_both_values` labels are removed, due to the extra long jumps they caused to execute 1 or 2 simple calls.

JerryScript-DCO-1.0-Signed-off-by: Robert Fancsik robert.fancsik@h-lab.eu
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter Related to the virtual machine performance Affects performance stack usage Affects stack usage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants