Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RISC-V: Add WCH/QingKe XW extension #6390

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

ArcaneNibble
Copy link

This PR adds support for the WCH/QingKe "XW" extension, closing issue #6385

This only includes support for the "XW" extension and not any of the other QingKe-specific functionality (i.e. the ARM Cortex-inspired SysTick and interrupt controller)

As part of this, I had to make a duplicate of the existing .cspec file, as some issues seem to arise with it if the core supports only single-precision floating point but not double-precision floating point. I don't know whether this .cspec is actually correct or not, especially for the QingKe-V2 core which only implements RV32ECXW.

I don't know how to make Ghidra correctly pick the correct variant based on .riscv.attributes.

The XW opcode disassembly was tested by feeding all possible opcodes (as far as I can tell) through the vendor toolchain, opening the resulting .o into Ghidra, and then visually scanning the results.

@jnk0le
Copy link

jnk0le commented Apr 7, 2024

One thing sticking out are the insn names conflicting with ratified Zcb instructions.

Naming them like xw.c.lbu has a higher chance of getting upstreamed to compilers (and reduce overall mess), though won't compile back through their proprietary gcc.

@ryanmkurtz ryanmkurtz added Feature: Processor/RISC-V Status: Triage Information is being gathered labels Apr 8, 2024
@ryanmkurtz ryanmkurtz linked an issue Apr 8, 2024 that may be closed by this pull request
@jobermayr
Copy link
Contributor

To fix build error:

diff --git a/Ghidra/Processors/RISCV/certification.manifest b/Ghidra/Processors/RISCV/certification.manifest
index 07cddb0454..9e40a0da46 100644
--- a/Ghidra/Processors/RISCV/certification.manifest
+++ b/Ghidra/Processors/RISCV/certification.manifest
@@ -5,6 +5,7 @@ data/languages/RV32G.pspec||GHIDRA||||END|
 data/languages/RV32GC.pspec||GHIDRA||||END|
 data/languages/RV32I.pspec||GHIDRA||||END|
 data/languages/RV32IC.pspec||GHIDRA||||END|
+data/languages/RV32IMACFX.pspec||GHIDRA||||END|
 data/languages/RV32IMC.pspec||GHIDRA||||END|
 data/languages/RV64G.pspec||GHIDRA||||END|
 data/languages/RV64GC.pspec||GHIDRA||||END|
@@ -16,6 +17,7 @@ data/languages/riscv.ilp32d.slaspec||GHIDRA||||END|
 data/languages/riscv.ilp32d_thead.slaspec||GHIDRA||||END|
 data/languages/riscv.instr.sinc||GHIDRA||||END|
 data/languages/riscv.ldefs||GHIDRA||||END|
+data/languages/riscv.lp32qingke.slaspec||GHIDRA||||END|
 data/languages/riscv.lp64d.slaspec||GHIDRA||||END|
 data/languages/riscv.lp64d_thead.slaspec||GHIDRA||||END|
 data/languages/riscv.opinion||GHIDRA||||END|
@@ -30,6 +32,7 @@ data/languages/riscv.rv32k.sinc||GHIDRA||||END|
 data/languages/riscv.rv32m.sinc||GHIDRA||||END|
 data/languages/riscv.rv32p.sinc||GHIDRA||||END|
 data/languages/riscv.rv32q.sinc||GHIDRA||||END|
+data/languages/riscv.rv32xw.sinc||GHIDRA||||END|
 data/languages/riscv.rv64a.sinc||GHIDRA||||END|
 data/languages/riscv.rv64b.sinc||GHIDRA||||END|
 data/languages/riscv.rv64d.sinc||GHIDRA||||END|
@@ -56,6 +59,7 @@ data/languages/riscv.zvbc.sinc||GHIDRA||||END|
 data/languages/riscv.zvkng.sinc||GHIDRA||||END|
 data/languages/riscv.zvksg.sinc||GHIDRA||||END|
 data/languages/riscv32-fp.cspec||GHIDRA||||END|
+data/languages/riscv32-fp32.cspec||GHIDRA||||END|
 data/languages/riscv32.cspec||GHIDRA||||END|
 data/languages/riscv32.dwarf||GHIDRA||||END|
 data/languages/riscv64-fp.cspec||GHIDRA||||END|

@thixotropist
Copy link
Contributor

If you can provide a binary exemplar or two including the new instructions, I'd like to add it to my RISCV collection. Metadata describing this RSA extension would be nice too, for instance:

  • what version of gcc or llvm was used as the pre-patched base for compiling the source code?
  • has the vendor taken steps to upstream support for this version of the extension in binutils, gcc, llvm, or the linux kernel?
  • does this vendor extension have a standard multi-letter name (like _xtheadbb), or just the older single-letter name?
  • has the vendor released other versions of this ISA extension, or subsequently announced similar chip families using the frozen Zcb extension?

Mostly I'm looking for hints on whether the vendor considers this extension tracked for standardization, deprecated in favor of newer extensions, or proprietary.

@ArcaneNibble
Copy link
Author

re #6390 (comment): to be honest I have no idea what this actually does (running sleigh manually works without doing this), but I've added it as a separate commit

re #6390 (comment):

i am attaching the files that I used to visually smoke-test this PR (as well as the quick hack script that generated them): wch-xw.zip

the opcodes are definitely found in the bootroms of the ch32v003/ch32v203/ch32v208 chips, but i do not want to be responsible for publicly distributing those. i've also been able to make vendor gcc emit them for ad-hoc specially-contrived test functions. the disassembly makes sense when skimming these cases.

as to your questions:

  • the vendor gcc's version command reports riscv-none-elf-gcc (xPack GNU RISC-V Embedded GCC x86_64) 12.2.0
  • no idea, i am not affiliated with the vendor, and I haven't found anything from google searching the english-language internet
  • all vendor documentation (including a quick skim of the chinese-language documentation) seems to just call it XW
  • i am only aware of a single version of this extension (specifying rv32ecxw (for QingKe V2) vs rv32ima{f}cxw (for QingKe V4) doesn't change the opcodes). I don't know anything about future vendor cores.

@thixotropist
Copy link
Contributor

Thanks for the exemplars and the contextual metadata. I'll see about getting them into my exemplars repo.

The certification.manifest file @jobermayr added is needed for building and distributing Ghidra, not for using sleigh to locally support a new processor spec or instruction set extension. It may amount to a public assertion that the contents of your new file are free of other proprietary or licensing claims and can be freely redistributed within Ghidra's Apache license.

If anyone knows how to triage intellectual property rights and licensing from reverse-engineered binaries we could have a fun discussion here. I've only been subpoenaed for a deposition on that topic once, which was enough for me.

@jnk0le
Copy link

jnk0le commented Apr 9, 2024

If anyone knows how to triage intellectual property rights and licensing from reverse-engineered binaries we could have a fun discussion here.

GCC is GPL licensed

@thixotropist
Copy link
Contributor

@jnk0le: gcc and binutils are both safe. The thead, ventana, and sifive vendor extensions contributed to binutils are probably equally safe. I'm unsure about the others.

@ArcaneNibble
Copy link
Author

As someone who has contributed to other projects that involve reverse engineering, I can assure you that the reverse-engineering processes that was done in order to create this specific PR is, to the best of my knowledge (IANAL), completely safe. As you can see from the script included in the zip file posted earlier, I performed the entire process by running controlled inputs through the vendor toolchain and observing the bytes that come out. In no case did I ever load vendor tools themselves into Ghidra or any other disassembler. The bit patterns and pcode in this PR describe only facts and information (and are my own expression of such, not the vendor's), so I do not believe the vendor can make any copyright claims to it. I am less familiar with patent law and did not read any patents while doing this work, but, as this PR doesn't involve implementing a RISC-V core itself, I find it unlikely that there is a legitimate patent claim to... decoding some opcodes for a disassembler.

In any case, I believe this discussion is far off-topic and out of scope when it comes to discussing this PR in question.

@thixotropist
Copy link
Contributor

Does anyone have suggestions on how to name this extension and the processor/language variant within the Ghidra RISCV directory? The wch toolchain provided by @ArcaneNibble names it xw2p2. That implies version 2.2 of the w vendor-specific (x) extension set. The pull request calls the processor/language variant RV32_QingKe - but not all QingKe processors support this versioned extension.

The Ghidra design questions here are well over my head. They get worse when Ghidra opens a linux-next RISCV-64 kernel, which determines the supported extensions at load time then patches its own executable code to optimize things like bit manipulation and encrypted file system support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: Processor/RISC-V Status: Triage Information is being gathered
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RISC-V: Support for WCH/QingKe XW extension
6 participants