This toolkit automates an end-to-end workflow for discovering and validating vulnerabilities in re-hosted IoT firmware:
-
Fuzzing-Driven Information Collection
-
Extract all target URIs and input parameters.
-
Locate sink functions and compute distances from basic blocks to sinks.
-
Produce mutation dictionaries for the fuzzer.
-
Instrumented QEMU binaries collect control-flow data during fuzzing.
-
Optional: Build emulation images directly with a helper script.
-
Generic, device-aware API fuzzer with configurable
Hostheader. -
Uses the pre-fuzzing artifacts to generate requests and mutations.
-
-
Taint-to-PoC Agent
- LLM-assisted taint analysis to findings vulnerabilities and produce PoCs.
-
Python: 3.8+ Install Python dependencies:
pip install -r requirements.txt -
Emulation: Greenhouse (Details can be found in https://github.com/sefcom/greenhouse.git)
- We ship instrumented binaries under
FirmAgent/FuzzingRecord/gh3fuzz/fuzz_bins/qemu/. - Control-flow collector:
libibresolver.so.
- We ship instrumented binaries under
-
Target firmware: Obtain and re-host the firmware image
-
Setting Environment Variables:
export idat64 = path/to/idat64
export OPENAI_API_KEY = {OPENAI_API_KEY} or export Deepseek_API_KEY = {Deepseek_API_KEY}
- Run pre-fuzzing in ida (extract inputs & sinks):
These generate:
Pre_fuzzing.json(API endpoints + request templates)
- Build an emulated image (optional, automated wiring):
python build_fuzz_img.py
This integrates the instrumentation and supporting binaries into the emulation image.
- Start fuzzing (from the host):
python fuzzer.py \
--json-file Pre_fuzzing.json \
--delay 0.5 \
--host {target_ip_or_domain}
- Run taint analysis to generate PoCs:
python LLMATaint.py \
-b {path_to_binary} \
-p {True|False} \
-t {vuln_type} \
-o {path_to_resultfolder} \
-m {model}
During fuzzing, a Source address list and an indirect call JSON are produced—place these files in the same folder as the target binary before running
LLMATaint.py.
Goal: Collect everything needed to drive effective mutations and triage.
- URIs & Parameters —
get_args.pyExtracts all API endpoints and “best-effort” parameter templates. - Sink Functions —
Get_sinkFunc.pyIdentifies sink function address ranges (e.g., dangerous APIs, command execs). - Distances —
Distance.pyComputes the distances between each basic block and sink locations. Distances are helpful to prioritize inputs that are “closer” to sensitive sinks.
Outputs:
Pre_fuzzing.json(API and parameter set for fuzzing)
We provide an instrumented QEMU stack and a control-flow recording library:
- Location:
FirmAgent/FuzzingRecord/gh3fuzz/fuzz_bins/qemu/libibresolver.socollects control-flow events.
- Recommended path:
- Use
build_fuzz_img.pyto assemble the emulation image. - Instrumentation and collectors are automatically integrated.
- Use
- Customization:
You can extend
qemuaflto log additional runtime signals. We provide our Fuzzing-SA source for reference (see repo).
Run the generic fuzzer against your re-hosted device:
python fuzzer.py \
--json-file Pre_fuzzing.json \
--delay 0.5 \
--host {target_ip_or_domain}
--json-file— Pre-fuzzing output containing API definitions.--delay— Inter-request sleep to avoid rate-limits / WAFs.--host— Value for theHostheader (supports SNI / vhosts).
What happens:
- The fuzzer parses
Pre_fuzzing.jsonand parameter templates. - For each endpoint & parameter, it generates mutations (e.g., command-injection, XSS, traversal).
- Requests and responses are logged to
fuzzing_results.log; structured details todetailed_results.json. - Potential vulnerabilities are flagged based on error codes, timing, and content indicators.
Once fuzzing surfaces suspicious behavior, validate and transform it into concrete PoCs:
python LLMATaint.py \
-b {path_to_binary} \
-p {True|False} \
-t {vuln_type} \
-o {path_to_resultfolder} \
-m {model}
Important:
Place the Source addresses file and the indirect call JSON produced during fuzzing into the same directory as the target binary before running LLMATaint.py. This lets the agent align runtime signals with code structures for precise taint propagation and PoC crafting.
An array of API objects. Minimal fields required by fuzzer.py:
[
{
"api_url": "/nitro/v1/config/example_endpoint",
"http_method": "POST",
"request_payload": "{ \"username\": <String_value>, \"cmd\": <String_value>, \"enabled\": true }"
}
]
api_url: Relative path joined with--base-url.http_method:GET,POST,PUT,DELETE, …request_payload: A JSON string containing placeholders:<String_value>indicates a fuzzable string parameter.- Boolean literals
true/falseare supported and extracted but (by default) only string-type placeholders are fuzzed.
fuzzing_results.log— Human-readable progress and warnings.detailed_results.json— Line-delimited JSON of each attempt:Source.json— source addressIndirect_call.json— (caller, callee)
fuzzer.py CLI:
--json-file (str, required) Path to Pre_fuzzing.json
--delay (float, default 0.5) Seconds between requests
--host (str, required) Host header value (IP or DNS name)
Headers (inside send_request):
Hostis set from--host.Content-Type: application/jsonby default.- Adjust or extend headers as needed for auth, cookies, etc.
Payload dictionary (default built-ins):
buffer_overflow– long'A' * 60command_injection_*–|,;,&, backticks,$()formsxss_payload– basic<script>alert(1)</script>path_traversal–../../../etc/passwd
Extend
self.payloadsinAPIFuzzer.__init__to add more patterns.
- Rate limiting / IDS: Tune
--delay, random jitter is already built-in between endpoints. - HTTPS: If you hit certificate issues, you can allow
verify=False(already used for non-GET). Prefer installing proper CA bundles in production assessments. - Virtual hosts / SNI: Always pass
--hostwhen the server expects a specificHostheader different from the IP. - Schema hygiene: Keep
Pre_fuzzing.jsonvalid. The fuzzer does some lightweight repair, but broken JSON templates will reduce coverage. - Narrow scope rapidly: Use sink distance data to prioritize endpoints closer to danger zones.
- Artifact hygiene: Keep fuzzing artifacts (source addresses, indirect call JSON) side-by-side with binaries before launching
LLMATaint.py.
No API definitions found: EnsurePre_fuzzing.jsonexists and is valid JSON. Run pre-fuzzing scripts again.No parameters foundfor an endpoint: Make sure yourrequest_payloadstring includes<String_value>placeholders for fuzzable fields.Request error: .../ timeouts:- Verify network reachability from the host to the emulated firmware.
- Increase timeouts in
send_requestif needed. - Check target service actually listens on the base URL.
- Target requires auth:
Add cookies/headers/tokens in
send_request()or pre-set inpara_results.json. You can also add anAuthorizationheader. - Instrumentation not recording:
Prefer using
build_fuzz_img.py. If doing it manually, confirm your loader paths and (if applicable)LD_PRELOAD/QEMU plugins are active. Ensurelibibresolver.sois discoverable by the emulated environment. - LLMATaint cannot find source/indirect call files:
Place both artifacts in the same directory as the target binary passed via
-b.
Q: How do I add new payloads (e.g., SQLi)?
A: Extend self.payloads in APIFuzzer.__init__. Add detection logic (error signatures, timing, reflections) in analyze_response().
Q: Can I parallelize fuzzing?
A: The provided script is single-process to preserve ordering and throttling. You can shard Pre_fuzzing.json by endpoints across multiple processes/containers.
Q: Where do logs go?
A: Human logs → fuzzing_results.log; structured artifacts → detailed_results.json.
Build emulation image (optional):
bash
python build_fuzz_img.py
Run fuzzing (host):
python fuzzer.py \
--json-file Pre_fuzzing.json \
--delay 0.5 \
--host 192.168.0.1
Run taint-to-PoC agent:
python LLMATaint.py \
-b ./bin/httpd \
-p True \
-t ci \
-o ./results/ASUS \
-m R1