Skip to content

Checks your files for existence of Unicode BIDI characters which can be misused for supply chain attacks. See CVE-2021-42574

License

Notifications You must be signed in to change notification settings

maweil/bidi_char_detector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BIDI Character Detector

This tool checks your files for existence of Unicode BIDI characters which can be misused for supply chain attacks to mitigate CVE-2021-42574. This tool was written in Rust and is distributed as a small (< 3MB) docker compatible container to allow fast and easy usage.

For an explanation of the attack, have a look at GitHub's blog entry or the original paper where the attack was published.

Installation

This package is mostly intended to be used via it's docker container. But local installation is possible of course. Compilation requires at least Rust 1.56.0 because it uses the Rust 2021 Edition. Clone this repository and run cargo install --path . to install the binary for your current user. After that, you can invoke the bidi_detector command.

Usage

Running the tool via it's official docker container is probably the easiest way to get started. To run it via docker, the following command should work to scan all files inside your current working directory:

docker run --rm -it -v $(pwd):/data ghcr.io/maweil/bidi_char_detector:latest

Depending on your system you may have to adapt this command slightly. If you use e.g. podman and have SELinux enabled, try the following command instead:

podman run --rm -it -v $(pwd):/data:Z ghcr.io/maweil/bidi_char_detector:latest

Configuration

By default, all files will be checked. If you have binary files inside the current directory, the command will fail because it can't decode a non-UTF8 encoded file. To adapt the command to your needs, place a file called bidi_config.toml inside the root of your project. You can find an example for it in this repository, see an example below. The options will be described in more detail below the example:

[general]
includes = [ 
    "src/**/*",
    "**/*.patch",
    "**/*.json",
    "**/Dockerfile",
    "test/*.js"
]
excludes = [
    ".git/*",
    "target/*"
]

[display]
show_details = true

General Settings

This section includes two arrays (includes and excludes) where you can specify patterns of files to be scanned (or to be excluded from the scan). Please make sure your patterns actually match the files inside the directory, not the directory name itself, otherwise your files will not be scanned. If you want to scan all files and only exclude e.g. your .git directory, the following configuration would do the trick:

[general]
includes = [ 
    "**/*"
]
excludes = [
    ".git/*",
]

[display]
show_details = true

If you want to intead explicitly define which folder contains your source files, the following configuration example would scan all files in the src directory (without ignoring anything):

[general]
includes = [ 
    "src/**/*"
]
excludes = [
]

[display]
show_details = true

Display Settings

The following settings are available in this section:

Setting Description Optional Default Introduced in
show_details Decides whether to print out line/pos and type of the detected BIDI character if found (see examples below) No - v0.1.1
ignore_invalid_data Whether to print an error message when binary/non-UTF8 files are detected Yes true v0.1.2
verbose Whether to print the filename even if no suspicious characters were found Yes true v0.1.2

Example: show_details = true, verbose = true

src/lib.rs - 0 BIDI characters
src/main.rs - 0 BIDI characters
test/example-commenting-out.js - 6 BIDI characters
Found character RLO (Right-to-Left Override), test/example-commenting-out.js:4:3
Found character LRI (Left-to-Right Isolate), test/example-commenting-out.js:4:7
Found character PDI (Pop Directional Isolate), test/example-commenting-out.js:4:20
Found character LRI (Left-to-Right Isolate), test/example-commenting-out.js:4:22
Found character RLO (Right-to-Left Override), test/example-commenting-out.js:6:20
Found character LRI (Left-to-Right Isolate), test/example-commenting-out.js:6:24
Found 6 potentially dangerous Unicode BIDI characters!

Example: show_details = false, verbose = true

src/lib.rs - 0 BIDI characters
src/main.rs - 0 BIDI characters
test/example-commenting-out.js - 6 BIDI characters
Found 6 potentially dangerous Unicode BIDI characters!

Example: show_details = false, verbose = false

test/example-commenting-out.js - 6 BIDI characters
Found 6 potentially dangerous Unicode BIDI characters!

Credits

All credits for detecting the attack including the list of relevant BIDI characters go to the original authors of the corresponding paper. Please cite their original paper when building on their work.

The file test/example-commenting-out.js in this repository is a copy of commenting-out.js in their original repository. It's licensing follows the original repository (MIT License) It is used for test purposes only here.

@article{boucher_trojansource_2021,
    title = {Trojan {Source}: {Invisible} {Vulnerabilities}},
    author = {Nicholas Boucher and Ross Anderson},
    year = {2021},
    journal = {Preprint},
    eprint = {2111.00169},
    archivePrefix = {arXiv},
    primaryClass = {cs.CR},
    url = {https://arxiv.org/abs/2111.00169}
}