- Introduction
- Features
- Motivation
- Basic PDF Syntax
- Attack
- Defense
- Usage
- References & Relative News
- Contributions
PDF Shield is a Python-based tool designed to detect and mitigate potential Denial of Service (DoS) attacks and embedded JavaScript threats within PDF files. By analyzing PDF structures, it helps users identify malicious content that could compromise system security.
-
Automated PDF Monitoring: Real-time scan of downloaded PDFs for potential DoS or malicious JavaScript.
-
Drag & Drop: Standalone executable supports drag-and-drop scanning on Windows.
-
Customizable Alerts: Pop-up notifications inform of embedded JS, infinite loops, deflate bombs.
-
Extensible: Easily add new detection rules via a modular plugin architecture.
-
User-Friendly Interface: Simple command-line & GUI interface for ease of use.
-
Cross-Browser Defense: Focused on PDF engines in Chrome, Edge, Brave (PDFium-based). However, our detection methods cover most common risks in PDF.js (Firefox) too.
PDF attacks, particularly as a method within social engineering attacks, have seen a significant increase in occurrence. Cyber adversaries exploit the flexibility of PDF files, often leveraging JavaScript customization to target unsuspecting users. Despite attempts to address vulnerabilities, built-in PDF reader engines in modern browsers remain vulnerable. To mitigate risks, the PDF DoS Detector aims to reduce the number of victims by alerting users to potential DoS attack methods found within PDF files.
You can gain valuable insights into PDF syntax by watching this informative video titled TROOPERS15 Ange Albertini and Kurt Pfeifle - Mastering Advanced PDF Techniques.
Here is some crucial information in the video that will help you understand what this project is all about:
-
A PDF's body section consists of objects, commencing with
<number> <generation> obj
and concluding withendobj
. -
Here's how object references work:
NOTED: Name objects begin with a forward slash (
/
), and the letter within can be represented in hexadecimal notation!!!
The information provided above may not be sufficient once we open the embedded.pdf
file, which will be generated by following the steps outlined in the "Basic Attack Method" section, using a text editor such as VSCode.
You will probably observe two occurrences of /JavaScript
within the PDF document:
- The first occurrence of this can be found within an object like the one below, denoting the moment at which the
/JavaScript
object will be run.... 3 0 obj << /Type /Catalog /Pages 1 0 R /Names << /JavaScript << /Names [ (41d4efc4\055d000\05546e4\055a973\0556f92c8bbd0f7) 29 0 R ] >> >> >> endobj ...
- The second one will appear prior to the
xref
section, instructing the PDF to execute the subsequent text as JavaScript code. Here's a more detailed explanation of what it accomplishes.... 29 0 obj << /Type /Action /S /JavaScript /JS (\012\040\040\040\040app\056alert\050\042Hello\054\040World\041\042\051\073\012) >> endobj xref <-- This is the beginning of xref part ...
- Victim opens malicious PDF document
- Bad things happen (attack-dependent)
- No user interaction required
Take JavaScript embedded attack as example:
-
Run
pip install PyPDF2
in the terminal. -
Next, use the
.add_js()
method of thePyPDF2
library to create a Python script:import PyPDF2 def embed_javascript(pdf_file, js_code): pdf_reader = PyPDF2.PdfReader(pdf_file) pdf_writer = PyPDF2.PdfWriter() for page in pdf_reader.pages: pdf_writer.add_page(page) pdf_writer.add_js(js_code) with open('embedded.pdf', "wb") as f: pdf_writer.write(f) javascript_code = ''' while(1){ app.alert("Hello, World!"); } ''' pdf_file_path = 'blank.pdf' with open(pdf_file_path, 'rb') as pdf_file: embed_javascript(pdf_file, javascript_code)
-
Please ensure that you run the Python file you've recently generated.
Don't forget to update the
FILE_NAME
accordingly! -
Open the
embedded.pdf
file in the listed web browsers to verify that they trigger an alert window, confirming the successful execution of the embedded JavaScript code within the PDF.
- The user downloads a potentially malicious PDF.
- The tool conducts an automated scan on the downloaded PDF, presenting the results through a user-friendly pop-up window.
- The user is empowered to make informed decisions, with options to either eliminate identified vulnerabilities within the PDF or proceed with opening it.
- Note: The following chart lists CVE information specifically related to PDFium. While it might apply to other PDF engines, our project focuses on creating a defense tool for current web browsers using PDFium, like Chrome, Brave, and Edge. Examples include CVE-2023-41257 (Foxit Reader 12.1.2.15356), CVE-2023-38573 (Foxit Reader 12.1.2.15356), and CVE-2022-39016 (PDFtron in M-Files Hubshare before 3.3.10.9).
Description | Defence Method | Related CVEs or Papers |
---|---|---|
JS runs stored XSS payload | Notice user there's JS embedded in the PDF | CVE-2023-45207 |
Remote attackers use JS to cause DOS | Notice user there's JS embedded in the PDF | CVE-2012-2844 |
Execute arbitrary JavaScript code with chrome privileges | Notice user there's JS embedded in the PDF | CVE-2013-5598 |
XSS created by injected JS | Notice user there's JS embedded in the PDF | CVE-2007-0045 |
Infinite loops caused by JavaScripts | Notice user there's JS embedded in the PDF | CVE-2007-0104 |
Sharing of objects over calls into JavaScript runtime | Notice user there's JS embedded in the PDF | CVE-2019-5772 |
Form Modification caused by JavaScripts | Notice user there's JS embedded in the PDF | Shadow Attacks: Hiding and Replacing Content in Signed PDFs |
- This project alerts users when it finds JavaScript code for two reasons. Firstly, many attacks are connected to JavaScript, according to Spider Experts. Secondly, creating a responsible PDF doesn't need JavaScript; there are built-in Named Objects that support responsible actions. JavaScript is only necessary if the PDF relies solely on it, for example, detecting keystrokes or playing videos without using YouTube or other online services.
Description | Defence Method | CVEs |
---|---|---|
Caused by the Named Object "/Kids" | Notice user there's infinite loop in the PDF | CVE-2007-0104 |
Action loop caused by "/Next" | Notice user there's infinite loop in the PDF | CVE-2007-0104 |
Object streams may extend other "/ObjStms" | Notice user there's infinite loop in the PDF | CVE-2007-0104 |
Outline entries ("/Outlines") can refer to each other | Notice user there's infinite loop in the PDF | CVE-2007-0104 |
Incorrect object lifecycle | Notice user there's infinite loop in the PDF | CVE-2018-18336 |
Incorrect object lifecycle | Notice user there's infinite loop in the PDF | CVE-2018-17481 |
Description | Defence Method | Related CVEs |
---|---|---|
Heap buffer overflow | Notice user there's a posiblity to have a deflate bomb in the PDF | CVE-2020-6513 |
PDFium does not properly handle certain out-of-memory conditions | Notice user there's a posiblity to have a deflate bomb in the PDF | CVE-2015-1271 |
git clone
this repository and don't forgot to runpip install -r requirements.txt
.- Execute the
main.py
file. - And now download a PDF file.
- Sit back, relax, and wait for the scanning process to be completed.
- Download the
PDF Shield
zipped file located in theoutput
directory. - Unzip it on your device.
- Locate the
PDF Shield.exe
in the unzipped folder and right-click on it toCreate a Shortcut
on your Desktop. - Drag-and-drop the PDF you want to scan onto the icon.
- Sit back, relax, and wait for the scanning process to be completed.
- Download the
PDF Shield
zipped file located in theoutput
directory. - Unzip it on your device.
- Double-click the
PDF Shield.exe
in the unzipped folder to start the scanning program. - Now, download a PDF file.
- Sit back, relax, and wait for the scanning process to be completed.
- PDF101
- [TROOPERS15] Ange Albertini, Kurt Pfeifle - Advanced PDF Tricks
- Artifacts for "Portable Document Flaws 101" at Black Hat USA 2020
- CVE searching results for "PDF"
- Malicious PDFs | Revealing the Techniques Behind the Attacks
- Common Tactics Used by Threat Actors to Weaponize PDFs
- How can I extract a JavaScript from a PDF file with a command line tool?
- Threat-Loaded: Malicious PDFs Never Go Out of Style
- One of the easiest and most powerful ways to customize PDF files is by using JavaScript.
- Adobe Reader and Acrobat JavaScript Vulnerabilities
- Hackers Use Weaponized PDF Files to Attack Manufacturing, and Healthcare Organizations
- 66% of malware delivered via PDF files in malicious emails: Report
- How to protect yourself from the Adobe Reader PDF JavaScript Vulnerability
- PDFium
- Portable-Document-Flaws-101
Contributions to the PDF DoS Detector are welcome. Whether it's bug fixes, feature enhancements, or other improvements, feel free to contribute to make the tool more effective in protecting users from PDF-based DoS attacks.
Stay secure, and happy browsing!