Skip to content

mbeardwell/arm-fp-emu

Repository files navigation

Download the report 📄

Faster Dynamically Instrumented Programs

A look at floating-point emulation in ARM Linux

Overview

This is the final submission of my third-year university project at King's College London.

Abstract

This project borrows existing dynamic program instrumentation techniques to propose a faster method of emulating floating-point instructions on Unix-like operating systems than what is provided by the kernel. The proposed method replaces floating-point instructions with branches that indirectly lead to emulation code resident in the same process’ memory. This prevents some execution flow switching into kernel code to run the kernel’s floating-point instruction emulator which theoretically reduces overhead for every instruction emulated.

Introduction

This tool is designed to be built and run on Armel Debian on an ARMv7(-a) CPU. It requires the 'librunt' and 'keystone' git repositories as well as other dependencies.

Installation

To set up the project, clone the repository:

git clone https://github.com/mbeardwell/arm-fp-emu.git
cd arm-fp-emu

Build instructions

  1. Install dependencies
git submodule update --init --recursive
bash install-dependencies.sh

Note: The script automates installation but does not guarantee robustness or safety.

  1. Build
make librunt
make keystone
make arm-fp-emu
make build-tests

This can take hours which is why each must be made separately. If something goes wrong building one, less progress is lost restarting at the point of failure.

Testing

To run tests:

make test

This will run something of the form

LD_LIBRARY_PATH=[...]/lib LD_PRELOAD=./build/arm-fp-emu.so ./tests/build/vadd10 10

LD_LIBRARY_PATH is necessary for how I'm using Keystone, but Keystone could technically be used in a way that removes the need to use this environment variable.

Video Walkthrough

This video provides a step-by-step demonstration of how my floating-point emulation program works using the GNU Debugger (GDB). It walks through a test binary, showing how floating-point instructions are replaced with branches to trampoline code that redirects execution to an emulation routine.

What’s Covered in the Video?

  • Debugging the program using GDB.
  • How floating-point instructions are replaced with branch instructions.
  • The role of trampoline code in redirecting execution.
  • How the emulation routine processes floating-point operations.
  • Why this method is eventually faster than kernel-based instruction emulation.

Watch the project demonstration here:
Watch on YouTube

References

  1. ARM Options. https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html.
  2. Debian ARM Ports. https://wiki.debian.org/ArmPorts.
  3. Debian Ports. https://www.debian.org/ports/index.en.html.
  4. GNU Binutils. https://www.gnu.org/software/binutils/.
  5. The ‘Capstone’ Disassembly Framework. http://www.capstone-engine.org/.
  6. ld.so(8) — Linux manual page. https://man7.org/linux/man-pages/man8/ld.so.8.html, Aug 2021.
  7. ARM Limited. Arm® architecture reference manual: Armv7-a and armv7-r edition. https://developer.arm.com/documentation/ddi0406/cd/.
  8. Fabrice Bellard. QEMU, a Fast and Portable Dynamic Translator. In FREENIX Track: 2005 USENIX Annual Technical Conference, pages 41–46, 2005.
  9. Derek Bruening and Saman Amarasinghe. Efficient, transparent, and comprehensive runtime code manipulation. PhD thesis, Massachusetts Institute of Technology, Department of Electrical Engineering . . . , 2004.
  10. Buddhika Chamith, Bo Joel Svensson, Luke Dalessandro, and Ryan R Newton. Instruction punning: Lightweight instrumentation for x86-64. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 320–332, 2017.
  11. Syed Zohaib Gilani, Nam Sung Kim, and Michael Schulte. Virtual floating-point units for low-power embedded processors. In 2012 IEEE 23rd International Conference on Application-Specific Systems, Architectures and Processors, pages 61–68, 2012.
  12. Gabriel Ferreira Teles Gomes and E Borin. Indirect branch emulation techniques in virtual machines. PhD thesis, Dissertation, University of Campinas, 2014.
  13. Mingyi Huang and Chengyu Song. ARMPatch: A Binary Patching Framework for ARM-based IoT Device. Journal of Web Engineering, pages 1829–1852, 2021.
  14. Peter B Kessler. Fast breakpoints: Design and implementation. ACM SIGPLAN Notices, 25(6):78–84, 1990.
  15. Michael A. Laurenzano, Mustafa M. Tikir, Laura Carrington, and Allan Snavely. Pebil: Efficient static binary instrumentation for linux. In 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS), pages 175–183, 2010.
  16. Arm Ltd. ARM Architecture Reference Manual Debug supplement: Breakpoint Debug Events. https://developer.arm.com/documentation/ddi0379/a/Debug-Events/Software-Debug-events/Breakpoint-Debug-events.
  17. Arm Ltd. ARM Compiler - Arm C and C++ Libraries and Floating-Point Support User Guide Version 6.11: About floating-point support. https://developer.arm.com/documentation/100073/0611/floating-point-support/about-floating-point-support.
  18. Arm Ltd. What is RISC? https://www.arm.com/glossary/risc.
  19. Nicholas Nethercote and Julian Seward. Valgrind: a framework for heavyweight dynamic binary instrumentation. ACM Sigplan notices, 42(6):89–100, 2007.
  20. Ignacio Sanmillan. Executable and Linkable Format 101 - Part 1 Sections and Segments. https://www.intezer.com/blog/research/executable-linkable-format-101-part1-sections-segments/.

About

Final-year BSc project on CPU performance

Resources

Stars

Watchers

Forks

Languages