Skip to content
This repository has been archived by the owner on Nov 5, 2020. It is now read-only.

Roadmap

Robin Sommer edited this page Feb 23, 2017 · 14 revisions

HILTI/Spicy Development Roadmap

The following collection contains a development roadmap for work on the HILTI/Spicy toolchain. There are a number of items that we will need to finish for an initial production release. In addition, there are many more in the “nice-to-have” category, including many potential performance enhancements. This is a “living document”, it will be updated as work progresses and new items show up.

Given limited resources currently available for moving HILTI/Spicy forward, any help or support is appreciated. If you would like to work on any of this, please make yourself know on the HILTI mailing list. The mailing list is also generally the best place to ask any questions about design & implementation you may have.

Note: The list is not comprehensive, it’s compiled from memory and older notes. We may very likely identify more pieces required for an initial production release than currently listed.

Legend:

  • (!) Blocker for initial production release of Spicy.
  • (WIP) Work on this has started. Ask the mailing list to learn more.

The items are not further sorted.

Installation, Toolchain, and Platform Support

  • (!) Add make install support for installing the right pieces system-wide, including header files for compiling against the libraries
    • Ideally, this includes cleaning up header files first; see the task in "Code Cleanup" below
  • (!) Port to FreeBSD, and generally also 32-bit platforms.
    • Right now only 64-bit Linux and Darwin are supported
  • Potentially port to other platforms as well.
    • I’ve heard about an ARM port already …
  • Investigate alternative libraries for generating ABI-specific function calls.
    • Currently we abuse libffi, and support only exactly the cases we need.
    • llvm-abi looks promising.

Overall Code Cleanup

  • Some parts of the code base aren’t well documented, extend.
  • In hilti/*, as<>/tryCast<> should be checkedCast<>
  • Clean up the include files, it's quite a mess right now what's internal vs public; and what host applications need to include for (1) linking against libhilti and (2) JITing code.

Stability & Robustness

  • (!) Validate the generated LLVM/machine code for irregular behavior.
    • Manual inspection is probably generally infeasible but run it through static and dynamic validation tools.
  • (!) Review HILTI and Spicy runtimes for obvious coding errors.
  • (!) Add comprehensive & consistent error checking to runtime library functions.
  • (!) Evaluate Spicy performance on live traffic.
    • Neither HILTI not Spicy have been exercised with live traffic yet.
    • Needs particular attention to potential processing spikes that can lead to packet drops.
  • (!)Develop process/support for profiling execution times consistently.
    • Necessary for pretty much any further development to understand impact of improvements and catch performance regressions.
  • (!) Setup continuous integration testing and ensure all tests pass on all supported platforms.

HILTI

Compiler

  • Revisit the AST design.
    • It’s overall rather messy, and also seems generally slow to operate on.
  • Improve performance of the various passes, compilation seems really slow currently.

Capabilities

  • Add support for “watch points” that trigger code when a variable changes its value.
  • Add lambda functions that capture the current stack.

Code Generation

  • (!) Emit debug information for gdb/lldb.
  • Emit lifetime markers for stack objects, that should help the LLVM optimizer quite a bit. - In fact, this means we also should not try to compress allocas in the HILTI code generator itself; it seem that LLVM can actually perform better if unrelated allocas remain separate (e.g., for alias analysis).
  • Improve reference counting performance through LLVM stack maps.
  • Emit alias information for LLVM.
    • See LLVM Alias Analysis Infrastructure.
    • This will probably mean annotating each HILTI instruction, and all functions, with alias information on parameters and return values.
  • Generally annotate LLVM bitcode with attributes that LLVM’s optimizer can exploit.
  • (!) Fix caching of compiled code to speed up startup times. Some unctionality is there but, iirc, disabled right now because it doesn’t work quite right. There’s a pull request reenabling it, but I believe more needs to be done.
  • Explore newer LLVM functionality to see what we can leverage.
    • The code remains mostly at the state of LLVM 3.4.

HILTI-level Optimizations

  • (!) Standard code optimization & analysis passes:
    • Constant folding & propagation,
    • Reuse of locals
    • Reuse of constants
    • Dead code removal
    • Common subexpression removal
    • Note: I’m marking these as (!)because without such code cleanup, autogenerated HILTI code quickly becomes pretty bad and unreadable … Some of this might be taken care of at the LLVM-level already but that’s (1) unclear and hard to predict, (2) less powerful because it lacks semantics, and (3) doesn’t help with readability of generated code.
  • Alias & escape analysis for further optimizations
    • Will probably need SSA transformation.
  • Hoist heap types to the stack automatically where possible.
  • Hoist heap types into structs where possible.

Linker

  • Cleanup the linker code.

    • The linker code is generally pretty messy, and it’s doing a lot of the linking work “manually” through llvm::Linker . Investigate porting the linker to RuntimeDyLd or the new lld (I’ve lost track what the right component for dynamic/custom linking is). Ideally we’ll be left with just our custom logic implementing HILTI’s cross-module semantics on top of LLVM’s standard linker.
  • Link-time optimizations:

    • Identify and remove unused struct fields,
    • Identify and remove unused globals.
    • Identify and replace constant function arguments.
    • Support HILTI-level optimizations by aggregation meta information across modules.
  • Infrastructure:

    • Identify all used symbols.
    • Identify all exported symbols, and their types, for optimizations.
    • Create a meta-data framework for communicating knowledge between HILTI linker and compiler.
    • Engage the code generator to emit meta data about usage that the linker can the use.
  • Notes:

    • Could LLVM's new ThinLTO help us collecting cross-module information through it's "summary information"?

JIT

  • (!) Fix caching of compiled code to speed up startup times (see above)

Libhilti

  • Investigate using LLVM’s proposed co-routine support for implementing fibers, once that lands (see this thread).

Spicy

Language

  • Add automatic expiration to containers.
  • Add timers for scheduling code.
  • Extend bytes types with indexing/slicing operators for byte-level manipulations. Consider Python-style semantics. (Example use case: calculating checksums)

Compiler

  • (!) Add comprehensive error checking for input files.
    • Currently, the Spicy compiler accepts a lot of malformed input, and then bails out only much later with obscure errors/crashes
  • Revisit the AST design.
    • It’s overall rather messy, and also seems generally slow to operate on.

Optimization

  • The compiler generally generates quite inefficient code. In particular, it often includes code for functionality that a parser doesn’t actually need. That should be skipped, ideally though through HILTI-level optimizations, potentially with annotation from the Spicy compiler.
    • Actually there's a manual version of this right now, that's also a limitation we need to remove: some advanced unit features are available only if the type is exported. This includes filters and sinks. Availability of features shouldn't be linked with exporting. (And if one forgets the export for such a type, the error is pretty cryptic.)

Bro interface

  • The script compiler does not yet support the when statement.
  • Generally, reconsider how to integrate compiled scripts into Bro.

justrx (HILTI’s custom regex library)

  • (!) Assess alternatives that we could swap in for the custom library.
    • At the time this was written, there was not existing library with the right feature set. That may or may not have changed now. (Note that most likely we will want to stay with DFA-based library.)
  • If we stay with justrx:
    • Clean up the code base (API, documentation, missing features).
    • Add JIT compilation

Documentation

  • HILTI
    • Document execution semantics.
    • Document runtime API.
    • Document host interface.
    • Document builder API.
  • Spicy
    • (!) Document the language.
    • Document runtime API.
    • Document host interface.
    • Document builder API.
  • Developer manual
    • Document the various subsystems.
    • Document coding conventions.
    • Document debugging and profiling support.