Skip to content

sheinbergon/dremio-udf-gis

Repository files navigation

GitHub Github Workflow Status GitHub release (latest by date) Maven Central Coveralls Liberapay

Dremio Geo-Spatial Extensions

What you get

  • Widespread OGC implementation for SQL (adheres to PostGIS standards)
    • Supported input formats: WKT, WKB (HEX or BINARY)
      • Supported output formats: WKT, WKB, GeoJSON
  • Easily installable Maven-Central/Github artifacts shaded jar artifact
  • Dremio CE version compatibility (new versions will be released with each community edition)
  • Up-2-date Proj4J & JTS geometry based implementation

Sponsorship

Enjoying my work? A show of support would be much obliged 😁

  

Installation

  • Take the shaded jar for the desired version and place inside your Dremio installation ($DREMIO_HOME/jars/3rdparty)
  • Restart your Dremio server(s)
  • Rejoice! (and see the WIKI for detailed usage instructions)

Version Compatibility

Actively Maintained
Library Version Dremio Version
0.12.x 24.3.x
0.14.x 25.0.x
Legacy
Library Version Dremio Version
0.2.x 20.1.x
0.3.x 21.1.x
0.4.x 21.2.x
0.5.x 22.0.x
0.6.x 22.1.x
0.7.x 23.0.x
0.8.x 23.1.x
0.9.x 24.0.x
0.10.x 24.1.x
0.11.x 24.2.x

Usage Notes

As opposed to PostGIS, Dremio is only a query engine based on existing/projected data sources/lakes.
That means that Geometry is not a natively supported data type, and you can only access it if
it's being properly projected from the data sources (For example, PostGIS Geometry is read as an EWKB HEX encoded string).

In order to successfully use the provided GIS functions, you must first make sure the geometry is in WKB (BINARY) format. If it's not, you need to decode it:

  • if the input is in WKT format, use ST_GeomFromText
  • if the input is a HEX encodedWKB, use Dremio's FROM_HEX

This library uses Dremios' Arrow buffers (ArrowBuf) to maintain geometry data in binary (WKB) format (for performance and efficiency)
when interchanging it between GIS functions, which is of course undecipherable for the naked eye. When running queries from the UI,
WKB output will always be base64 encoded.

In order to resolve Data back to human-readable format (WKT), use ST_AsText/ST_AsGeoJson

Example:

SELECT ST_AsText(
    ST_Difference(
      ST_GeomFromText('LINESTRING(50 100, 50 200)'),
        ST_GeomFromText('LINESTRING(50 50, 50 150)')
))

Roadmap

  • Frequent version/dependency updates
  • Add more OGC/PostGIS matching functionality
  • Add Geography type support

Noteworthy Mentions

Work in this repository was originally based on the following sources: