Skip to content

Presto Community Roadmap Discussion April 6, 2017

Rao Yendluri edited this page Jun 11, 2017 · 3 revisions

Attendance: Facebook, Teradata, Twitter, Jet, Walmart, Netflix, Bloomberg, Turbine/WB,Innominds & others (please add your company name here!)

Facebook (Martin Traverso)

1. Overview of what's new in Presto

*all community contributions since the last roadmap meeting over a year ago)

Data types: VARCHAR(n), DECIMAL, REAL, TINYINT, SMALLINT, INTEGER, CHAR(n)

Improved ROW type

Language:

  • Support for correlated subqueries
  • EXISTS
  • INTERSECT/EXCEPT
  • Quantified comparisons
  • Selective aggregates
  • Non-equality outer joins
  • Lambda expressions

CREATE TABLE IF NOT EXISTS

Prepared statements

EXPLAIN ANALYZE GRANT/REVOKE SHOW CREATE [TABLE | VIEW] CREATE/DROP/RENAME SCHEMA

Implicit coercions for INSERT

Performance improvements:

  • JOIN/GROUP BY
  • window functions
  • approx_percentile, map_agg, array_agg
  • local parallelism
  • communication between coordinator-worker
  • adaptive concurrency
  • intra-node parallelism for intermediate stages

Pluggable event listeners

Resource groups

Hive connector

  • Kerberos support
  • Optimized RCFile reader
  • Transactional DELETE+INSERT
  • Writing to bucketed tables
  • Support for mismatched table/partition schema

Cassandra connector improvements

New connectors

  • MongoDB
  • Accumulo

2. Upcoming / planned features

Language:

  • GROUPING
  • Ordered aggregations

Optimizer Spill to disk

ZSTD support for ORC reader/writer Optimized RCFile writer Optimized ORC writer

Bucket-by-bucket execution

Performance and scalability improvements

  • Coordinator CPU utilization
  • Structural types
  • Intra-task scheduling/prioritization
  • Memory-aware scheduling
  • HTTP/2

Connectors

  • TPC-DS
  • Thrift
  • SQL Server

SSL-based authentication

Also Vaughn indicated FB is working on concrete plans for increasing PR throughput.

Teradata (Kamil Bajda-Pawlikowski)

1. Functionality delivered in the past year

*some may not be yet merged upstream but they are already shipped in the Teradata distro (www.teradata.com/presto)

Connectors

  • Cassandra (major improvements)
  • MS SQL Server
  • TPC-DS
  • Memory

Security

  • Kerberized Hadoop support
  • LDAP Authentication
  • GRANT / REVOKE / ROLES

Misc

  • JDBC & ODBC by Simba (free to use)
  • BI Tools support & certifications

SQL syntax

  • DECIMAL, REAL, CHAR
  • Subqueries
  • EXISTS / EXCEPT / INTERSECT
  • Non-equi joins
  • grouping()
  • Prepared statements

Performance

  • Windowing functions
  • Joins

2. Roadmap for 2017

Spill to disk

  • Aggregation, Join, and other operators

Statistics for Hive connector

  • Collected in Hive Metastore

Execution Engine performance

  • Distributed sorting
  • Runtime filtering

Cost-Based Optimizer

  • Selectivity / cost calculator
  • Join ordering
  • Join distribution type selection

Twitter (Bill Graham)

LZO Thrift (Persisted)

  • The vast majority of data on Twitter’s internal HDFS are in LZO Thrift format.
  • Working on adding support as a new input format in the Presto-hive module.

Security - start up UI on a different port

  • Presto currently uses a single port for all HTTP communications (e.g. UI, JMX, client API, coordination, query execution). It would be helpful to be able to configure the use of different ports to have more granular control of firewall settings.
  • Launch new module in a new port and bind the UI and related instances from the main app into new module
  • https://github.com/prestodb/presto/pull/7106

Nested Schema evolution/push down for Parquet

  • For non-primitive fields, Presto currently compares the type in a particular partition to what's in HMS. It doesn't support the evolution on those fields. If we can pass the type to the specific reader and give the chance that allows reader to figure out the conversion from partition schema to table schema, it will reduce much cost of maintaining the metadata(for example, data with old schema for a table, data with new schema for another table) or reproducing old data with a new schema.
  • Depends on https://github.com/prestodb/presto/pull/4714
  • Yaliang’s patch: https://github.com/prestodb/presto/pull/7305
  • Pruning nested fields: https://github.com/prestodb/presto/pull/6675

Dain (FB) commented: "Nezih is working on getting everyone focused on the new optimized parquet reader to get rid of the old one and then getting all the patches together"

Tableau connection

  • The issue with Tableau support is with the secure connection. The tricky part is Presto embeds the next url to get the next chunk of result in the response and in http and sneaks through the proxy.
  • Connection via JDBC is closed source in Teradata

Improved resource pooling

  • Scheduled Zeppelin queries are starting to hog the cluster, after midnight. We're exploring how to best mitigate that effect and will be looking into Presto resource groups functionality.

Bloomberg (Adam Shook)

interested in leveraging stats for the accumulo connector

Walmart (???)

  • Using Swift api connector to connect to cloud storage.
  • also using Teradata JDBC drivers

Miscellaneous Questions

  • Are there plans to implement WholeStageCodeGeneration?

TD has a prototype from a hackathon, we'll probably be working on that in the second half of the year

  • Are we planning to work on supporting OR predicates in tuple domain in addition to AND?

There's an open PR, we may do it that way, or we may expose whole expression tree to connectors

  • Is concern about extending the SPI still holding back support for nested data types?

We may be able to do it without changing the SPI by creating virtual columns, and then from the API's perspective it's just a regular column

  • Planning time increase from 0.160 for table with 600 columns

Please try latest version because we had some regressions that we fixed. File an issue on github and we'll look at it.

  • Can we have hooks to determine the number of splits based on the number of worker threads?

splits are sized so that they're not too small (so you don't spend tons of cpu time setting up the computation and not doing work) and not too large (Because don't want high latency for query). For the hive connector we have a heuristic of starting with small splits and then increase the size in order to help short running queries.

  • Where's the Raptor documentation? (Lawrence from Jet)

We're rewriting it to be transactional. And it won't be backwards compatible. Once there's a new version out there and using it in production and we have a story for upgrading, etc. then we'll document. It could still be used now as a cache (because then you don't care that it's not backwards compatible)

Clone this wiki locally