We explored the Python data visualization ecosystem by selecting the most commonly used open-source libraries and testing them in a set of 10 standard use cases. The packages were evaluated on their richness of features, capabilities for interaction (primarily in a Jupyter notebook environment), project sustainability, documentation, and performance.
Our findings are best represented as a ranked list:
-
matplotlib (+ipyvolume)
:matplotlib
is the most mature and well-established project, with the largest community/user base, and great case coverage. Interaction with Jupyter widgets works well. However, 3D performance is poor, and it should be coupled to a 3D-specific library (we foundipyvolume
to be a very good candidate). -
plotly
: best use case coverage (including 3D), excellent interactivity, great performance, has aDash
platform for creating in-browser apps. It could benefit from someDatashader
-like functionality for large datasets. Performance is however poor when back-and-forth communication is required between the plot and the Jupyter Kernel, as is the case for slider widgets, for example. -
HoloViz
: very similar toplotly
in case coverage, very high performance (especially via theDatashader
method), makes interactions much easier to implement thanbokeh
, also has aPanels
utility for apps, but is a young project that still feels scattered over different sub-projects, not as unified asplotly
. -
bokeh
: good case coverage, many interactions possibilities, but lacking 3D visualization, and interactions with buttons/sliders must be implemented in javascript. -
pyqtgraph
: outstanding performance, 1 to 3D under a unified interface, wide range of interactions, but only one developer, project feels un-finished (especially 3D graphs), and integration in Jupyter notebooks is poor. -
bqplot
: focuses on interactions (every item in a plot is a clickable/draggable widget), the diversity of which, unfortunately, never makes up for the sad performance. It would be worth re-visiting this young project in a year's time.