Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple from matplotlib and add Bokeh for backend #57

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

Gurnek
Copy link
Collaborator

@Gurnek Gurnek commented Mar 1, 2020

This PR removes the dependency on matplotlib for creating plots. I decoupled the functions that created matplotlib figures and moved them into their own class, and had the strategy classes detect which library will be used for the backend. For each supported backend, there is a class of functions that creates a grid of figures specific to the backend library being used. For example, bokeh is the first additional backend that is being supported, and it has its own class in the backends folder that contains functions for easily adding new figures in an organized manner according to the strategy that was selected initially. This is easily extensible and now we can continue to add support for other graphing libraries in the future.

@Gurnek Gurnek requested a review from pganssle March 1, 2020 00:36
@lgtm-com
Copy link

lgtm-com bot commented Mar 1, 2020

This pull request introduces 3 alerts when merging 0ea4204 into db12cdf - view on LGTM.com

new alerts:

  • 3 for Unused import

@pganssle
Copy link
Member

pganssle commented Mar 1, 2020

Hm.. Test failures all seem to relate to coverage. I think that this might be an issue with coverage 5.0 changing the report format.

I think it's to do with this line.

What happens if we drop it?

@lgtm-com
Copy link

lgtm-com bot commented Mar 1, 2020

This pull request introduces 3 alerts when merging 69feff7 into db12cdf - view on LGTM.com

new alerts:

  • 3 for Unused import

Copy link
Member

@pganssle pganssle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm liking this a lot so far, great job!

I've made some comments about how this can be improved.

I also think we desperately will need some tests, especially now. It's super hard to keep everything in line correctly when you have more than one backend without tests.

Maybe @tacaswell or @story645 or one of the other Matplotlib maintainers (you can see my terrible NYC bias in pinging people) can give you some pointers on good ways to test image-based libraries like this without making tests that are crazy sensitive to accidental properties like the exact pixel-by-pixel arrangements of things.

)
self.library = Bokeh(alignment=self.alignment)

# elif backend == "plotly":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally recommend against checking in commented-out code, we can move this to another branch or a later commit, no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, i didnt want to get rid of it so I didn't forget how I implemented it before hand. It had a bug that was difficult to understand so I just dropped it to implement bokeh instead.

return ax_specs


# class Plotly:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this all commented out? Does it not work yet?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, I will work on it later.

try:
import bokeh
except ImportError:
print(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should raise your own ImportError here, though actually I'm thinking that we can move this stuff out of the constructor entirely.

Considering my other suggestion, to take a GridStrategyBackend or whatever class (I spent literally no time thinking of that name) instead of a string, I think you can do this at module level:

DEFAULT_BACKEND = None
try:
    from .matplotlib_backend import MatplotlibBackend
    DEFAULT_BACKEND = MatplotlibBackend
except ImportError:
    pass

if DEFAULT_BACKEND is None:
    try:
        from .bokeh_backend import BokehBackend
        DEFAULT_BACKEND = BokehBackend
    except ImportError:
        pass

if DEFAULT_BACKEND is None:
    raise ImportError("Could not find a suitable backend - please install either matplotlib or bokeh")

Then have the constructor take backend=DEFAULT_BACKEND.

In the future we can rework this with importlib and a looping over a list of supported backends, but I figured this is a reasonable start when there's just the two.

@@ -15,8 +18,38 @@ class GridStrategy(metaclass=ABCMeta):
nearly square (nearly equal in both dimensions).
"""

def __init__(self, alignment="center"):
def __init__(self, alignment="center", backend="matplotlib"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think instead of taking a string that says what the backend is, we should define an abstract base class and/or a Protocol and take that.

This will allow anyone to write their own backends for it without needing to submit them upstream.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I notice that the backend objects take "alignment" and so does GridStrategy, which seems wrong, no?

I think if anything the alignment should be passed to the relevant backend objects' functions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was under the impression that Python didn't have this kind of static typing stuff. This is pretty cool. I'll look into it if I have time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't have to worry too much about the protocol stuff. Make it an ABC to start with to organizer your thinking - the point is to advertise exactly what parts of the interface are public.

codecov.yml Outdated
@@ -4,7 +4,7 @@ coverage:
changes: false
project:
default:
target: '50'
target: '10'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might as well not have a coverage target at all if you're going to lower it randomly. I think just bail on it at least for now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think to even get the project up for HackIllinois, I will disable most of the tests and re add them afterwards once I refactor the code again.

from bokeh.io import show, output_file


class JustifiedGrid:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two seem to share a decent amount of code, and I'm assuming they entirely share their substantive interface - maybe refactor into a BokehGrid abstract base class, possibly implementing add_plots, output_file, etc.

if len(self.plots[self.current_row]) == self.grid_arrangement[self.current_row]:
self.plots.append([])
self.current_row += 1
assert (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify - you should not use assertions for exception handling. When to use assertions is a tricky business, but basically think that assertion code won't be run at all in some circumstances (if you run python -O, for example), so assertions are something where it will help you catch a bug if you're debugging the thing and it might fail in a mysterious way later, but not to be used to convey any information to the user, like, "This critical thing failed."

Not quite sure what category this fits in.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll switch it to a try except block I think, since I think it would be fair to consider that user's requesting for n grids and then trying to insert more than n figures would be considered an error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I think you are right, and in fact I think you should move this to the top of add_plot, because you mutate the state of the backend in this function before you check whether it's valid to do so.

  2. To clarify, it's not try/except block, it's raise, probably something like raise RuntimeError(...), but you can look through the Exception hierarchy to see if anything suits your taste.

self.add_plot(plot)

def output_dest(self, file):
output_file(file)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know bokeh - were you supposed to return this, or is it the side effect you care about?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bokeh creates html files as it's output medium, so that function doesn't return anything useful. I just sets the output file.



class Matplotlib:
def __init__(self, alignment="center"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably something for a future change, but I do think it would be nice to be able to pass a pre-existing figure to be populated by the backend.

I think I didn't see a great way to do that without doubling down on the tight coupling to matplotlib, but if you have users pass their own backends to the the GridStrategy objects, this allows you to configure the backend objects however you want, including passing them figures and whatnot.

@@ -7,6 +7,9 @@
import matplotlib.pyplot as plt
import numpy as np
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM bot thing is complaining that these imports are only necessary for the matplotlib backend.

@lgtm-com
Copy link

lgtm-com bot commented Mar 1, 2020

This pull request introduces 3 alerts when merging 77e7491 into db12cdf - view on LGTM.com

new alerts:

  • 3 for Unused import

@@ -40,7 +40,6 @@ steps:
- bash: |
if [[ $TOXENV == "py" ]];
then
$PYTHON -m tox -- --junitxml=unittests/TEST-$(AGENT.JobName).xml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this change is not working, another option is to pin to coverage < 5.0 for now. I know that at a Hackathon you're usually strapped for time, so unless someone wants to make it a project to fix the coverage reporting on this project (I'll look for resources on that if you want), pinning to an older version should get us back to green in the meantime, I think.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I pin the coverage? I was considering just disabling all of the tests that were failing so the PR could go through.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here and here change the dependency to coverage < 5

@lgtm-com
Copy link

lgtm-com bot commented Mar 1, 2020

This pull request introduces 3 alerts when merging b563216 into db12cdf - view on LGTM.com

new alerts:

  • 3 for Unused import

@pganssle
Copy link
Member

pganssle commented Mar 1, 2020

I definitely don't think you should remove the failing tests, they're super important for the review process and to make sure we don't break anything.

The housekeeping / bitrot stuff you can disable like pinning coverage or whatever, but the unit tests are critical and need to be fixed (considering that last year we had to actually write all the tests in addition to getting them working, doesn't seem like it would be too difficult to get them working again).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants