Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add utilities functions to time functions #286

Merged
merged 1 commit into from Jul 25, 2018

Conversation

gglanzani
Copy link
Contributor

Reference #193

@gglanzani gglanzani mentioned this pull request Jul 7, 2018
dask_ml/utils.py Outdated
"""
Output execution time of a function to the given logger level

:param str name: How to name the timer (will be in the logs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use numpydoc style docstrings. Could you update these to use a Parameters section (don't need ad Returns).

dask_ml/utils.py Outdated
stop = time.perf_counter()
delta = datetime.timedelta(seconds=stop - start)
_logger_level = getattr(_logger, level)
_logger_level("Finished %s in %s", name, delta) # nicer formatting for time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this formatted for a timedelta?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>>> import logging
>>> import datetime
>>> logging.warning("%s", datetime.timedelta(seconds=50))
WARNING:root:0:00:50
>>> logging.warning("%s", datetime.timedelta(seconds=500))
WARNING:root:0:08:20
>>> logging.warning("%s", datetime.timedelta(seconds=5000))
WARNING:root:1:23:20

Copy link
Member

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay @gglanzani.

The _timer context manager is what I had in mind. I'm not sure whether the _timed decorator will be useful or not. IIRC, there aren't as many place where we would log the entire function execution time, though I may be long. Could you look at a few places where we currently log the time, and replace them with _timer?

@gglanzani
Copy link
Contributor Author

gglanzani commented Jul 19, 2018

@TomAugspurger There are usages of _timed here and there. I've tried to use _timer everywhere I could (note that in some cases the time is given back as the caller — why I don't know — so there's no way to make the context manager return without some magic — but that would defeat the point).

@TomAugspurger
Copy link
Member

Pushed some formatting changes.

The usages of timed look good. I'll take a closer look tomorrow.

dask_ml/utils.py Outdated
level : str
On which level to log the performance measurement
"""
start = time.perf_counter()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like time.perf_counter is new in python 3.3, and we support 2.7.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I fixed that. I'm now checking (locally) why the check_estimator test is so slow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed some strangeness with that test previously. If necessary, we can disable it, but I'm a bit concerned that it's showing this change has a performance impact (though I don't see why that would be).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TomAugspurger I'm running it locally with the decorator turned off, but I don't see different results.

I'll test a bit more and report back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TomAugspurger I forgot an indentation in the Lloyd loop :) Fixed now

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's entirely possible that was my fault when I did the reformatting. If so then you have my apologies :)

@gglanzani
Copy link
Contributor Author

@TomAugspurger I've rebased against master.

Local build is green, let's see what py27 holds for me.

@TomAugspurger TomAugspurger merged commit d95f1ce into dask:master Jul 25, 2018
@TomAugspurger
Copy link
Member

TomAugspurger commented Jul 25, 2018

All green, thanks @gglanzani!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants