Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSError: could not get source code #191

Open
jmrichardson opened this issue Sep 9, 2019 · 9 comments
Open

OSError: could not get source code #191

jmrichardson opened this issue Sep 9, 2019 · 9 comments

Comments

@jmrichardson
Copy link

Hello,

I am getting the following error when trying to follow along this article:

https://towardsdatascience.com/hyperparameter-hunter-feature-engineering-958966818b6e

exp_1 = CVExperiment(AdaBoostRegressor, feature_engineer=[euclidean_norm])
<18:12:03> Uncaught exception!   OSError: could not get source code
Traceback (most recent call last):
  File "D:\Anaconda3\envs\autoquant\lib\code.py", line 91, in runcode
    exec(code, self.locals)
  File "<input>", line 1, in <module>
  File "D:\Anaconda3\envs\autoquant\lib\site-packages\hyperparameter_hunter\experiment_core.py", line 165, in __call__
    return super().__call__(*args, **kwargs)
  File "D:\Anaconda3\envs\autoquant\lib\site-packages\hyperparameter_hunter\experiments.py", line 752, in __init__
    target_metric=target_metric,
  File "D:\Anaconda3\envs\autoquant\lib\site-packages\hyperparameter_hunter\experiments.py", line 598, in __init__
    target_metric=target_metric,
  File "D:\Anaconda3\envs\autoquant\lib\site-packages\hyperparameter_hunter\experiments.py", line 251, in __init__
    self.feature_engineer = FeatureEngineer(self.feature_engineer)
  File "D:\Anaconda3\envs\autoquant\lib\site-packages\hyperparameter_hunter\feature_engineering.py", line 915, in __init__
    self.add_step(step)
  File "D:\Anaconda3\envs\autoquant\lib\site-packages\hyperparameter_hunter\feature_engineering.py", line 1024, in add_step
    self._steps.append(self._to_step(step, stage=stage, name=name))
  File "D:\Anaconda3\envs\autoquant\lib\site-packages\hyperparameter_hunter\feature_engineering.py", line 1051, in _to_step
    return EngineerStep(step, name=name, stage=stage, do_validate=self.do_validate)
  File "D:\Anaconda3\envs\autoquant\lib\site-packages\hyperparameter_hunter\feature_engineering.py", line 508, in __init__
    self.params = params
  File "D:\Anaconda3\envs\autoquant\lib\site-packages\hyperparameter_hunter\feature_engineering.py", line 643, in params
    self._params = value if value is not None else get_engineering_step_params(self.f)
  File "D:\Anaconda3\envs\autoquant\lib\site-packages\hyperparameter_hunter\feature_engineering.py", line 1170, in get_engineering_step_params
    source_code = getsource(f)
  File "D:\Anaconda3\envs\autoquant\lib\inspect.py", line 973, in getsource
    lines, lnum = getsourcelines(object)
  File "D:\Anaconda3\envs\autoquant\lib\inspect.py", line 955, in getsourcelines
    lines, lnum = findsource(object)
  File "D:\Anaconda3\envs\autoquant\lib\inspect.py", line 786, in findsource
    raise OSError('could not get source code')
OSError: could not get source code

Here is the code:

from hyperparameter_hunter.utils.learning_utils import get_boston_data
from hyperparameter_hunter import Environment, CVExperiment
from sklearn.ensemble import AdaBoostRegressor
from sklearn.model_selection import train_test_split
import numpy as np
from sklearn.preprocessing import StandardScaler, QuantileTransformer

def standard_scale(train_inputs, non_train_inputs):
    s = StandardScaler()
    train_inputs[train_inputs.columns] = s.fit_transform(train_inputs.values)
    non_train_inputs[train_inputs.columns] = s.transform(non_train_inputs.values)
    return train_inputs, non_train_inputs

def euclidean_norm(all_inputs):
    all_inputs["euclidean_norm"] = all_inputs.agg(
        lambda row: np.sqrt(np.sum([np.square(_) for _ in row])), axis="columns",
    )
    return all_inputs

def quantile_transform(train_targets, non_train_targets):
    t = QuantileTransformer(output_distribution="normal", n_quantiles=100)
    train_targets[train_targets.columns] = t.fit_transform(train_targets.values)
    non_train_targets[train_targets.columns] = t.transform(non_train_targets.values)
    return train_targets, non_train_targets, t

env = Environment(
    train_dataset=get_boston_data(),
    results_path="HyperparameterHunterAssets",
    holdout_dataset=lambda train, _: train_test_split(train, test_size=0.1, random_state=1),
    target_column="DIS",
    metrics={"mae": "median_absolute_error"},
    cv_type="KFold",
    cv_params=dict(n_splits=5, random_state=1),
)

exp_0 = CVExperiment(AdaBoostRegressor, model_init_params={})

exp_1 = CVExperiment(AdaBoostRegressor, feature_engineer=[euclidean_norm])

CVExperiment exp_0 works but not exp_1

Thanks for any help.

@HunterMcGushion
Copy link
Owner

@jmrichardson,

Thanks for opening this! Definitely want to figure out what's going on, but I'm not able to reproduce the issue on my end (thanks for providing a minimal code sample, by the way).

I was able to check the sample code on MacOS 10.14 and Ubuntu 16.04 (both running Python 3.6 and 3.7) without issue.

Can you tell me about your setup? Python version? OS? Thanks again for taking the time to report this problem! I really appreciate it!

@jmrichardson
Copy link
Author

jmrichardson commented Sep 10, 2019

Thank you for the help. My setup is windows 10, python 3.6 using pycharm IDE.

So, it appears that the issue is running the code interactively...

After troubleshooting in different situations, I found that I could run the code above successfully in a script file:

python test.py

But either in pycharm or just the python shell, if I paste the code it fails with the above error. This behavior is the same for Windows and Ubuntu.

Here's the same error on ubuntu:

>>> exp_1 = CVExperiment(AdaBoostRegressor, feature_engineer=[euclidean_norm])
<10:56:33> Uncaught exception!   OSError: could not get source code
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/asg/anaconda3/envs/autoquant/lib/python3.6/site-packages/hyperparameter_hunter/experiment_core.py", line 165, in __call__
    return super().__call__(*args, **kwargs)
  File "/home/asg/anaconda3/envs/autoquant/lib/python3.6/site-packages/hyperparameter_hunter/experiments.py", line 752, in __init__
    target_metric=target_metric,
  File "/home/asg/anaconda3/envs/autoquant/lib/python3.6/site-packages/hyperparameter_hunter/experiments.py", line 598, in __init__
    target_metric=target_metric,
  File "/home/asg/anaconda3/envs/autoquant/lib/python3.6/site-packages/hyperparameter_hunter/experiments.py", line 251, in __init__
    self.feature_engineer = FeatureEngineer(self.feature_engineer)
  File "/home/asg/anaconda3/envs/autoquant/lib/python3.6/site-packages/hyperparameter_hunter/feature_engineering.py", line 915, in __init__
    self.add_step(step)
  File "/home/asg/anaconda3/envs/autoquant/lib/python3.6/site-packages/hyperparameter_hunter/feature_engineering.py", line 1024, in add_step
    self._steps.append(self._to_step(step, stage=stage, name=name))
  File "/home/asg/anaconda3/envs/autoquant/lib/python3.6/site-packages/hyperparameter_hunter/feature_engineering.py", line 1051, in _to_step
    return EngineerStep(step, name=name, stage=stage, do_validate=self.do_validate)
  File "/home/asg/anaconda3/envs/autoquant/lib/python3.6/site-packages/hyperparameter_hunter/feature_engineering.py", line 508, in __init__
    self.params = params
  File "/home/asg/anaconda3/envs/autoquant/lib/python3.6/site-packages/hyperparameter_hunter/feature_engineering.py", line 643, in params
    self._params = value if value is not None else get_engineering_step_params(self.f)
  File "/home/asg/anaconda3/envs/autoquant/lib/python3.6/site-packages/hyperparameter_hunter/feature_engineering.py", line 1170, in get_engineering_step_params
    source_code = getsource(f)
  File "/home/asg/anaconda3/envs/autoquant/lib/python3.6/inspect.py", line 973, in getsource
    lines, lnum = getsourcelines(object)
  File "/home/asg/anaconda3/envs/autoquant/lib/python3.6/inspect.py", line 955, in getsourcelines
    lines, lnum = findsource(object)
  File "/home/asg/anaconda3/envs/autoquant/lib/python3.6/inspect.py", line 786, in findsource
    raise OSError('could not get source code')
OSError: could not get source code

@HunterMcGushion
Copy link
Owner

Ahh nice catch.

I was able to reproduce the error in a terminal shell. However, I didn't get the error using the PyCharm debugger to step through the script...

(MacOS) Py3.6.6 Py3.7.0
Script good good
PyCharm Debug good good
Shell BAD BAD
Jupyter good good
(Ubuntu) Py3.6.6 Py3.7.0
Script good good
Shell BAD BAD

The error being raised in the shell makes sense because inspect.getsource tries to track down the file in which the code was defined. When the functions are defined in the shell, there's no file to track down, which is why it breaks. So I'll look into other ways of recording feature engineering function definitions. I'm still not sure why we got different results using PyCharm's debugger, though...

@jmrichardson
Copy link
Author

I see, makes sense...

Yes, I prefer to use pycharm in a similar way to jupyter where I use the pycharm variable inspector to show the results as each code snippet is executed. I don't use the debugger but rather have pycharm setup to "shift-enter" through each code blocks and then use the variable inspector to explore. I think if you were to execute the code in pycharm (not debugger) it will reproduce the error as it is similar to running it in the shell (with the added benefit of having the variable explorer).

Yes it would be great to have pycharm support to be able to interactively run experiments without having to execute the entire script each time. I could always switch to jupyter but it lacks some of the benefits of pycharm.

Thanks and looking forward to digging deeper into using hyperparameter hunter. So far it looks very nice :)

@HunterMcGushion
Copy link
Owner

Oh, my apologies. I assumed you meant the debugger. You're not referring to PyCharm's builtin Jupyter notebooks, or the "Python Console" tool, are you? I just tried the latter with no problems:
Screen Shot 2019-09-10 at 6 32 56 PM

This actually seems strange to me, as I was expecting the Python Console to raise the same error... But are you telling me there's yet another way to run your code in PyCharm that I'm still not getting? Hahaha

I agree completely, and I'll be looking into a workaround for the shell and for the PyCharm tool you're talking about that is still somehow eluding me.

It'll be great to have someone else digging into the project! I'm always open to PRs and appreciate issues!

@jmrichardson
Copy link
Author

This is weird... I would have expected you to get the same error in pycharm's console as you would in a a shell console...

Yes, I am using the pycharm console tool the same way as you in the picture. However, I am getting the error:

strange

I was using python 3.6 but tried 3.7 just for testing and got the same result. I also tried playing around with some settings in pycharm but wasn't able to get it to work. I did notice that my python variables are being displayed in the explorer window under the "Special Variables" section but I think that is just a different visual represenation.

Hmmmm... Not sure but will continue to look into it...

@HunterMcGushion
Copy link
Owner

I believe the variable display difference is caused by the left gear icon’s “Simplified Variables View” option. I checked that option and restarted the console, which produced a display closer to yours, so as you said, I think that’s unrelated.

I came up with a smaller snippet for our testing:

import inspect
def foo(): ...
inspect.getsource(foo)
inspect.getfile(foo)

The last line outputs “<ipython-input-3-0ad329aeb3e9>”, which was interesting, so I looked into my settings and found I had the “Build, Execution, Deployment” > “Console” > “Use IPython if available” option checked. When I unchecked this, I got the OSError when running the above snippet. Curiously, only inspect.getsource raises the error. inspect.getfile returns "<input>" with the above setting unchecked.

So it looks like the PyCharm console originally worked for me because of that setting, which must be saving some temporary IPython notebook somewhere, enabling the inspect functions to actually function.

Perhaps we can implement something similar in HH. However, the fact that it only worked because of a fairly obscure PyCharm setting makes me think that for HH, we would basically need to save all interactive console input as it’s coming in. To do that, HH would have to be imported immediately after starting the interactive session, and I don’t think that’s a reasonable requirement. I’ll keep trying to think of other ways around this, though…

@jmrichardson
Copy link
Author

Great catch! Yes, so that is good that there is a work around in pycharm provided you have that option checked and ipython installed. (Which I didn't have installed when I also tried that setting earlier to try to replicate your success). I wouldn't make this a high priority now that we have a workaround.

BTW, HH so far is very nice! Great work! I have run some simple experiments so far and really like how easy it is to setup and also to keep track of everything. I am planning on next week diving deeper and would like to give you some feedback. Do you have a forum for users to discuss or to ask you questions?

@HunterMcGushion
Copy link
Owner

Awesome! Glad that worked for you, too!

Thanks a lot! I'd absolutely love any feedback! I do have a Slack channel set up for HyperparameterHunter, but I haven't been able to figure out how to add the nice little invite badge to the README yet, so it's pretty much empty. Nevertheless, that's a great place to give feedback or ask anything you might not think is "issue-worthy". Thanks a lot for asking and for taking the time to dive in!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants