Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Noorain ensembling #4

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open

Noorain ensembling #4

wants to merge 13 commits into from

Conversation

noorains
Copy link
Collaborator

No description provided.

@Polichinel Polichinel self-assigned this Feb 29, 2024
Copy link
Collaborator

@Polichinel Polichinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've noticed that this Pull Request (PR) seems to be mixed up with your baseline_pgm branch, resulting in changes being scattered throughout. Consequently, I'm stopping the review here.

Please create a PR specifically for the baseline models and another separate one for the ensembles. Otherwise, I'm just going over code that I've already reviewed, which is not an efficient use of time.

Unless there's a very compelling reason to build off a branch that has not yet been merged into the main branch, I strongly advise starting a new branch from the main for each new project. For instance, when transitioning from working on baseline_pgm to working on ensembles, it's best practice to initiate the latter from the main branch. This approach helps in maintaining a clear and organized codebase, making it easier to review and manage changes. It also avoids the complications that arise from intertwining the code of different features or projects, which can lead to confusion and errors. By following this guideline, we ensure that each feature or improvement is isolated, making it simpler to track progress, identify issues, and merge updates into the main branch once they are ready and approved.

The simplest solution for you here is to delete the baseline models on this branch. I'll take a look again after this is done and after some docstrings and comments are added for clarity.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the other PR why are you deleting this?

Comment on lines +7 to +28
def ensemble_median():

forecasts = {}
path_to_test = os.path.dirname(os.getcwd())
print(path_to_test)
for file in glob.glob(path_to_test + "/views_pipeline/models/**/data/generated/forecasts.parquet", recursive=True):
print(file)
# get the model name from the file path
model_name = file.split('/')[-4]
forecasts[model_name] = pd.read_parquet(file)


ensemble = pd.concat(forecasts.values()).groupby(level=[0,1]).median()
ensemble = ensemble.filter(regex='^step_pred_|^ln_ged_sb_dep$')

# get the directory of this function
function_file = inspect.getfile(ensemble_median)
function_dir = os.path.dirname(function_file)
parent_dir = os.path.dirname(function_dir)
ensemble_model_name = parent_dir.split('/')[-2]
ensemble.forecasts.to_store(ensemble_model_name, overwrite=True)
print(ensemble)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add docstrings and comments where needed.
Docstrings tell you what is happening, comments why/how it is done (in the best of worlds)

import inspect

def ensemble_mean():

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the structure is subject to chang, I'm not gonna comment on the configs

Comment on lines +1 to +8
from ..blue_sea.src.dataloaders.fetch_data_run_query import fetch_data
from ..blue_sea.src.forecasting.true_future_36m import forecast
from ..blue_sea.src.evaluation.evaluation_mse import evaluate_mse
from ..blue_sea.configs import config

from .src.utils.utils import wandb_init, wandb_finish
from .configs import wandb_config
from ..blue_sea.src.visualization.visual import visualize_forecasts_in_maps
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a comment on this in the baseline_pgm PR

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will den up in config after new structure implemented

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this aligned with Malika's work? And is this not also in a different PR? or?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, I know this is also in a different PR? If they are all mixing up, this makes me use a lot more time here. Please be a bit considered here.

Comment on lines +1 to +4
from ..green_oracle.src.dataloaders.fetch_data_run_query import fetch_data
from ..green_oracle.src.forecasting.true_future_36m import forecast
from ..green_oracle.src.evaluation.evaluation_mse import evaluate_mse
from ..green_oracle.configs import config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

..

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I realize this PR is mixup with your baseline pgm branch - so all the changes are everywhere. As such I stop the review here.

Please have a PR specifically for the baseline and a separate one for the ensembles. Otherwise, I do nothing but go over code I've already been over...

Unless you have a very good reason to build of a branch that has not yet been merged to main, please make a new branch from main when starting on a new project - I.e. when you go from making baseline_pgm to ensembles.

I think the easiest solution here is for you to delete your baseline models on this branch - should not be a big deal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants