Use 2023 WEO data in `tools.costs` #187

measrainsey · 2024-04-25T14:16:59Z

Update costs module to move from using the 2022 WEO release to using the 2023 WEO data release

Main changes made:

Upload 2023 data file
Modify code to read in 2023 data release
Use 2022 USD to 2001 USD conversion (instead of 2021 USD to 2001 USD)
Use the new latest historical values (2022) for the regional differentation

Additionally, I've changed a few lines in the code to address the comments that were left on #186 which were not properly fixed in that PR.

How to review

For @khaeru and/or @glatterf42 : Read the diff and note that the CI checks all pass.

PR checklist

Continuous integration checks all ✅
Update doc/whatsnew.

Note that the following change was made from the original data file: in the "Nuclear" data sheet, the trailing first columns were deleted (which are hidden in the original data file) so that column A coincides with the start of the data. This is to make the Nuclear sheet consistent with the rest of the notebook.

- Read in 2023 WEO data - Update the rows to be read in - Convert from 2022 USD to 2001 USD (previously 2021 USD to 2001 USD) - Use 2022 values for base year costs

codecov · 2024-04-25T14:28:03Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 76.7%. Comparing base (16029b4) to head (5b3a1f8).

Additional details and impacted files

@@          Coverage Diff          @@
##            main    #187   +/-   ##
=====================================
  Coverage   76.7%   76.7%           
=====================================
  Files        112     112           
  Lines       7154    7154           
=====================================
  Hits        5491    5491           
  Misses      1663    1663

Files	Coverage Δ
...tests/tools/costs/test_regional_differentiation.py	`100.0% <100.0%> (ø)`
message_ix_models/tools/costs/projections.py	`84.7% <ø> (ø)`
..._ix_models/tools/costs/regional_differentiation.py	`97.9% <100.0%> (ø)`

khaeru · 2024-04-26T07:14:36Z

@measrainsey thanks for the PR and clean-ups!

For the moment I imagine no one will want to use the older WEO data. But suppose this:

We (or others) publish work using the WEO 2023 data via this module.
Development of message-ix-models carries on, with more updates and releases; to WEO 2024, WEO 2025, etc.
At some point, someone would like to reproduce earlier work, e.g. work that used WEO 2023 per this PR.

The question is, how should they do that?

Simply revert to the older version, e.g. the 2024.4.22 release that had WEO 2022, or an upcoming release that has WEO 2023 per this PR.
- This option could pose problems if there is no pip freeze >requirements.txt file, or if they are using other code that is compatible with future message-ix-models but not with 2024.4.22.
Provide a parameter or option (see for instance .tools.iea.web, or .model.snapshot) that allows the user to select the edition/version of input data used, and adjusts any other parameters to suit.
They can't; we will only support one version of WEO input data at a time, and if they want to use a different one (older or newer) they must figure it out on their own.

To be clear, I don't think either is necessary for this PR, but it would be good to consider which is desired for the future. @glatterf42 and I can then help implement.

measrainsey · 2024-04-26T13:30:46Z

Hi @khaeru, thanks for your comment and feedback!
That's a good point you brought up -- option (2) would probably be best for the long term, both for other users and for ourselves in case we want to test how different releases of data affect our results.

measrainsey · 2024-05-07T14:32:04Z

Circling back, would it be possible to review this PR as-is, and possibly work on a longer-term solution for using previous WEO data versions as a separate enhancement/functionality later on? I would like to get this merged sooner for the sake of the SSP update process. Thanks!

glatterf42

Certainly! As @khaeru mentioned, no need to tackle this issue now, though it would be good to create an issue tracking it.

The PR looks good to me, just one nit-pick: I think F841 errors regard variables that are assigned to but never used; these are easy to fix and don't ever need a # noqa statement. If you need the variable conversion_factor, you should be fine without the # noqa, and if you don't need it, you can use an _ to clarify. After that, happy to approve :)

glatterf42 · 2024-05-08T05:15:35Z

message_ix_models/tools/costs/regional_differentiation.py

    )

    # Retrieve conversion factor
-    conversion_factor = registry("1.0 USD_2021").to("USD_2005").magnitude  # noqa: F841
+    conversion_factor = registry("1.0 USD_2022").to("USD_2005").magnitude  # noqa: F841


Suggested change

conversion_factor = registry("1.0 USD_2022").to("USD_2005").magnitude # noqa: F841

conversion_factor = registry("1.0 USD_2022").to("USD_2005").magnitude # if variable is needed later on

_ = registry("1.0 USD_2022").to("USD_2005").magnitude # if not

Thanks @glatterf42 ! The conversion_factor is an interesting case where it is used later on here:

message-ix-models/message_ix_models/tools/costs/regional_differentiation.py

Line 125 in 0b0ee50

.eval("value = value * @conversion_factor")

But when the variable is used inside a string, the linting does not pick it up as being used. Without the # noqa: F841, when I save the file, the linter will automatically delete the conversion_factor variable line. So that's why the # noqa: F841 is put there. Alternatively, I can change the variable name to _conversion_factor and use it in the string as @_conversion_factor, and I think that should work as well. 😄

Huh, I didn't know about such a use-case. However, reading up on it, I found this note: DataFrame.eval() is only recommended for DataFrames with more than 1e5 rows. Do you have any estimate about the number of rows in your df? If it's lower for sure, df.assign() might be more performant.

I've changed the line to use df.assign() instead, so now conversion_factor is explicitly used :)

message-ix-models/message_ix_models/tools/costs/regional_differentiation.py

Line 125 in 5b3a1f8

.assign(value=lambda x: x.value * conversion_factor)

However, I will note that I believe encounter this same use-case in other places (especially when I use df.query()).

Thanks! No need to change all occurrences of this now, I just found it odd since I hadn't seen it and so wasn't aware this noqa might be needed. If you find you prefer assign() or similar for the other places, feel free to change them later on :)

glatterf42

Looks good to me :)
Following our merge recommendations, please merge without rebase or squash :)

measrainsey added 5 commits April 25, 2024 14:42

Use 2023 WEO data for the regional differentiation

55a5919

- Read in 2023 WEO data - Update the rows to be read in - Convert from 2022 USD to 2001 USD (previously 2021 USD to 2001 USD) - Use 2022 values for base year costs

Update tests to reflect 2023 WEO data being used

94d9675

Condense code based on review in #186

3278ab1

Add comments about converting column type

6f44dd1

measrainsey self-assigned this Apr 25, 2024

Add #187 to doc/whatsnew

0b0ee50

measrainsey marked this pull request as draft April 25, 2024 14:21

measrainsey marked this pull request as ready for review May 7, 2024 14:25

measrainsey mentioned this pull request May 7, 2024

Change the default final year in tools.costs to 2110 #190

Merged

2 tasks

glatterf42 reviewed May 8, 2024

View reviewed changes

Change from df.eval() to df.assign()

5b3a1f8

measrainsey mentioned this pull request May 8, 2024

Add functionality to allow tools.costs to use older version(s) of WEO data #191

Open

glatterf42 approved these changes May 8, 2024

View reviewed changes

measrainsey merged commit 82ac699 into main May 8, 2024
26 checks passed

measrainsey deleted the costs/weo-2023 branch May 8, 2024 11:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use 2023 WEO data in `tools.costs` #187

Use 2023 WEO data in `tools.costs` #187

measrainsey commented Apr 25, 2024 •

edited

codecov bot commented Apr 25, 2024 •

edited

khaeru commented Apr 26, 2024

measrainsey commented Apr 26, 2024

measrainsey commented May 7, 2024

glatterf42 left a comment

glatterf42 May 8, 2024

measrainsey May 8, 2024 •

edited

glatterf42 May 8, 2024

measrainsey May 8, 2024

glatterf42 May 8, 2024

glatterf42 left a comment

	conversion_factor = registry("1.0 USD_2022").to("USD_2005").magnitude # noqa: F841
	conversion_factor = registry("1.0 USD_2022").to("USD_2005").magnitude # if variable is needed later on
	_ = registry("1.0 USD_2022").to("USD_2005").magnitude # if not

Use 2023 WEO data in tools.costs #187

Use 2023 WEO data in tools.costs #187

Conversation

measrainsey commented Apr 25, 2024 • edited

How to review

PR checklist

codecov bot commented Apr 25, 2024 • edited

Codecov Report

khaeru commented Apr 26, 2024

measrainsey commented Apr 26, 2024

measrainsey commented May 7, 2024

glatterf42 left a comment

Choose a reason for hiding this comment

glatterf42 May 8, 2024

Choose a reason for hiding this comment

measrainsey May 8, 2024 • edited

Choose a reason for hiding this comment

glatterf42 May 8, 2024

Choose a reason for hiding this comment

measrainsey May 8, 2024

Choose a reason for hiding this comment

glatterf42 May 8, 2024

Choose a reason for hiding this comment

glatterf42 left a comment

Choose a reason for hiding this comment

Use 2023 WEO data in `tools.costs` #187

Use 2023 WEO data in `tools.costs` #187

measrainsey commented Apr 25, 2024 •

edited

codecov bot commented Apr 25, 2024 •

edited

measrainsey May 8, 2024 •

edited