Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fsurdat: PCT_SAND, PCT_CLAY, ORGANIC differ with different PE layouts on derecho #2502

Open
slevis-lmwg opened this issue Apr 30, 2024 · 5 comments
Assignees
Labels
priority: high High priority task to fix soon, e.g., because it is a problem in important configurations tag: bug - impacts science bug causing incorrect science results tag: support tools only Only modifies offline support tools (example in tools/contrib) so less testing required type: -investigation Needs to be verified and more investigation into what's going on.
Milestone

Comments

@slevis-lmwg
Copy link
Contributor

slevis-lmwg commented Apr 30, 2024

Brief summary of bug

I ran mksurfdata_esmf on derecho to generate fsurdat/landuse files for the VR grids ne0np4CONUS, ne0np4.ARCTIC, and ne0np4.ARCTICGRIS grids (PR #2490 iss #2487). Accidentally, I tried two PE layouts:

  • I see no diffs in the landuse files.
  • I see diffs in the fsurdat files. The fsurdat files show the different number of tasks used:
<               :Host = "derecho7" ;
<               :Number-of-tasks = 256 ;
---
>               :Host = "derecho6" ;
>               :Number-of-tasks = 1152 ;

Possibly related to issue #2430.

General bug information

CTSM version you are using: ctsm5.2.001

Does this bug cause significantly incorrect results in the model's science? Maybe

Configurations affected: All ctsm5.2.0 and newer, as well as hacked simulations that use 5.2 fsurdat files

Details of bug

I used /glade/campaign/cesm/cesmdata/cseg/tools/cime/tools/cprnc/cprnc -m <file1> <file2>
to get info like this:

 PCT_SAND   (gridcell,nlevsoi)
        281  1523900  ( 38260,     7) ( 77809,     1) ( 30259,     5) ( 30260,     8)
             1523900   9.500000000000000E+01   8.000000000000000E+00 4.5E+01  8.800000000000000E+01 4.2E-05  1.500000000000000E+01
             1523900   9.500000000000000E+01   8.000000000000000E+00          4.300000000000000E+01          4.300000000000000E+01
             1523900  ( 38260,     7) ( 77809,     1)
          avg abs field values:    4.507733154296875E+01    rms diff: 2.2E-01   avg rel diff(npos):  4.2E-05
                                   4.507754516601562E+01                        avg decimal digits(ndif):  0.8 worst:  0.2
 RMS PCT_SAND                         2.1765E-01            NORMALIZED  4.8284E-03

 PCT_CLAY   (gridcell,nlevsoi)
        269  1523900  ( 30936,     8) ( 38260,     1) ( 30260,     8) ( 42656,     6)
             1523900   7.400000000000000E+01   2.000000000000000E+00 4.6E+01  6.400000000000000E+01 6.5E-05  3.400000000000000E+01
             1523900   7.400000000000000E+01   2.000000000000000E+00          1.800000000000000E+01          6.000000000000000E+00
             1523900  ( 30936,     8) ( 38260,     1)
          avg abs field values:    1.737113952636719E+01    rms diff: 1.9E-01   avg rel diff(npos):  6.5E-05
                                   1.737069702148438E+01                        avg decimal digits(ndif):  0.6 worst:  0.1
 RMS PCT_CLAY                         1.9311E-01            NORMALIZED  1.1117E-02

 ORGANIC   (gridcell,nlevsoi)
        290  1523900  ( 36565,     5) (     1,     1) ( 42634,     8) ( 30207,     1)
             1523900   2.974772033691406E+02   0.000000000000000E+00 1.7E+02  1.733897705078125E+02 1.6E-04  4.729569244384766E+01
             1523900   2.974772033691406E+02   0.000000000000000E+00          0.000000000000000E+00          0.000000000000000E+00
             1523900  ( 36565,     5) (     1,     1)
          avg abs field values:    1.125364875793457E+01    rms diff: 8.5E-01   avg rel diff(npos):  1.6E-04
                                   1.124512195587158E+01                        avg decimal digits(ndif):  0.1 worst:  0.0
 RMS ORGANIC                          8.4824E-01            NORMALIZED  7.5403E-02

@ekluzek proposed this follow-up:
Perform testing with f09 to make easier to visualize (VR are unstructured grids and difficult to view).

@slevis-lmwg slevis-lmwg self-assigned this Apr 30, 2024
@slevis-lmwg slevis-lmwg added priority: high High priority task to fix soon, e.g., because it is a problem in important configurations type: -investigation Needs to be verified and more investigation into what's going on. tag: support tools only Only modifies offline support tools (example in tools/contrib) so less testing required tag: bug - impacts science bug causing incorrect science results labels Apr 30, 2024
@slevis-lmwg slevis-lmwg added the tag: next this issue should get some attention in the next week or two label May 2, 2024
@slevis-lmwg slevis-lmwg added this to the ctsm5.3.0 milestone May 2, 2024
@ekluzek ekluzek removed the tag: next this issue should get some attention in the next week or two label May 2, 2024
@slevis-lmwg
Copy link
Contributor Author

I have submitted a 4-node job and an 8-node job:

qsub mksurfdata_jobscript_single
qsub mksurfdata_jobscript_single_8nodes.sh

in /glade/work/slevis/git/latest_master/tools/mksurfdata_esmf
git describe: ctsm5.2.003
The jobs point to

surfdata_0.9x1.25_hist_2000_78pfts_c240506.namelist
surfdata_0.9x1.25_hist_2000_78pfts_c240506b.namelist

@slevis-lmwg
Copy link
Contributor Author

First I compare two files that I expect (hope) to be identical because derecho generated them on the same number of nodes. I'm relieved to find that they are indeed identical:

surfdata_0.9x1.25_hist_2000_78pfts_c240506
surfdata_0.9x1.25_hist_2000_78pfts_c240216

Next I compare the two files that I generated today:

surfdata_0.9x1.25_hist_2000_78pfts_c240506b
surfdata_0.9x1.25_hist_2000_78pfts_c240506

and find diffs as shown in the following sample ncview images.

@slevis-lmwg
Copy link
Contributor Author

surfdata_f09_2000_78pfts_c240506b-a_pctsand_nlevsoi0
surfdata_f09_2000_78pfts_c240506b-a_pctclay_nlevsoi0
surfdata_f09_2000_78pfts_c240506b-a_ORGANIC_nlevsoi0

@slevis-lmwg
Copy link
Contributor Author

landmask for reference

surfdata_f09_2000_78pfts_c240506_pctocn

@slevis-lmwg
Copy link
Contributor Author

My assessment of this visual examination:
A very small number of grid cells show differences at f09, but differences in those locations can be large.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: high High priority task to fix soon, e.g., because it is a problem in important configurations tag: bug - impacts science bug causing incorrect science results tag: support tools only Only modifies offline support tools (example in tools/contrib) so less testing required type: -investigation Needs to be verified and more investigation into what's going on.
Projects
Status: Todo
Development

No branches or pull requests

2 participants