How to deal with "free list full with 5000 items" #525

tovogt · 2021-09-22T09:22:53Z

It would be great if you could give advice what to do when the calculation stops due to the error message: free list full with 5000 items. This typically appears after a lot of ***adjusting timestep for level 7 at t = and together with Note: The following floating-point exceptions are signalling: IEEE_UNDERFLOW_FLAG IEEE_DENORMAL and usually also the following:

 out of bndry space - allowed ***** bndry grids
 There are    15447 total grids    120351 bndry nbors average num/grid      7.791
 Expanding size of boundary list from       120000  to       180000
 out of nodal space - allowed    30000 grids
    level    1 has       1 grids
    level    2 has       4 grids
    level    3 has      16 grids
    level    4 has      64 grids
    level    5 has     256 grids
    level    6 has    1024 grids
    level    7 has   14107 grids
  Could need twice as many grids as on any given
  level if regridding/birecting
 Expanding maximum number of grids from        30000  to        40000

Any ideas what might be causing this or how to deal with this? Thanks in advance!

The text was updated successfully, but these errors were encountered:

mandli · 2021-09-23T23:41:56Z

This is pretty rare to see, more often we run of grids before we run out of boundary space. Given that you are seeing underflows as well I am wondering if something is blowing up instead. Have you tried plotting the results up to this time and see what might be going on?

tovogt · 2021-09-24T07:07:40Z

Thanks for your response!

I now think that this is somehow related to a very irregular bathymetry. This problem occurs when a TC track crosses over the Bahamas Archipel region (bathymetry from SRTM15+V2.0):

Even for rather medium-strength storms, GeoClaw tends to produce pretty large waves in this region. For example Hurricane Jeanne (2004):

I will be on vacation for a week now, and come back to this at the beginning of October with plots of the GeoClaw wind fields, AMR regions and surface. See you!

mandli · 2021-09-24T15:15:17Z

The Bahamas are difficult to say the least. Let's pick it up after you get back then.

tovogt · 2022-05-30T14:45:31Z

I just wanted to report back that this is still coming up from time to time and not only for the Bahamas (as suggested above). Most of the time, this is not a problem. But recently, I was experimenting with higher resolution runs (up to 30m) with synthetic scenarios where the wind speeds reach 290 km/h and found that I would run into the "free list full with 5000 items". Reproducing this is nasty, since it will run for 5 days (!) before the run fails.

In those cases I was running modified versions of Cyclone Idai (2019), but only its final landfall:

The bathymetry looks totally harmless in principal

As I said, I can't easily rerun this and get arbitrary plot outputs from the run because it takes more than 5 days to run it. I just wanted to paste this here for future reference.

mjberger · 2022-05-30T17:38:36Z

maybe this data structure should have the auto-increase too.

…

On May 30, 2022, at 10:45 AM, Thomas Vogt ***@***.***> wrote: I just wanted to report back that this is still coming up from time to time and not only for the Bahamas (as suggested above). Most of the time, this is not a problem. But recently, I was experimenting with higher resolution runs (up to 30m) with synthetic scenarios where the wind speeds reach 290 km/h and found that I would run into the "free list full with 5000 items". Reproducing this is nasty, since it will run for 5 days (!) before the run fails. In those cases I was running modified versions of Cyclone Idai (2019) <http://ibtracs.unca.edu/index.php?name=v04r00-2019063S18038>, but only its final landfall: <https://user-images.githubusercontent.com/57705593/171014737-df61b43c-775d-4dd6-9369-9491f3f207cf.png> The bathymetry looks totally harmless in principal <https://user-images.githubusercontent.com/57705593/171015710-8043e0c8-4690-4901-853b-ff59c178252b.png> As I said, I can't easily rerun this and get arbitrary plot outputs from the run because it takes more than 5 days to run it. I just wanted to paste this here for future reference. — Reply to this email directly, view it on GitHub <#525 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGUGC3Y2GLKY3RFQGQO4YDVMTIBNANCNFSM5EQ3GEJA>. You are receiving this because you are subscribed to this thread.

mandli · 2022-05-31T14:12:50Z

@mjberger that would not be too hard to do I suppose.

tovogt · 2024-02-12T12:34:01Z

Unfortunately, this still occurs from time to time. In a current setup, we would like to model storm surge of Super Typhoon Haiyan (2013) at up to 9 arc-seconds resolution in GeoClaw, but the run fails after more than 24 hours of run time with "free list full with 5000 items":

Note that we are only modeling a 48-hour subset of the storm duration:

How could we implement auto-increase for this data structure?

mjberger · 2024-02-12T19:22:22Z

Thomas, I am putting your request on to-do my list. In the mean time, I suggest you change the parameter in amr_module.f90 (which is used by geoclaw) from its initial value of 5000 to 25000 (or more if you want). It is only a 1d array, so not a big deal to make it larger than absolutely necessary. After you make this change, you will have to type "make new" to make sure everything is recompiled with the new module data. — Marsha

…

On Feb 12, 2024, at 7:34 AM, Thomas Vogt ***@***.***> wrote: Unfortunately, this still occurs from time to time. In a current setup, we would like to model storm surge of Super Typhoon Haiyan (2013) <https://ncics.org/ibtracs/index.php?name=v04r00-2013306N07162> at up to 9 arc-seconds resolution in GeoClaw, but the run fails after more than 24 hours of run time with "free list full with 5000 items": image.png (view on web) <https://github.com/clawpack/geoclaw/assets/57705593/bb2bfbeb-e565-4406-b03e-3e691a306d3d> Note that we are only modeling a 48-hour subset of the storm duration: image.png (view on web) <https://github.com/clawpack/geoclaw/assets/57705593/c2e1e99f-d1dc-4389-9954-d462a99bc8c1> How could we implement auto-increase for this data structure? — Reply to this email directly, view it on GitHub <#525 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGUGC7AON6K2HNBWYDIM7DYTID4JAVCNFSM5EQ3GEJKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOJTHA2TSNRTGQ3Q>. You are receiving this because you were mentioned.

tovogt · 2024-02-13T06:29:47Z

Thanks for the quick response, Marsha! The code is now running with the lfdim variable increased to 25000. :) I will report back whether it still fails.

tovogt · 2024-02-15T11:06:56Z

Okay, I now produced some more time-dependent plots to understand what's going on:

Even though this is a pure surge-driven scenario, it seems like there is an earthquake at the bottom boundary somewhere around hour "+10". What causes this and how can I prevent it?

I already enforce that outside of the inner rectangle (that you see in frame 1) refinement is limited to at most level 4 (of 6). If the refinement at the boundary is causing this, would it help to restrict to refinement level 1 in a small strip around the boundary area? Here is my regions.data:

5                    =: num_regions         
1  4 -4.32000000000000e+04  1.40400000000000e+05  1.05831993103027e+02  1.40850708007812e+02  9.86385345458984e-02  2.24680137634277e+01  
3  6 -4.32000000000000e+04  1.40400000000000e+05  1.15847679138184e+02  1.30129470825195e+02  8.31493663787842e+00  1.43523235321045e+01  
5  6 -4.32000000000000e+04  1.40400000000000e+05  1.18690833333333e+02  1.21242500000000e+02  9.57416666666733e+00  1.40425000000004e+01  
5  6 -4.32000000000000e+04  1.40400000000000e+05  1.21257500000000e+02  1.23792499999999e+02  9.10750000000069e+00  1.38425000000004e+01  
5  6 -4.32000000000000e+04  1.40400000000000e+05  1.23807499999999e+02  1.26342499999999e+02  8.64083333333405e+00  1.33091666666671e+01

mjberger · 2024-02-15T16:31:29Z

It looks like a bug from this distance, because nothing should be coming in from the boundary according to what you say. Is your example checked in so I can look at it further? Or send me a zip file and how to run it, so I can see what's happening? — Marsha

…

On Feb 15, 2024, at 6:07 AM, Thomas Vogt ***@***.***> wrote: Okay, I now produced some more time-dependent plots to understand what's going on: fig1001.gif (view on web) <https://github.com/clawpack/geoclaw/assets/57705593/7906050b-cbd6-4736-8bf0-f7ab1e217d9a> fig1005.gif (view on web) <https://github.com/clawpack/geoclaw/assets/57705593/815e0dd9-98ff-42d7-a8e3-4e5f15f505b7> Even though this is a pure surge-driven scenario, it seems like there is an earthquake at the bottom boundary somewhere around hour "+10". What causes this and how can I prevent it? I already enforce that outside of the inner rectangle (that you see in frame 1) refinement is limited to at most level 4 (of 6). If the refinement at the boundary is causing this, would it help to restrict to refinement level 1 in a small strip around the boundary area? Here is my regions.data: 5 =: num_regions 1 4 -4.32000000000000e+04 1.40400000000000e+05 1.05831993103027e+02 1.40850708007812e+02 9.86385345458984e-02 2.24680137634277e+01 3 6 -4.32000000000000e+04 1.40400000000000e+05 1.15847679138184e+02 1.30129470825195e+02 8.31493663787842e+00 1.43523235321045e+01 5 6 -4.32000000000000e+04 1.40400000000000e+05 1.18690833333333e+02 1.21242500000000e+02 9.57416666666733e+00 1.40425000000004e+01 5 6 -4.32000000000000e+04 1.40400000000000e+05 1.21257500000000e+02 1.23792499999999e+02 9.10750000000069e+00 1.38425000000004e+01 5 6 -4.32000000000000e+04 1.40400000000000e+05 1.23807499999999e+02 1.26342499999999e+02 8.64083333333405e+00 1.33091666666671e+01 — Reply to this email directly, view it on GitHub <#525 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGUGC7WTMKJSAV4LR7FQ2TYTXT5ZAVCNFSM5EQ3GEJKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOJUGU4DMNRRGM2Q>. You are receiving this because you were mentioned.

tovogt · 2024-02-16T14:27:32Z

Thanks for your quick response!

Here is the complete set of input data that I used to run the example (with make .output): https://www.pik-potsdam.de/~tovogt/for_mjberger/2013306N07162_60as_2024-02-16.zip (This archive also includes a file stdout.log with the full log output of GeoClaw!) I use clawpack version 5.9.2 and gfortran version 13.2.0, and I run the example on a single HPC cluster node with 16 CPUs and 64 GB of RAM. Running this example on that setup requires more than 6 hours (wall time).

Here I uploaded the complete _plots directory: https://www.pik-potsdam.de/~tovogt/for_mjberger/2013306N07162_60as_plots_2024-02-16.zip

And here is the _output directory: https://www.pik-potsdam.de/~tovogt/for_mjberger/2013306N07162_60as_output_2024-02-16.zip (5.9 GB of data!)

Note that this is not the exact same setup that caused the "free list full with 5000 items" error message for me (this post: #525 (comment)). That one has exceedingly long run times (more than 24 hours), so I reduced the resolution a bit, but left everything else unchanged.

tovogt · 2024-02-16T14:46:44Z

Oh, and I can now answer my question whether this is caused by refinement at the boundary. I enforced that there is no refinement at the boundary and the problem persists:

Maybe this has to do with the bottom boundary being very close to the equator, and there is some kind of division by zero or so?

mjberger · 2024-02-16T14:52:45Z

Something appears to be coming in from the boundary - I'll have to take a closer look. Is what you previously sent the coarsest resolution that demonstrates the problem? — Marsha

…

On Feb 16, 2024, at 9:46 AM, Thomas Vogt ***@***.***> wrote: Oh, and I can now answer my question whether this is caused by refinement at the boundary. I enforced that there is no refinement at the boundary and the problem persists: fig1001.gif (view on web) <https://github.com/clawpack/geoclaw/assets/57705593/5937a2e5-fd7f-4129-b63a-7b26c4897d46> Maybe this has to do with the bottom boundary being very close to the equator, and there is some kind of division by zero or so? — Reply to this email directly, view it on GitHub <#525 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGUGC2A74QY4W7ZVDZXSYDYT5WOBAVCNFSM5EQ3GEJKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOJUHA2TCOBSGE3Q>. You are receiving this because you were mentioned.

mjberger · 2024-02-16T16:37:31Z

Do you have the setrun.py and setplot.py that generated the data files? What is your direct email so we don't bring the whole group in for this. — Marsha

…

On Feb 16, 2024, at 9:27 AM, Thomas Vogt ***@***.***> wrote: Thanks for your quick response! Here is the complete set of input data that I used to run the example (with make .output): https://www.pik-potsdam.de/~tovogt/for_mjberger/2013306N07162_60as_2024-02-16.zip (This archive also includes a file stdout.log with the full log output of GeoClaw!) I use clawpack version 5.9.2 and gfortran version 13.2.0, and I run the example on a single HPC cluster node with 16 CPUs and 64 GB of RAM. Running this example on that setup requires more than 6 hours (wall time). Here I uploaded the complete _plots directory: https://www.pik-potsdam.de/~tovogt/for_mjberger/2013306N07162_60as_plots_2024-02-16.zip And here is the _output directory: https://www.pik-potsdam.de/~tovogt/for_mjberger/2013306N07162_60as_output_2024-02-16.zip (5.9 GB of data!) Note that this is not the exact same setup that caused the "free list full with 5000 items" error message for me (this post: #525 (comment) <#525 (comment)>). That one has exceedingly long run times (more than 24 hours), so I reduced the resolution a bit, but left everything else unchanged. — Reply to this email directly, view it on GitHub <#525 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGUGCZDI4KAG46GRIX2XHDYT5UGBAVCNFSM5EQ3GEJKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOJUHA2DQMBZGMYQ>. You are receiving this because you were mentioned.

tovogt mentioned this issue Oct 11, 2021

Run fails with seg fault (invalid memory reference) #528

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to deal with "free list full with 5000 items" #525

How to deal with "free list full with 5000 items" #525

tovogt commented Sep 22, 2021

mandli commented Sep 23, 2021

tovogt commented Sep 24, 2021

mandli commented Sep 24, 2021

tovogt commented May 30, 2022

mjberger commented May 30, 2022 via email

mandli commented May 31, 2022

tovogt commented Feb 12, 2024

mjberger commented Feb 12, 2024 via email

tovogt commented Feb 13, 2024 •

edited

tovogt commented Feb 15, 2024 •

edited

mjberger commented Feb 15, 2024 via email

tovogt commented Feb 16, 2024

tovogt commented Feb 16, 2024

mjberger commented Feb 16, 2024 via email

mjberger commented Feb 16, 2024 via email

How to deal with "free list full with 5000 items" #525

How to deal with "free list full with 5000 items" #525

Comments

tovogt commented Sep 22, 2021

mandli commented Sep 23, 2021

tovogt commented Sep 24, 2021

mandli commented Sep 24, 2021

tovogt commented May 30, 2022

mjberger commented May 30, 2022 via email

mandli commented May 31, 2022

tovogt commented Feb 12, 2024

mjberger commented Feb 12, 2024 via email

tovogt commented Feb 13, 2024 • edited

tovogt commented Feb 15, 2024 • edited

mjberger commented Feb 15, 2024 via email

tovogt commented Feb 16, 2024

tovogt commented Feb 16, 2024

mjberger commented Feb 16, 2024 via email

mjberger commented Feb 16, 2024 via email

tovogt commented Feb 13, 2024 •

edited

tovogt commented Feb 15, 2024 •

edited