Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lgrngn: error in 2D run #105

Open
trontrytel opened this issue Jan 31, 2020 · 7 comments
Open

lgrngn: error in 2D run #105

trontrytel opened this issue Jan 31, 2020 · 7 comments

Comments

@trontrytel
Copy link
Contributor

I'm getting negative rv values:

CHEATING: turning negative values to small positive values                     
A negative number -1.65644e-05 detected in: rv after mixed_rhs_ante_step apply rhs
CHEATING: turning negative values to small positive values                     
A negative number -0.00276932 detected in: rv after condensation               
CHEATING: turning negative values to small positive values                     
A negative number -0.503947 detected in: rv at start of slvr_common::hook_post_step

And soon after an error:

A negative number detected in: th after mixed_rhs_post_step apply rhs (+ output of th rhs)

for this run from master:

running OMP_NUM_THREADS=1 bicycles --outdir=out_test_lgrngn_base_17 --case=dycoms_rf02 --nx=129 --ny=0 --nz=301 --dt=1 --spinup=3600 --nt=25200 --micro=lgrngn --outfreq=900 --backend=CUDA --rng_seed=17 --prs_tol=6e-7 --sd_conc=512 --sstp_cond=10 --sstp_coal=10 --out_wet="25e-7:25e-6|0,1,2,3,6;25e-6:25e-4|0,1,2,3,6;25e-8:25e-4|0,1,2,3,6;2.50e-06:2.68e-06|0;2.68e-06:2.87e-06|0;2.87e-06:3.08e-06|0;3.08e-06:3.30e-06|0;3.30e-06:3.53e-06|0;3.53e-06:3.78e-06|0;3.78e-06:4.05e-06|0;4.05e-06:4.34e-06|0;4.34e-06:4.66e-06|0;4.66e-06:4.99e-06|0;4.99e-06:5.34e-06|0;5.34e-06:5.73e-06|0;5.73e-06:6.14e-06|0;6.14e-06:6.58e-06|0;6.58e-06:7.05e-06|0;7.05e-06:7.55e-06|0;7.55e-06:8.09e-06|0;8.09e-06:8.67e-06|0;8.67e-06:9.29e-06|0;9.29e-06:9.95e-06|0;9.95e-06:1.07e-05|0;1.07e-05:1.14e-05|0;1.14e-05:1.22e-05|0;1.22e-05:1.31e-05|0;1.31e-05:1.41e-05|0;1.41e-05:1.51e-05|0;1.51e-05:1.61e-05|0;1.61e-05:1.73e-05|0;1.73e-05:1.85e-05|0;1.85e-05:1.99e-05|0;1.99e-05:2.13e-05|0;2.13e-05:2.28e-05|0;2.28e-05:2.44e-05|0;2.44e-05:2.62e-05|0;2.62e-05:2.81e-05|0;2.81e-05:3.01e-05|0;3.01e-05:3.22e-05|0;3.22e-05:3.45e-05|0;3.45e-05:3.70e-05|0;3.70e-05:3.96e-05|0;3.96e-05:4.25e-05|0;4.25e-05:4.55e-05|0;4.55e-05:4.87e-05|0;4.87e-05:5.22e-05|0;5.22e-05:5.60e-05|0;5.60e-05:6.00e-05|0;6.00e-05:6.43e-05|0;6.43e-05:6.89e-05|0;6.89e-05:7.38e-05|0;7.38e-05:7.91e-05|0;7.91e-05:8.47e-05|0;8.47e-05:9.08e-05|0;9.08e-05:9.73e-05|0;9.73e-05:1.04e-04|0;1.04e-04:1.12e-04|0;1.12e-04:1.20e-04|0;1.20e-04:1.28e-04|0;1.28e-04:1.37e-04|0;1.37e-04:1.47e-04|0;1.47e-04:1.58e-04|0;1.58e-04:1.69e-04|0;1.69e-04:1.81e-04|0;1.81e-04:1.94e-04|0;1.94e-04:2.08e-04|0;2.08e-04:2.23e-04|0;2.23e-04:2.39e-04|0;2.39e-04:2.56e-04|0;2.56e-04:2.74e-04|0;2.74e-04:2.94e-04|0;2.94e-04:3.15e-04|0;3.15e-04:3.37e-04|0;3.37e-04:3.61e-04|0;3.61e-04:3.87e-04|0;3.87e-04:4.15e-04|0;4.15e-04:4.45e-04|0;4.45e-04:4.76e-04|0;4.76e-04:5.10e-04|0;5.10e-04:5.47e-04|0;5.47e-04:5.86e-04|0;5.86e-04:6.28e-04|0;6.28e-04:6.73e-04|0;6.73e-04:7.21e-04|0;7.21e-04:7.73e-04|0;7.73e-04:8.28e-04|0;8.28e-04:8.87e-04|0;8.87e-04:9.50e-04|0;9.50e-04:1.02e-03|0;1.02e-03:1.09e-03|0;1.09e-03:1.17e-03|0;1.17e-03:1.25e-03|0;1.25e-03:1.34e-03|0;1.34e-03:1.44e-03|0;1.44e-03:1.54e-03|0;1.54e-03:1.65e-03|0;1.65e-03:1.77e-03|0;1.77e-03:1.90e-03|0;1.90e-03:2.03e-03|0;2.03e-03:2.18e-03|0;2.18e-03:2.33e-03|0;2.33e-03:2.50e-03|0"
@pdziekan
Copy link
Contributor

negative th is really bad ;)

After the error message, th and RHS of th should be outputted.

RHS of th should be 0, can you check if it is so?
If yes, then the only thing that comes to my mind is that th becomes negative after MPDATA advection.

Can you look at the outputted th array to check at what height the negative th appears?

@trontrytel
Copy link
Contributor Author

@pdziekan - I have checked the things you asked about. The RHS is zero.

Since it starts with negative rv I have plotted the values that are outputted as warning. Looks like it escalates quickly ;)

image

I also plotted theta outputed when the simulation crashes in reasonable colorscale (left panel) and in actual range (right panel). Ignore the x and y axis ticks - those are just the gridpoint numbers. It looks like some issue with super-droplet condensation/evaporation maybe? It is happening inside the cloud layer.

image

@trontrytel
Copy link
Contributor Author

Though it could also be related to updraft velocity, not sure. Plotting things from my last successful output doesn't reveal anything.

@trontrytel
Copy link
Contributor Author

I was also thinking, in those negtozero functions we should have a limit based on the absolute value of the number we want to overwrite to zero. If that number is greater that some threshold we should just error and abort. Seems like this simulation should have crashed much sooner

@pdziekan
Copy link
Contributor

pdziekan commented Feb 4, 2020

@trontrytel I'm pretty sure that the negative th is not a result of condensation/evaporation, because there is a negcheck right after that forcing is applied and it doesn't find any negative values. Negative th is found after the advection routine.

I wonder if the negative th has something to do with the negative rv.
I think that rv becomes negative because of the forcing associated with the large-scale subsidence.
You could disable subsidence and see if that helps. To disable it, you need to:

  1. set "params.subsidence = false;" in src/cases/DYCOMS.hpp
  2. compile UWLCM

NOTE: once subsidence is disabled, negative th may not appear for the same rng_seed as before. Therefore you would need to run an ensemble of simulations to know if getting rid of negative rv also helps for the negative th issue.

@trontrytel
Copy link
Contributor Author

I'll test that. I also have a similar error in blk1m simulation:

A negative number -0.00199518 detected in: rv at start of slvr_common::hook_post_step

that soon leads to

adj_cellwise.hpp:109: void libcloudphxx::blk_1m::adj_cellwise_nwtrph(const libcloudphxx::blk_1m::opts_t<real_t>&, const cont_t&, cont_t&, cont_t&, cont_t&, const real_t&) [with real_t = float; cont_t = blitz::Array<float, 2>]: Assertion `th >= 273.15' failed

So your guess looks possible now. I'll start debuggig from the bulk_1m simulation since its faster and cheaper

@trontrytel
Copy link
Contributor Author

I run 20 blk_1m simulations with subsidence disabled and I did not encounter any negative rv or negative th errors. It could still be a coincidence, but maybe not

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants