Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mpirun signal 9 (killed) #740

Open
subedika opened this issue Jun 5, 2021 · 5 comments
Open

mpirun signal 9 (killed) #740

subedika opened this issue Jun 5, 2021 · 5 comments

Comments

@subedika
Copy link

subedika commented Jun 5, 2021

While trying to run the go_solver_pbs.bash in the global_s362ani_shakemovie directory, I get a mpirun noticed that process rank 7 with PID 0 on node [nodename] exited on signal 9 (killed) error, which I suppose is a memory overflow error. How do I fix this?

@subedika subedika changed the title Building own model mpirun signal 9 (killed) Jun 7, 2021
@amkearns-usgs
Copy link

amkearns-usgs commented Aug 31, 2021

I would like to note that I too have received this sort of issue. My lab is attempting to run the program to generate global simulations with NEX_XI and NEX_ETA params at a value of 256. I am compiling the program to run with a set of NVidia RTX 2080s. While I am aware there is a parameter called MEMORY_INSTALLED_PER_CORE_IN_GB and the related PERCENT_OF_MEM_TO_USE_PER_CORE, assigning different values to these doesn't actually affect the performance of the program. Each GPU thread only seems to use about 150MB of memory on the GPUs; we are running this over 24 threads. The xcreate_header_file binary suggests that we need about 160GB of memory to run the solver. We only have around 128GB of RAM available on the system.

Aside from lowering the value of the NEX_XI and NEX_ETA parameters or increasing the amount of memory available to the system, is there anything we can do to fix this memory issue, especially given that the GPUs' memory is mostly going unused?

@danielpeter
Copy link
Contributor

@subedika: maybe you can add more details, e.g., attach the output files here? also, try first the most recent devel branch version to check.

@amkearns-usgs: the numbers above don't seem to match: 150MB on the GPU for 24 threads would only amount to 3.6 GB memory needed, not the 160GB mentioned to run the solver. there should be more details in the output_solver.txt files for example. what is the setup in your case, 24 GPU cards spread over 24 compute nodes? or only a single compute node with 1 GPU card and you use CUDA MPS to run all processes on the same card and node?

to use more GPU memory per process, you would lower the NPROC_XI/NPROC_ETA values such that the partition size gets closer to your GPU memory size. the parameter MEMORY_INSTALLED_PER_CORE_IN_GB has no effect, it is only used for UNDO_ATTENUATION simulations to estimate the time steps in between wavefield snapshots when SAVE_FORWARD is set or kernel simulations with SIMULATION_TYPE = 3 are run.

anyway, add more outputs if you want to get more specific answers... :)

@amkearns-usgs
Copy link

amkearns-usgs commented Sep 7, 2021

To be more precise, the GPU memory ussage is only ~150 MB per thread according to nvidia-smi. Main memory usage (according to htop) is multiple GB per thread. Once the program gets past ~5 GB per thread it crashes due to an out of memory error.

The system we run on has 4 RTX 2080 GPUs, and I believe it has 10 dual-thread CPU cores (exact hardware according to /proc/cpuinfo is Intel core i9-9820X, 3.30GHz).

@amkearns-usgs
Copy link

Here is the contents of output_solver from the last attempted run of the program:


**** Specfem3D MPI Solver ****


Version: v7.0.2-421-gc4c30a79

Planet: Earth

There are 24 MPI processes
Processes are numbered from 0 to 23

There are 256 elements along xi in each chunk
There are 256 elements along eta in each chunk

There are 2 slices along xi in each chunk
There are 2 slices along eta in each chunk
There is a total of 4 slices in each chunk
There are 6 chunks
There is a total of 24 slices in all the chunks

NDIM = 3

NGLLX = 5
NGLLY = 5
NGLLZ = 5

using single precision for the calculations

smallest and largest possible floating-point numbers are: 1.17549435E-38 3.40282347E+38

model: s362ani
incorporating the oceans using equivalent load
incorporating ellipticity
incorporating surface topography
incorporating self-gravitation (Cowling approximation)
incorporating rotation
incorporating attenuation using 3 standard linear solids

incorporating 3-D lateral variations in the mantle
no heterogeneities in the mantle
incorporating crustal variations
using one layer only in crust
incorporating transverse isotropy
no inner-core anisotropy
no general mantle anisotropy

GPU_MODE Active.
runtime : 1
platform: NVIDIA
device : *
GPU number of devices per node: min = 4
max = 4

creating global slice addressing

Spatial distribution of the slices
3 1
2 0

11    9       7    5      19   17
10    8       6    4      18   16

                   23   21
                   22   20

                   15   13
                   14   12

mesh databases:
reading in crust/mantle databases...
reading in outer core databases...
reading in inner core databases...
reading in coupling surface databases...
reading in MPI databases...
for overlapping of communications with calculations:

percentage of edge elements in crust/mantle 5.73075438 %
percentage of volume elements in crust/mantle 94.2692490 %

percentage of edge elements in outer core 14.9479170 %
percentage of volume elements in outer core 85.0520859 %

percentage of edge elements in inner core 14.9107141 %
percentage of volume elements in inner core 85.0892868 %

Elapsed time for reading mesh in seconds = 357.150177

topography:
topography/bathymetry: min/max = -7747 5507

Elapsed time for reading topo/bathy in seconds = 0.508016586

adjacency:
total number of elements in this slice = 167936

using kd-tree search radius = 234.55179839086608 (km)

maximum search elements = 656
maximum of actual search elements (after distance criterion) = 655

estimated typical element size at surface = 39.091966398477680 (km)
maximum distance between neighbor centers = 202.34023456793579 (km)

maximum neighbors found per element = 37 (should be 37 for globe meshes)
total number of neighbors = 4256864

Elapsed time for detection of neighbors in seconds = 16.861965878168121

kd-tree:
total data points: 167936
theoretical number of nodes: 335869
tree memory size: 10.2499084 MB
actual number of nodes: 335871
tree memory size: 10.2499695 MB
maximum depth : 22
creation timing : 6.68773651E-02 (s)

sources: 1


locating sources


source # 1

source located in slice 2
in element 157113

using moment tensor source:
xi coordinate of source in that element: -0.22106384566793172
eta coordinate of source in that element: 0.63624502320449539
gamma coordinate of source in that element: -0.52283578294249500

source time function:
using (quasi) Heaviside source time function

 half duration:    32.399999999999999       seconds
 time shift:    0.0000000000000000       seconds

magnitude of the source:
scalar moment M0 = 2.9586466500749968E+028 dyne-cm
moment magnitude Mw = 8.2807287337668960

original (requested) position of the source:

     latitude:    55.420000000000002     
    longitude:   -157.31999999999999     
        depth:    30.199999999999999       km

position of the source that will be used:

     latitude:    55.419999999999995     
    longitude:   -157.32000000000002     
        depth:    30.200000000000891       km

Error in location of the source: 1.42565528E-12 km

maximum error in location of the sources: 1.42565528E-12 km

Elapsed time for detection of sources in seconds = 3.6215900378301740

End of source detection - done

receivers:

Total number of receivers = 378


locating receivers


reading receiver information...

Stations sorted by epicentral distance:
Station # 120: II.KDAK epicentral distance: 3.530290 degrees
Station # 29: IU.COLA epicentral distance: 10.573919 degrees
Station # 20: IU.ADK epicentral distance: 12.014195 degrees
Station # 162: US.EGAK epicentral distance: 12.325594 degrees
Station # 202: US.WRAK epicentral distance: 14.016299 degrees
Station # 25: IU.BILL epicentral distance: 20.907255 degrees
Station # 191: US.NLWA epicentral distance: 22.179230 degrees
Station # 30: IU.COR epicentral distance: 24.157248 degrees
Station # 169: US.HAWA epicentral distance: 25.199800 degrees
Station # 66: IU.PET epicentral distance: 25.532650 degrees
Station # 189: US.NEW epicentral distance: 25.548803 degrees
Station # 152: US.BMO epicentral distance: 27.369146 degrees
Station # 56: IU.MA2 epicentral distance: 27.600475 degrees
Station # 357: IW.PLID epicentral distance: 27.969683 degrees
Station # 186: US.MSO epicentral distance: 28.138155 degrees
Station # 204: US.WVOR epicentral distance: 28.141666 degrees
Station # 355: IW.MFID epicentral distance: 29.093248 degrees
Station # 350: IW.DLMT epicentral distance: 29.745358 degrees
Station # 171: US.HLID epicentral distance: 29.802465 degrees
Station # 163: US.EGMT epicentral distance: 29.867979 degrees
Station # 153: US.BOZ epicentral distance: 30.159069 degrees
Station # 61: IU.MIDW epicentral distance: 30.790148 degrees
Station # 115: II.FFC epicentral distance: 30.987230 degrees
Station # 164: US.ELK epicentral distance: 31.167278 degrees
Station # 180: US.LKWY epicentral distance: 31.472580 degrees
Station # 353: IW.IMW epicentral distance: 31.572300 degrees
Station # 351: IW.FLWY epicentral distance: 31.595047 degrees
Station # 352: IW.FXWY epicentral distance: 31.682159 degrees
Station # 356: IW.MOOW epicentral distance: 31.774481 degrees
Station # 195: US.RLMT epicentral distance: 31.785198 degrees
Station # 359: IW.TPAW epicentral distance: 31.814299 degrees
Station # 354: IW.LOHW epicentral distance: 31.939100 degrees
Station # 358: IW.SNOW epicentral distance: 31.948198 degrees
Station # 148: US.AHID epicentral distance: 32.187500 degrees
Station # 178: US.LAO epicentral distance: 32.607452 degrees
Station # 172: US.HWUT epicentral distance: 32.669357 degrees
Station # 159: US.DGMT epicentral distance: 32.765194 degrees
Station # 160: US.DUG epicentral distance: 32.902267 degrees
Station # 155: US.BW06 epicentral distance: 33.064751 degrees
Station # 198: US.TPNV epicentral distance: 33.273872 degrees
Station # 88: IU.TIXI epicentral distance: 33.945564 degrees
Station # 48: IU.KIP epicentral distance: 33.954693 degrees
Station # 249: N4.K22A epicentral distance: 34.867802 degrees
Station # 76: IU.RSSD epicentral distance: 35.367790 degrees
Station # 281: N4.O20A epicentral distance: 35.531281 degrees
Station # 69: IU.POHA epicentral distance: 35.630775 degrees
Station # 132: II.PFO epicentral distance: 35.661385 degrees
Station # 144: II.XPFO epicentral distance: 35.661385 degrees
Station # 270: N4.MDND epicentral distance: 35.674683 degrees
Station # 225: N4.E28B epicentral distance: 35.793369 degrees
Station # 102: II.ALE epicentral distance: 36.103497 degrees
Station # 98: IU.YAK epicentral distance: 36.571423 degrees
Station # 203: US.WUAZ epicentral distance: 37.007076 degrees
Station # 173: US.ISCO epicentral distance: 37.251591 degrees
Station # 99: IU.YSS epicentral distance: 37.410580 degrees
Station # 187: US.MVCO epicentral distance: 37.411461 degrees
Station # 147: US.AGMN epicentral distance: 37.489815 degrees
Station # 314: N4.SUSD epicentral distance: 38.048679 degrees
Station # 192: US.OGNE epicentral distance: 38.530491 degrees
Station # 229: N4.F33B epicentral distance: 38.691559 degrees
Station # 197: US.SDCO epicentral distance: 38.722054 degrees
Station # 250: N4.K30B epicentral distance: 38.786301 degrees
Station # 256: N4.KSCO epicentral distance: 39.486588 degrees
Station # 92: IU.TUC epicentral distance: 39.724506 degrees
Station # 43: IU.JOHN epicentral distance: 39.734013 degrees
Station # 161: US.ECSD epicentral distance: 39.810059 degrees
Station # 166: US.EYMN epicentral distance: 40.079155 degrees
Station # 22: IU.ANMO epicentral distance: 40.173866 degrees
Station # 113: II.ERM epicentral distance: 40.240379 degrees
Station # 223: N4.BGNE epicentral distance: 40.435734 degrees
Station # 226: N4.E38A epicentral distance: 40.849354 degrees
Station # 313: N4.SPMN epicentral distance: 41.008095 degrees
Station # 257: N4.L34B epicentral distance: 41.076458 degrees
Station # 156: US.CBKS epicentral distance: 41.268528 degrees
Station # 237: N4.I37B epicentral distance: 41.434464 degrees
Station # 299: N4.R32B epicentral distance: 42.121662 degrees
Station # 272: N4.N35B epicentral distance: 42.219021 degrees
Station # 232: N4.G40A epicentral distance: 42.418022 degrees
Station # 158: US.COWI epicentral distance: 42.481190 degrees
Station # 176: US.KSU1 epicentral distance: 42.854454 degrees
Station # 196: US.SCIA epicentral distance: 42.898548 degrees
Station # 149: US.AMTX epicentral distance: 42.921909 degrees
Station # 271: N4.MSTX epicentral distance: 42.965900 degrees
Station # 238: N4.I40B epicentral distance: 43.061317 degrees
Station # 185: US.MNTX epicentral distance: 43.180031 degrees
Station # 230: N4.F42A epicentral distance: 43.240807 degrees
Station # 273: N4.N38B epicentral distance: 43.625595 degrees
Station # 175: US.JFWS epicentral distance: 43.899185 degrees
Station # 258: N4.L40A epicentral distance: 43.914387 degrees
Station # 239: N4.I42A epicentral distance: 44.004452 degrees
Station # 286: N4.P38B epicentral distance: 44.244911 degrees
Station # 235: N4.H43A epicentral distance: 44.259701 degrees
Station # 315: N4.T35B epicentral distance: 44.421242 degrees
Station # 201: US.WMOK epicentral distance: 44.677055 degrees
Station # 259: N4.L42A epicentral distance: 44.821751 degrees
Station # 227: N4.E46A epicentral distance: 44.923138 degrees
Station # 274: N4.N41A epicentral distance: 45.022800 degrees
Station # 251: N4.K43A epicentral distance: 45.100460 degrees
Station # 287: N4.P40B epicentral distance: 45.148758 degrees
Station # 94: IU.WAKE epicentral distance: 45.152176 degrees
Station # 240: N4.I45A epicentral distance: 45.381012 degrees
Station # 323: N4.TUL3 epicentral distance: 45.541893 degrees
Station # 306: N4.S39B epicentral distance: 45.694679 degrees
Station # 222: N4.ABTX epicentral distance: 45.728127 degrees
Station # 167: US.GLMI epicentral distance: 45.746983 degrees
Station # 45: IU.KBS epicentral distance: 45.761795 degrees
Station # 300: N4.R40B epicentral distance: 45.875923 degrees
Station # 324: N4.U38B epicentral distance: 45.968262 degrees
Station # 170: US.HDIL epicentral distance: 46.000931 degrees
Station # 15: IC.MDJ epicentral distance: 46.089497 degrees
Station # 265: N4.M44A epicentral distance: 46.137505 degrees
Station # 80: IU.SFJD epicentral distance: 46.285595 degrees
Station # 288: N4.P43A epicentral distance: 46.496716 degrees
Station # 347: N4.Z35B epicentral distance: 46.568455 degrees
Station # 27: IU.CCM epicentral distance: 46.619053 degrees
Station # 244: N4.J47A epicentral distance: 46.678482 degrees
Station # 260: N4.L46A epicentral distance: 46.686901 degrees
Station # 58: IU.MAJO epicentral distance: 46.770386 degrees
Station # 82: IU.SLBS epicentral distance: 46.989681 degrees
Station # 174: US.JCT epicentral distance: 47.236687 degrees
Station # 294: N4.Q44B epicentral distance: 47.288155 degrees
Station # 241: N4.I49A epicentral distance: 47.297077 degrees
Station # 312: N4.SFIN epicentral distance: 47.348766 degrees
Station # 316: N4.T42B epicentral distance: 47.423203 degrees
Station # 337: N4.WHTX epicentral distance: 47.438595 degrees
Station # 184: US.MIAR epicentral distance: 47.782181 degrees
Station # 275: N4.N47A epicentral distance: 47.785889 degrees
Station # 289: N4.P46A epicentral distance: 47.809814 degrees
Station # 348: N4.Z38B epicentral distance: 47.912746 degrees
Station # 145: US.AAM epicentral distance: 47.942612 degrees
Station # 307: N4.S44A epicentral distance: 47.996696 degrees
Station # 252: N4.K50A epicentral distance: 48.186100 degrees
Station # 12: IC.HIA epicentral distance: 48.186790 degrees
Station # 282: N4.O48B epicentral distance: 48.511524 degrees
Station # 276: N4.N49A epicentral distance: 48.592594 degrees
Station # 317: N4.T45B epicentral distance: 48.818001 degrees
Station # 266: N4.M50A epicentral distance: 48.889153 degrees
Station # 290: N4.P48A epicentral distance: 48.911953 degrees
Station # 283: N4.O49A epicentral distance: 49.013348 degrees
Station # 188: US.NATX epicentral distance: 49.201656 degrees
Station # 95: IU.WCI epicentral distance: 49.286777 degrees
Station # 277: N4.N51A epicentral distance: 49.588684 degrees
Station # 221: N4.735B epicentral distance: 49.594658 degrees
Station # 318: N4.T47A epicentral distance: 49.706474 degrees
Station # 267: N4.M52A epicentral distance: 49.715790 degrees
Station # 146: US.ACSO epicentral distance: 49.730022 degrees
Station # 301: N4.R49A epicentral distance: 49.870598 degrees
Station # 39: IU.HKT epicentral distance: 49.872437 degrees
Station # 96: IU.WVT epicentral distance: 49.908997 degrees
Station # 165: US.ERPA epicentral distance: 50.060329 degrees
Station # 193: US.OXF epicentral distance: 50.178013 degrees
Station # 302: N4.R50A epicentral distance: 50.348492 degrees
Station # 207: N4.143B epicentral distance: 50.362427 degrees
Station # 295: N4.Q51A epicentral distance: 50.370556 degrees
Station # 245: N4.J55A epicentral distance: 50.435070 degrees
Station # 284: N4.O52A epicentral distance: 50.440907 degrees
Station # 278: N4.N53A epicentral distance: 50.506321 degrees
Station # 177: US.KVTX epicentral distance: 50.548698 degrees
Station # 341: N4.Y45B epicentral distance: 50.571201 degrees
Station # 328: N4.V48A epicentral distance: 50.757614 degrees
Station # 325: N4.U49A epicentral distance: 50.790997 degrees
Station # 215: N4.441B epicentral distance: 50.791645 degrees
Station # 319: N4.T50A epicentral distance: 50.957043 degrees
Station # 296: N4.Q52A epicentral distance: 51.012302 degrees
Station # 291: N4.P53A epicentral distance: 51.126614 degrees
Station # 285: N4.O54A epicentral distance: 51.193005 degrees
Station # 181: US.LONY epicentral distance: 51.202148 degrees
Station # 308: N4.S51A epicentral distance: 51.212288 degrees
Station # 200: US.VBMS epicentral distance: 51.224247 degrees
Station # 246: N4.J57A epicentral distance: 51.298031 degrees
Station # 261: N4.L56A epicentral distance: 51.352261 degrees
Station # 253: N4.K57A epicentral distance: 51.493896 degrees
Station # 338: N4.X48A epicentral distance: 51.562305 degrees
Station # 208: N4.146B epicentral distance: 51.747425 degrees
Station # 297: N4.Q54A epicentral distance: 51.787106 degrees
Station # 183: US.MCWV epicentral distance: 51.852131 degrees
Station # 349: N4.Z47B epicentral distance: 51.895912 degrees
Station # 333: N4.W50A epicentral distance: 51.998638 degrees
Station # 199: US.TZTN epicentral distance: 52.019722 degrees
Station # 247: N4.J59A epicentral distance: 52.042664 degrees
Station # 268: N4.M57A epicentral distance: 52.141338 degrees
Station # 150: US.BINY epicentral distance: 52.145870 degrees
Station # 84: IU.SSPA epicentral distance: 52.226120 degrees
Station # 224: N4.D62A epicentral distance: 52.298801 degrees
Station # 342: N4.Y49A epicentral distance: 52.353786 degrees
Station # 375: NE.VT1 epicentral distance: 52.355080 degrees
Station # 309: N4.S54A epicentral distance: 52.365425 degrees
Station # 228: N4.E62A epicentral distance: 52.386093 degrees
Station # 213: N4.346B epicentral distance: 52.439674 degrees
Station # 182: US.LRAL epicentral distance: 52.621212 degrees
Station # 42: IU.INCN epicentral distance: 52.621807 degrees
Station # 298: N4.Q56A epicentral distance: 52.649811 degrees
Station # 262: N4.L59A epicentral distance: 52.651623 degrees
Station # 303: N4.R55A epicentral distance: 52.678341 degrees
Station # 279: N4.N58A epicentral distance: 52.712772 degrees
Station # 339: N4.X51A epicentral distance: 52.715542 degrees
Station # 179: US.LBNH epicentral distance: 52.835220 degrees
Station # 334: N4.W52A epicentral distance: 52.855061 degrees
Station # 233: N4.G62A epicentral distance: 52.858883 degrees
Station # 218: N4.545B epicentral distance: 52.859642 degrees
Station # 292: N4.P57A epicentral distance: 52.974285 degrees
Station # 374: NE.TRY epicentral distance: 52.994068 degrees
Station # 236: N4.H62A epicentral distance: 52.994175 degrees
Station # 326: N4.U54A epicentral distance: 53.000866 degrees
Station # 364: NE.HNH epicentral distance: 53.025387 degrees
Station # 329: N4.V53A epicentral distance: 53.062744 degrees
Station # 372: NE.PQI epicentral distance: 53.092178 degrees
Station # 248: N4.J61A epicentral distance: 53.139992 degrees
Station # 97: IU.XMAS epicentral distance: 53.209785 degrees
Station # 151: US.BLA epicentral distance: 53.280235 degrees
Station # 242: N4.I62A epicentral distance: 53.394806 degrees
Station # 194: US.PKME epicentral distance: 53.450283 degrees
Station # 231: N4.F64A epicentral distance: 53.497402 degrees
Station # 370: NE.NHFNK epicentral distance: 53.499088 degrees
Station # 243: N4.I63A epicentral distance: 53.656094 degrees
Station # 343: N4.Y52A epicentral distance: 53.676704 degrees
Station # 310: N4.S57A epicentral distance: 53.693417 degrees
Station # 263: N4.L61B epicentral distance: 53.707870 degrees
Station # 123: II.KWJN epicentral distance: 53.708191 degrees
Station # 378: NE.WVL epicentral distance: 53.758553 degrees
Station # 254: N4.K62A epicentral distance: 53.786964 degrees
Station # 211: N4.250A epicentral distance: 53.820126 degrees
Station # 330: N4.V55A epicentral distance: 53.822395 degrees
Station # 327: N4.U56A epicentral distance: 53.925365 degrees
Station # 154: US.BRAL epicentral distance: 53.977077 degrees
Station # 373: NE.QUA2 epicentral distance: 54.000267 degrees
Station # 371: NE.ORNO epicentral distance: 54.020081 degrees
Station # 140: II.TLY epicentral distance: 54.044498 degrees
Station # 320: N4.T57A epicentral distance: 54.078285 degrees
Station # 362: NE.DUNH epicentral distance: 54.123108 degrees
Station # 304: N4.R58B epicentral distance: 54.136063 degrees
Station # 209: N4.152A epicentral distance: 54.186455 degrees
Station # 157: US.CBN epicentral distance: 54.238213 degrees
Station # 41: IU.HRV epicentral distance: 54.250591 degrees
Station # 255: N4.KMSC epicentral distance: 54.273335 degrees
Station # 377: NE.WSPT epicentral distance: 54.279114 degrees
Station # 365: NE.MAACT epicentral distance: 54.305531 degrees
Station # 168: US.GOGA epicentral distance: 54.345608 degrees
Station # 234: N4.G65A epicentral distance: 54.357368 degrees
Station # 280: N4.N62A epicentral distance: 54.376938 degrees
Station # 376: NE.WES epicentral distance: 54.459896 degrees
Station # 369: NE.MATOP epicentral distance: 54.474796 degrees
Station # 368: NE.MANTK epicentral distance: 54.506935 degrees
Station # 361: NE.BCX epicentral distance: 54.573841 degrees
Station # 360: NE.BCDNQ epicentral distance: 54.573841 degrees
Station # 293: N4.P61A epicentral distance: 54.575333 degrees
Station # 367: NE.MAGLO epicentral distance: 54.612709 degrees
Station # 366: NE.MAFXB epicentral distance: 54.719723 degrees
Station # 363: NE.EMMW epicentral distance: 54.761940 degrees
Station # 269: N4.M63A epicentral distance: 54.785454 degrees
Station # 214: N4.352A epicentral distance: 54.951435 degrees
Station # 335: N4.W57A epicentral distance: 55.012547 degrees
Station # 331: N4.V58A epicentral distance: 55.030720 degrees
Station # 264: N4.L64A epicentral distance: 55.031734 degrees
Station # 321: N4.T59A epicentral distance: 55.032310 degrees
Station # 46: IU.KEV epicentral distance: 55.087784 degrees
Station # 216: N4.451A epicentral distance: 55.131645 degrees
Station # 210: N4.154A epicentral distance: 55.138588 degrees
Station # 305: N4.R61A epicentral distance: 55.251312 degrees
Station # 93: IU.ULN epicentral distance: 55.266289 degrees
Station # 107: II.BORG epicentral distance: 55.511566 degrees
Station # 311: N4.S61A epicentral distance: 55.542931 degrees
Station # 344: N4.Y57A epicentral distance: 55.625370 degrees
Station # 322: N4.TIGA epicentral distance: 55.739445 degrees
Station # 340: N4.X58A epicentral distance: 55.815834 degrees
Station # 336: N4.W59A epicentral distance: 55.862198 degrees
Station # 345: N4.Y58A epicentral distance: 56.112461 degrees
Station # 219: N4.553A epicentral distance: 56.199299 degrees
Station # 190: US.NHSC epicentral distance: 56.418922 degrees
Station # 332: N4.V61A epicentral distance: 56.431896 degrees
Station # 10: IC.BJT epicentral distance: 56.437138 degrees
Station # 124: II.LVZ epicentral distance: 56.673843 degrees
Station # 212: N4.257A epicentral distance: 56.786213 degrees
Station # 346: N4.Y60A epicentral distance: 56.846622 degrees
Station # 217: N4.456A epicentral distance: 57.161289 degrees
Station # 220: N4.656A epicentral distance: 57.894760 degrees
Station # 85: IU.TARA epicentral distance: 59.062077 degrees
Station # 50: IU.KNTN epicentral distance: 59.197605 degrees
Station # 33: IU.DWPF epicentral distance: 59.477257 degrees
Station # 17: IC.SSE epicentral distance: 60.356575 degrees
Station # 38: IU.GUMO epicentral distance: 60.847641 degrees
Station # 205: N4.060A epicentral distance: 60.907791 degrees
Station # 206: N4.061Z epicentral distance: 61.500675 degrees
Station # 87: IU.TEIG epicentral distance: 61.715145 degrees
Station # 122: II.KURK epicentral distance: 64.456177 degrees
Station # 19: IC.XAN epicentral distance: 64.744698 degrees
Station # 51: IU.KONO epicentral distance: 64.802391 degrees
Station # 103: II.ARTI epicentral distance: 64.829056 degrees
Station # 86: IU.TATO epicentral distance: 64.895447 degrees
Station # 108: II.BRVK epicentral distance: 64.975082 degrees
Station # 104: II.ARU epicentral distance: 65.179810 degrees
Station # 24: IU.BBSR epicentral distance: 65.437599 degrees
Station # 59: IU.MAKZ epicentral distance: 66.600517 degrees
Station # 34: IU.FUNA epicentral distance: 66.659393 degrees
Station # 18: IC.WMQ epicentral distance: 66.887558 degrees
Station # 9: CU.TGUH epicentral distance: 67.225983 degrees
Station # 11: IC.ENH epicentral distance: 67.372818 degrees
Station # 114: II.ESK epicentral distance: 67.609833 degrees
Station # 130: II.OBN epicentral distance: 69.238815 degrees
Station # 7: CU.MTDJ epicentral distance: 69.420685 degrees
Station # 6: CU.GTBY epicentral distance: 69.447495 degrees
Station # 21: IU.AFI epicentral distance: 70.132820 degrees
Station # 5: CU.GRTK epicentral distance: 70.438568 degrees
Station # 118: II.JTS epicentral distance: 71.592377 degrees
Station # 8: CU.SDDR epicentral distance: 72.364037 degrees
Station # 100: II.AAK epicentral distance: 72.879036 degrees
Station # 40: IU.HNR epicentral distance: 73.780113 degrees
Station # 47: IU.KIEV epicentral distance: 74.105019 degrees
Station # 37: IU.GRFO epicentral distance: 74.819443 degrees
Station # 13: IC.KMI epicentral distance: 75.067055 degrees
Station # 3: CU.BCIP epicentral distance: 75.401962 degrees
Station # 127: II.MSVF epicentral distance: 75.815979 degrees
Station # 117: II.IBFO epicentral distance: 75.917427 degrees
Station # 143: II.XBFO epicentral distance: 75.919067 degrees
Station # 106: II.BFO epicentral distance: 75.919067 degrees
Station # 81: IU.SJG epicentral distance: 75.950951 degrees
Station # 16: IC.QIZ epicentral distance: 76.137657 degrees
Station # 74: IU.RAR epicentral distance: 76.353302 degrees
Station # 32: IU.DAV epicentral distance: 76.893105 degrees
Station # 14: IC.LSA epicentral distance: 77.125587 degrees
Station # 65: IU.PAYG epicentral distance: 77.711273 degrees
Station # 137: II.SIMI epicentral distance: 78.293884 degrees
Station # 109: II.CMLA epicentral distance: 78.469727 degrees
Station # 1: CU.ANWB epicentral distance: 78.730888 degrees
Station # 67: IU.PMG epicentral distance: 79.333153 degrees
Station # 121: II.KIV epicentral distance: 79.550690 degrees
Station # 79: IU.SDV epicentral distance: 80.873322 degrees
Station # 128: II.NIL epicentral distance: 81.302849 degrees
Station # 44: IU.KBL epicentral distance: 82.014137 degrees
Station # 28: IU.CHTO epicentral distance: 82.233315 degrees
Station # 64: IU.PAB epicentral distance: 82.628075 degrees
Station # 36: IU.GNI epicentral distance: 82.955910 degrees
Station # 4: CU.GRGR epicentral distance: 83.308647 degrees
Station # 70: IU.PTCN epicentral distance: 83.481422 degrees
Station # 63: IU.OTAV epicentral distance: 83.485992 degrees
Station # 2: CU.BBGH epicentral distance: 83.648315 degrees
Station # 23: IU.ANTO epicentral distance: 84.687180 degrees
Station # 73: IU.RAO epicentral distance: 86.156281 degrees
Station # 31: IU.CTAO epicentral distance: 89.094749 degrees
Station # 119: II.KAPI epicentral distance: 90.080643 degrees
Station # 57: IU.MACI epicentral distance: 90.171333 degrees
Station # 134: II.RPN epicentral distance: 91.863922 degrees
Station # 129: II.NNA epicentral distance: 94.431213 degrees
Station # 142: II.WRAB epicentral distance: 94.605293 degrees
Station # 71: IU.PTGA epicentral distance: 94.782471 degrees
Station # 141: II.UOSS epicentral distance: 94.983322 degrees
Station # 133: II.RAYN epicentral distance: 98.991165 degrees
Station # 83: IU.SNZO epicentral distance: 99.262672 degrees
Station # 77: IU.SAML epicentral distance: 99.648308 degrees
Station # 135: II.SACV epicentral distance: 99.780228 degrees
Station # 131: II.PALK epicentral distance: 101.315697 degrees
Station # 60: IU.MBWA epicentral distance: 103.267693 degrees
Station # 52: IU.KOWA epicentral distance: 106.796890 degrees
Station # 55: IU.LVC epicentral distance: 107.425476 degrees
Station # 139: II.TAU epicentral distance: 108.617744 degrees
Station # 110: II.COCO epicentral distance: 108.939735 degrees
Station # 53: IU.LCO epicentral distance: 111.527992 degrees
Station # 75: IU.RCBR epicentral distance: 112.243813 degrees
Station # 62: IU.NWAO epicentral distance: 113.976898 degrees
Station # 35: IU.FURI epicentral distance: 114.539948 degrees
Station # 111: II.DGAR epicentral distance: 118.063927 degrees
Station # 126: II.MSEY epicentral distance: 122.967308 degrees
Station # 90: IU.TRQA epicentral distance: 123.109291 degrees
Station # 105: II.ASCN epicentral distance: 124.293205 degrees
Station # 49: IU.KMBO epicentral distance: 124.592354 degrees
Station # 125: II.MBAR epicentral distance: 124.962517 degrees
Station # 112: II.EFI epicentral distance: 134.423752 degrees
Station # 78: IU.SBA epicentral distance: 134.838379 degrees
Station # 136: II.SHEL epicentral distance: 134.975815 degrees
Station # 68: IU.PMSA epicentral distance: 139.134171 degrees
Station # 101: II.ABPO epicentral distance: 139.178696 degrees
Station # 26: IU.CASY epicentral distance: 139.460815 degrees
Station # 54: IU.LSZ epicentral distance: 139.714325 degrees
Station # 91: IU.TSUM epicentral distance: 143.636383 degrees
Station # 72: IU.QSPA epicentral distance: 145.203156 degrees
Station # 116: II.HOPE epicentral distance: 146.808273 degrees
Station # 89: IU.TRIS epicentral distance: 150.073502 degrees
Station # 138: II.SUR epicentral distance: 156.928391 degrees

maximum error in location of all the receivers: 5.45888942E-12 km

Elapsed time for receiver detection in seconds = 0.24783961405046284

End of receiver detection - done

found a total of 378 receivers in all slices
this total is okay

source arrays:
number of sources is 1
size of source array = 1.43051147E-03 MB
= 1.39698386E-06 GB

seismograms:
seismograms written by all processes
writing out seismograms at every NTSTEP_BETWEEN_OUTPUT_SEISMOS = 135500
maximum number of local receivers is 147 in slice 5
size of maximum seismogram array = 227.949142 MB
= 0.222606584 GB

Total number of samples for seismograms = 135500

Reference radius of the globe used is 6371.0000000000000 km

incorporating the oceans using equivalent load

incorporating ellipticity

incorporating surface topography

incorporating self-gravitation (Cowling approximation)

incorporating rotation

incorporating attenuation using 3 standard linear solids

preparing mass matrices
preparing constants
preparing gravity arrays
preparing attenuation
The code uses a constant Q quality factor, but approximated
based on a series of Zener standard linear solids (SLS).
Approximation is performed in the following frequency band:

number of SLS bodies: 3
partial attenuation, physical dispersion only: F

Reference frequency of anelastic model (Hz): 1.00000000
period (s): 1.00000000
Attenuation frequency band min/max (Hz): 1.02351687E-03 / 5.75565845E-02
period band min/max (s) : 17.3742065 / 977.023499
Logarithmic center frequency (Hz): 7.67529383E-03
period (s): 130.288177

using shear attenuation Q_mu

ATTENUATION_1D_WITH_3D_STORAGE : T
ATTENUATION_3D : F
preparing elastic element arrays
using attenuation: shifting to unrelaxed moduli
crust/mantle transverse isotropic and isotropic elements
tiso elements = 98304
iso elements = 69632
inner core isotropic elements
iso elements = 4864
preparing wavefields
allocating wavefields
initializing wavefields

This is where the file ends, because that's where the program stops running.

@danielpeter
Copy link
Contributor

right, the node or workstation has not enough memory to fit and run this simulation setup.

the code stops when assigning values to the wavefields. after allocation, arrays have not been mapped to memory yet. this is done with the first wavefield initialization here. given with this setup of 24 MPI processes on a single node and the NEX 256 setting, the estimate is having ~160GB memory.

you will have to run on multiple nodes or workstations (given they can communicate by an MPI installation), or run it on a fat node with more memory.

regarding the GPUs, a Geforce RTX 2080 card has 8GB memory. the setup with NEX 256 and 24 MPI processes (and model s362ani) will require ~5GB GPU memory per process based on my past experience. thus, only a single process would fit onto one card. I'm afraid you will need more GPU cards as well to run this setup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants