Ghost entries skipped for ILU apply and SpMV operator in all levels of AMG/CPR hierarchy #5182

lisajulia · 2024-02-12T10:31:22Z

Replacement for #4296

lisajulia · 2024-02-12T10:37:48Z

jenkins build this please

lisajulia · 2024-02-12T11:51:44Z

jenkins build this please

blattms · 2024-02-12T12:48:40Z

Fortunately, no time stepping changes and hence failed tests. Just new warnings that need to be removed.

lisajulia · 2024-02-12T13:07:53Z

jenkins build this please

lisajulia · 2024-02-12T14:13:31Z

jenkins build this please

lisajulia · 2024-02-12T15:01:00Z

benchmark please

lisajulia · 2024-02-12T17:00:52Z

@blattms: I believe the last comment "benchmark please" did not have any effect - can you show me where I can check that?

lisajulia · 2024-02-12T17:48:29Z

jenkins build this please

blattms · 2024-02-12T18:19:56Z

turns out you can't. I asked Michael to whitelist you and @aritorto. It is "benchmark please" and should be used scarcely. We cannot really see whether it worked. Normally the benchmarking report is added to the PR after a few hours, but currently that is broken, too.

blattms · 2024-02-12T18:20:04Z

benchmark please

ytelses · 2024-02-13T02:12:45Z

Benchmark result overview:

Test	Configuration	Relative
opm-git	OPM Benchmark: drogon - Threads: 1	0.999
opm-git	OPM Benchmark: drogon - Threads: 8	0.997
opm-git	OPM Benchmark: punqs3 - Threads: 1	1.007
opm-git	OPM Benchmark: punqs3 - Threads: 8	1.003
opm-git	OPM Benchmark: smeaheia - Threads: 1	0.965
opm-git	OPM Benchmark: smeaheia - Threads: 8	1
opm-git	OPM Benchmark: spe10_model_1 - Threads: 1	1.013
opm-git	OPM Benchmark: spe10_model_1 - Threads: 8	1.004
opm-git	OPM Benchmark: flow_mpi_extra - Threads: 1 - FOIT (Total Oil Injection At End Of Run)	1
opm-git	OPM Benchmark: flow_mpi_extra - Threads: 8 - FOIT (Total Oil Injection At End Of Run)	1
opm-git	OPM Benchmark: flow_mpi_norne - Threads: 1	0.991
opm-git	OPM Benchmark: flow_mpi_norne - Threads: 8	1.006
opm-git	OPM Benchmark: flow_mpi_norne_4c_msw - Threads: 1	1.008
opm-git	OPM Benchmark: flow_mpi_norne_4c_msw - Threads: 8	0.998

Speed-up = Total time master / Total time pull request. Above 1.0 is an improvement. *

View result details @ https://www.ytelses.com/opm/?page=result&id=2370

ytelses · 2024-02-13T10:48:47Z

Benchmark result overview:

Test	Configuration	Relative
opm-git	OPM Benchmark: drogon - Threads: 1	1.005
opm-git	OPM Benchmark: drogon - Threads: 8	0.816
opm-git	OPM Benchmark: punqs3 - Threads: 1	1.002
opm-git	OPM Benchmark: punqs3 - Threads: 8	0.988
opm-git	OPM Benchmark: smeaheia - Threads: 1	0.955
opm-git	OPM Benchmark: smeaheia - Threads: 8	0.885
opm-git	OPM Benchmark: spe10_model_1 - Threads: 1	1.008
opm-git	OPM Benchmark: spe10_model_1 - Threads: 8	0.998
opm-git	OPM Benchmark: flow_mpi_extra - Threads: 1 - FOIT (Total Oil Injection At End Of Run)	1
opm-git	OPM Benchmark: flow_mpi_extra - Threads: 8 - FOIT (Total Oil Injection At End Of Run)	1
opm-git	OPM Benchmark: flow_mpi_norne - Threads: 1	0.99
opm-git	OPM Benchmark: flow_mpi_norne - Threads: 8	0.956
opm-git	OPM Benchmark: flow_mpi_norne_4c_msw - Threads: 1	1.001
opm-git	OPM Benchmark: flow_mpi_norne_4c_msw - Threads: 8	0.931

Speed-up = Total time master / Total time pull request. Above 1.0 is an improvement. *

View result details @ https://www.ytelses.com/opm/?page=result&id=2371

opm/simulators/linalg/FlexibleSolver_impl.hpp

ytelses · 2024-02-13T20:45:26Z

Benchmark result overview:

Test	Configuration	Relative
opm-git	OPM Benchmark: drogon - Threads: 1	1.003
opm-git	OPM Benchmark: drogon - Threads: 8	0.992
opm-git	OPM Benchmark: punqs3 - Threads: 1	0.991
opm-git	OPM Benchmark: punqs3 - Threads: 8	1.012
opm-git	OPM Benchmark: smeaheia - Threads: 1	0.969
opm-git	OPM Benchmark: smeaheia - Threads: 8	1.001
opm-git	OPM Benchmark: spe10_model_1 - Threads: 1	1.002
opm-git	OPM Benchmark: spe10_model_1 - Threads: 8	1.001
opm-git	OPM Benchmark: flow_mpi_extra - Threads: 1 - FOIT (Total Oil Injection At End Of Run)	1
opm-git	OPM Benchmark: flow_mpi_extra - Threads: 8 - FOIT (Total Oil Injection At End Of Run)	1
opm-git	OPM Benchmark: flow_mpi_norne - Threads: 1	0.995
opm-git	OPM Benchmark: flow_mpi_norne - Threads: 8	1.001
opm-git	OPM Benchmark: flow_mpi_norne_4c_msw - Threads: 1	0.995
opm-git	OPM Benchmark: flow_mpi_norne_4c_msw - Threads: 8	0.993

Speed-up = Total time master / Total time pull request. Above 1.0 is an improvement. *

View result details @ https://www.ytelses.com/opm/?page=result&id=2372

ytelses · 2024-02-14T05:20:02Z

Benchmark result overview:

Test	Configuration	Relative
opm-git	OPM Benchmark: drogon - Threads: 1	0.998
opm-git	OPM Benchmark: drogon - Threads: 8	0.992
opm-git	OPM Benchmark: punqs3 - Threads: 1	0.989
opm-git	OPM Benchmark: punqs3 - Threads: 8	1.009
opm-git	OPM Benchmark: smeaheia - Threads: 1	0.969
opm-git	OPM Benchmark: smeaheia - Threads: 8	1.001
opm-git	OPM Benchmark: spe10_model_1 - Threads: 1	1.019
opm-git	OPM Benchmark: spe10_model_1 - Threads: 8	1.005
opm-git	OPM Benchmark: flow_mpi_extra - Threads: 1 - FOIT (Total Oil Injection At End Of Run)	1
opm-git	OPM Benchmark: flow_mpi_extra - Threads: 8 - FOIT (Total Oil Injection At End Of Run)	1
opm-git	OPM Benchmark: flow_mpi_norne - Threads: 1	1.006
opm-git	OPM Benchmark: flow_mpi_norne - Threads: 8	1.005
opm-git	OPM Benchmark: flow_mpi_norne_4c_msw - Threads: 1	0.999
opm-git	OPM Benchmark: flow_mpi_norne_4c_msw - Threads: 8	0.993

Speed-up = Total time master / Total time pull request. Above 1.0 is an improvement. *

View result details @ https://www.ytelses.com/opm/?page=result&id=2373

atgeirr · 2024-02-14T08:31:03Z

Looks like no change in the benchmarks, which is as expected, since they all run using the default linear solver (i.e. ILU0 preconditioner, no CPR/AMG) I believe.

lisajulia · 2024-02-20T07:31:50Z

Looks like no change in the benchmarks, which is as expected, since they all run using the default linear solver (i.e. ILU0 preconditioner, no CPR/AMG) I believe.

Yes that makes sense. Curretly, I'm still waiting for Andreas to get back to me with the measurements he has done previously.

lisajulia · 2024-03-15T07:51:13Z

Looks like no change in the benchmarks, which is as expected, since they all run using the default linear solver (i.e. ILU0 preconditioner, no CPR/AMG) I believe.

Yes that makes sense. Curretly, I'm still waiting for Andreas to get back to me with the measurements he has done previously.

New results using the current master (N = number of processes, simulation time in seconds for normal and skipping ghost) below, for 32 and 64 processes the improvement is still around 5% and for 128 processes, the improvement is still around 10%!

N	normal	skip ghost
32	1043.34	1006.81
64	820.60	773.49
128	589.27	541.67
256	still running	still running

blattms

Review is not complete yet.

Looking at this PR I suddenly realize that we are already assuming that copy/overlap entries come after owner. The certainly holds when using CpGrid, but I kind of doubt that it holds for other parallel grids 😬 . It seems like there is no check for this in master anywhere, but I'll double check this.

Anyway, if this assumption is true then this restriction has been there for quite some time without us noticing. Hence it should not hold merging back.

blattms · 2024-03-15T09:55:59Z

tests/test_ghostlastmatrixadapter.cpp

+    for(int j=0; j < N; j++)
+        for(int i=overlapStart; i < overlapEnd; i++, ++iter) {


Don't we need to change this, to make the sure that the guys that are GridAttributes::copy come after the rest?

I think we would need some kind of mapping of the indices to ensure that?

Not sure whether we want an expensive check in GhostLastMatrxAdapter that checks the GridAttributes::owner comes first.

Can we discuss this tomorrow once more and then eventually merge?

lisajulia · 2024-03-15T13:12:37Z

jenkins build this please

lisajulia · 2024-04-17T14:40:12Z

jenkins build this please

…chy. This works since the ghost entries are the last entries

lisajulia force-pushed the ilu-op-in-amg branch from 8a44018 to f418bd5 Compare February 12, 2024 10:36

lisajulia force-pushed the ilu-op-in-amg branch from f418bd5 to 562c15a Compare February 12, 2024 11:51

lisajulia force-pushed the ilu-op-in-amg branch from 562c15a to 9f95301 Compare February 12, 2024 13:06

lisajulia force-pushed the ilu-op-in-amg branch from 9f95301 to aa72833 Compare February 12, 2024 14:13

lisajulia force-pushed the ilu-op-in-amg branch from aa72833 to d443d52 Compare February 12, 2024 17:47

atgeirr reviewed Feb 13, 2024

View reviewed changes

opm/simulators/linalg/FlexibleSolver_impl.hpp Outdated Show resolved Hide resolved

lisajulia force-pushed the ilu-op-in-amg branch 2 times, most recently from dddb44e to 3818c75 Compare March 15, 2024 07:37

blattms reviewed Mar 15, 2024

View reviewed changes

lisajulia force-pushed the ilu-op-in-amg branch from 3818c75 to 05b4af1 Compare March 15, 2024 13:12

lisajulia force-pushed the ilu-op-in-amg branch from 05b4af1 to 3caa86e Compare April 17, 2024 14:39

andrthu and others added 2 commits April 18, 2024 17:19

Ghost entries skipped for ilu apply and GL operator in AMG/CPR hierar…

fb4c7e5

…chy. This works since the ghost entries are the last entries

Test for skipping the ghost entries using Jacobi preconditioner

aa9a848

lisajulia force-pushed the ilu-op-in-amg branch from 3caa86e to aa9a848 Compare April 18, 2024 15:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ghost entries skipped for ILU apply and SpMV operator in all levels of AMG/CPR hierarchy #5182

Ghost entries skipped for ILU apply and SpMV operator in all levels of AMG/CPR hierarchy #5182

lisajulia commented Feb 12, 2024

lisajulia commented Feb 12, 2024

lisajulia commented Feb 12, 2024

blattms commented Feb 12, 2024

lisajulia commented Feb 12, 2024

lisajulia commented Feb 12, 2024

lisajulia commented Feb 12, 2024

lisajulia commented Feb 12, 2024

lisajulia commented Feb 12, 2024

blattms commented Feb 12, 2024

blattms commented Feb 12, 2024

ytelses commented Feb 13, 2024

ytelses commented Feb 13, 2024

ytelses commented Feb 13, 2024

ytelses commented Feb 14, 2024

atgeirr commented Feb 14, 2024

lisajulia commented Feb 20, 2024

lisajulia commented Mar 15, 2024 •

edited by bska

blattms left a comment

blattms Mar 15, 2024

lisajulia Apr 17, 2024

lisajulia commented Mar 15, 2024

lisajulia commented Apr 17, 2024

		for(int j=0; j < N; j++)
		for(int i=overlapStart; i < overlapEnd; i++, ++iter) {

Ghost entries skipped for ILU apply and SpMV operator in all levels of AMG/CPR hierarchy #5182

Are you sure you want to change the base?

Ghost entries skipped for ILU apply and SpMV operator in all levels of AMG/CPR hierarchy #5182

Conversation

lisajulia commented Feb 12, 2024

lisajulia commented Feb 12, 2024

lisajulia commented Feb 12, 2024

blattms commented Feb 12, 2024

lisajulia commented Feb 12, 2024

lisajulia commented Feb 12, 2024

lisajulia commented Feb 12, 2024

lisajulia commented Feb 12, 2024

lisajulia commented Feb 12, 2024

blattms commented Feb 12, 2024

blattms commented Feb 12, 2024

ytelses commented Feb 13, 2024

ytelses commented Feb 13, 2024

ytelses commented Feb 13, 2024

ytelses commented Feb 14, 2024

atgeirr commented Feb 14, 2024

lisajulia commented Feb 20, 2024

lisajulia commented Mar 15, 2024 • edited by bska

blattms left a comment

Choose a reason for hiding this comment

blattms Mar 15, 2024

Choose a reason for hiding this comment

lisajulia Apr 17, 2024

Choose a reason for hiding this comment

lisajulia commented Mar 15, 2024

lisajulia commented Apr 17, 2024

lisajulia commented Mar 15, 2024 •

edited by bska