Cuda heat example w quaditer #913

Abdelrahman912 · 2024-05-14T12:06:22Z

Heat Example Prototype using CUDA.jl and StaticCellValues

koehlerson · 2024-05-15T08:51:36Z

docs/src/literate-tutorials/gpu_qp_heat_equation.jl

+end
+
+Kgpu = CUDA.zeros(dh.ndofs.x,dh.ndofs.x)
+gpu_dh = GPUDofHandler(dh)


Maybe that's the plan anyway but wouldn't it be nicer to write Adapt rules for DofHandler and Grid which return adapt_structure for the GPU structs?

function Adapt.adapt_structure(to,dh:DofHandler) return adapt_structure(GPUDofHandler(cu(Int32.(dh.cell_dofs)),GPUGrid(dh.grid))) end

and then just use the "normal" structs from a user perspective that are automatically converted when needed

I am still a bit split about this, because it contains a performance pitfalls. If we have repeated assembly, then the dof handler will be converted and copied to GPU for each assembly instead of only once.

Am I missing something or wouldn't the adapt call not also happen for each assembly kernel launch for the GPUDofHandler in that case?

This should not happen, because we do not need to adapt the GPUDofHandler (it is already a GPU datastructure).

KristofferC · 2024-05-15T10:01:13Z

docs/src/literate-tutorials/gpu_qp_heat_equation.jl

+end
+
+
+gm = static_cellvalues.gm


Maybe this is still a work in progress but the mix of functions and global variables make it kind of confusing to read the code.

Abdelrahman912 · 2024-05-23T14:18:14Z

What I did for now, and it's still work in progress:

I added some higher-level abstractions and some restructuring to match the original example (still need some refactoring).
I used the QuadratureValuesIterator and edited the StaticCellValue object to be compatible with the GPU.

This still work in progress and as my discussion with @termi-official last week I still need to work on the assembler, coloring algorthim.

Some problems I have encountered that might be so straightforward to tackle:

Grid object contains Dict type which is not GPU compatible.

termi-official · 2024-05-23T19:29:05Z

Great to see some quick progress here!

Some problems I have encountered that might be so straightforward to tackle:
1. `Grid` object contains `Dict` type which is not GPU compatible.

I think that is straight forward to solve. We never really need the Dicts directly during assembly. We should be able to get away by just convert the Vectors (once) to GPUVectors and run the assembly with these. This might require 2 structs. One holding the full information (e.g. GPUGrid) and one which we use in the kernels (e.g. GPUGridView). Maybe the latter could be something like

struct GPUGridView{TEA, TNA, TSA <: Union{Nothing, <:AbstractVector{Int}, <: AbstractVector{FaceIndex}, ..., TCA} <: AbstractGrid (?)
    cells::TEA
    nodes::TNA
    subdomain::TSA
    color::TCA
end

where subdomain just holds the data which we want to iterate over (or nothing for all cells) and color is a vector for elements with one color of the current subdomain.

KnutAM · 2024-05-23T19:44:24Z

A longer-term thing just to throw out the idea, but perhaps a more slim Grid could be nice?

struct Grid{dim, C, T, CV, NV, S}
    cells::CV
    nodes::NV
    gridsets::S
    function Grid(cells::AbstractVector{C}, nodes::AbstractVector{Node{dim, T}}, gridsets) where {C, dim, T}
        return new{dim, C, T, typeof(cells), typeof(nodes), typeof(sets)}(cells, nodes, gridsets)
     end
end
struct GridSets
    facetsets::Dict{String, OrderedSet{FacetIndex}}
    cellsets::Dict{String, OrderedSets{Int}}
    ....
end

allowing also gridsets=nothing

termi-official · 2024-06-04T14:18:29Z

A longer-term thing just to throw out the idea, but perhaps a more slim Grid could be nice?

struct Grid{dim, C, T, CV, NV, S}
    cells::CV
    nodes::NV
    gridsets::S
    function Grid(cells::AbstractVector{C}, nodes::AbstractVector{Node{dim, T}}, gridsets) where {C, dim, T}
        return new{dim, C, T, typeof(cells), typeof(nodes), typeof(sets)}(cells, nodes, gridsets)
     end
end
struct GridSets
    facetsets::Dict{String, OrderedSet{FacetIndex}}
    cellsets::Dict{String, OrderedSets{Int}}
    ....
end

allowing also gridsets=nothing

I thought of this quite a bit already, whether we should have our grid in the form

struct Grid{dim, C, T, CV, NV, S, TT}
    cells::CV
    nodes::NV
    subdomain_info::S
    function Grid(cells::AbstractVector{C}, nodes::AbstractVector{Node{dim, T}}, subdomain_info) where {C, dim, T}
        return new{dim, C, T, typeof(cells), typeof(nodes), typeof(sets)}(cells, nodes, subdomain_info)
     end
end

where subdomain info contains any kind of subdomain information. This could also include potentially some optional topology information which we need for some problems. In the simplest case it would be just facesets and cellsets.

However, we should do this in a separate PR. What do you think @fredrikekre ?

Abdelrahman912 · 2024-06-04T16:54:55Z

src/Grid/grid_generators.jl

@@ -85,16 +86,19 @@ end

 function _generate_2d_nodes!(nodes, nx, ny, LL, LR, UR, UL)
      for i in 0:ny-1
-        ratio_bounds = i / (ny-1)
+        T = typeof(LL[1])


Regarding also to grids, I don't know if this is an issue or not but when I specified my tensors (Left and Right) to be Float32 and generated grid as follows:

left = Tensor{1,2,Float32}((0,-0)) # define the left bottom corner of the grid. right = Tensor{1,2,Float32}((100.0,100.0)) # define the right top corner of the grid. grid = generate_grid(Quadrilateral, (100, 100),left,right);

and due to _generate_2d_nodes! uses Float division the by default division generates Float64 type an throws an error when tries to push a node in nodes list push!(nodes, Node((x, y)))
because the former is of type Float32 whereas the latter is of type Float64. So, as a quick workaround a I did an explicit cast for ratio_bounds to ensure type comptability.

This is nasty. We should be able to circumvent the issue with the following patch for now:

diff --git a/src/Grid/grid_generators.jl b/src/Grid/grid_generators.jl index 42eca02dc..6e9952b58 100644 --- a/src/Grid/grid_generators.jl +++ b/src/Grid/grid_generators.jl @@ -69,7 +69,7 @@ end function _generate_2d_nodes!(nodes, nx, ny, LL, LR, UR, UL) for i in 0:ny-1 - ratio_bounds = i / (ny-1) + ratio_bounds = convert(eltype(LL), (i / (ny-1))) x0 = LL[1] * (1 - ratio_bounds) + ratio_bounds * UL[1] x1 = LR[1] * (1 - ratio_bounds) + ratio_bounds * UR[1] @@ -78,7 +78,7 @@ function _generate_2d_nodes!(nodes, nx, ny, LL, LR, UR, UL) y1 = LR[2] * (1 - ratio_bounds) + ratio_bounds * UR[2] for j in 0:nx-1 - ratio = j / (nx-1) + ratio = convert(eltype(LL), j / (nx-1)) x = x0 * (1 - ratio) + ratio * x1 y = y0 * (1 - ratio) + ratio * y1 push!(nodes, Node((x, y)))

Can you go ahead and open a PR with the patch for master together with some test coverage for all grid generators?

KnutAM and others added 10 commits January 11, 2024 10:35

Initial ideas

a979fb2

Working implementation

298158c

Merge branch 'master' into kam/QuadraturePointIterator

1794db3

Add static values version and improve interface

51ab4f2

Add dev example and test

22a7377

Merge branch 'master' into kam/QuadraturePointIterator

18377f3

Add StaticCellValues without stored cell coordinates

27a3a96

initial ideas

95b5729

minor changes

d4e881d

Merge branch 'Ferrite-FEM:master' into cuda-heat-example-w-quaditer

f55b878

koehlerson reviewed May 15, 2024

View reviewed changes

KristofferC reviewed May 15, 2024

View reviewed changes

Abdelrahman912 added 2 commits May 23, 2024 15:54

add some abstractions

c1ef6ad

add minor comment

394ac6a

Abdelrahman912 added 3 commits May 30, 2024 22:57

add z dierction for numerical integration

1f0df67

add Float32

3152042

minor fix

aac5994

Abdelrahman912 commented Jun 4, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cuda heat example w quaditer #913

Cuda heat example w quaditer #913

Abdelrahman912 commented May 14, 2024

koehlerson May 15, 2024

termi-official May 15, 2024

koehlerson May 15, 2024

termi-official Jun 4, 2024

KristofferC May 15, 2024

Abdelrahman912 commented May 23, 2024

termi-official commented May 23, 2024

Some problems I have encountered that might be so straightforward to tackle:

KnutAM commented May 23, 2024

termi-official commented Jun 4, 2024 •

edited

Abdelrahman912 Jun 4, 2024

termi-official Jun 4, 2024

termi-official Jun 4, 2024

Cuda heat example w quaditer #913

Are you sure you want to change the base?

Cuda heat example w quaditer #913

Conversation

Abdelrahman912 commented May 14, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Abdelrahman912 commented May 23, 2024

What I did for now, and it's still work in progress:

Some problems I have encountered that might be so straightforward to tackle:

termi-official commented May 23, 2024

Some problems I have encountered that might be so straightforward to tackle:

KnutAM commented May 23, 2024

termi-official commented Jun 4, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

termi-official commented Jun 4, 2024 •

edited