Add a transformer neural operator model and an accompanying example for training it on Darcy flow #293

zijieli-Jlee · 2024-02-23T22:31:44Z

Main changes:

Add transformer_no.py under models, which is a Transformer encoder-decoder architecture for learning operator. Small changes to the __init__.py and model_dispatcher.py accordingly.
Add Gaussian Fourier feature and Siren to the embeddings.py which is useful for building the neural field (query point embedding) in the Transformer decoder.
Add a training example train_darcy_transformer.py under scripts, the corresponding config yaml file is darcy_transformer_config.yaml.
Fixed a bug in previous attention layer test #290

dhpitt

Thanks @zijieli-Jlee! This looks awesome. A few minor things:

Thanks for fixing AttentionKernelLayer fails test due to error in Einsum #290 ! Would it be OK to split up this PR into one that fixes the issue (which we can merge immediately) and one that implements the TransformerNO (which may take longer to go over)?
It would be great to add unit tests for the TransformerNO model, along the lines of neuralop/models/tests/test_fno.py (just verifying that the forward pass produces outputs that we would expect)
It might help keep the code flexible and easier to maintain if the EncoderBlocks was its own module in neuralop.layers, along the lines of the FNOBlocks.

This reverts commit f10186b.

dhpitt

Thank you for the updates @zijieli-Jlee, the PR is in really good shape overall. I left a few small comments

dhpitt · 2024-03-05T19:40:30Z

scripts/train_darcy_transformer.py

+model = get_model(config)
+
+
+class ModelWrapper(torch.nn.Module):


What's the function of the ModelWrapper class? It seems like it adds functionality to the Transformer NO's forward pass. Some documentation would be helpful.

Is it possible to incorporate the things in ModelWrapper into the dataset instead? I think this has been the convention we have been taking in the library. @JeanKossaifi Feel free to correct me if I'm wrong.

dhpitt · 2024-03-05T19:41:52Z

neuralop/models/transformer_no.py

+            self.enc_pos_emb_module = None
+            self.dec_pos_emb_module = None
+
+        self.encoder = nn.ModuleList(


nit: FNOBlocks takes n_layers as a parameter instead of exposing a moduleList within the FNO itself. It might be nice to have a similar convention here

zijieli-Jlee · 2024-03-05T19:45:34Z

Hi @dhpitt , thanks for the comments!

I saw this was fixed at copy zijieli-Jlee's fix for attention kernel layer test #295 , let me know if there is still any remaining issue
Definitely. I will add the test for the TransformerNO if its current structure looks fine (see the next point)
Thanks for the suggestion! I added transformer_block following fno_block to layers, the transformer_block contains implementation for TransformerEncoderBlock and TransformerDecoderBlock. I also added test for these modules to test_transformer_block

dhpitt · 2024-03-07T16:38:28Z

@zijieli-Jlee thanks for the edits, your PR is in good shape. if you fix the conflicts we can re-run the tests and see if it's ready to go.

…-attention

dhpitt · 2024-03-26T17:48:16Z

Thanks for the fixes @zijieli-Jlee ! I fixed a one-line conflict with a model import since the last update to main.

This PR is looking great overall, but I think your layers and util functions could use clearer documentation in the docstrings - see neuralop/layers/integral_transform.py for an example of informative docs.

zijieli-Jlee · 2024-03-28T19:13:44Z

I added docstring to the transformer_block and some newly added positional encoding class. In addition I just spotted and fixed a bug in the normalization inside attention layer

dhpitt

This PR is looking great. My only comment is a small nitpick about putting the model source within docstrings. Otherwise I think this is ready to go.

dhpitt · 2024-03-28T21:23:22Z

neuralop/layers/transformer_block.py

+                pos_src: torch.Tensor, grid point coordinates of shape [batch_size, num_src_grid_points, channels]
+                pos_emb_module: nn.Module, positional embedding module, by default None
+                pos_qry: torch.Tensor, grid point coordinates of shape [batch_size, num_sry_grid_points, channels],
+                         by default None and is set to pos_src


It would be useful to include the specific purpose of pos_qry as compared to pos_src - inputs to the query basis function

dhpitt · 2024-03-28T21:24:39Z

neuralop/layers/embeddings.py

+# Gaussian random Fourier features
+# code modified from: https://github.com/ndahlquist/pytorch-fourier-feature-networks
+class GaussianFourierFeatureTransform(nn.Module):


nit: these comments probably belong in the docstring

dhpitt · 2024-03-28T21:24:53Z

neuralop/layers/embeddings.py

+
+# SirenNet
+# code modified from: https://github.com/lucidrains/siren-pytorch/blob/master/siren_pytorch/siren_pytorch.py


same as above - cleaner to leave these comments in the docstring

dhpitt · 2024-04-02T16:50:26Z

Looks great! Asking @JeanKossaifi for final approval.

mliuschi

Left a couple of comments only because it mostly looks good to me!

mliuschi · 2024-04-07T08:03:09Z

neuralop/layers/transformer_block.py

+                                     pos_src=pos,
+                                     positional_embedding_module=pos_emb_module,
+                                     **kwargs)
+            u = u + u_attention_skip


Have you noticed any empirical improvements from having this skip connection be a pointwise linear (not MLP) as opposed to identity connection? If so, it could be interesting to supplement with this functionality.

mliuschi · 2024-04-07T08:11:19Z

scripts/train_darcy_transformer.py

+model = get_model(config)
+
+
+class ModelWrapper(torch.nn.Module):


Is it possible to incorporate the things in ModelWrapper into the dataset instead? I think this has been the convention we have been taking in the library. @JeanKossaifi Feel free to correct me if I'm wrong.

juliusberner · 2024-04-09T00:21:35Z

neuralop/layers/embeddings.py

+                dim_in: int, Number of input channels.
+                dim_out: int, Number of output channels.
+                w0: float, scaling factor (denominator) used to initialize the weights, by default 6.


w0 seems to default to 1 instead of 6?

It is also used to initialize the Sine activation scaling, is this intentional?

juliusberner · 2024-04-09T00:28:15Z

neuralop/layers/embeddings.py

+        weight = torch.zeros(dim_out, dim_in)
+        bias = torch.zeros(dim_out) if use_bias else None
+        self.init_(weight, bias, c=c, w0=w0)


Is there a reason that we do not use torch.nn.Linear and torch.nn.init.uniform_ to initialize?

juliusberner · 2024-04-09T00:30:50Z

neuralop/models/base_model.py

These edits could be removed from the PR

juliusberner · 2024-04-09T00:33:21Z

neuralop/layers/embeddings.py

+
+
+class SirenNet(nn.Module):


@dhpitt Should we move this to layers/mlp.py?

That makes sense to me

juliusberner · 2024-04-09T00:36:04Z

neuralop/layers/embeddings.py

+
+
+class GaussianFourierFeatureTransform(nn.Module):


@dhpitt We now have two Fourier feature embeddings, i.e., PositionalEmbedding above and this one --- I think we should name them more consistently, e.g., GaussianFourierEmbedding and FourierEmbedding?

Good idea @zijieli-Jlee

…ing in the encoder

dhpitt

This all looks great. @JeanKossaifi

zijieli-Jlee added 4 commits February 23, 2024 16:29

add a transformer neural operator model and examples

8300ce2

update example

a25fb55

update the config part in wandb

15bbedc

fix the bug in attention layer test

f10186b

dhpitt reviewed Feb 26, 2024

View reviewed changes

zijieli-Jlee added 4 commits March 3, 2024 23:28

Revert "fix the bug in attention layer test"

4e6922c

This reverts commit f10186b.

re-commit attention test

287ce2a

add encoder/decoder blocks to layers, add corresponding test

1a7fcbe

add encoder/decoder blocks to layers, add corresponding test

aadb594

dhpitt reviewed Mar 5, 2024

View reviewed changes

change TransformerEncoderBlock to have n_layers internally

d09f9d4

zijieli-Jlee and others added 5 commits March 8, 2024 14:08

remove old model_dispatcher.py and update to base_model

1d4d221

add test for transformer NO, remove unsued normalization before cross…

c871a1d

…-attention

add an explanation to the modelwrapper

1c9c387

Merge branch 'main' into dev

f6a7285

Merge branch 'main' into dev

1763a0c

zijieli-Jlee added 3 commits March 28, 2024 15:10

add docstring to transformer_block and position embedding class

751dc25

Merge remote-tracking branch 'origin/dev' into dev

2ba7df7

fix a bug in attention_kernel_integral's normalization wrt domain

be3804e

dhpitt reviewed Mar 28, 2024

View reviewed changes

zijieli-Jlee added 2 commits March 29, 2024 22:07

add source of the code to docstring

4c76f5d

add an explanation to the docstring of pos_qry argument

2c042a6

dhpitt requested review from JeanKossaifi, juliusberner and mliuschi April 2, 2024 17:09

mliuschi reviewed Apr 7, 2024

View reviewed changes

juliusberner reviewed Apr 9, 2024

View reviewed changes

zijieli-Jlee added 4 commits April 18, 2024 14:18

support different types skip connection other than the identity skipp…

9300b17

…ing in the encoder

consistent with update-to-date main repo

7133c00

rename embeddings, move siren to mlp

1aba39b

fix a bug in SirenMLP

b57e12f

dhpitt approved these changes May 30, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a transformer neural operator model and an accompanying example for training it on Darcy flow #293

Add a transformer neural operator model and an accompanying example for training it on Darcy flow #293

zijieli-Jlee commented Feb 23, 2024

dhpitt left a comment

dhpitt left a comment

dhpitt Mar 5, 2024

mliuschi Apr 7, 2024

dhpitt Mar 5, 2024

zijieli-Jlee Mar 5, 2024

zijieli-Jlee commented Mar 5, 2024

dhpitt commented Mar 7, 2024

dhpitt commented Mar 26, 2024

zijieli-Jlee commented Mar 28, 2024

dhpitt left a comment

dhpitt Mar 28, 2024

dhpitt Mar 28, 2024

dhpitt Mar 28, 2024

dhpitt commented Apr 2, 2024

mliuschi left a comment

mliuschi Apr 7, 2024

mliuschi Apr 7, 2024

juliusberner Apr 9, 2024

juliusberner Apr 9, 2024

juliusberner Apr 9, 2024

juliusberner Apr 9, 2024

juliusberner Apr 9, 2024

dhpitt Apr 16, 2024

juliusberner Apr 9, 2024

dhpitt Apr 16, 2024

dhpitt left a comment

		model = get_model(config)


		class ModelWrapper(torch.nn.Module):


		# SirenNet
		# code modified from: https://github.com/lucidrains/siren-pytorch/blob/master/siren_pytorch/siren_pytorch.py

Add a transformer neural operator model and an accompanying example for training it on Darcy flow #293

Are you sure you want to change the base?

Add a transformer neural operator model and an accompanying example for training it on Darcy flow #293

Conversation

zijieli-Jlee commented Feb 23, 2024

dhpitt left a comment

Choose a reason for hiding this comment

dhpitt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zijieli-Jlee commented Mar 5, 2024

dhpitt commented Mar 7, 2024

dhpitt commented Mar 26, 2024

zijieli-Jlee commented Mar 28, 2024

dhpitt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dhpitt commented Apr 2, 2024

mliuschi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dhpitt left a comment

Choose a reason for hiding this comment