Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding NTK adaptive loss function #834

Closed
wants to merge 7 commits into from
Closed

Conversation

ayushinav
Copy link
Contributor

Checklist

  • Appropriate tests were added
  • Any code changes were done in a way that does not break public API
  • All documentation related to code changes were updated
  • The new code follows the
    contributor guidelines, in particular the SciML Style Guide and
    COLPRAC.
  • Any new documentation only uses public API

Additional context

Add any other context about the problem here.

@ayushinav
Copy link
Contributor Author

ayushinav commented Mar 16, 2024

One of the arguments for NTK Adaptive loss would be the kernel size (eg., 4th cell here). This is the number of points that should be used to make the NTK. This means we have to sample points again when computing the adaptive loss weights. We can use the already sampled points, but under the current implementation that is not possible because the points are generated inside merge_strategy_with_loss_function in discretize.jl. Also, the kernel size might be different from the batch size, but yeah, we can reuse points instead of sampling.

As a workaround, I tried generating the points inside generate_adaptive_loss_function. The issue now is that we don't have access to data free loss functions because the current implementation only passes the loss functions after taking the mean from the losses of the sampled points. while they are saved for future calls later on, pinnrep does not hold them until then.

So, a simpler solution would be to allocate pinnrep.loss_function earlier on, after defining bc_loss_function and pde_loss_function but since we do not have full_loss_function and additional_loss, we can have them as null functions, so something like

pinnrep.loss_functions = PINNLossFunctions(bc_loss_functions, pde_loss_functions,
                                                ()->(), ()->(), 
                                                datafree_pde_loss_functions,
                                                datafree_bc_loss_functions)

here or probably even before but make all the functions as nulls until we get them. Wanted to confirm if it's fine doing this, @ChrisRackauckas @sathvikbhagavan

IIUC, the sampler in StochasticTraining best resembles their sampler (3rd cell here).

The current implementation I have here takes the gradient after taking the mse, that is, the squares of sum of the gradient of squared errors, we want to take the sum of the squares of the gradients of the squared errors.

@ayushinav
Copy link
Contributor Author

I'd be happy to work on #703 that will help resolve the issue here as well. As I understand now, I feel like it's mostly about making a struct that contains the strategy and the sampled points, maybe domains as well?

@ayushinav ayushinav closed this May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant