Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

P10 Toy example: when the grid line reaches 200, it is inconsistent with the results of the paper. #201

Open
wdy321 opened this issue May 16, 2024 · 3 comments

Comments

@wdy321
Copy link

wdy321 commented May 16, 2024

This is the code I changed based on example1. I just increased the number of grids in the list. When the number of grids was updated to 200, the loss suddenly became larger and jitter occurred. I don’t know what went wrong.

import sys 
sys.path.append("..")
from kan import *

# initialize KAN with G=3
model = KAN(width=[2,1,1], grid=3, k=3)


# create dataset
f = lambda x: torch.exp(torch.sin(torch.pi*x[:,[0]]) + x[:,[1]]**2)
dataset = create_dataset(f, n_var=2)

grids = np.array([3,5,10,20,50,100,200,500,1000])

train_losses = []
test_losses = []
steps = 200
k = 3

for i in range(grids.shape[0]):
    if i == 0:
        model = KAN(width=[2,1,1], grid=grids[i], k=k)
    if i != 0:
        model = KAN(width=[2,1,1], grid=grids[i], k=k).initialize_from_another_model(model, dataset['train_input'])
    results = model.train(dataset, opt="LBFGS", steps=steps, stop_grid_update_step=30)
    train_losses += results['train_loss']
    test_losses += results['test_loss']

The following are the training results

train loss: 1.42e-02 | test loss: 1.49e-02 | reg: 3.02e+00 : 100%|█| 200/200 [00:25<00:00,  7.71it/s
train loss: 6.42e-03 | test loss: 6.57e-03 | reg: 2.97e+00 : 100%|█| 200/200 [00:21<00:00,  9.12it/s
train loss: 2.91e-04 | test loss: 3.35e-04 | reg: 2.97e+00 : 100%|█| 200/200 [00:20<00:00,  9.57it/s
train loss: 2.21e-05 | test loss: 2.31e-05 | reg: 2.97e+00 : 100%|█| 200/200 [00:19<00:00, 10.15it/s
train loss: 7.55e-06 | test loss: 1.56e-05 | reg: 2.97e+00 : 100%|█| 200/200 [00:22<00:00,  8.82it/s
train loss: 5.01e-06 | test loss: 1.49e-05 | reg: 2.97e+00 : 100%|█| 200/200 [00:19<00:00, 10.48it/s
train loss: 2.67e-02 | test loss: 2.11e-01 | reg: 2.96e+00 : 100%|█| 200/200 [01:21<00:00,  2.46it/s
train loss: 2.53e-02 | test loss: 4.15e-01 | reg: 3.61e+00 : 100%|█| 200/200 [01:32<00:00,  2.17it/s
train loss: 1.89e-01 | test loss: 1.38e+00 | reg: 4.19e+00 : 100%|█| 200/200 [02:11<00:00,  1.52it/s

image

@iiisak
Copy link

iiisak commented May 16, 2024

See Figure 2.3 and Section 2.4 (Toy Example) in the paper

@wdy321
Copy link
Author

wdy321 commented May 16, 2024

@iiisak I want to reproduce the results of this part, but the results I ran are as above. Can you tell me what the problem is?

@KindXiaoming
Copy link
Owner

KindXiaoming commented May 16, 2024

Hi, at high precision, the results can be quite sensitive to random seeds. At least when I made the plot, noise_scale_base=0.0 is used by default, and the default now becomes noise_scale_base=0.1.
Please try if model = KAN(width=[2,1,1], grid=3, k=3, noise_scale_base=0.0) helps. You may also try different random seeds to see how random seeds may affect results, e.g., using model = KAN(width=[2,1,1], grid=3, k=3, noise_scale_base=0.0, seed=42). Also stop_grid_update_step=50 is used by default, and you are using 30. Overall, my feeling is that since new changes are happening very fast, I think exactly reproducible is hard but you'd get something similar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants