Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.6节train_ch3函数loss计算和sgd问题 #155

Open
jhj0411jhj opened this issue Aug 22, 2020 · 1 comment
Open

3.6节train_ch3函数loss计算和sgd问题 #155

jhj0411jhj opened this issue Aug 22, 2020 · 1 comment

Comments

@jhj0411jhj
Copy link

bug描述
3.6节train_ch3函数,如果传入的loss已经是求过平均的,train_l_sum每次只累加一个batch的平均值,最后却除以总样本数,打印的loss结果就会很小,例如3.9节的调用。

如果每个batch不一样大(例如Fashion-MNIST设置batch_size=256时,最后一个batch是96),当optimizer=None时,默认的sgd传入batch_size应该会在最后一个batch造成误差,似乎应该使用y.shape[0]。

另外train_ch5是用batch_count,如果batch大小不一致,最后打印的loss也应该会有微小误差。

def train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size,
              params=None, lr=None, optimizer=None):
    for epoch in range(num_epochs):
        train_l_sum, train_acc_sum, n = 0.0, 0.0, 0
        for X, y in train_iter:
            y_hat = net(X)
            l = loss(y_hat, y).sum()
            
            # 梯度清零
            if optimizer is not None:
                optimizer.zero_grad()
            elif params is not None and params[0].grad is not None:
                for param in params:
                    param.grad.data.zero_()
            
            l.backward()
            if optimizer is None:
                sgd(params, lr, batch_size)
            else:
                optimizer.step()  # “softmax回归的简洁实现”一节将用到
            
            
            train_l_sum += l.item()
            train_acc_sum += (y_hat.argmax(dim=1) == y).sum().item()
            n += y.shape[0]
        test_acc = evaluate_accuracy(test_iter, net)
        print('epoch %d, loss %.4f, train acc %.3f, test acc %.3f'
              % (epoch + 1, train_l_sum / n, train_acc_sum / n, test_acc))

版本信息
pytorch:
torchvision:
torchtext:
...

@CKing111
Copy link

你好,我在这个部分出现了下面的error,请问应该怎么修正,有点无从下手


RuntimeError Traceback (most recent call last)
in
31 % (epoch + 1, train_l_sum / n, train_acc_sum / n, test_acc))
32
---> 33 train_ch3(net, train_iter, test_iter, cross_entropy, num_epochs, batch_size, [W, b], lr)

in train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size, params, lr, optimizer)
7 train_l_sum, train_acc_sum, n = 0.0, 0.0, 0
8 for X, y in train_iter:
----> 9 y_hat = net(X)
10 l = loss(y_hat, y).sum()
11

in net(X)
1 def net(X):
----> 2 return softmax(torch.mm(X.view((-1, num_inputs)), W) + b)

RuntimeError: The size of tensor a (10) must match the size of tensor b (3) at non-singleton dimension 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants