Cross entropy learning #135

mikerabat · 2024-04-19T12:33:38Z

I hope I'm not too annoying but you guys are the experts in that area so I hope I can discuss another neat feature with you...

While browsing through the "Neural Networks for Pattern Recognition" from C. M. Bishop I recognized that there are more than
the standard learning error propagation method with mean squared error but rather there is one called
Cross Entropy loss function... There are a few sources that claim that this error/loss function would indeed allow faster learning
progress....

What do you think? Would that be a viable feature for the library?

joaopauloschuler · 2024-04-21T03:34:01Z

The forward pass of categorical cross-entropy is implemented via TNNetSoftMax. There is an example at:
https://github.com/joaopauloschuler/neural-api/blob/master/examples/SimpleImageClassifier/SimpleImageClassifier.lpr

Regarding the backpropagation, I have changed my mind about the best approach to it a number of times. In some APIs, the derivative on the last softmax layer is just not calculated and the errors are passed assuming derivative 1. You can get this behaviour via the parameter {SkipBackpropDerivative=}1. In most of cases, we get faster convergence with {SkipBackpropDerivative=}1.

You can use cross-entropy right now with TNNetSoftMax as per example.

joaopauloschuler · 2024-04-21T03:42:20Z

@mikerabat , you are certainly not annoying. Glad to help.

mikerabat · 2024-04-21T20:04:15Z

Thank you for the clarification. I guess that my misunderstanding was that I thought the cross-entropy type of error propagation would also be applied in the inner layers... I actually always struggled to wrap my head around that...

mikerabat · 2024-05-08T12:06:37Z

Is there actually a way to implement some kind of weighting in the softmax handling/backpropagation -> The reason is that I deal with a dataset that is heaviyl skewed towards one class (aka in terms of 100:1 that is a reasonable setup for ECG classification...).
So there is a heavy bias towards that one class.
One way I was dealing with that problem was to reduce the number of elements in the first class to have at best a reation of 5:1 ... other approaches could be to have a weighted loss function or weighting in the last softmax layer to emphasize the error in the
classification step right?

joaopauloschuler self-assigned this Apr 21, 2024

joaopauloschuler added the documentation Improvements or additions to documentation label Apr 21, 2024

mikerabat changed the title ~~Cross entryp learning~~ Cross entropy learning Apr 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cross entropy learning #135

Cross entropy learning #135

mikerabat commented Apr 19, 2024

joaopauloschuler commented Apr 21, 2024

joaopauloschuler commented Apr 21, 2024

mikerabat commented Apr 21, 2024

mikerabat commented May 8, 2024

Cross entropy learning #135

Cross entropy learning #135

Comments

mikerabat commented Apr 19, 2024

joaopauloschuler commented Apr 21, 2024

joaopauloschuler commented Apr 21, 2024

mikerabat commented Apr 21, 2024

mikerabat commented May 8, 2024