Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a tracking of global loss in each epoch #23

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

ducovrossem
Copy link

Something to look at while training the model :)

global_loss += 0.5 * entry_weight * (prediction - c_log(count)) **2

@maciejkula
Copy link
Owner

Makes sense to track the loss. A couple of comments:

  1. Why not just use a primitive double that's initialized in Cython, and then maybe returned from the fit_vectors function? We could then avoid the array + instance attribute approach.
  2. We should only print the loss if verbose == True.
  3. If you are using more than one thread, the loss will be approximate (as the threads will be writing over each other).

fit_vectors returns global loss
Global loss printing over epochs only enabled if verbose is True
@ducovrossem
Copy link
Author

  1. Changed it around. Had thought about your proposed structure but went with the previous method only because it seemed less of a re-write. Why avoid the the array + instance approach?
  2. Added.
  3. One could initiate a global_loss instance per no_threads and add them at the end... I am not sure how much of a detail this is.

@maciejkula
Copy link
Owner

It just seems strange to have a one-element array when the same purpose can be served by just having a single number.

Regarding the multithreading issue: I don't think this is a huge problem, just wanted to point out that that the loss numbers can be non-deterministic.


fit_vectors(self.word_vectors,
self.global_loss = fit_vectors(self.word_vectors,
self.vectors_sum_gradients,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you just fix the indentation of these lines? self.vectors_sum_gradients should be indented to the same level as self.word_vectors.

@maciejkula
Copy link
Owner

I was going to go in sneakily after this is merged and do that :)

@ducovrossem as Radim pointed out, it will be more efficient to only calculate the (prediction - c_log(count)) portion once and then re-use it in the expressions for loss and global_loss.

As for the 0.5 factor: because the expression for the gradient of the loss does not have a 2 * entry_weight scaling factor it implies that the original loss was 0.5 * (x - y). As far as I know this is quite common, because it allows us to drop the 2 constant in the derivation.

@piskvorky
Copy link

Actually, why not factor out log(count) completely, out of the main loop, into the cooccurrence matrix? In other words, do cooccurrence_matrix = log(cooccurrence_matrix).

Or are the actual original counts/weights needed anywhere else, apart from log(count)?

Re. 0.5 * makes sense, thanks!

@maciejkula
Copy link
Owner

The raw value is used for the weighting (and I also quite like the fact that the co-occurrence matrix is marginally model agnostic and could conceivably support a different application).

In general there is a fair amount of things that can still be factored out to just happen once (look at the bias updates for instance). I'll probably do a pass soon and get those out of the way.

@IronFarm
Copy link

IronFarm commented Jan 3, 2018

I know this pull request has gotten stale but is there any interest in getting it merged? I've managed to merge it into master locally and would be willing to fork and open a new pull request where we can discuss it. Three years have passed but it still looks like a valuable addition!

@gokceneraslan
Copy link

It'd be really great to add this feature!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants