Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

incorrect number of mistakes in structured perceptron code #25

Open
estambolieva opened this issue Jul 23, 2014 · 3 comments
Open

incorrect number of mistakes in structured perceptron code #25

estambolieva opened this issue Jul 23, 2014 · 3 comments
Projects

Comments

@estambolieva
Copy link

not quite sure, but should we not increment the num_mistakes variable at line 56 in the sequences/structured_perceptron.py code by adding the line num_minstakes +=1?

2 minutes later, never mind, I figured it out :). closed.

@miguelbalmeida
Copy link
Contributor

I am reopening this because what comes next is related. The number of mistakes is correctly computed, but the code is very confusing. Me, Luis Sarmento and Zita also had trouble figuring this out.

In day 3, the code for structured perceptron deals with the initial position at two separate places: first it updates the initial weights, then it processes it in the loop over all positions, where mistakes are counted and transition and emission weights are updated.

This led at least four monitors to confusion, so let's make sure each position is processed only in one place. Use one of the following two structures:

Process position 0, update initial and emission weights
Process positions 1 ... N-1 (FOR loop), update emission and transition weights
Process position N, update final, emission and transition weights

or

Process positions 0 ... N (FOR loop)
if position is 0, process initial and emission weights
if position is N, process final, emission and transition weights
else, process emission and transition weights

Thanks to Katia and Luis Sarmento for bringing up the issue.

@miguelbalmeida miguelbalmeida self-assigned this Jul 25, 2014
@andre-martins
Copy link
Contributor

Not sure I agree the proposed solutions are better. There also basically
two kinds of features here:

  • Emission features
  • Transition-like features (which includes transitions, initial [i.e.
    transitions from START] and final [i.e. transitions to STOP]).

The main for loop takes care of the emissions (and counting the mistakes),
hence its range.

Perhaps the most clear way is to separate the transition and emission
updates into two for loops (albeit this would be a bit slower). The code
would become:

1) Make emission updates.

for i in xrange(N)
...

2) Make transition updates.

... # Initial updates.
for i in xrange(1,N)
...
... # Final updates.

2014-07-25 14:43 GMT+01:00 Miguel Almeida notifications@github.com:

I am reopening this because what comes next is related. The number of
mistakes is correctly computed, but the code is very confusing. Me, Luis
Sarmento and Zita also had trouble figuring this out.

In day 3, the code for structured perceptron deals with the initial
position at two separate places: first it updates the initial weights, then
it processes it in the loop over all positions, where mistakes are counted
and transition and emission weights are updated.

This led at least four monitors to confusion, so let's make sure each
position is processed only in one place. Use one of the following two
structures:
Process position 0, update initial and emission weights Process positions
1 ... N-1 (FOR loop), update emission and transition weights Process
position N, update final, emission and transition weights

or
Process positions 0 ... N (FOR loop) if position is 0, process initial
and emission weights if position is N, process final, emission and
transition weights else, process emission and transition weights

Thanks to Katia and Luis Sarmento for bringing up the issue.


Reply to this email directly or view it on GitHub
#25 (comment)
.

@miguelbalmeida
Copy link
Contributor

Fine by me, all three possibilities seem better than the current state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
lxmls 2019
  
TODO
Development

No branches or pull requests

3 participants