Svm context engine #635

ShraddhaThumsi · 2019-09-16T18:56:53Z

Preparing to merge SVMContextEngine to master

… the bug was arising. Hopefully, this patch does it.

enoriega · 2019-09-16T21:43:01Z

@ShraddhaThumsi , Travis is reporting some of the tests failing because it's searching for a file hardcoded to a path in your computer

For example: /home/sthumsi/enter/reach/main/src/main/resources/org/clulab/context/svmFeatures/svmTrainedModel.dat

That is not the only one, please take a look into Travis' build log to find all the pertinent file paths.

Can you fix those, please?

…to SVMContextEngine

… to prepare the path to the temporary .dat file

…heck on the functionality of the code itself.

… test code. Next up: merging with SVMContextEngine.

enoriega

Just some minor clean up requests and I think its ready

.sbtopts

logfile_IS_UNDEFINED

enoriega · 2019-09-24T01:17:20Z

main/build.sbt

@@ -35,6 +37,8 @@ libraryDependencies ++= {
    // testing
    "org.scalatest"       %%  "scalatest"      % "3.0.1"  % "test",
    "com.typesafe.akka"   %%  "akka-testkit"   % akkaV    % "test"
+    //"org.ml4ai" %% "scalacontext" % "0.1.0-SNAPSHOT"


Please erase this commented line

main/build.sbt

main/src/main/resources/application.conf

enoriega · 2019-09-24T01:19:34Z

main/src/main/resources/application.conf

 # the output formats for mentions:
 # "arizona" (column-based, one file per paper)
 # "cmu" (column-based, one file per paper)
 # "fries" (multiple JSON files per paper)
 # "serial-json" (JSON serialization of mentions data structures. LARGE output!)
 # "text" (non-JSON textual format)
-outputTypes = ["fries"]
+outputTypes = ["fries", "arizona", "cmu", "serial-json", "text"]


Please restore this setting to its original value

main/src/main/scala/org/clulab/reach/ReachSystem.scala

…Will make another commit for that, and I'll be ready to push it

…. have been attended to. The log file is deleted.

enoriega

It looks good now

enoriega · 2019-09-28T21:08:01Z

@MihaiSurdeanu let me know if you want to take a look, otherwise I consider this is ready to be merged.

ShraddhaThumsi · 2019-09-28T23:02:47Z

Thank you!

…

On Sat, Sep 28, 2019 at 2:08 PM Enrique Noriega ***@***.***> wrote: @MihaiSurdeanu <https://github.com/MihaiSurdeanu> let me know if you want to take a look, otherwise I consider this is ready to be merged. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#635?email_source=notifications&email_token=AFJCRMPWCJF255ZMT5IBL2DQL7BTHA5CNFSM4IXGAD7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD73COWI#issuecomment-536225625>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFJCRMJ6NGURZIENW4VK27LQL7BTHANCNFSM4IXGAD7A> .

MihaiSurdeanu · 2020-03-06T03:49:24Z

@enoriega, @ShraddhaThumsi, @cl4yton: this PR was never merged.
Should we merge now? Can one of you summarize what is included here?
Thanks!

enoriega · 2020-03-07T22:30:06Z

@MihaiSurdeanu This pull request contains @ShraddhaThumsi 's implementation of the SVM context engine, a training script for a linear SVM and ensures no unit test is broken. You requested to hold back from merging because the validation of the trained model is still pending.

MihaiSurdeanu · 2020-03-08T00:46:38Z

Thanks! On March 7, 2020 at 3:30:08 PM, Enrique Noriega (notifications@github.com) wrote: @MihaiSurdeanu <https://github.com/MihaiSurdeanu> This pull request contains @ShraddhaThumsi <https://github.com/ShraddhaThumsi> 's implementation of the SVM context engine, a training script for a linear SVM and ensures no unit test is broken. You requested to hold back from merging because the validation of the trained model is still pending. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#635?email_source=notifications&email_token=AAI75TQJ77NSOVLHFFGCIMDRGLDG7A5CNFSM4IXGAD7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEOEG6KI#issuecomment-596143913>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAI75TUHJDDO7ZKU7Q6JNWDRGLDG7ANCNFSM4IXGAD7A> .

cl4yton

I have merged master into this branch and all tests are currently passing. I have asked @enoriega to do another pass, and also added @MihaiSurdeanu to also review. I do not have authority to complete the pull request.

MihaiSurdeanu · 2020-03-18T14:29:34Z

@enoriega: please let me know when you're done, and I'll do a pass too.

enoriega

It looks good. All tests pass

enoriega · 2020-03-19T02:17:38Z

I think this is okay, however, I wonder if you want to merge until I write the README for the SVMContextEngine and co.

As long as you select the Policy4 context engine, nothing should be affected after the merge

MihaiSurdeanu · 2020-03-19T02:28:13Z

Thanks @enoriega !

I'll wait for the README. Please let me know when it's available.

…onfigure the SVMContextEngine in application.conf

MihaiSurdeanu · 2020-03-22T16:08:45Z

@enoriega, @ShraddhaThumsi, @cl4yton:
I propose to not merge this yet, until we solve the issue discussed in the email thread. Pasted here, for completeness:

ENRIQUE:
My apologies for getting back to you until today. I wrote a README on how to use the training script left by Shraddha and on how to configure the SVMContextEngine on the application.conf file.

By doing this I detected two annoying details I didn’t realize back then, but fortunately they’re easy to fix. The first is that the svm model file is hard coded and can’t be changed in the config file.

The second is that there’s a file that specifies a subset of features to use for training. This file is supposed to be a text file but instead is a serialized java string. This should be simple to solve too.

The features file is a product of the experiments Paul Hein ran for the paper. It is not a raw data file because several attempts that Shraddha made to generate it generated different results and instead of following that unreliable approach we decided to use the same file as before to aim to get the same results, and then go back into the feature parser code.

MIHAI:

Small issue: the feature file should be stored as a text file in resources/. This is hopefully an easy change.
The bigger issue: I am worried that if you train using the features from this file, but at testing time, or when Reach is used in the wild, you use features created in the Scala code, they won’t match. Based on your comments, this seems indeed to be the case. This is a big problem. It means that we have unexpected behavior in actual Reach usage. Should we work on training the model used in Reach using the same feature creation code used during testing?

kwalcock · 2021-01-06T15:48:33Z

Do I dare ask for an update? March 2020 was a long time ago. At the very least I need to resolve the merge conflicts that have developed in the interim. It's also very possible to keep a parallel branch around and not ever merge into master, but then to close the PR. There may be other users who now are concerned with the repercussions of these changes.

MihaiSurdeanu · 2021-01-06T15:49:43Z

Still in progress... But I hope that @enoriega will wrap this up once he starts his new position.

enoriega · 2021-01-15T01:07:00Z

I'll take over it

ShraddhaThumsi added 16 commits September 8, 2019 16:18

fixing precison and recall code

4f96f30

fixing precison and recall code

d16ba73

fixing precison and recall code

6c8d68e

fixing precison and recall code

1d955f0

fixing precison and recall code

e8c16ce

possibly there were stray rows in a given directory, because of which…

022b96f

… the bug was arising. Hopefully, this patch does it.

possibly there were stray rows in a given directory, because of which…

f7390dd

… the bug was arising. Hopefully, this patch does it.

possibly there were stray rows in a given directory, because of which…

8c9a1c3

… the bug was arising. Hopefully, this patch does it.

possibly there were stray rows in a given directory, because of which…

b6f0003

… the bug was arising. Hopefully, this patch does it.

possibly there were stray rows in a given directory, because of which…

7946b84

… the bug was arising. Hopefully, this patch does it.

possibly there were stray rows in a given directory, because of which…

0b0f35a

… the bug was arising. Hopefully, this patch does it.

possibly there were stray rows in a given directory, because of which…

a50f53f

… the bug was arising. Hopefully, this patch does it.

possibly there were stray rows in a given directory, because of which…

5237f82

… the bug was arising. Hopefully, this patch does it.

possibly there were stray rows in a given directory, because of which…

9bceb54

… the bug was arising. Hopefully, this patch does it.

possibly there were stray rows in a given directory, because of which…

1f31b88

… the bug was arising. Hopefully, this patch does it.

Refactoring code to get ready to merge to master

326cb3c

ShraddhaThumsi requested review from enoriega and cl4yton September 16, 2019 18:56

ShraddhaThumsi added 2 commits September 16, 2019 12:05

removing stray print statements

3076d28

removing stray print statements

21372b9

ShraddhaThumsi and others added 9 commits September 16, 2019 16:18

removing stray print statements

c603b65

trying master branch way of writing root directory

1e2b39a

starting code with events file instead of annotations

d5642d3

starting code with events file instead of annotations

283b77e

Merge branch 'SVMContextEngine' of https://github.com/clulab/reach in…

83f8fdb

…to SVMContextEngine

starting code with events file instead of annotations

b171fa4

starting code with events file instead of annotations

2c64ee5

starting code with events file instead of annotations

2e81457

starting code with events file instead of annotations

5f44311

ShraddhaThumsi added 6 commits September 23, 2019 13:26

absolute path to the temp dir didn't work, so I will use the URL path…

2391ebb

… to prepare the path to the temporary .dat file

inverting order in which I'm calling the instance

d461e6a

inverting order in which I'm calling the instance

e4a1eeb

inverting order in which I'm calling the instance

9697939

removed test to check for exception, since it doesn't add any extra c…

348f159

…heck on the functionality of the code itself.

added detailed comments on the usage of the script and cleaned up the…

5c51a3d

… test code. Next up: merging with SVMContextEngine.

enoriega requested changes Sep 24, 2019

View reviewed changes

ShraddhaThumsi added 2 commits September 23, 2019 20:38

made modifications to all but log file as per Enrique's suggestions. …

cd1ca54

…Will make another commit for that, and I'll be ready to push it

all suggestion by Enrique on changing build.sbt, application.conf etc…

3eea465

…. have been attended to. The log file is deleted.

enoriega approved these changes Sep 28, 2019

View reviewed changes

merging master into branch

3eac754

cl4yton requested review from MihaiSurdeanu and enoriega March 18, 2020 14:02

cl4yton approved these changes Mar 18, 2020

View reviewed changes

enoriega reviewed Mar 19, 2020

View reviewed changes

Added README on how to use the TrainSVMClassifier script and how to c…

1183f46

…onfigure the SVMContextEngine in application.conf

MihaiSurdeanu approved these changes Mar 22, 2020

View reviewed changes

enoriega self-assigned this Jan 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Svm context engine #635

Svm context engine #635

ShraddhaThumsi commented Sep 16, 2019

enoriega commented Sep 16, 2019 •

edited

enoriega left a comment

enoriega Sep 24, 2019

enoriega Sep 24, 2019

enoriega left a comment

enoriega commented Sep 28, 2019

ShraddhaThumsi commented Sep 28, 2019 via email

MihaiSurdeanu commented Mar 6, 2020

enoriega commented Mar 7, 2020

MihaiSurdeanu commented Mar 8, 2020 via email

cl4yton left a comment

MihaiSurdeanu commented Mar 18, 2020

enoriega left a comment

enoriega commented Mar 19, 2020

MihaiSurdeanu commented Mar 19, 2020

MihaiSurdeanu commented Mar 22, 2020

kwalcock commented Jan 6, 2021

MihaiSurdeanu commented Jan 6, 2021

enoriega commented Jan 15, 2021

Svm context engine #635

Are you sure you want to change the base?

Svm context engine #635

Conversation

ShraddhaThumsi commented Sep 16, 2019

enoriega commented Sep 16, 2019 • edited

enoriega left a comment

Choose a reason for hiding this comment

enoriega Sep 24, 2019

Choose a reason for hiding this comment

enoriega Sep 24, 2019

Choose a reason for hiding this comment

enoriega left a comment

Choose a reason for hiding this comment

enoriega commented Sep 28, 2019

ShraddhaThumsi commented Sep 28, 2019 via email

MihaiSurdeanu commented Mar 6, 2020

enoriega commented Mar 7, 2020

MihaiSurdeanu commented Mar 8, 2020 via email

cl4yton left a comment

Choose a reason for hiding this comment

MihaiSurdeanu commented Mar 18, 2020

enoriega left a comment

Choose a reason for hiding this comment

enoriega commented Mar 19, 2020

MihaiSurdeanu commented Mar 19, 2020

MihaiSurdeanu commented Mar 22, 2020

kwalcock commented Jan 6, 2021

MihaiSurdeanu commented Jan 6, 2021

enoriega commented Jan 15, 2021

enoriega commented Sep 16, 2019 •

edited