Skip to content
This repository has been archived by the owner on Feb 19, 2020. It is now read-only.

Please make the epic-pos-en model avaliable #24

Open
mark-watson opened this issue Jan 27, 2015 · 14 comments
Open

Please make the epic-pos-en model avaliable #24

mark-watson opened this issue Jan 27, 2015 · 14 comments

Comments

@mark-watson
Copy link

If you have time, please make the epic-pos-en model available. I have the Treebank data, but it would be easier to not build it myself (and other people might use it if it were available).

@dlwh
Copy link
Owner

dlwh commented Jan 27, 2015

actually it is available! version 2015.1.25

@reactormonk
Copy link
Collaborator

@briantopping
Copy link

Is it possible to get some instructions on how to use this? The following fails on my machine:

java -Xmx4g -cp target/scala-2.11/epic-assembly-0.4-SNAPSHOT.jar epic.parser.ParseText --model ~/Downloads/epic-parser-en-span_2.10-2014.6.3-SNAPSHOT.jar --nthreads 4 /tmp/travel.txt

My suggestion would be to make a "hello world" that used sbt run. Happy to create one if @dlwh were interested!

@dlwh
Copy link
Owner

dlwh commented May 14, 2015

what's the error you're getting?

On Tue, May 12, 2015 at 9:43 AM, Brian Topping notifications@github.com
wrote:

Is it possible to get some instructions on how to use this? The following
fails on my machine:

java -Xmx4g -cp target/scala-2.11/epic-assembly-0.4-SNAPSHOT.jar
epic.parser.ParseText --model
~/Downloads/epic-parser-en-span_2.10-2014.6.3-SNAPSHOT.jar --nthreads 4
/tmp/travel.txt

My suggestion would be to make a "hello world" that used sbt run. Happy
to create one if @dlwh https://github.com/dlwh were interested!


Reply to this email directly or view it on GitHub
#24 (comment).

@reactormonk reactormonk reopened this May 14, 2015
@briantopping
Copy link

Exception in thread "main" java.util.NoSuchElementException: None.get
    at scala.None$.get(Option.scala:344)
    at scala.None$.get(Option.scala:342)
    at epic.parser.ParseText$.classPathLoad(ParseText.scala:21)
    at epic.parser.ParseText$.classPathLoad(ParseText.scala:11)
    at epic.util.ProcessTextMain$class.main(ProcessTextMain.scala:45)
    at epic.parser.ParseText$.main(ParseText.scala:11)
    at epic.parser.ParseText.main(ParseText.scala)

I'm just getting started on my day, gonna take a look as well.

@briantopping
Copy link

https://github.com/briantopping/epic/blob/master/src/main/scala/epic/models/package.scala#L21-21 is failing, The IOException is getting swallowed by the caller, it is:

java.io.InvalidClassException: epic.parser.models.ParserTrainer$$anonfun$2; local class incompatible: stream classdesc serialVersionUID = 0, local class serialVersionUID = 5531977503861241212
    at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:617)
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
    at epic.models.package$$anonfun$1.applyOrElse(package.scala:21)
        ...

Opened #32.

@dlwh
Copy link
Owner

dlwh commented May 14, 2015

Yeah, Scala changed serialversionuids for all anonfuns to 0 somewhere in
the 2.11.x release cycle, which has broken all serialized model files,
again.

I have to stop using java serialization. I just don't know what to use
instead.

On Wed, May 13, 2015 at 11:21 PM, Brian Topping notifications@github.com
wrote:

https://github.com/briantopping/epic/blob/master/src/main/scala/epic/models/package.scala#L21-21
is failing, The IOException is getting swallowed by the caller, it is:

java.io.InvalidClassException: epic.parser.models.ParserTrainer$$anonfun$2; local class incompatible: stream classdesc serialVersionUID = 0, local class serialVersionUID = 5531977503861241212
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:617)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at epic.models.package$$anonfun$1.applyOrElse(package.scala:21)
at epic.models.package$$anonfun$1.applyOrElse(package.scala:19)
at scala.PartialFunction$$anonfun$runWith$1.apply(PartialFunction.scala:141)
at scala.PartialFunction$$anonfun$runWith$1.apply(PartialFunction.scala:140)
at scala.collection.Iterator$class.foreach(Iterator.scala:743)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1195)
at scala.collection.TraversableOnce$class.collectFirst(TraversableOnce.scala:132)
at scala.collection.AbstractIterator.collectFirst(Iterator.scala:1195)
at epic.models.package$.readFromJar(package.scala:19)
at epic.models.package$.deserialize(package.scala:58)
at epic.models.package$.deserialize(package.scala:15)
at epic.util.ProcessTextMain$class.main(ProcessTextMain.scala:42)
at epic.parser.ParseText$.main(ParseText.scala:11)
at epic.parser.ParseText.main(ParseText.scala)

Gonna work on clearing that up, but does this exception have any meaning
on your end? I'm running on JDK 1.7.0_79-b15.


Reply to this email directly or view it on GitHub
#24 (comment).

@briantopping
Copy link

Would you entertain a PR that converted to using https://developers.google.com/protocol-buffers ? I'm not at all sure I can pull it off, just a thought so far. Looking around though, I don't see the parser model generators.

@briantopping
Copy link

Ok, once I rolled back to e0238ce, I was able to get things running with the published models.

It would be great to get a CI process running to generate the models to snapshots on Sonatype. I'd be happy to do this.

@dlwh
Copy link
Owner

dlwh commented May 17, 2015

the problem is that the data files needed to build the models aren't freely
licensed, so I can't just stick them somewhere.

On Thu, May 14, 2015 at 9:45 PM, Brian Topping notifications@github.com
wrote:

Ok, once I rolled back to e0238ce
e0238ce,
I was able to get things running with the published models.

It would be great to get a CI process running to generate the models to
snapshots on Sonatype. I'd be happy to do this.


Reply to this email directly or view it on GitHub
#24 (comment).

@reactormonk
Copy link
Collaborator

Isn't that a derivative work?

@dlwh
Copy link
Owner

dlwh commented May 17, 2015

i meant that I can't e.g. put the data on github and have travis ci build
the models. I agree it seems like model files aren't subject to copyright,
but I can't publish an automated rebuild-models-script

On Sat, May 16, 2015 at 11:36 PM, reactormonk notifications@github.com
wrote:

Isn't that a derivative work?


Reply to this email directly or view it on GitHub
#24 (comment).

@briantopping
Copy link

I have a private CI server (Atlassian Bamboo) in a hardened installation in Equinix NY4. All the training files would remain protected and publish to Sonatype without exposing anything to anyone but yourself. I usually set up TravisCI for OSS projects, but this seems like a good reason to use the private instances.

@Sergey80
Copy link

I switched back to scala 2.10.4 and to epic 0.2 and still having this issue. Actually this lib never worked for me, just trying to start it once a year.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants