You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It is very difficult to intuitively understand how the TACReader class is meant to be used. What path do I send to "corpusRoot"? Here is the file hierarchy of the raw TAC 2014-2015 data, where 2015 has a similar folder structure to 2014.
From what I can tell, TACReader is breaking down XML documents. The only folder containing XML data is in source_documents. Inside the .txt files is XML file structure. Is TACReader ONLY parsing information from source_documents, or does it parse from other folders in the file structure?
Here's how I'm trying to use TACReader and here's the error message I'm getting. Note, I've tried a bunch of different paths to set corpusRoot at, and they're all giving me the same error. I'm running completely blind here. Any help would be very appreciated!
import edu.illinois.cs.cogcomp.nlp.corpusreaders.TACReader;
public class PreprocessTAC {
public static void main(String[] args) throws Exception {
String path = "/path/to/tac_kbp_eng_event_arg_comp_train_eval_2014-2015/data/";
TACReader reader_tac = new TACReader(path, false);
}
}
Error message:
Exception in thread "main" java.lang.NullPointerException: Cannot read the array length because "<local4>" is null
at edu.illinois.cs.cogcomp.core.io.IOUtils.lsFilesRecursive(IOUtils.java:145)
at edu.illinois.cs.cogcomp.nlp.corpusreaders.TACReader.getFileListing(TACReader.java:239)
at edu.illinois.cs.cogcomp.nlp.corpusreaders.XmlDocumentReader.initializeReader(XmlDocumentReader.java:107)
at edu.illinois.cs.cogcomp.nlp.corpusreaders.AnnotationReader.<init>(AnnotationReader.java:47)
at edu.illinois.cs.cogcomp.nlp.corpusreaders.AbstractIncrementalCorpusReader.<init>(AbstractIncrementalCorpusReader.java:61)
at edu.illinois.cs.cogcomp.nlp.corpusreaders.XmlDocumentReader.<init>(XmlDocumentReader.java:89)
at edu.illinois.cs.cogcomp.nlp.corpusreaders.TACReader.<init>(TACReader.java:113)
at PreprocessTAC.main(PreprocessTAC.java:7)
The text was updated successfully, but these errors were encountered:
It is very difficult to intuitively understand how the TACReader class is meant to be used. What path do I send to "corpusRoot"? Here is the file hierarchy of the raw TAC 2014-2015 data, where 2015 has a similar folder structure to 2014.
From what I can tell, TACReader is breaking down XML documents. The only folder containing XML data is in source_documents. Inside the .txt files is XML file structure. Is TACReader ONLY parsing information from source_documents, or does it parse from other folders in the file structure?
Here's how I'm trying to use TACReader and here's the error message I'm getting. Note, I've tried a bunch of different paths to set corpusRoot at, and they're all giving me the same error. I'm running completely blind here. Any help would be very appreciated!
Error message:
The text was updated successfully, but these errors were encountered: