- JavaSE-1.7
- UTF-8 File Encoding
-
Add
ontologyAcquisition.jar
to classpath -
Get an instance of the Ontology file
ehownet_ontology.txt
EHowNetTree tree = EHowNetTree.getInstance("./docs/ehownet_ontology.txt");
-
For example, we search for 「開心」
List<EHowNetNode> results = tree.searchWord("開心"); EHowNetNode node = results.get(0);
-
If there's no result, an empty List will be returned
node.getNodeType()
: returnNodeType.WORD
orNodeType.TAXONOMY
- Node with type
NodeType.WORD
has no Hyponym, since it is at the bottom of the Ontology
- Node with type
- For word node:
node.getSid()
: return an integer denoting the id of the word, for example61549
node.getNodeName()
: return a string denoting the name of the word, for example開心
node.getPos()
: return a string denoting the part-of-speech tag of the word, for exampleNv4,VH21
node.getEhownet()
: return a string denoting the ehownet's definition of the word, for example{joyful|喜悅}
- For taxonomy node:
node.getNodeName()
: return a string denoting the name of the taxonomy, for example物體
node.getEhownet()
: return a string denoting the ehownet's definition of the word, for exampleobject|物體
node.getHypernym()
: return anEHowNetNode
instance, which is the parent of the node. If the node is at the top of the Ontology, the returned value will benull
node.getHyponymList()
: return aList<EHowNetNode>
instance, containing all the children of the node. If the node is at the bottom of the Ontology, an empty List will be returned
-
Add
ontologyAcquisition.jar
andjsoup-1.9.2.jar
to classpath -
Set the input/output files and convert
Converter.toCKIP("ckip_input.txt", "ckip_output.txt");
-
We can also convert the documents online: http://sunlight.iis.sinica.edu.tw/uwextract/demo.htm
-
Add
ontologyAcquisition.jar
andjxl.jar
to classpath -
Initialize and start with root concept, CKIP-documents and EHowNet
OntologyAcquisition oa = new OntologyAcquisition("課綱", "./docs/ckip", "./docs/ehownet_ontology.txt"); oa.start();
-
For example, we search for 「會議」
OntologyNode node = oa.searchConcept("會議");
-
If the concept does not exist,
null
will be returned
node.getConcept()
: return a string denoting the name of the concept, for example會議
and記錄
node.getAttr()
: return aList<String>
instance, containing all the related concept(but not Hypernym or Hyponym) of the node. If the node has no attributes, an empty List will be returned
node.getHypernym()
: return anOntologyNode
instance, which is the parent of the node. If the node is at the top of the Ontology, the returned value will benull
node.getCategories()
: return aList<OntologyNode>
instance, containing all the children of the node. If the node is at the bottom of the Ontology, an empty List will be returned
oa.getTermFreq("教育")
: return an integer, which is the term frequency of教育
oa.getDocFreq("教育")
: return an integer, which is the document frequency of教育
oa.dump()
: save the Ontology into a new sheet inresult.xls
new UIFrame()
-
Add
ontologyAcquisition.jar
to classpath -
Build the model with domain concept, CKIP-documents, EHowNet and dimension of the output vector
Doc2Vec d2v = new Doc2Vec("課綱", "./docs/ckip", "./docs/ehownet_ontology.txt", 5); VectorModel model = d2v.build();
model.getFeatures()
: return aList<String>
instance, denoting the features extraced by the model. An empty list will be returned if the process failsmodel.getDimension()
: return an integer equal to the valid dimension
model.getDocVectors()
: return aMap< String, List<Double> >
instance containing all the document vectors.key
is the absolute path of a document whilevalue
is the vectormodel.getDocVector("docs/ckip/97815.txt")
: return aList<Double>
instance denoting the vector of the document with pathdocs/ckip/97815.txt
. Both path and absolute path are acceptable for the parameter
OntologyDemo
is an Eclipse sample project of EHowNet, CKIP-Converter, Ontology Acquisition and Doc2Vec- For Eclipse:
- Open the project in workspace
Properties-JavaBuildPath-Libraries
: add all the JAR files inlibs
Windows-Perferences-General-Workspace
: set the text file encoding toUTF-8
- For Shell:
-
Makefile
is availableOntologyDemo$ make
to compile,OntologyDemo$ make run
to run
-
Commands to Compile and Run
OntologyDemo$ javac -d bin -sourcepath src -encoding utf8 -cp libs/jsoup-1.9.2.jar;libs/jxl.jar;libs/ontologyAcquisition.jar src/Main.java OntologyDemo$ java -Dfile.encoding=UTF-8 -cp bin;libs/jsoup-1.9.2.jar;libs/jxl.jar;libs/ontologyAcquisition.jar Main
-