Skip to content

Latest commit

 

History

History
116 lines (81 loc) · 3.41 KB

relation_extraction.md

File metadata and controls

116 lines (81 loc) · 3.41 KB

Chinese Relation Extraction

Background

Given two entity mentions, identify relations and classify them into predefined types.

Example

Input:

[李晓华]和她的丈夫[王大牛]前日一起去[英国]旅行了。

Output:

(entity1: 李晓华, entity2: 王大牛, relation: 夫妻) 

Standard Metrics

Precision, Recall and F1.

Input:

[李晓华]和她的丈夫[王大牛]前日一起去[英国]旅行了。

Reference:

(entity1: 李晓华, entity2: 王大牛, relation: 夫妻) 
(entity1: 李晓华, entity2: 英国, relation: Other) 
(entity1: 王大牛, entity2: 李晓华, relation: 夫妻) 
(entity1: 王大牛, entity2: 英国, relation: Other) 
(entity1: 英国, entity2: 李晓华, relation: Other) 
(entity1: 英国, entity2: 王大牛, relation: Other)

System Output:

(entity1: 李晓华, entity2: 王大牛, relation: 夫妻) 
(entity1: 李晓华, entity2: 英国, relation: 夫妻) 
(entity1: 王大牛, entity2: 李晓华, relation: 夫妻) 
(entity1: 王大牛, entity2: 英国, relation: Other) 
(entity1: 英国, entity2: 李晓华, relation: Other) 
(entity1: 英国, entity2: 王大牛, relation: Other) 

Metric:

Precision = 2 / 3 = 0.66
Recall = 2 / 2 = 1.0

ACE 2005 Relation Extraction.

ACE 2005 employs 6 relation types and 18 subtypes as listed below.

Type Subtypes
ART (artifact) User-Owner-Inventor-Manufacturer
GEN-AFF (Gen-affiliation) Citizen-Resident-Religion-Ethnicity, Org-Location
* METONYMY none
ORG-AFF (Org-affiliation) Employment, Founder, Ownership, Student-Alum, Sports-Affiliation, Investor-Shareholder, Membership
P ART-WHOLE (part-whole) Artifact, Geographical, Subsidiary
* PER-SOC
(person-social)
Business, Family, Lasting-Personal
* PHYS (physical) Located, Near

Results

F1 (6 relation types) F1 (18 relation types) Train/Test split
Zhang et al. (2018) 87.87 83.40 80% / 20%
Li et. al. (2019) - 78.17 75% / 25%
Chen et al. 2014 90.35 75.44 -

Resources

ACE 2005 Chinese Corpus chars files
Newswire 121797 238
Broadcast news 120513 298
Web blogs 65681 97
Total 307991 633

Chinese-Literature-NER-RE-Dataset .

  • Data link
  • Description paper
  • Well defined train, development and test data splits.
  • Contains 9 types of relations (Located, Part-Whole, Family, General-Special, Social, Ownership, Use, Create, Near)

Results

F1
Li et. al. (2019) 65.61
Zhang et. al. (2020) 63.13
Xu et. al. (2020) 57.43

Suggestions? Changes? Please send email to chinesenlp.xyz@gmail.com