-
archive_consult.zip: 从全国数据量最多的前10个省级档案网站抓取的在线咨询交互数据,以XML格式保存。
-
odp4espm.zip: This is the ODP dataset used in paper: Generating Categorical Semantic Path via Explicit Semantic Path Mining
-
sohu-dataset: 抓取自sohu网站的1000个网页,附带标题、关键词、带格式的HTML正文内容,无格式的纯文本内容等信息,以XML格式保存。可用于关键词抽取测试。
iamxiatian/data
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Experimental Data
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published