Skip to content

Yiguan/crawl_bioRxiv2

Repository files navigation

crawl_bioRxiv2

Summarise the number of word in each section of submitted articles on bioRxiv.

After data cleaning, a total of 42,348 submitted papers on bioRxiv were analyzed here (before Oct 15, 2019).

Summary of word count in each section

  1. ABSTRACT

[Bule vertical dashed lines indicate integer numbers from 150 to 400 with step = 50. Clear peaks were showed in these vertical lines.]

  • It seems many authors were trying to delete some words to meet the criteria of journals before submitted.*
  1. INTRODUCTION

  1. METHOD

  1. RESULT

  1. DISCUSSION

  1. Number of REFERENCE

  1. Put all section together

[x-aixs was truncated at 50000]

Correlation among each section

Relationship between REFERENCE and each section

Using mutilple linear regression, all sections expect ABSTRACT had impacts on the number of REFERENCE.As expected, the length of DISCUSSION has the largest impact on the number of REFERENCE.

DATA

https://github.com/Yiguan/crawl_bioRxiv2/blob/master/bioData_clean.txt

About

Summarise the number of word in each section of submitted articles on bioRxiv

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages