-
Notifications
You must be signed in to change notification settings - Fork 3
/
example_paper.qmd
96 lines (62 loc) · 4.25 KB
/
example_paper.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
---
title: "Example Paper"
author: "Ryan Safner"
date: "11/29/2018"
format: pdf
editor: visual
abstract: |
| **Abstract**: In this paper, I use fake data I created to demonstrate how to organize your files and manage your workflow effectively.
bibliography: bibliography/bibexample.bib
execute:
echo: false
message: false
warning: false
---
# Introduction
I am managing all of my files (paper, references, data, code, and figures) by creating an `R Project`. This sets my working directory `wd()` to a folder on my computer where the `R Project` is stored. Everything that I put within this folder (including folders, for my Data, my Figures, etc.) is accessible from the same starting directory. I can send the entire project to you, and you can use all of the content easily, since all files are referenced *relative* to the project's directory folder.
# Literature Review
I use the citekeys (beginning with `@`) from my `bibexample.bib` file in the **Bibliography** folder which contains all of my references, any time I wish to cite a reference in my text.
> "Here is an example quote from this article." @citekey1[p.2]
@citekey2[p.12] says "this short quote", disagreeing with @citekey1[].
See also the references mentioned appear at the end of the document now (I just need to add a `# References` or `# Bibliography` at the end of the file to create a section for them). It lists all the references you cited in alphabetical order.
# Data Creation
I first created my random data with the script `01-generate-data.R` in my **Scripts** folder. This produces two files, `rawdata1.csv` and `rawdata2.csv`,[^1] which I saved in my **Data** folder. My second script, `02-data-cleaning.R` takes the two data files and merges them in to a single dataset, which I save as `clean_data.csv` in my **Data** folder for later keeping. I can then send out `clean_data.csv` to coauthors and spectators who wish to use my data, but I can always justify where it came from because running `02-data-cleaning.R` on the two raw data sources, `rawdata1.csv` and `rawdata2.csv` will always produce `clean_data.csv` for anyone who runs it!
[^1]: This is to show that often we will need to import multiple data files and wrangle them into a tidy format for data analysis!
\clearpage
# Summary Statistics
Here are some summary statistics of my data, the script for which is saved in **Scripts** as `03-summary-statistics.R`. I reproduce the script as `R` chunks in my `R Markdown .Rmd` document to demonstrate that everything *could* be produced in one document. I hide the raw code by setting `echo=FALSE` in the chunk options, since I do not wish to display code in my final paper.
```{r, summary-statistics, echo=FALSE}
# load R scripts to use
source("scripts/02-data-cleaning.R")
source("scripts/03-summary-statistics.R")
# prints summary statistics table (based on Script 3)
df.sum.tidy
```
```{r}
table(clean_data$shape) %>%
kable()
```
\clearpage
# Plots
Here, I load as images from my **Figures** folder some of the plots that I generated in my `04-plots.R` script in my **Scripts** folder.
![Histogram of x](figures/hist_of_x.pdf)
![Scatterplot of x and y](figures/scatter.png)
\clearpage
# Regressions
I run three regressions with my fake data. The first is a simple regression of $y$ on $x$:
$$Y_i=\beta_0+\beta_1X_i$$
The second is augmented by including a dummy variable for each of the four categories of `shape` (omitting triangle).
$$Y_i=\beta_0+\beta_1X_1+\beta_2circle_i+\beta_3rectangle_i+\beta_4square_i$$
```{r, results="asis"}
# load regressions from Scripts/05-regressions.R
source("scripts/05-regressions.R")
```
The third adds "state-fixed effects" using the Least Squares Dummy Variable method.[^2]
$$Y_i=\beta_0+\beta_1X_1+\beta_2circle_i+\beta_3rectangle_i+\beta_4square_i+\alpha_i$$
All of the regressions are run in the script `05-regressions.R` in the **Scripts** folder. I import the script and use an `R` chunk below to run `stargazer` to generate the regression table.
```{r, results="asis"}
source("scripts/05-regressions.R")
stargazer(reg1, reg2, reg3, omit=c("state"), column.labels = c("","","With State-effects"), type="latex",header=F, float=F)
```
[^2]: To avoid showing all of the state dummies, I add `omit=c("state")` to the `stargazer()` command.
# References