example_paper.qmd

---
title: "Example Paper"
author: "Ryan Safner"
date: "11/29/2018"
format: pdf
editor: visual
abstract: |
  | **Abstract**: In this paper, I use fake data I created to demonstrate how to organize your files and manage your workflow effectively. 
bibliography: bibliography/bibexample.bib
execute:
  echo: false
  message: false
  warning: false
---

# Introduction

I am managing all of my files (paper, references, data, code, and figures) by creating an `R Project`. This sets my working directory `wd()` to a folder on my computer where the `R Project` is stored. Everything that I put within this folder (including folders, for my Data, my Figures, etc.) is accessible from the same starting directory. I can send the entire project to you, and you can use all of the content easily, since all files are referenced *relative* to the project's directory folder. 

# Literature Review

I use the citekeys (beginning with `@`) from my `bibexample.bib` file in the **Bibliography** folder which contains all of my references, any time I wish to cite a reference in my text. 

> "Here is an example quote from this article." @citekey1[p.2]

@citekey2[p.12] says "this short quote", disagreeing with @citekey1[].

See also the references mentioned appear at the end of the document now (I just need to add a `# References` or `# Bibliography` at the end of the file to create a section for them). It lists all the references you cited in alphabetical order. 

# Data Creation

I first created my random data with the script `01-generate-data.R` in my **Scripts** folder. This produces two files, `rawdata1.csv` and `rawdata2.csv`,[^1] which I saved in my **Data** folder. My second script, `02-data-cleaning.R` takes the two data files and merges them in to a single dataset, which I save as `clean_data.csv` in my **Data** folder for later keeping. I can then send out `clean_data.csv` to coauthors and spectators who wish to use my data, but I can always justify where it came from because running `02-data-cleaning.R` on the two raw data sources, `rawdata1.csv` and `rawdata2.csv` will always produce `clean_data.csv` for anyone who runs it!  

[^1]: This is to show that often we will need to import multiple data files and wrangle them into a tidy format for data analysis! 

\clearpage 

# Summary Statistics

Here are some summary statistics of my data, the script for which is saved in **Scripts** as `03-summary-statistics.R`. I reproduce the script as `R` chunks in my `R Markdown .Rmd` document to demonstrate that everything *could* be produced in one document. I hide the raw code by setting `echo=FALSE` in the chunk options, since I do not wish to display code in my final paper. 

```{r, summary-statistics, echo=FALSE}
# load R scripts to use 
source("scripts/02-data-cleaning.R")
source("scripts/03-summary-statistics.R")

# prints summary statistics table (based on Script 3)
df.sum.tidy
```

```{r}
table(clean_data$shape) %>%
  kable()
```

\clearpage

# Plots 

Here, I load as images from my **Figures** folder some of the plots that I generated in my `04-plots.R` script in my **Scripts** folder. 

![Histogram of x](figures/hist_of_x.pdf)

![Scatterplot of x and y](figures/scatter.png)

\clearpage

# Regressions 

I run three regressions with my fake data. The first is a simple regression of $y$ on $x$:  

$$Y_i=\beta_0+\beta_1X_i$$ 

The second is augmented by including a dummy variable for each of the four categories of `shape` (omitting triangle).

$$Y_i=\beta_0+\beta_1X_1+\beta_2circle_i+\beta_3rectangle_i+\beta_4square_i$$

```{r, results="asis"}
# load regressions from Scripts/05-regressions.R
source("scripts/05-regressions.R")
```

The third adds "state-fixed effects" using the Least Squares Dummy Variable method.[^2] 

$$Y_i=\beta_0+\beta_1X_1+\beta_2circle_i+\beta_3rectangle_i+\beta_4square_i+\alpha_i$$

All of the regressions are run in the script `05-regressions.R` in the **Scripts** folder. I import the script and use an `R` chunk below to run `stargazer` to generate the regression table. 

```{r, results="asis"}
source("scripts/05-regressions.R")
stargazer(reg1, reg2, reg3, omit=c("state"), column.labels = c("","","With State-effects"), type="latex",header=F, float=F)
```

[^2]: To avoid showing all of the state dummies, I add `omit=c("state")` to the `stargazer()` command. 

# References