forked from pablobarbera/eui-text-workshop
/
00-setup.Rmd
50 lines (33 loc) · 2.98 KB
/
00-setup.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
---
title: "Setting Up R and RMarkdown"
author: "Pablo Barbera"
date: "May 19, 2016"
output: html_document
---
Welcome to the workshop!
The goal of this first lesson is to provide an overview of the tools we will use throughout the rest of today's workshop.
__Why R?__
- Currently used by most statisticians and social scientists interested in data analysis; and also becoming one of most [popular languages in Data Science](http://www.kdnuggets.com/2015/05/r-vs-python-data-science.html).
- Open source: makes it highly customizable and easily extensible through ["packages"](https://cran.r-project.org/web/packages/) (over 7,500 and counting!).
- Powerful tool to conduct automated text analysis even with very large datasets.
- Command-line interface and scripts favors reproducibility.
- Excellent documentation and online help resources.
We will be using [RStudio](https://www.rstudio.com/) to interact with R, and write our annotated R code using [Markdown](http://rmarkdown.rstudio.com).
__RStudio__ is an open-source integrated development environment (IDE). The main advantage of RStudio with respect to other graphical interfaces, such as R GUI (the default), is that it integrates a powerful built-in text editor as well as other tools for plotting, debugging, and workspace management.
__Markdown__ is a simple formatting syntax to generate HTML or PDF documents. In combination with R, it will generate a document that includes the comments, the R code, and the output of running such code.
You can embed R code in chunks like this one:
```{r}
1 + 1
```
You can run each chunk of code one by one, by highlighting the code and clicking `Run` (or pressing `Ctrl + Enter` in Windows or `command + enter` in OS X). You can see the output of the code in the console right below, inside the RStudio window.
Alternatively, you can generate (or __knit__) an html document with all the code, comment, and output in the entire `.Rmd` file by clicking on `Knit HTML`.
You can also embed plots and graphics, for example:
```{r}
x <- c(1, 3, 4, 5)
y <- c(2, 6, 8, 10)
plot(x, y)
```
If you run the chunk of code, the plot will be generated on the panel on the bottom right corner. If instead you knit the entire file, the plot will appear after you view the html document.
Using R + Markdown has several advantages: it leaves an "audit trail" of your work, including documentation explaining the steps you made. This is helpful to not only keep your own progress organized, but also make your work reproducible and more transparent. You can easily correct errors (just fix them and run the script again), and after you have finished, you can generate a PDF or HTML version of your work.
__Important__: always make sure to set your working directory to the folder where the current Rmd script is located. You can do this by clicking in Session > Set Working Directory > To Source File Location.
We will be exploring R through R Markdown over the next few modules. For more details and documentation see <http://rmarkdown.rstudio.com>.