-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.Rmd
172 lines (140 loc) · 6.15 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
---
output:
md_document:
variant: markdown_github
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
```
# The California Department of Education in R
![](https://upload.wikimedia.org/wikipedia/commons/4/40/Seal_of_the_California_Department_of_Education.jpg)
The California Department of Education provides ample data. Now, that data is available in R.
## Installation
```{r, eval=FALSE}
devtools::install_github("daranzolin/rCAEDDATA")
library(rCAEDDATA)
```
## Available Datasets
* [Cohort Outcome Data ("cohorts")](http://www.cde.ca.gov/ds/sd/sd/filescohort.asp) -- California Longitudinal Pupil Achievement Data System (CALPADS) cohort outcome data reported by race/ethnicity, program participation, and gender.
* [Dropouts by Race and Gender ("dropouts")](http://www.cde.ca.gov/ds/sd/sd/filesdropouts.asp) -- Data for grade seven through twelve dropouts and enrollment by race/ethnic designation and gender by school.
* [English Learners by Grade and Language ("english_learners")](http://www.cde.ca.gov/ds/sd/sd/fileselsch.asp) -- Data for English learners (ELs) by grade, language, and school.
* [Enrollment by School ("enrollments")](http://www.cde.ca.gov/ds/sd/sd/filesenr.asp) -- Data for school-level enrollment by racial/ethnic designation, gender, and grade.
* [Student Poverty FRPM ("frpm")](http://www.cde.ca.gov/ds/sd/sd/filessp.asp) -- Data for students eligible for Free or Reduced Price Meals (FRPM).
* [Graduates by Race and Gender ("graduates")](http://www.cde.ca.gov/ds/sd/sd/filesgrads.asp) -- Data for graduates and graduates meeting University of California (UC)/California State University (CSU) entrance requirements by race/ethnic designation and gender by school.
* [Primary and Short-Term Enrollment ("primary_and_short_term")](http://www.cde.ca.gov/ds/sd/sd/filesenrps.asp) -- Data for primary and short-term school-level enrollment by racial/ethnic designation, gender, and grade.
* [Expulsion and Suspension Data](http://www.cde.ca.gov/ds/sd/sd/filesesd.asp) -- Data containing student discipline data by ethnicity. Expulsion, in-school suspension, and out-of-school suspension data are provided.
* [Truancy](http://www.cde.ca.gov/ds/sd/sd/filestd.asp) -- Data containing aggregate truancy data at the state, county, district, and school levels, including Census Day enrollment, cumulative enrollment, and rates.
## Examples
### Graduates
```{r, fig.width=12}
library(rCAEDDATA)
library(tidyverse)
data("graduates")
graduates %>%
group_by(YEAR) %>%
summarize(total_grads = sum(GRADS),
Yes = sum(UC_GRADS),
No = total_grads - Yes) %>%
select(-total_grads) %>%
gather(Eligibility, Graduates, -YEAR) %>%
ggplot(aes(YEAR, Graduates, fill = Eligibility)) +
geom_bar(stat = "identity", color = "black") +
labs(x = "Year",
y = "Graduates",
title = "California High School Graduates, 1992-2016",
fill = "UC Eligible?") +
scale_y_continuous(labels = scales::comma) +
scale_fill_manual(values = c("yellow", "lightblue")) +
theme_minimal()
```
### Dropouts
```{r}
data("dropouts")
dropouts %>%
select(GENDER, matches("D[0-9]{1,2}")) %>%
gather(GRADE, DROPOUTS, -GENDER) %>%
mutate(GRADE = as.numeric(stringr::str_replace(GRADE, "D", ""))) %>%
group_by(GENDER, GRADE) %>%
summarize(DROPOUTS = sum(DROPOUTS)) %>%
ggplot(aes(GRADE, DROPOUTS, fill = GENDER)) +
geom_bar(stat = "identity", position = "fill") +
scale_x_continuous(breaks = c(7:12)) +
labs(x = "Grade",
y = "",
title = "Proportion of Student Dropouts by Gender, Grades 7-12",
fill = "Gender") +
theme_minimal()
```
### Enrollments
```{r, fig.width=12}
enrollments %>%
mutate(ETHNIC = case_when(
ETHNIC == 0 ~ "Not Reported",
ETHNIC == 1 ~ "American Indian",
ETHNIC == 2 ~ "Asian",
ETHNIC == 3 ~ "Pacific Islander",
ETHNIC == 4 ~ "Filipino",
ETHNIC == 5 ~ "Hispanic",
ETHNIC == 6 ~ "African American",
ETHNIC == 7 ~ "White",
ETHNIC == 9 | ETHNIC == 8 ~ "Two or More")
) %>%
filter(DISTRICT %in% c("Santa Clara Unified",
"Milpitas Unified",
"San Jose Unified",
"Fremont Union High",
"Mountain View-Los Altos Union High",
"Cupertino Union",
"Campbell Union",
"Cambrian",
"Palo Alto Unified")
) %>%
select(DISTRICT, YEAR, ETHNIC, starts_with("GR_")) %>%
gather(GRADE, STUDENTS, -DISTRICT, -YEAR, -ETHNIC) %>%
group_by(DISTRICT, YEAR, ETHNIC) %>%
summarize(TOTAL_STUDENTS = sum(STUDENTS)) %>%
ggplot(aes(YEAR, TOTAL_STUDENTS, fill = ETHNIC)) +
geom_bar(stat = "identity", position = "fill") +
facet_wrap(~DISTRICT, nrow = 3) +
labs(x = "Year",
y = "",
title = "Ethnic Diversity in Silicon Valley, 2007-2017",
subtitle = "Santa Clara Districts",
fill = "Ethnicity") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
```
### Suspensions
```{r}
library(maps)
library(ggmap)
library(mapdata)
states <- map_data("state")
ca_df <- subset(states, region == "california")
counties <- map_data("county")
ca_county <- subset(counties, region == "california")
drug_data <- suspensions %>%
filter(YEAR == "2014-15",
AGGEGATELEVEL == "O") %>%
group_by(NAME) %>%
summarize(TOTAL_DRUGS = sum(DRUGS, na.rm = TRUE),
TOTAL = sum(TOTAL, na.rm = TRUE),
DRUG_PROP = round(TOTAL_DRUGS/TOTAL, 2))
map_data <- left_join(ca_county, drug_data %>%
mutate(subregion = stringr::str_to_lower(NAME)),
by = "subregion")
ggplot(data = ca_df, mapping = aes(x = long, y = lat, group = group)) +
coord_fixed(1.3) +
geom_polygon(color = "black", fill = "gray") +
geom_polygon(data = map_data, aes(fill = DRUG_PROP), color = "white") +
geom_polygon(color = "black", fill = NA) +
labs(title = "Proportion of Drugs-Related Suspensions by County, 2014-2015",
fill = "Proportion") +
theme_void() +
viridis::scale_fill_viridis()
```