/
index.html
executable file
·711 lines (614 loc) · 36.1 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
<!DOCTYPE html>
<html>
<head>
<meta name="viewport" content="width=device-width, initial-scale=1">
<!-- Latest compiled and minified CSS -->
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" integrity="sha384-BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+PmSTsz/K68vbdEjh4u" crossorigin="anonymous">
<!-- Optional theme -->
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap-theme.min.css" integrity="sha384-rHyoN1iRsVXV4nD0JutlnGaslCJuC7uwjduW9SVrLvRYooPp2bWYgmgJQIXwl/Sp" crossorigin="anonymous">
<!-- Latest compiled and minified JavaScript -->
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<link rel="stylesheet" href="style.css" />
<link rel="icon"
type="image/png"
href="images/icon.png">
<title>INFO 370 Introduction to Data Science</title>
<!-- Global Site Tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=UA-106917583-1"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments)};
gtag('js', new Date());
gtag('config', 'UA-106917583-1');
</script>
</head>
<body>
<!-- Anything with the tag QUARTERLY next to it has to be updated each quarter. -->
<div class="container-fluid">
<div class="row">
<div class="col-xs-12">
<img src="images/world.png" class="img-responsive" />
<h1>INFO 370: Introduction to Data Science</h1>
<!-- QUARTERLY -->
<div class="lead">
<small style="float:right"><a href="mailto:glnelson@uw.edu">email Greg</a> | <a href="mailto:umang93@uw.edu">email Umang</a> | <a href="mailto:info370a_wi18@uw.edu">email class</a></small>
<br/>
Instructor: <a href="http://www.greglnelson.info/" target="_blank">Gregory L. Nelson</a>
<br/>
TA: <a href="" target="_blank">Umang Seghal</a>
<br/>
<small>Winter 2018. Section B</small>
</div>
<p>
<a href="#schedule">Schedule</a>
•
<a href="#grading">Grading</a>
•
<a href="#activities">Activities</a>
•
<a href="#reading">Reading</a>
•
<a href="#homework">Individual Homework</a>
•
<a href="#project">Project</a>
•
<a href="#teams">Teams</a>
•
<a href="#resources">Resources</a>
</p>
<p>In this class you'll learn how to think like a data scientist. You'll learn what data scientists do and how they do it. You'll also learn about the contexts in which a data scientist exists. By the end of the course, you should be able to enter any organization and begin to understand the social and technical contexts in which you help make decisions. If you want to be a great data scientist, this is the course for you.</p>
Learning objectives for the course:
<ol>
<li>Comprehend the practice of data science as an interactive, iterative process.</li>
<li>Critique the quality of data, models, and results within a decision context.</li>
<li>Consider contextual and critical perspectives in data science</li>
<li>Have familiarity using computational tools that support data scientists</li>
<li>Understand what data scientists do in various organizational and social contexts</li>
</ol>
<!--TODO: Update this-->
<!--<p>To learn these things, this course has three major phases:</p>-->
<!---->
<!--<ul>-->
<!--<li>Two weeks of context setting about software, software engineers, and society</li>-->
<!--<li>Five weeks of readings and skills on grand challenges in software engineering</li>-->
<!--<li>Four weeks of software engineering practice</li>-->
<!--</ul>-->
<!--<p>Additionally, we'll use the labs throughout the quarter to practice interpersonal skills critical to being a great software engineer.</p> -->
<h2>Prerequisites</h2>
<p>You should have aspirations to be a data scientist or to work closely with them. Because we'll use data to inform decisions, you should also know:</p>
<ul>
<li>How to use a scripting language (R, Python) to manipulate data</li>
<li>How to use command line</li>
<li>How to use Git and GitHub</li>
<li>How to access a Web API</li>
</ul>
<p>The prerequisite course, INFO 201 (the Technical Foundations of Informatics), should be suitable preparation for the above. Refer to the <a target="_blank" href="https://info201.github.io/">INFO 201 online book</a> to refresh your knowledge of the course.</p>
<!-- QUARTERLY -->
<h2 id="office-hours">Office Hours</h2>
<p>We are available to talk about jobs, careers, graduate school, research, class, taboos, and anything else. Greg's office hours this quarter will be held twice a week, Monday 12:30pm-1:30pm and Wednesday 5:30pm-6:30pm, both at CSE 3rd Floor Breakout area next to the stairs (large whiteboard wall and windowed area). Umang's office hours this quarter are twice weekly ,Tuesday - 11am to 12am (MGH Commons) and Thursday 11 am to 12 am (MGH Commons). Occasionally we need to schedule things over office hours. To guarantee we'll be around, write to us in advance to secure a time.</p>
<!-- <b>Monday and Tuesday 9-10:30 am in MGH-015</b> (door is locked, so just knock to get in). -->
<h2>Devices in Class</h2>
<p>We will use smartphones and laptops throughout the quarter to facilitate activities and project work in-class. <i>However</i>, research and student feedback clearly shows that using devices on non-class related activities not only harms your own learning, but other students' learning as well. Therefore, <b>I only allow device usage during activities that require devices. At all other times, you should not be using your device.</b> We'll help you remember this by announcing when to bring devices out and when to put them away.</p>
<!-- QUARTERLY -->
<h2 id="schedule">Typical Week</h2>
<ul>
<li>Sunday: Do reading assignments (read readings, review and run scripts). Complete reflection survey.</li>
<li>Monday: Go to class and participate. After class, report struggles online.</li>
<li>Tuesday:Do reading assignments (read readings, review and run scripts). Complete reflection survey.</li>
<li>Wednesday: Go to class and participate. After class, report struggles online.</li>
<li>Wednesday-Saturday: Homework or group project work</li>
<li>Sunday: Homework due</li>
</ul>
<h2 id="schedule">Schedule</h2>
<table class="table table-striped text-left">
<tr><td colspan=3 class="text-uppercase lead">Week 0 — What is data science?</td></tr>
<tr>
<!-- QUARTERLY -->
<td>1/3</td><td>Lecture</td>
<td>
Data science: welcome and opportunity
<ul>
<li><a href="https://drive.google.com/open?id=1pGjmRGoROZNkCwQfDGQlpX4IiaDwRs20shL0SJ1cyfg" target="_blank">Slides</a></li> <!-- I'm experienced -->
<li>Lecture (20 min): <a href="class/data-science-process.html">Data science is a process</a></li>
<li>Review this syllabus (20 min)</li>
<li>Surveys (rest of class)</li>
</ul>
Assigned: <a href="homeworks/review-prerequisite.html">Homework 1</a>. Due Fri 1/12.
</td>
</tr>
<tr><td colspan=3 class="text-uppercase lead">Week 1 — Decision Making in Data Science</td></tr>
<tr>
<!-- QUARTERLY -->
<td>1/8</td><td>Lecture and Lab</td>
<td>
Data science is a process
<ul>
<li>Reading: <a href="https://drive.google.com/open?id=17Sbdm7QdMFPCEVkH0VxpB1Q-YKYx9xGWbD7gmY4lMCw">Decision Theory and Probability</a></li>
<li><a href="" target="_blank">Slides</a></li>
<li>Activity (50 min): Investigating a disease outbreak</li>
<li>Activity (40 min): Reflection </li>
<li>Activity (20 min): Introduction to Decision Theory </li>
<!-- <li><a href="https://docs.google.com/presentation/d/150DcoqEa2jguO9O06tuMyBt-xQaXRXMuK5LoDVLiQGs/edit?usp=sharing" target="_blank">Slides</a></li> -->
</ul>
</td>
</tr>
<!-- QUARTERLY -->
<td>1/10</td><td>Lecture</td>
<td>
Understanding domain and applying decision theory
<ul>
<li>Reading: <a href=" https://docs.google.com/document/d/1qwjT6fjy3WK-h_yxX8ayC4LdsZ-iYLWUxI-BRvfNueg/edit?usp=sharing">Understanding a domain</a></li>
<li><a href="https://docs.google.com/presentation/d/1yuYHoVFHq__Z4M4hVgaFjBkq3yYCI-4aMGePBp7onlU/edit?usp=sharing" target="_blank">Slides</a></li>
</ul>
Assigned: <a href="https://goo.gl/ZQr3DD">Homework 2</a>. Due <del>Sun 1/14</del> before class Weds 1/17.
<br/>
</td>
</tr>
<tr><td colspan=3 class="text-uppercase lead">Week 2 — Framing your analysis </td></tr>
<tr>
<!-- QUARTERLY -->
<td>1/15</td><td>No class, holiday</td><td> <ul>
<li>Reading (due 1/16 at noon): <a href="https://docs.google.com/document/d/18jnXkyRweEfTWJpgs6O6hxo67vMUZIgnZn7GyDVh3Pg/edit?usp=sharing">HW1 Reflection Survey, short R review and Reflection Survey</a></li> </ul></td>
</tr>
<tr>
<!-- QUARTERLY -->
<td>1/17</td><td>Lecture</td>
<td>
Framing: Using data and models for decisions and questions
<ul>
<li>Reading: <a href="https://docs.google.com/document/d/1mUa8O4EISusRyv0Q6Br_LOtJ_Y0gw7ZhjTUo97EIRhw/edit?usp=sharing" target="_blank">Framing and Reframing Data Science with Questions</a></li>
<li>Slides: <a href="https://docs.google.com/presentation/d/17IlP_-5S3Q0-FfCHOLyq31gBYMxtA0ZlBBQU8wYQ2Sc/edit?usp=sharing" target="_blank">Framing and Reframing Data Science with Questions</a></li>
</ul>
Not yet posted: <a href="https://docs.google.com/document/d/1PpjEyVv3jjKotZl5a168CUgaIDd4y8G2p9XucKOFSw0/edit?usp=sharing">Homework 3: Framing </a>. Due Sun 1/21.
<br/>
Assigned: <a href="project/group-formation.html">Project Milestone 1: Group formation & initial domain understanding</a>. Due <del>Sun 1/21</del> Tues 1/23.
</td>
</tr>
<tr><td colspan=3 class="text-uppercase lead">Week 3 — Modeling concepts; finding data</td></tr>
<tr>
<!-- QUARTERLY -->
<td>1/22</td><td>Lecture</td>
<td>
Causal Diagrams and Scoping
<ul>
<li>Reading: <a href="https://docs.google.com/document/d/1FPXD168DDiUfnleOK_55AptDj7hd6MZT903VK_Gpwmk/edit?usp=sharing" target="_blank">Causal Diagrams and Modeling</a></li>
<li>Slides: <a href="https://drive.google.com/open?id=1NJHlmOoidDHSc1ktwhnc7DJBwZUwV8JOdxQkQ5rniic" target="_blank">Slides</a></li>
</ul>
</td>
</tr>
<!-- QUARTERLY -->
<td>1/22</td><td>Lab</td>
<td>
Using Causal Loop Diagram for Scoping
<ul>
<!-- <li>Activity: <a href="labs/data-cleaning.html">Data cleaning with Wrangler</a></li> -->
</ul>
</td>
</tr>
<tr>
<!-- QUARTERLY -->
<td>1/24</td><td>Lecture</td>
<td>
PM1 Review; Ideating, finding and selecting data sources
<ul>
<li>Reading: <a href="https://drive.google.com/open?id=18NvydbuNaFpBOZkSXmEUehAsd42E3Ed83IY4pdm_LUM">Clarifications on Causal Loop Diagrams</a></li>
<li>Slides: <a href="https://docs.google.com/presentation/d/1tYhiypRdJ_FePxvmKkcj0n_P3YXvGkSip_0Yealb8no/edit?usp=sharing">Slides</a></li>
<li>Resources for finding data: <a href="https://drive.google.com/open?id=13NHgb0R9d3kwIjLBEqaj0ywI0ebhXyntmSNN2UxHivQ">Links</a></li>
</ul>
Assigned: <a href="https://docs.google.com/document/d/1cNJAHsXeTQ4DhFDxhfT6gdSGyVL2ywawUEKGM0j4Cc0/edit?usp=sharing">Project Milestone 2: Refining Framing and Evaluating Feasibility and Potential Impact</a>. Due Weds l/31.
</td>
</tr>
<tr><td colspan=3 class="text-uppercase lead">Week 4 — Collecting and Making Sense of Data</td></tr>
<tr>
<!-- QUARTERLY -->
<td>1/29</td><td>Lecture</td>
<td>
Visualizing Data
<ul>
<li>Reading:
<a href="https://docs.google.com/document/d/1x9q8rDUkJsIzibqJGj_FuPENyey03QFx72Yk4TsdOqk/edit?usp=sharing" target="_blank">
Data Visualization with ggPlot2
</a>
</li>
<li>Slides:
<a href="https://docs.google.com/presentation/d/16KIkZaQZXb9jPkmg7f-NlP3Glm-5XIO48NgSEvzzN-M/edit?usp=sharing" target="_blank">
Visualizing Data
</a>
</li>
</ul>
</td>
</tr>
<tr>
<!-- QUARTERLY -->
<td>1/29</td><td>Lab</td>
<td>
Lab - Review PM2, Decisions: Focus on Choices
</td>
</tr>
<tr>
<!-- QUARTERLY -->
<td>1/31</td><td>Lecture</td>
<td>
Tidying and cleaning data
<ul>
<li>Reading:
<a href="https://docs.google.com/document/d/1bHPe1sBPAUK8TlaDa39_0H7dq3jKWZoLvO0dPwiuUsk/edit?usp=sharing" target="_blank">
Tidying and cleaning data
</a>
</li>
<li>Slides:
<a href="https://docs.google.com/presentation/d/17aF8vfRvHYwlGj7QVfomk_SmnPDssBhgRYA-M3j5yK0/edit?usp=sharing" target="_blank">
Tidying and cleaning data
</a>
<li>Assigned: <a href="https://docs.google.com/document/d/1-tiULDb4q9atpBaBVJhLe2KiEt6WeZoRrx9jIa9Y-vs/edit?usp=sharing">Project Milestone 3: Completing and choosing a scoped framing for your project</a>. Due <del>Tues 2/6</del> Weds 2/7 (but do work on it over the weekend if possible). </li>
<li>Assigned: <a href="https://docs.google.com/document/d/1EI4hyqBHhPuJawMRk4xdAcRe-IfcT2xhMQzjp81fPpQ/edit?usp=sharing">Homework 4</a> Due Sun 2/4. </li>
</ul>
</td>
</tr>
<tr><td colspan=3 class="text-uppercase lead">Week 5 — Visualization; Model Fitting</td></tr>
<tr>
<!-- QUARTERLY -->
<td>2/5</td><td>Lecture</td>
<td>
Web scraping; Exploratory Data Analysis and PM2 examples
<ul>
<li>Reading: <a href="https://docs.google.com/document/d/1trM5yW5zm8MyGarDLKeip57Z1icM0FPm9Vm6X-Bo0SM/edit?usp=sharing" target="_blank">Web Scraping</a></li>
<li>Slides: <a href="https://docs.google.com/presentation/d/1XL84YRkoTlQGFXHvOxCXgou3QIdrC2N1bE6_aQLSOLI/edit#slide=id.g27ab030d48_0_45" target="_blank">Slides</a></li>
</ul>
</td>
</tr>
<tr>
<!-- QUARTERLY -->
<td>2/5</td><td>Lab</td>
<td>
Web Scraping
<ul>
<li>Slides: <a href="https://docs.google.com/presentation/d/1zLwa2woDl_4Zz19SPIIreaM-tmpdp9GHJO9etRtTUMI/edit?usp=sharing">Slides</a></li>
<li>Code: <a href="https://drive.google.com/open?id=1gGYdw-Ac-DuF4W7Xf5IOCLu_SBE2r1kd" target="_blank">Code</a></li>
</ul>
</td>
</tr>
<tr>
<!-- QUARTERLY -->
<td>2/7</td><td>Lecture</td>
<td>
Models
<ul>
<li>Reading: <a href="https://docs.google.com/document/d/1mRK9UAzaB9wbwsvGbSLoIY7dcNHJVkQEzm2pNITW2_8/edit">Modeling</a></li>
<li>Slides: <a href="https://docs.google.com/presentation/d/1Ev6x2tgjYzAWWXS53c3P_tbeseduAxpDxay2iT408lk/edit?usp=sharing" target="_blank">Slides</a></li>
<li>Assigned: <a href="https://docs.google.com/document/d/1LJ3f8g5bo6tkpbOmR931hzKxC-0S0J0jEbLR1gzd8GY/edit">Homework 5</a>. Due Tuesday 2/13.
</ul>
<ul>
</ul>
</td>
</tr>
<!--<tr><td colspan=3 class="text-uppercase lead">Week 6 — Finding and Comparing Models & Parameters</td></tr>-->
<tr><td colspan=3 class="text-uppercase lead">Week 6 — Modeling</td></tr>
<tr>
<!-- QUARTERLY -->
<td>2/12</td><td>Lecture</td>
<td>
Modeling as a search for "optimal" parameters
<ul>
<!--<li>Reading: Visualizing Models, Their Difference with Data (Residuals)</li>-->
<li>Reading: <a href="https://docs.google.com/document/d/1BDi1P7Tm6LU2w1BtRZdnafydDuW0KzP33ZVHeuLxs5k/edit?usp=sharing">Modeling: Interactions and Bootstrapping</a></li>
<li>Slides: <a href="https://docs.google.com/presentation/d/16uWgZSOa3JIx_OrRst8rakLQelNuCgnWPtuo04GfJsw/edit?usp=sharing">Modeling, Inference, Variation, and Linear Models</a></li>
</ul>
</td>
</tr>
<tr>
<!-- QUARTERLY -->
<td>2/12</td><td>Lab</td>
<td>
Fitting basic models in R
<ul>
<li>Code: <a href="https://github.com/Info-370-Winter-2018/class_6_1/blob/master/Class%206-1%20with%20updates%20after%20lab%20session.Rmd">Fitting and visualizing linear models</a></li>
</ul>
</td>
</tr>
<tr>
<!-- QUARTERLY -->
<td>2/14</td><td>Lecture</td>
<td>
Evaluating quality of model parameters (fitted models)
<!--Finding models for decision-making using computer simulations-->
<ul>
<li>Slides: <a href="https://docs.google.com/presentation/d/1P40dNddzuxbCG91jwOEJmBI-VReZmF6DmTNSzkWJPQE/edit?usp=sharing" target="_blank">Slides</a></li>
<li>Code: <a href="https://github.com/Info-370-Winter-2018/class_examples" target="_blank">Bootstrapping and Monte Carlo Simulations</a></li>
</ul>
<ul>
<li>Assigned: <a href="https://docs.google.com/document/d/1Yan1Ndfy5kxMhfPkF7fhQMYjyxHBNvL6XllNRW8lGmQ/edit?usp=sharing">Homework 6</a>. Due Friday Feb 23.</li>
<li>Assigned: <a href="http://www.greglnelson.info/info370/project/proposal-review.html">Project Milestone 4: Project Review</a>. Due <del>Mon 2/12</del> Friday 2/16.</li>
<li>Assigned: <a href="project/proposal-revision.html">Project Milestone 5: Project Revision</a>. Due <del>Mon 2/19</del> Tuesday 2/20.</li>
<li>Assigned: <a href="project/project-meeting.html">Project Milestone 6: Project check-in meetings</a>. Due Tues 2/20. </li>
<li>Assigned: Project Milestone 7 & 8: <a href="project/presentation.html">Presentation</a> & <a href="project/artifact.html">Artifact</a>. Due by end of day on day of finals (March 13).</li>
</ul>
</td>
</tr>
<!--<tr><td colspan=3 class="text-uppercase lead">Week 7 — Contrasting and Interpreting Models</td></tr>-->
<tr><td colspan=3 class="text-uppercase lead">Week 7 — Interpreting models</td></tr>
<tr>
<!-- QUARTERLY -->
<td>2/19</td><td>Holiday</td>
<td>
</td>
</tr>
<tr >
<!-- QUARTERLY -->
<td>2/21</td><td>Lecture</td>
<td>
<!--Model fit, overfitting and cross-validation-->
<!--Common models and how to use them-->
Logistic Regression; Evaluating models with cross-validation
<ul>
<li>Note: Greg will be at a conference</li>
<li>Reading: <a href="https://docs.google.com/document/d/1JJKH71y-Axm41-IDcaBjQ44wC1YfYL-J8rTYmAGWOBQ/edit?usp=sharing">Logistic regression; cross-validation and overfitting</a></li>
</ul>
</td>
</tr>
<!-- QUARTERLY -->
<!--<tr><td colspan=3 class="text-uppercase lead">Week 8 — Models, Bias, and Social Impacts</td></tr>-->
<tr><td colspan=3 class="text-uppercase lead">Week 8 — Understanding Models </td></tr>
<tr>
<!-- QUARTERLY -->
<td>2/26</td><td>Lecture</td>
<td>
Logistic Regression; simulating decisions using models and residuals
<ul>
<!--<li>Reading: Using simulations to understand logistic regression <a href=""></a></li><-->
<li>Slides: <a href="https://docs.google.com/presentation/d/1u517LRSLY-mDwBr5n2T6S14qGqbYf6PZNsVqYTbMPQw/edit?usp=sharing">Slides</a></li>
<li>Code: <a href="https://github.com/Info-370-Winter-2018/class_examples">see 8.2 Filled In</a></li>
</ul>
</td>
</tr>
<tr>
<!-- QUARTERLY -->
<td>2/26</td><td>Lab</td>
<td>
Trying different thresholds for logistic regression and simulating decisions
<ul>
<li>Code: <a href="https://github.com/Info-370-Winter-2018/class_examples">see 8.1 (not filled in) </a></li>
</ul>
</td>
</tr>
<tr>
<!-- QUARTERLY -->
<td>2/28</td><td>Lecture</td>
<td>
Bias; Simulating decision using models and residuals
<ul>
<li>Slides: <a href="https://docs.google.com/presentation/d/1xxAK8KfgzwKEW-ISEgN0JbcxNEu_2szFWoRrHCyF_LI/edit?usp=sharing">Slides</a></li>
<li>Code: <a href="https://github.com/Info-370-Winter-2018/class_examples">see simulating decision for class 8.3 and class 8.2 filled in</a></li>
</ul>
</td>
</tr>
<!--<tr><td colspan=3 class="text-uppercase lead">Week 9 — Big Data and Opacity </td></tr>-->
<tr><td colspan=3 class="text-uppercase lead">Week 9 — Debugging and Limitations of Models</td></tr>
<tr >
<!-- QUARTERLY -->
<td>3/5</td><td>Lecture</td>
<td>
Debugging strategies for R code and Monte Carlo Simulations
<!--Scaling to "Big Data"-->
<ul>
<!--<li>Reading: Excerpt from Fourth Paradigm; Business Articles on Big Data</li>-->
<!-- <li>Activity (110 min): <a href="activities/work.html">Work time</a></li> -->
<li>Supplemental Reading:
<a href="https://github.com/berkeley-scf/tutorial-R-debugging/blob/master/R-debugging.Rmd">Debugging in R</a>,
<a href="https://web.stanford.edu/class/archive/cs/cs106a/cs106a.1134/handouts/250%20Debugging%20Strategies.pdf">Debugging Mindset</a>,
<a href="https://bookdown.org/rdpeng/rprogdatascience/">Examples of R language features</a> </li>
<li>Slides: <a href="https://docs.google.com/presentation/d/1v0U22VILRM4QaHJicT15ls5Z0KDP6NOqdzs000THZ1Q/edit?usp=sharing">Slides</a></li>
<li>Code: <a href="https://github.com/Info-370-Winter-2018/class_examples">See code for 9-1</a></li>
</ul>
</td>
</tr>
<tr>
<!-- QUARTERLY -->
<td>3/7</td><td>Lecture</td>
<td>
Interpreting models and limitations; reflecting on class projects
<ul>
<!--<li>Reading: Excerpt from Antifragile and Simulations</li>-->
<li>Reading: Reflections from data scientists</li>
<li>Optional but encouraged Reading: Excerpt from Antifragile, big data and opacity</li>
</ul>
</td>
</tr>
<tr><td colspan=3 class="text-uppercase lead">Week 10 — Project Fair </td></tr>
>
<!-- QUARTERLY -->
<td>3/13 Tues</td><td>Project Fair</td>
<td>
Room
</td>
</tr>
<td colspan="4"><a href="homeworks/reflect.html">Homework 9 (Project and Course Reflection)</a> Due 3/14. <br/></td>
</table>
<h2 id="grading">Grading</h2>
<p>There are <b>100 points</b> you can earn in this class:
<ul>
<li><a href="#activities">Activities</a> (12 points, 0.5 points for each class or lab). Show up and engage to get credit.</li>
<li><a href="#reading">Reading</a> (15 points, 1 point each). Prove you read and understood the reading.</li>
<li><a href="#homework">Individual Homework</a> (33 points). Prove you understand important data sciencetopics.</li>
<li><a href="#project">Project</a> (40 points, <i>team score</i>). Reach several milestones related to your team data science project.</li>
</ul>
<p>We will use the <a href="https://canvas.uw.edu/courses/721562/pages/ischool-standard-grading-scale" target="_blank">iSchool Standard Grading Scale</a> to convert your grade percentage (as shown in Canvas) to a 4.0 scale.</p>
<table class="table table-striped">
<tr>
<td>≥ 97% → 4.0</td>
<td>90.5 → 3.5</td>
<td>83.9 → 3.0</td>
<td>78 → 2.5</td>
<td>73 → 2.0*</td>
<td>68 → 1.5</td>
<td>62 → 0.9</td>
</tr>
<tr>
<td>95.7 → 3.9</td>
<td>89.2 → 3.4</td>
<td>82.6 → 2.9</td>
<td>77 → 2.4</td>
<td>72 → 1.9</td>
<td>67 → 1.4</td>
<td>61 → 0.8</td>
</tr>
<tr>
<td>94.4 → 3.8</td>
<td>87.8 → 3.3</td>
<td>81.3 → 2.8</td>
<td>76 → 2.3</td>
<td>71 → 1.8</td>
<td>65 → 1.2</td>
<td>60 → 0.7***</td>
</tr>
<tr>
<td>93.1 → 3.7</td>
<td>86.5 → 3.2</td>
<td>80 → 2.7</td>
<td>75 → 2.2</td>
<td>70 → 1.7**</td>
<td>64 → 1.1</td>
<td>< 60 → 0.0</td>
</tr>
<tr>
<td>91.8 → 3.6</td>
<td>85.2 → 3.1</td>
<td>79 → 2.6</td>
<td>74 → 2.1</td>
<td>69 → 1.6</td>
<td>63 → 1.0</td>
</tr>
<tr>
<td colspan="7">
<small>*: 2.0 is the minimum grade required for any <i>required</i> INFO course to count towards an informatics degree.</small> <br/>
<small>**: The UW requires a 1.7 or better for non-degree requirements for undergraduate courses.</small> <br/>
<small>***: 0.7 is lowest passing grade in an undergraduate course.</small>
</td>
</tr>
</table>
<div class="alert alert-danger" id="late-policy">
<p>
Late work receives no credit unless you can provide a note from a health care professional or provost documenting the reason for your absence, or you make arrangements with the instructor.
However, you can miss up to 3 activities without penalty and without documentation.
This should be enough to allow for sickness, unavoidable travel, or other personal matters.
</p>
<p>
If you miss a reading quiz due to sickness,
you can make up the quiz credit by sending a 250-500 word critique of the reading and submitting
it to your Google Drive folder within a week of the quiz you missed.
Title the Google doc with the class number and "make up quiz".
E.g. "2.3 make up quiz" for the make up quiz for week 2 and class 3/wednesday lecture.
</p>
<!--If you miss a reading quiz due to sickness, you can make up the quiz credit by sending a critique of the reading to me within a week of the due date that reports at least three improvements to the content, including high level issues such as topics that should be discussed or points you think are wrong, to low level issues including spelling, grammar, or clarity.-->
</div>
<h2 id="activities">Activities</h2>
<p>Each day in class we'll practice some skill. You'll get <a href="#grading">0.5 points</a> if you engage in and complete the activity. How to get credit for the activity will depend on the activity; sometimes being present will be enough, sometimes being to class on time will be enough, and sometimes you'll have to turn something in.</p>
<h2 id="reading">Reading</h2>
<p>To access the readings, you will do the following:</p>
<ol>
<li>Click on the reading (a link to a Google doc) on the <a href="#schedule">course schedule</a></li>
<li>Copy the google doc to your personal INFO 370 folder (which we shared with you at the beginning of the course). <a href="https://support.google.com/docs/answer/49114" target="_blank">Instructions on making a copy of a file in Google Drive.</a></li>
<li>Read through the google doc/reading. Highlight and <a href="https://support.google.com/docs/answer/65129" target="_blank">comment</a> any parts which are confusing.</li>
<li>Complete the questions marked "TODO".</li>
</ol>
<p>
You should complete your readings and reflection <i>before</i> at the beginning of each lecture (twice a week).
The Google Doc in your personal Drive folder is your submission (not using Canvas for readings).
Each class, you'll come prepared to discuss the assigned reading.
</p>
<p>The day that each reading is due, we'll do the following:</p>
<ul>
<li>We clarify confusions based on reading reflections.</li>
<li>We give you some questions to answer individually about the assigned reading (a "Reading Quiz").</li>
<li>You turn in your answer.</li>
<li>You discuss your answers with your neighbor.</li>
<li>We discuss the correct answers as a class.</li>
</ul>
<p>
You will receive <a href="#grading">0.75 points</a> for completing the reading and reflection before class (on the Google Doc).
You will receive up to another 0.25 points for getting the in-class reading quiz correct.
We will give partial credit for partially correct answers on the reading quiz, at our discretion.
In total, you can receive up to 1 point per reading.
</p>
<h2 id="homework">Individual Homework</h2>
<p>There will be about one individual homework assignment each week, which are separate from reading assignments and project milestones. These will give you practice and feedback on the skills in a narrower context than your project. They will be due on the nearest Sunday.</p>
<p>
All homeworks are due by 11:59:00 PM PST on the specified date.
</p>
<p>
The goal of the individual homework assignments is to check and deepen your understanding of specific concepts which are critical to your understanding of data science.
</p>
<h2 id="project">Project</h2>
<p>The project is split across 8 milestones/assignments, each worth a different amount:</p>
<ol>
<li><a href="project/group-formation.html">Group formation and initial questions</a> (2 points).</li>
<li><a href="project/pilot-study.html">Pilot study</a> (3 points). </li>
<li><a href="">Proposal</a> (4 points). </li>
<li><a href="project/proposal-review.html">Proposal Review</a> (2 points). </li>
<li><a href="project/proposal-revision.html">Proposal Revision</a> (3 points).</li>
<li><a href="project/project-meeting.html">Project check-in meeting</a> (2 points). </li>
<li><a href="project/presentation.html">Presentation</a> (8 points).</li>
<li><a href="project/artifact.html">Artifact</a> (16 points).</li>
</ol>
<p>
All assignments except the Project check-in meeting are due by 11:59:00 PM PST on the specified date.
</p>
<p>
The goal of the project is for you to practice the process of data science to make or inform a decision,
so you can experience the nuances of formulating a good question, setting up process, constraints, and plans in relation to a context.
Note, however, that because the timeline for the project is so short, it <em>won't</em> give you a deep, longitudinal experience with data science, nor will it give you practice with massive complexity or scale.
I believe these are experiences best left to practice in industry, as they're <em>very</em> difficult to replicate in the artificial setting of school.
</p>
<h2 id="teams">Project Teams</h2>
<ul>
<li><a href="https://github.com/Info-370-Winter-2018/group-formation-for-projects-quokka" target="_blank">#Quokka </a> </li>
<li><a href="https://github.com/Info-370-Winter-2018/group-formation-for-projects-giraffekittens" target="_blank">Giraffe-kittens</a> </li>
<li><a href="https://github.com/Info-370-Winter-2018/group-formation-for-projects-seaotters" target="_blank">Sea Otters</a></li>
<li><a href="https://github.com/Info-370-Winter-2018/group-formation-for-projects-tamagotchu" target="_blank">Tamagotchu</a></li>
<li><a href="https://github.com/Info-370-Winter-2018/group-formation-for-projects-honeybadger" target="_blank">Honey Badger</a></li>
<li><a href="https://github.com/Info-370-Winter-2018/group-formation-for-projects-platypus" target="_blank">Platypus</a> </li>
<li><a href="https://github.com/Info-370-Winter-2018/group-formation-for-projects-team-awesome-possum" target="_blank">Awesome Possum</a> </li>
<li><a href="https://github.com/Info-370-Winter-2018/Team-Crabs-Repo" target="_blank">Crabs</a></li>
</ul>
<h2 id="resources">Resources</h2>
<p>
Links to Data Science communities at/near UW:
</p>
<ul>
<li>
<a href="misc/eScience-education-brochure.pdf" target="_blank">Data Science Education at UW</a></li>
</li>
<li>
<a href="http://escience.washington.edu/" target="_blank">UW eScience Institute</a>: cross-disciplinary community of researchers thinking about data science.
They often have <a href="http://escience.washington.edu/escience-events/" target="_blank">events</a> open to students and the public.
</li>
<li>
<a href="https://www.seattledataforgood.org/" target="_blank">Seattle Data for Good</a>:
A citizen service organization dedicated to supporting data science related needs of local civic organizations.
Benji Xie is the head of the <a href="http://benjixie.com/sdfg" target="_blank">Education Co-op</a>, dedicated to developing a community of learners within the organization.
</li>
</ul>
<p>
Links to recommended learning resources (most of which are free)
</p>
<ul>
<li><i><a href="http://www.cookbook-r.com/" target="_blank">R Graphics Cookbook</a></i> (Chang, 2012): Practical book on data visualizations in R. Website provides R code with great explanations for common tasks.</li>
<li><i><a href="https://alliance-primo.hosted.exlibrisgroup.com/primo-explore/fulldisplay?docid=CP51262790190001451&context=L&vid=UW&search_scope=all&tab=default_tab&lang=en_US" target="_blank">Data mining with R : learning with case studies</a></i> (Torgo, 2017): Practical book on data mining (a former buzzword, mostly like data science) using detailed case studies and good commentary. Good for learning R and how to apply data science to problems (without reframing those problems - it takes the questions mostly as given).</li>
<li><i><a href="http://r4ds.had.co.nz/" target="_blank">R for Data Science</a></i> (Grolemund & Wickham): Free online textbook which provides teaches you to use R for data science. We use their chapter on exploratory data analysis (Ch 7) and modeling (Part IV).</li>
<li><i><a href="https://infoactive.co/data-design/" target="_blank">Data + Design</a></i> (Infoactive): Free and open-source ebook which (beautifully) introduces the fundamentals of data and how to prepare and visualize it. We draw upon their chapter on "Data Fundamentals."</li>
<li><i><a href="http://xcelab.net/rm/statistical-rethinking/" target="_blank">Statistical Rethinking</a></i> (McElreath, 2016): Textbook which provides a nice framing of statistics as engineering, teaching a Bayesian perspective to statistics with R. We use the first 3 chapters to introduce model creation.</li>
<li><i><a href="https://www.rstudio.com/resources/cheatsheets/" target="_blank">RStudio Cheat Sheets</a></i>: Fantastic cheatsheets for anyone learing or using R.</li>
<li><i><a href="https://www.datacamp.com/" target="_blank">Datacamp</a></i>: Online learning platform for data science that we use for some assignments.</li>
</ul>
<p>
Links to important UW resources:
</p>
<ul>
<li><a href="http://hr.uw.edu/dso/" target="_blank">Disability Services Office</a>: If you require disability accommodations for this course, work with the DSO.</li>
<li><a href="https://depts.washington.edu/safecamp/" target="_blank">SafeCampus</a>: Resources and points of contact to promote a safer UW community.</li>
</ul>
</div>
</div>
</div>
</body>
</html>