Skip to content

juliema/bioinformatics_and_data_science

Repository files navigation

bioinformatics_and_data_science

Bioinformatics and Data Science Fall 2019
BIOL 792 - 1044
Prof: Julie Allen; SFB 206; jallen23@unr.edu
Class: Monday 5:30 – 8:15pm DMSC 102
Office Hours: By appointment

Course Description

Online data repositories and individual data sets are growing at unprecedented rates. Correspondingly, the need for bioinformatic and data science skills is rapidly growing to match the needs of working with these repositories. The main goal of this course is to introduce students to beginning data science and bioinformatic skills for managing large datasets. The course will focus on Python programming and working in the shell along with other lessons including introduction to data standards and version control, tools for cleaning up dirty data, understanding how to work with clusters etc.

With an understanding of how to work with different data sources and link them we will increase not only the creativity of our science, but also expand our ability to do more integrative research. A prerequisite for this course is enrollment as an M.S. or PhD student and students are strongly encouraged to have some background in either Linux, R, Python or Perl programming. The course will be capped at 15 students. For questions please email me.

Student Learning Outcomes

The goal of the course is to go though many data science tools/tricks and hacks from a bioinformatics angle. By the end of the course you should feel comfortable with the tools data scientists use in Biology and be able to solve and/or trouble shoot both small and large-scale data challenges in biology.

Material Distribution

All readings, lab instructions, datasets, etc. will be available here.

Attendance and Participation

Because this is a graduate class, I expect full attendance and participation at all times, with all computational in class exercises, homework, and projects.

Grade

Weekly assignments (40%) Assignments will involve working in Unix, writing simple Python scripts, and other small assignments. These will be working with data sets that will be provided over the course of the semester. Assignments will be evaluated based on completion. You can work in teams of 2 or 3 but will turn in your own notes.txt, python or bash script for each assignment. This will explain, step by step, what you did to complete the task. More guidelines on these files and each specific assignment will be available on github.

Participation (20%) Participation entails showing up for class, prepared and doing your best to work through assigned tasks and programming example problems. Becasue all classes build on previous classes if you need to miss a class contact me. Some of the material we cover might be easy and quick to figure out. Other material and tasks will present roadblocks that are more difficult. We are building a positive community in this class, your attitude and helpfulness will be evaluated.

Independent project (40%) Everyone will be responsible for an independent project (this can be done either individually, or as a group no more than 3 people). The goal of your semester project is to incorporate the tools learned in this classroom into a project of your design. Ideally this would be something related to your research and will help you move your PhD forward, but you could decide to work on new project idea. A requirement of the project will be to incorporate at least 2 tools learned in the class to resolve a biological question or computational problem. You will turn in a one to two page write up of the project and how you will solve it by week 6. On the last day of class you will turn in a one to three page write up of the project, put the documented code on github, (or submit to me) and present your project in a 10-15 min presentation the last day of class.

White paper

  • 1-2 page White Paper: The 1-2 page write up should be similar format to a whtie paper. Therefore there should be an introduction to the biological or other type of problem you are trying to solve (with references), just like a white paper. The next section will be there methods. Here describe what you are going to do. For example "I will write a python script to take the data from a phyllip format to a fasta format". There should be two techniques from the class used (e.g. python, shell scripts, Github, Relational Database, Cleaning Data).

Project Summary + Presentation

  • 1-2 page Project Paper: The 1-2 page final paper should be similar format to the whiite paper but added results and discussion section. Again explain in detail what aare the two tools from the class you used and how thata turned oout. In the diiscussion talk about how this helped your project and what you would do next (or what you leaarned).

  • 10 - 15 min presentation: On the last day of class each of you will present your project to the class. No more than 15 min each - Feel free to show GitHub repos anad or run code in class.

Statement on Academic Dishonesty:

"Cheating, plagiarism or otherwise obtaining grades under false pretenses constitute academic dishonesty according to the code of this university. Academic dishonesty will not be tolerated and penalties can include canceling a student's enrollment without a grade, giving an F for the course or for the assignment. For more details, see the University of Nevada, Reno General Catalog."

Statement of Disability Services:

"Any student with a disability needing academic adjustments or accommodations is requested to speak with the Disability Resource Center (Thompson Building, Suite 101) as soon as possible to arrange for appropriate accommodations."

Statement on Audio and Video Recording:

"Surreptitious or covert video-taping of class or unauthorized audio recording of class is prohibited by law and by Board of Regents policy. This class may be videotaped or audio recorded only with the written permission of the instructor. In order to accommodate students with disabilities, some students may be given permission to record class lectures and discussions. Therefore, students should understand that their comments during class may be recorded."

SCHEDULE

*this is the tentative outline of the schedule – the events may change according to the speed and needs of the students in the course

Week Date Class Due
Week 1 26th August Course intro, Intro to Unix I
--- 2nd September Labor Day No Class
Week 2 9th September Unix II Homework 1
Week 3 16th September Unix III Homework 2
Week 4 23rd September Version control with Git Homework 3
Week 5 30th September Gitignore/Github *1-2 page project writeup due
Week 6 7th October Git Conflicts, Beginning Programming, Python I nothing due!
Week 7 14th October Python II Homework 4
Week 8 21st October Python III + regular expressions Homework 5
Week 9 28th October Python IV regex in Python Homework 6
Week 10 4th November Python Modules + Open Refine Homework 7
--- 11th November Veterans Day No Class
Week 11 18th November Relational Databases
Week 12 25th November Clusters/Cloud Computing/Pronhorn
Week 13 2nd December Project prep Homework 8
Week 14 9th December Present Projects *projects due

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published