Skip to content
This repository has been archived by the owner on Jan 15, 2020. It is now read-only.

stevensmiley1989/BreastCancer

Repository files navigation

BreastCancer

Repository by Steven Smiley

This respository hosts the files I used to analyze and evaluate the Wisconsin Diagnostic Breast Cancer (WDBC) dataset in Python.

Table of Contents to Repository

1 Jupyter Notebook

Jupyter Notebook(s) written in Python.

Notebook Description
ML_for_Diagnosing_Breast_Cancer-Steven_Smiley.ipynb My Jupyter notebook written in Python for Kaggle.
Applied_ML_Bioinformatic_BP_WDBC.ipynb My Jupyter notebook written in Python for Medium.

Single input file (data.csv) contains all of the information for the Wisconsin Breast Cancer dataset.

data.csv

The Outputs from the Jupyter notebook are placed in the following two folders: Models & Figures

4 Credits/References

4.1 Wisconsin Diagnostic Breast Cancer (WDBC) dataset

Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

Creators of Wisconsin Diagnostic Breast Cancer (WDBC)

Dr. William H. Wolberg, General Surgery Dept., University of Wisconsin, Clinical Sciences Center, Madison, WI 53792 wolberg@eagle.surgery.wisc.edu

W. Nick Street, Computer Sciences Dept., University of Wisconsin, 1210 West Dayton St., Madison, WI 53706 street@cs.wisc.edu 608-262-6619

Olvi L. Mangasarian, Computer Sciences Dept., University of Wisconsin, 1210 West Dayton St., Madison, WI 53706 olvi@cs.wisc.edu

4.2 Dr. Olson's Research on 'Data-driven advice for applying machine learning to bioinformatics problems.'

Olson, Randal S. et al. “Data-driven advice for applying machine learning to bioinformatics problems.” Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing 23 (2017): 192-203 .

4.3 Coding Libraries

4.3.1 SciPy

Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, CJ Carey, İlhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E.A. Quintero, Charles R Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedregosa, Paul van Mulbregt, and SciPy 1.0 Contributors. (2019) SciPy 1.0–Fundamental Algorithms for Scientific Computing in Python. preprint arXiv:1907.10121

4.3.2 Scientific computing in Python

  • Travis E. Oliphant. Python for Scientific Computing, Computing in Science & Engineering, 9, 10-20 (2007)
  • K. Jarrod Millman and Michael Aivazis. Python for Scientists and Engineers, Computing in Science & Engineering, 13, 9-12 (2011)

4.3.4 IPython

  • Fernando Pérez and Brian E. Granger. IPython: A System for Interactive Scientific Computing, Computing in Science & Engineering, 9, 21-29 (2007)

4.3.5 Matplotlib

  • J. D. Hunter, "Matplotlib: A 2D Graphics Environment", Computing in Science & Engineering, vol. 9, no. 3, pp. 90-95, 2007.

https://arxiv.org/abs/1708.05070v2

4.3.6 pandas

  • Wes McKinney. Data Structures for Statistical Computing in Python, Proceedings of the 9th Python in Science Conference, 51-56 (2010)

4.3.7 scikit-Learn

  • Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, Édouard Duchesnay. Scikit-learn: Machine Learning in Python, Journal of Machine Learning Research, 12, 2825-2830 (2011)

5 Contact-Info

Feel free to contact me to discuss any issues, questions, or comments.

6 License

This repository contains a variety of content; some developed by Steven Smiley, and some from third-parties. The third-party content is distributed under the license provided by those parties.

The content developed by Steven Smiley is distributed under the following license:

*I am providing code and resources in this repository to you under an open source license. Because this is my personal repository, the license you receive to my code and resources is from me and not my employer.

Copyright 2020 Steven Smiley

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

About

This respository hosts the files I used to analyze and evaluate the Wisconsin Diagnostic Breast Cancer (WDBC) dataset in Python.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published