Skip to content

SebastianRokholt/Soccer-Predictions

Repository files navigation

Soccer Predictions

Predicting the outcome of matches for the Norwegian women's soccer championships (Toppserien)

This repository contains the code for a Data Science project about predicting the outcome of soccer matches, including a simple web application built with Flask.

The Data

The dataset was provided through the course material for INF161: Data Science at the University of Bergen, and was sourced from FBRef. The main dataset contains statistics on Toppserien soccer matches in the years 2017 - 2019. In addition, the repository contains an overview of the planned fixtures (games) for the 2020 season, which was used for the final predictions at the end of the project. These predictions were submitted to the INF161 course's Kaggle contest.

The Process

Part 1: Preparations
Data inspection, cleaning and wrangling. Feature engineering and some simple visualisations.
Part 2: Modelling
Exploratory data analysis and machine learning modelling. Evaluations and predictions on the 2020 data.
Part 3: Implementation
Creating a simple Flask application for the model, where a user can enter two Toppserien teams and receive a predicted match outcome.

Repository Content

Setup

  1. Clone repository. To run the Flask application, download soccer-predictions-website and model.pkl as a minimum.
  2. Make sure that Python 3.6 or newer is installed. I recommend running the .ipynb files in Jupyter Notebook or Google Colab.
  3. run python -m pip install -r 'requirements.txt' in the terminal to install dependencies.
  4. run python app.py to run the web application locally at port 8080.