Skip to content

Functions in R, Julia, and Python to fit a set of LOESS smoothers to the shot length data of a motion picture in order to identify the temporal structure of a film's editing without committing the analyst to a particular level of smoothing before applying the function.

License

DrNickRedfern/multiloesssmoothers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fitting multiple loess smoothers to motion picture shot length data

This repository has functions in a range of languages for fitting a set of LOESS smoothers to the shot length data of a motion picture, iterating over a range of spans specified by the user and plotting the result.

LOESS, or locally estimated scatterplot smoothing, is a nonparametric method for fitting a curve to an independent variable. Rather than fitting a global model, LOESS fits a linear or quadratic function, depending on the degree of the polynomial used, to a localised section of the data. The size of the segment of the data is determined by the span of the local window, with more data used in fitting the curve as the span increases.

The functions available on this repository allow the analyst to identify the editing structure of a film at different scales by using different spans without committing the analyst to a particular level of smoothing before applying the function. At the macro-scale, LOESS smoothers with large spans describe the dominant trend in the editing of a film; while at the micro-scale smoothers with small spans reveal transient features associated with the editing of specific moments in a film.

The resulting plot can be used diagnostically for exploratory data analysis in order to decide which spans for the LOESS smoother are the most informative or for limiting the range of spans used for cross-validation to speed up the process of selecting the best span to describe the data.

Functions are available for the following languages: R, Julia, and Python (with others on the way!)

R: loessggplot

The loessggplot function fits and visualises multiple loess smoothers using ggplot2. loessggplot wll remove NA values and calculate the shot timings.

loessggplot takes the following arguments:

  • x: a numeric vector of the lengths of shots in a film in temporal order.
  • low, high, step: low and high set the limits on the range of the spans of the LOESS smoothers, and step defines the increase in the value of the span for each iteration.
  • title: text added between "" is added to the plot as a title.
  • ticks: specifies the distance between tick marks on the colour bar in the legend. The lower and upper limits of the colour bar are set by low and high, respectively.

To draw the plot using the shot length data for the Buster Keaton film Convict 13 (1920) using data from the Buster Keaton dataset, we use the command:

loessggplot(convict_13, low = 0.1, high = 0.9, step = 0.01, title = "Convict 13", ticks = 0.1)

which returns the following plot:

R: Time series of editing in Buster Keaton's Convict 13 (1920)

Julia: MultiLoessPlot

The MultiLoessPlot function fits and visualises multiple loess smoothers using Gadfly, to produce plots in a simialr style to gglpot2.

MultiLoessPlot takes the following arguments:

  • df: a data frame containing shot length data in numeric order in wide format.
  • index: the index of the data frame column containing the shot length data to be visulaised.
  • low: the minimum loess span.
  • step: the increment of the span of the loess smoothers.
  • high: the maximum loess spans.
  • title: the title of the plot.

To plot the result for Convict 13, we load the csv file for the film and select the column containing the shot length data by its index. MultiLoessPlot will remove NA values and calculate shot timings:

using CSV, DataFrames
df = CSV.read("./path/to/file/data.csv", DataFrame; header=1)
using ColorSchemes, Gadfly, Loess
MultiLoessPlot(df; index=2, low=0.1, step=0.01, high=0.9, title="Convict 13")

which returns the following plot:

Julia: Time series of editing in Buster Keaton's Convict 13 (1920)

Python: multiloessplot

The multiloessplot function fits and visualises multiple loess smoothers using Seaborn.

multiloessplot has the following arguments:

  • x: a data frame in wide format containing shot length data in numeric order.
  • index: the index of the data column in the data frame to be plotted. Note that the index is in Python format and begins at 0.
  • low: the minimum value of the spans the loess smoothers.
  • high: the maximum value of the spans of the loess smoothers.
  • step: the increment in the span of the loess smoothers.
  • tick_step: specifies the distance between tick marks on the colour bar in the legend. The lower and upper limits of the colour bar are set by low and high, respectively.
  • title: a title for the plot.

To plot the result for Convict 13, we load the csv file for the film and select the column containing the shot length data by its index. multiloessplot will remove NA values and calculate shot timings:

import pandas as pd
df = pd.read_csv('path/to/file/data.csv', delimiter=',')
multiloessplot(df, index=2, low=0.1, high=0.9, step=0.01, tick_step = 0.1, title = "Convict 13")

which returns the following plot:

python: Time series of editing in Buster Keaton's Convict 13 (1920)

The loess fits in Python seem to be somewhat different from those of R and Julia.

multiloess: a streamlit for visualising shot length data

I have created a streamlit app that takes the code for multiloessplot.py and automates the process of visualising shot length data if you don't want to interact with the code directly.

You can access the app at: multiloess

The GitHub repository for the app is here: DrNickRedfern/multiloess

About

Functions in R, Julia, and Python to fit a set of LOESS smoothers to the shot length data of a motion picture in order to identify the temporal structure of a film's editing without committing the analyst to a particular level of smoothing before applying the function.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published