Skip to content

AlexKMDev/wwdc-session-transcripts

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

77 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WWDC Session Transcripts

As used by ASCIIwwdc.

Methodology

Transcripts for WWDC sessions are aggregated from subtitles included in session videos from 2010, 2013, and 2014.

The code used to automatically scrape and collect this information for 2014 sessions is as follows:

require 'open-uri'
require 'nokogiri'

year = 2014
doc = Nokogiri::HTML(open("https://developer.apple.com/videos/wwdc/#{year}/"))

doc.search("p.download").each do |download|
  href = download.at("a").attr('href')
  uri = URI(href)
  uri.path = uri.path.split('/')[0...-1].join('/') + '/'
  uri.query = nil

  p session = uri.path.split('/').last.to_i

  begin
    subtitles = []
    m3u8 = open(URI.join(uri, "subtitles/eng/prog_index.m3u8"))
    m3u8.each_line do |line|
      next unless /\.webvtt$/ === line
      webvtt = open(URI.join(uri, "subtitles/eng/", line.strip))
      subtitles << webvtt.read
    end

    File.open("#{session}.vtt", "w") do |f|
      f.write subtitles.join("\n")
    end
  rescue => e
    p "! #{session} (#{e})"
    next
  end
end

Additional transcripts for 2012 WWDC content was graciously contributed by Rev.com, and community volunteers.

Copyright

All content copyright © 2010–2014 Apple Inc. All rights reserved.

About

WWDC Session Transcripts

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Ruby 72.7%
  • Python 19.4%
  • Shell 7.9%