Skip to content

Commit

Permalink
fix(IMDb): Fix scraping of episode's overviews
Browse files Browse the repository at this point in the history
  • Loading branch information
bugwelle committed Apr 6, 2024
1 parent ac35f46 commit cb978aa
Show file tree
Hide file tree
Showing 8 changed files with 61 additions and 19 deletions.
2 changes: 1 addition & 1 deletion CHANGELOG.md
Expand Up @@ -9,7 +9,7 @@

### Fixed

- tbd
- IMDb: Episode's overviews are scraped again (#1724)

### Changed

Expand Down
15 changes: 4 additions & 11 deletions src/scrapers/tv_show/imdb/ImdbTvEpisodeParser.cpp
Expand Up @@ -168,22 +168,15 @@ void ImdbTvEpisodeParser::parseInfos(TvShowEpisode& episode, const QString& html
// }

// --------------------------------------
rx.setPattern("<p itemprop=\"description\">(.*)</p>");
rx.setPattern(R"re(Plot Summary</td>(.*)</td>)re");
match = rx.match(html);
if (match.hasMatch()) {
QString outline = match.captured(1);
outline = outline.remove("See full summary&nbsp;&raquo;").trimmed();
episode.setOverview(removeHtmlEntities(outline));
}

// --------------------------------------
rx.setPattern(R"(<div class="summary_text">(.*)</div>)");
match = rx.match(html);
if (match.hasMatch()) {
QString outline = match.captured(1);
outline = outline.remove("See full summary&nbsp;&raquo;").trimmed();
outline = outline.remove("Plot Summary").trimmed();
outline = outline.remove("Plot Synopsis").trimmed();
episode.setOverview(removeHtmlEntities(outline));
}

// --------------------------------------

rx.setPattern(R"(<h2>Storyline</h2>\n +\n +<div class="inline canwrap">\n +<p>\n +<span>(.*)</span>)");
Expand Down
15 changes: 13 additions & 2 deletions test/resources/scrapers/imdbtv/Black-Mirror-S05.ref.txt
Expand Up @@ -20,7 +20,13 @@ season: SeasonNumber=05
episode: EpisodeNumber=01
displaySeason: SeasonNumber=xx
displayEpisode: EpisodeNumber=xx
overview:
overview:
Danny and Karl were best friends in college. Eleven years later and Danny has bu
ilt a stable life with a wife and son while Karl remains wild, seemingly incapab
le of growing up. When Karl attends Danny's birthday party, they seem to find th
eir old groove. The gift of a virtual reality game introduces new possibilities.
While playing the nostalgic video game from their youth, both men discover a ne
w form of satisfaction.
writers: (N=0)
directors: (N=0)
playCount: 0
Expand Down Expand Up @@ -56,7 +62,12 @@ season: SeasonNumber=05
episode: EpisodeNumber=02
displaySeason: SeasonNumber=xx
displayEpisode: EpisodeNumber=xx
overview:
overview:
After kidnapping an intern at a tech company, a mysterious Uber driver is surrou
nded by police in a meadow. The story follows an hours long standoff between the
police and the kidnapper, who doesn't want money or to harm his victim, as his
true motive slowly becomes clearer the further that the situation spirals out of
control.
writers: (N=0)
directors: (N=0)
playCount: 0
Expand Down
Expand Up @@ -15,7 +15,12 @@ season: SeasonNumber=01
episode: EpisodeNumber=00
displaySeason: SeasonNumber=xx
displayEpisode: EpisodeNumber=xx
overview:
overview:
Buffy Summers arrives for her first day at a new school, and already weird thing
s are happening. She investigates a dead body that is found in the girls' locker
room, and, with the help of her new friends Willow and Xander, she fights a gan
g of vampires. Also, she meets Cordelia and her friends, and Giles, her new Watc
her, tells her more about her destiny. Written by page8701
writers: (N=0)
directors: (N=0)
playCount: 0
Expand Down
Expand Up @@ -15,7 +15,16 @@ season: SeasonNumber=01
episode: EpisodeNumber=01
displaySeason: SeasonNumber=xx
displayEpisode: EpisodeNumber=xx
overview:
overview:
Buffy Summers just moved with her mom from L.A. (where she set fire to the schoo
l gym) to Sunnydale, which is, alas, experiencing a plague of vampires. She meet
s nerd Willow Rosenberg, cool skateboarder Xander Harris, and his mate Jesse McN
ally, fashionable snooty Cordelia Chase and the somewhat creepy British libraria
n, Rupert Giles. When a corpse with bite-marks is found in a locker, she realize
s, but refuses to acknowledge, that her vampire-killing past is catching up with
her and keeps Giles, her watcher (trainer), at arms-length, but is stalked by A
ngel who refers to the hell-mouth. Finally her destiny kicks in. Written by KGF
Vissers
writers: (N=0)
directors: (N=0)
playCount: 0
Expand Down
Expand Up @@ -15,7 +15,15 @@ season: SeasonNumber=12
episode: EpisodeNumber=19
displaySeason: SeasonNumber=xx
displayEpisode: EpisodeNumber=xx
overview:
overview:
Ned opens a theme park to the memory of his late wife Maude and it becomes a hug
e success when people kneeling in front of a statue of Maude experience mystic v
isions. The reason for this is that a grille in front of the statue is an out-pi
pe for a propane gas line and they are getting high on the gas. Unfortunately th
e park is closed down when Homer and Ned try to stop two children from lighting
a candle before the altar and are charged with assault.Ned does,however,enjoy th
e further company of Rachel Jordan,despite his efforts to turn her into a clone
of Maude. Written by don @ minifie-1
writers: (N=0)
directors: (N=0)
playCount: 0
Expand Down
Expand Up @@ -15,7 +15,15 @@ season: SeasonNumber=xx
episode: EpisodeNumber=xx
displaySeason: SeasonNumber=xx
displayEpisode: EpisodeNumber=xx
overview:
overview:
Ned opens a theme park to the memory of his late wife Maude and it becomes a hug
e success when people kneeling in front of a statue of Maude experience mystic v
isions. The reason for this is that a grille in front of the statue is an out-pi
pe for a propane gas line and they are getting high on the gas. Unfortunately th
e park is closed down when Homer and Ned try to stop two children from lighting
a candle before the altar and are charged with assault.Ned does,however,enjoy th
e further company of Rachel Jordan,despite his efforts to turn her into a clone
of Maude. Written by don @ minifie-1
writers: (N=0)
directors: (N=0)
playCount: 0
Expand Down
Expand Up @@ -15,7 +15,15 @@ season: SeasonNumber=xx
episode: EpisodeNumber=xx
displaySeason: SeasonNumber=xx
displayEpisode: EpisodeNumber=xx
overview:
overview:
Ned opens a theme park to the memory of his late wife Maude and it becomes a hug
e success when people kneeling in front of a statue of Maude experience mystic v
isions. The reason for this is that a grille in front of the statue is an out-pi
pe for a propane gas line and they are getting high on the gas. Unfortunately th
e park is closed down when Homer and Ned try to stop two children from lighting
a candle before the altar and are charged with assault.Ned does,however,enjoy th
e further company of Rachel Jordan,despite his efforts to turn her into a clone
of Maude. Written by don @ minifie-1
writers: (N=0)
directors: (N=0)
playCount: 0
Expand Down

0 comments on commit cb978aa

Please sign in to comment.