Skip to content

malina/metascraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

metascraper

Metascraper is a little lib for web scraping purposes.

You give it an URL, and it lets you easily get its title, images, description, videos.

Installation

Add this to your application's shard.yml:

dependencies:
  metascraper:
    github: malina/metascraper

Usage

require "metascraper"

Initialize a Metascraper instance for an URL, like this:

page = Metascraper.new("https://github.com/malina/metascraper")

puts page.title

Accessing scraped data

page.url                 # URL of the page
page.images              # enumerable collection, with every img found on the page
page.title               # title of the page from the head section, as string
page.description         # returns the meta description, or the first long paragraph if no meta description is found
page.content             # primary readability page content

You can also access most of the scraped data as a hash:

  page.to_hash

Contributing

  1. Fork it ( https://github.com/malina/metascraper/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request

Contributors

  • malina Alexandr Shumov - creator, maintainer

About

Metascraper is a Crystal library for web scraping.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published