Configuration

Below you can find a few configuration sections that you can use with Wombat:

Capture response headers

The response headers can be retrieved in the following manner:

#coding: utf-8
require 'wombat'

class HeadersScraper
  include Wombat::Crawler

  base_url "http://www.rubygems.org"
  path "/"

  headers "^[^k]+$", :headers
end

p HeadersScraper.new.crawl
# outputs =>

{
  "headers": {
    "server": "nginx/1.2.2",
    "date": "Mon, 10 Dec 2012 07:47:59 GMT",
    "content-type": "text/html; charset=utf-8",
    "content-length": "2416",
    "connection": "keep-alive",
    "x-powered-by": "Phusion Passenger (mod_rails/mod_rack) 3.0.11",
    "x-frame-options": "sameorigin",
    "x-xss-protection": "1; mode=block",
    "x-ua-compatible": "IE=Edge,chrome=1",
    "etag": "\"8c531e8f9967f430bf51e0eb5768b13e\"",
    "cache-control": "max-age=0, private, must-revalidate",
    "x-request-id": "f4170b6bc7ee063467f563eaf7950937",
    "x-runtime": "0.246211",
    "status": "200",
    "vary": "Accept-Encoding",
    "content-encoding": "gzip"
  }
}

Global configuration

You can configure Wombat globally with the utility method Wombat.configure:

require 'wombat'

Wombat.configure do |config|
  config.set_proxy "10.0.0.1", 8080
  config.set_user_agent "Wombat"
  config.set_user_agent_alias "Mac Safari"
end

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configuration

Capture response headers

Global configuration

Clone this wiki locally