Skip to content
felipecsl edited this page Jul 31, 2012 · 3 revisions

Nesting

You can nest properties that look better together:

class GithubScraper
  include Wombat::Crawler
  base_url "http://www.github.com"
  path "/"

  headline "xpath=//h1"

  benefits do
    team_mgmt "css=.column.leftmost h3"
    code_review "css=.column.leftmid h3"
    hosting "css=.column.rightmid h3"
    collaboration "css=.column.rightmost h3"
  end
end

Outputs:

{
  "headline"=>"1,338,564 people hosting over 4,066,093 git repositories", 
  "benefits"=>{
    "team_mgmt"=>"Team management", 
    "code_review"=>"Code review", 
    "hosting"=>"Reliable code hosting", 
    "collaboration"=>"Open source collaboration"
  }
}

In the GithubScraper we grouped together all the selectors for benefits information using a block. The same rules from top level properties apply here.

You can even nest several levels deep!

class GithubScraper
  include Wombat::Crawler
  base_url "http://www.github.com"
  path "/"

  benefits do
    team_mgmt "css=.column.leftmost h3"
    links do
      team_mgmt "xpath=//div[@class='column leftmost']//a/@href"
    end
  end
end

Outputs:

{
  "benefits"=>{
    "team_mgmt"=>"Team management", 
    "links"=>{
      "team_mgmt"=>"/features/projects/collaboration"
    }
  }
}