You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for this awesome gem, we have been using this gem for years to scrape the blog posts. MetaInspector is able to scrape almost all of the blog posts, however, I am facing an issue with one particular blog post.
When scraping this one particular blog post, I am getting the expected results such as title, best_description, and best_image in my local machine. However, the same piece of code is not working in the production environment(deployed in the AWS EC2 machine).
MetaInspector returning expected results in my local machine:
def scrape(url)
@page = MetaInspector.new(url,
:connection_timeout => 5, :read_timeout => 5,
:headers => { 'User-Agent' => user_agent, 'Accept-Encoding' => 'identity' },
:faraday_options => { :ssl => { :verify => false } },
:html_content_only => true)
end
url = "https://www.simplyleb.com/recipe/easy-french-fry-nachos/"
page = scrape(url)
page.title
=> "Easy French Fry Nachos - Simply Lebanese"
page.images.best
=> "https://www.simplyleb.com/wp-content/uploads/Mccain-Fries-9.jpg"
page.images.count
=> 25
page.best_description
=> "Melted cheese, sour cream, chopped tomatoes and all your favorite toppings on frozen French fries for an easy and quick kid-friendly lunch or snack after school."
MetaInspector is not working from AWS EC2 instance:
irb(main):012:0>url = "https://www.simplyleb.com/recipe/easy-french-fry-nachos/"
irb(main):013:0>page = scrape(url)
irb(main):014:0> inspector.images.best
=> nil
irb(main):015:0> inspector.images.count
=> 0
irb(main):016:0> inspector.title
=> "StackPath"
irb(main):017:0> inspector.best_description
=> "www.simplyleb.com is using a security service for protection against online attacks. The service requires full cookie support in order to view this website."
Please notice best_description returned in the above response: www.simplyleb.com is using a security service for protection against online attacks. The service requires full cookie support in order to view this website.
Seems like an issue with cookies, do I have to send any extra parameters related to cookies?. Could someone please provide any suggestions on what might be the issue? I am unable to figure out why it's working in my local machine and not from the AWS. Any help would be much appreciated. Thanks.
The text was updated successfully, but these errors were encountered:
Hi Team,
Thanks for this awesome gem, we have been using this gem for years to scrape the blog posts. MetaInspector is able to scrape almost all of the blog posts, however, I am facing an issue with one particular blog post.
When scraping this one particular blog post, I am getting the expected results such as
title
,best_description
, andbest_image
in my local machine. However, the same piece of code is not working in the production environment(deployed in the AWS EC2 machine).ISSUE DETAILS:
MetaInspector gem version: 5.4.0
RAILS VERSION: 5.2.6
MetaInspector returning expected results in my local machine:
MetaInspector is not working from AWS EC2 instance:
Please notice
best_description
returned in the above response:www.simplyleb.com is using a security service for protection against online attacks. The service requires full cookie support in order to view this website.
Seems like an issue with cookies, do I have to send any extra parameters related to cookies?. Could someone please provide any suggestions on what might be the issue? I am unable to figure out why it's working in my local machine and not from the AWS. Any help would be much appreciated. Thanks.
The text was updated successfully, but these errors were encountered: