Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use splash:mouse_click #106

Closed
crisfan opened this issue Feb 21, 2017 · 2 comments
Closed

Use splash:mouse_click #106

crisfan opened this issue Feb 21, 2017 · 2 comments

Comments

@crisfan
Copy link

crisfan commented Feb 21, 2017

I tried using splash: mouseclick to load the next page and get the code for the next page of the page.

function process_one(splash)
    local get_dimensions = splash:jsfunc([[
    function () {
          var allA = document.getElementsByTagName('a');
    	  for(var i=0;i<allA.length;i++){
    		if(allA[i].innerHTML=="\u4e0b\u4e00\u9875"){
    			var rect = allA[i].getClientRects()[0];
    			return {"x": rect.left, "y": rect.top};
  		}
 	}
    }
    ]])
   splash:set_viewport_full()
   splash:wait(0.1)
   local dimensions = get_dimensions()
   splash:mouse_click(dimensions.x, dimensions.y)
   splash:wait(5)
   local content=splash:html()
   return content
end

function process_mul(splash)
   local res={}
   for i=1,3,1 do
       res[i]=process_one(splash)
   end
   return res
end

function main(splash)
   assert(splash:go("http://was.mot.gov.cn:8080/govsearch/gov_list.jsp"))
   return {res=process_mul(splash)}
end

The above code can work properly, but the efficiency is too low,I have to use splash: wait to wait 5 seconds to ensure that the page load is completed, otherwise I will get a lot of duplicate page code.I have read the information for a long time but did not find an efficient way to deal with this problem.

Is there any way in splash that has a method like selenium implicitlyWait or is there an easier way to fix my problem?

@kmike
Copy link
Member

kmike commented Feb 21, 2017

Hey @ForkEyes,

There is no inplicitlyWait in Splash (yet? it sounds like an interesting idea), but you can do it explicitly, e.g.

function main(splash)
  splash:set_user_agent(splash.args.ua)
  assert(splash:go(splash.args.url))

  -- requires Splash 2.3
  -- todo: use splash:with_timeout here,
  -- to limit total wait time
  while not splash:select('.my-element') do
    splash:wait(0.1)
  end
  splash:select('.my-element'):mouse_click()
  splash:wait(0.5)  -- todo: wait for another element
  return {html=splash:html()}
end

I think adding a helper function like wait_for_element to Splash itself is a good idea (just opened scrapinghub/splash#569 for it).

@Gallaecio
Copy link
Contributor

scrapinghub/splash#569 covers the feature and scrapinghub/splash#829 documenting the best current solution. @crisfan Can we close this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants