Skip to content
This repository has been archived by the owner on Mar 25, 2018. It is now read-only.

RunScriptOnNode is failing sometimes in 1.6.0-rc2 because of wrong IP #1574

Open
jaiganeshm opened this issue Apr 27, 2013 · 0 comments
Open

Comments

@jaiganeshm
Copy link

I recently upgraded to 1.6.0 (rc2). I was using runScriptOnNode call on EC2 instance. In some cases, I get the following error.
Apparently, the ssh connection is being tried on the private IP of the instance instead of the public IP.

6:45:40.505 [SimpleAsyncTaskExecutor-1] ERROR SLF4JLogger << (root:rsa[fingerprint(20:05:08:81:8c:01:99:fc:30:29:23:e6:c3:6b:12:42),sha1(64:43:e5:ec:da:8f:a6:88:f9:a2:9b:8c:0e:d9:41:cb:e5:4e:51:41)]@10.195.7.24:22) error acquiring {hostAndPort=10.195.7.24:22, loginUser=root, ssh=null, connectTimeout=7200000, sessionTimeout=7200000} (out of retries - max 7): Exhausted available authentication methods
net.schmizz.sshj.userauth.UserAuthException: Exhausted available authentication methods
at net.schmizz.sshj.userauth.UserAuthImpl.authenticate(UserAuthImpl.java:114) ~[sshj-0.8.1.jar:na]
at net.schmizz.sshj.SSHClient.auth(SSHClient.java:205) ~[sshj-0.8.1.jar:na]
at net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:305) ~[sshj-0.8.1.jar:na]
at net.schmizz.sshj.SSHClient.authPublickey(SSHClient.java:324) ~[sshj-0.8.1.jar:na]
at org.jclouds.sshj.SSHClientConnection.create(SSHClientConnection.java:144) ~[jclouds-sshj-1.6.0-rc.4.jar:1.6.0-rc.4]
at org.jclouds.sshj.SSHClientConnection.create(SSHClientConnection.java:40) ~[jclouds-sshj-1.6.0-rc.4.jar:1.6.0-rc.4]
at org.jclouds.sshj.SshjSshClient.acquire(SshjSshClient.java:193) [jclouds-sshj-1.6.0-rc.4.jar:1.6.0-rc.4]
at org.jclouds.sshj.SshjSshClient.connect(SshjSshClient.java:223) [jclouds-sshj-1.6.0-rc.4.jar:1.6.0-rc.4]
at org.jclouds.compute.callables.RunScriptOnNodeUsingSsh.call(RunScriptOnNodeUsingSsh.java:80) [jclouds-compute-1.6.0-rc.4.jar:1.6.0-rc.4]
at org.jclouds.compute.internal.BaseComputeService.runScriptOnNode(BaseComputeService.java:614) [jclouds-compute-1.6.0-rc.4.jar:1.6.0-rc.4]

More Notes:

The ConcurrentOpenSocketFinder class tries to identify the reachable IP from the two IP's available for the node (Public and Private).
In this case, my local network happened to have the exact IP that amazon generated for the node as its private IP. So the socket connect test to private IP succeeded.
Now , it tried to ssh to it and the ssh failed because of wrong authentication for obvious reasons.

The following method constructs the FluentIterable by first concating the publicAddress. But still the ssh connect was trying to the private IP.
private static FluentIterable checkNodeHasIps(NodeMetadata node) {
FluentIterable ips = FluentIterable.from(concat(node.getPublicAddresses(), node.getPrivateAddresses()));
checkState(size(ips) > 0, "node does not have IP addresses configured: " + node);
return ips;
}

From Adrian:
I think the reason would be evident in the code that calls the method pasted. At any rate, I'd guess it is more about which socket test completed first, given it is in parallel. The code should prefer the local address as that's cheaper in public clouds. Custom routing is possible by making a subclass of this and binding it in a guice module passed to ContextBuilder.modules

More discussions on this is available here
https://groups.google.com/forum/?fromgroups=#!topic/jclouds/TBpDtt9jaTo

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant