Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URI bind/connect parsing error in zmq.io.net.tcp.TcpAddress::resolve(..) #874

Open
RalphSteinhagen opened this issue Feb 11, 2021 · 4 comments

Comments

@RalphSteinhagen
Copy link

We are implementing a Majordomo broker derivative that uses in addition to the ROUTER also XPUB and SUB socket types for communicating with external/internal clients and workers.

In order to distinguish the endpoints, we bind these sockets internally to:

  • "inproc://broker/router",
  • "inproc://broker/publisher", and
  • "inproc://broker/subscribe",
    which works fine.

In order to keep this symmetric for external clients/workers and to stick to the RFC 3986 URI convention (ie. scheme:[//authority]path[?query][#fragment]), the same sockets are also individually bound to external IP:port addresses.
I'd like to simply replace the 'broker' authority part with a specific 'IP:port' address (not necessarily all the same for each socket), for example 'tcp://localhost:5050/router', which presently fails in TcpAddress::resolve(..) with

java.lang.NumberFormatException: For input string: "5050/router"

at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.base/java.lang.Integer.parseInt(Integer.java:652)
at java.base/java.lang.Integer.parseInt(Integer.java:770)
at jeromq@0.5.2/zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:117)
[..]

since the parser assumes that there is only one number in the portStr after the addStr and colon: port = Integer.parseInt(portStr);

Binding several different sockets (but with different paths) on the same port aside, it would be nice if the addressing scheme could be fixed, honour the RFC URI convention, and either
a) ignore the additional path component, or
b) use the additional path information as an additional specifier similar to the 'inproc:////' syntax.

Maybe, at the same time, it would be worthwhile to reduce a lot of this custom/redundant code with the existing java.net.URI class implementation that handles a lot of the parsing and syntax checking.

A feature request that is related, but goes beyond this issue: would it be conceivable to use the 'path' argument as a differentiating feature to allow binding multiple sockets to the same TCP port ie. same port but different paths as for the 'inproc://' implementation? This would keep the implementation between both low-level protocols a bit more symmetric and cleaner.

@fredoboulo
Copy link
Contributor

Hi, Jeromq is aligned with Zeromq in the transport address definition, for example in tcp or inproc.

The project is based on C4, you are welcome to submit a PR. Please be mindful of the backward-compatibility (and testing, but that's more of an individual inclination).

@RalphSteinhagen
Copy link
Author

@fredoboulo thanks for the links. I started to extend this (which is/should be technically simple) but encountered some stumbling blocks, maybe you could advise how to proceed:

The usage of the asterisk '*' is different from the official URI RFX definition but manageable/ok and a simple replace. However:

  • While the tcp link you sent and examples therein seem to prescribe a leading 'tcp://' scheme prefix (which should be normal, since ZeroMQ supports also many other schemes/protocols), most of the unit-test omit this and assume a default fallback to 'tcp://' if no scheme/protocol is given. Shall one follow the
    a) 'works as implemented' which dumps any notion of URI compatibility, or
    b) 'works as specified' paradigm which may affect some users who used this (undocumented?) behaviour.
    In the latter case, the definition is a bit loose, strongly "inspired" by but not completely following the RFC 3986 URI definition. Personally, I'd prefer/suggest abiding to RFC3986 standard, because it's a more precise/known definition used all over the internet (ie. HTTP, REST).

  • Java API deprecation: jdk.net.Sockets.setOption(..) used in TcpUtils.java has been deprecated since Java 9 and seems to have been dropped from the module system. There is a direct method replacement in Socket itself, but the general question is rather: which Java language standard shall JeroMQ target? 8 (as still stated in pom.xml)? 11, or newer? I am asking because the newest Java standards/JDKs are diverging more and more from Java 8 and supporting it may become increasingly difficult.

Would be willing to dive a bit more into the code, modify it where necessary but am a bit worried that'll it be for nothing if these proposed changes (ie. endpoints following URI semantics, Java 11) which are close to the core are perceived to be too much...

Your input would thus be much appreciated.

@fredoboulo
Copy link
Contributor

fredoboulo commented Feb 14, 2021

The library aims to support Android usage, which means almost-Java-8.
It's painful to maintain (and only my individual perception).

Can you point which unit tests in Jeromq are omitting the scheme/protocol?

@RalphSteinhagen
Copy link
Author

Pretty much all in TcpAddressTests except for testBad(), testGoodIP46Google(), and other test that implicitly use 'tcp'.

I thought of generalising TcpAddress::resolve(..) using URI (which would take most of the parsing/syntax checking) to:

@Override
public InetSocketAddress resolve(String name, boolean ipv6, boolean local)
{
    if (name == null) {
        throw new IllegalArgumentException("declared TCP endpoint is null");
    }
    final URI uri = URI.create(name.replace("*", ipv6 ? "::" : "0.0.0.0"));
    final String newHost = uri.getHost();
    final int newPort = uri.getPort();
    final String path = uri.getPath();
    // [..]
}

There are also some parts in zmq.io.net.Address that drop the protocol implicitly but these are minor and can be fixed transparently.

It's of course possible to change all the unit-tests accordingly, but this may (or not) surprise some users that relied on this implicit behaviour. The first PR would have simply ignored the path argument -- as it should with the present definition.

Then -- as a second step (another PR) -- I would have propagated the path as (an optional) specifier argument similar to how it is done for inproc:// and also to allow a more economic use of TCP ports.

The library aims to support Android usage, which means almost-Java-8.

Well, then I should be probably looking for an Android-compatible JDK8 if I want to get anything merged. :-)

N.B. my IDE refuses to compile/run with the deprecated jdk.net.Sockets.setOption(socket, option, value); due to (probably) modularisation issues and/or it being an intellij issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants