Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug]: chan_iax2: Call setup fails if dialing hostnames that resolve to localhost #413

Open
1 task done
InterLinked1 opened this issue Nov 5, 2023 · 14 comments · May be fixed by #414
Open
1 task done

[bug]: chan_iax2: Call setup fails if dialing hostnames that resolve to localhost #413

InterLinked1 opened this issue Nov 5, 2023 · 14 comments · May be fixed by #414
Assignees
Labels
bug support-level-core Functionality with core support level

Comments

@InterLinked1
Copy link
Contributor

InterLinked1 commented Nov 5, 2023

Severity

Minor

Versions

21.0.0

Components/Modules

chan_iax2

Operating Environment

Debian 12

Frequency of Occurrence

Constant

Issue Description

I've uncovered a scenario of all calls failing on a system where calls are dialed using a hostname, but they succeed if using the loopback IP address:

same => n,Dial(IAX2/myuser@some.hostname.example.com/5551212) ; fails
same => n,Dial(IAX2/myuser@127.0.0.1/5551212) ; works

Authentication and encryption settings are not relevant, just changing from hostname to IP will make the bug disappear.

On this system, the hostname in the first example resolves to 127.0.1.1 locally.

I added some debug messages locally to pinpoint the issue, and something strange I've noticed is the following for all such failing calls:

Silently rejecting call without accurate destination call number (32769)

Failing calls always emit this message with 32,769, which is likely indicative of a call number not being set somewhere. This happens between the initial call set up request and when the request is supposed to be repeated with a call token, but the repeated request with a call token never happens due to the above.

The message above is one I added, here:

if (!fr->callno) {
	int check_dcallno = 0;

	/*
	 * We enforce accurate destination call numbers for ACKs.  This forces the other
	 * end to know the destination call number before call setup can complete.
	 *
	 * Discussed in the following thread:
	 *    http://lists.digium.com/pipermail/asterisk-dev/2008-May/033217.html
	 */

	if ((ntohs(mh->callno) & IAX_FLAG_FULL) && ((f.frametype == AST_FRAME_IAX) && (f.subclass.integer == IAX_COMMAND_ACK))) {
		check_dcallno = 1;
	}

	if (!(fr->callno = find_callno(ntohs(mh->callno) & ~IAX_FLAG_FULL, dcallno, &addr, new, fd, check_dcallno))) {
		ast_log(LOG_NOTICE, "inside the if (no such call: %d)\n", ntohs(mh->callno));
		if (f.frametype == AST_FRAME_IAX && f.subclass.integer == IAX_COMMAND_NEW) {
			send_apathetic_reply(1, ntohs(fh->scallno), &addr, IAX_COMMAND_REJECT, ntohl(fh->ts), fh->iseqno + 1, fd, NULL);
		} else if (f.frametype == AST_FRAME_IAX && (f.subclass.integer == IAX_COMMAND_REGREQ || f.subclass.integer == IAX_COMMAND_REGREL)) {
			send_apathetic_reply(1, ntohs(fh->scallno), &addr, IAX_COMMAND_REGREJ, ntohl(fh->ts), fh->iseqno + 1, fd, NULL);
		} else {
			ast_log(LOG_WARNING, "Silently rejecting call without accurate destination call number (%d)\n", ntohs(mh->callno));
		}
		ast_variables_destroy(ies.vars);
		return 1;
	}
}

Here's a failing call with IAX2 debug, showing that call token is requested but the request is never repeated with the call token:

[2023-11-05 23:33:57] Tx-Frame Retry[ No] -- OSeqno: 000 ISeqno: 001 Type: IAX     Subclass: CTOKEN
[2023-11-05 23:33:57]     Timestamp: 00010ms  SCall: 00001  DCall: 01938 127.0.0.1:4569
[2023-11-05 23:33:57]     CALLTOKEN       : 51 bytes
[2023-11-05 23:33:57]
[2023-11-05 23:33:57] DEBUG[40612]: chan_iax2.c:10321 socket_process_helper: XXX socket_process_helper tracepoint
[2023-11-05 23:33:57] DEBUG[40614]: chan_iax2.c:10198 socket_process_helper: XXX socket_process_helper
[2023-11-05 23:33:57] Rx-Frame Retry[ No] -- OSeqno: 000 ISeqno: 001 Type: IAX     Subclass: CTOKEN
[2023-11-05 23:33:57]     Timestamp: 00010ms  SCall: 00001  DCall: 01938 127.0.0.1:4569
[2023-11-05 23:33:57]     CALLTOKEN       : 51 bytes
[2023-11-05 23:33:57]
[2023-11-05 23:33:57] NOTICE[40614]: chan_iax2.c:10344 socket_process_helper: No call number yet
[2023-11-05 23:33:57] NOTICE[40614]: chan_iax2.c:10360 socket_process_helper: inside the if (no such call: 32769)
[2023-11-05 23:33:57] WARNING[40614]: chan_iax2.c:10366 socket_process_helper: Silently rejecting call without accurate destination call number (32769)
[2023-11-05 23:33:58]     -- Nobody picked up in 1000 ms

The only thing I can think of being different in this scenario is the fact that the hostname being dialed resolves to something local (127.0.0.1). On a previous system, the hostname was local, but it resolved to a public IP address, not the loopback address. So this is probably being exposed by DNS resolving slightly differently between the two systems, but obviously this is incorrect behavior regardless.

Relevant log output

No response

Asterisk Issue Guidelines

  • Yes, I have read the Asterisk Issue Guidelines
InterLinked1 added a commit to InterLinked1/phreakscript that referenced this issue Nov 6, 2023
It seems that on systems where calls to a hostname which
resolves locally to the loopback address will fail
with IAX2. This is outlined here: asterisk/asterisk#413

This might happen on nodes where a (public) hostname
resolves internally to a loopback IP address.

As a workaround, we now resolve the hostname and if it resolves
to a loopback IP address, we explicitly rewrite the destination
with the loopback IP address.

A corresponding change has been made in the verification APIs
to allow loopback calls to successfully verify if the alleged
hostname resolves to the verifying node's IP address, and the
received IP address is the loopback IP address.
@jcolp
Copy link
Member

jcolp commented Nov 6, 2023

I am unable to reproduce this using the sample configuration and the provided information. Please provide a complete iax.conf file that reproduces this, to ensure that no other configuration options are required to expose it.

InterLinked1 added a commit to InterLinked1/phreakscript that referenced this issue Nov 6, 2023
It seems that on systems where calls to a hostname which
resolves locally to the loopback address will fail
with IAX2. This is outlined here: asterisk/asterisk#413

This might happen on nodes where a (public) hostname
resolves internally to a loopback IP address.

As a workaround, we now resolve the hostname and if it resolves
to a loopback IP address, we explicitly rewrite the destination
with the loopback IP address.

A corresponding change has been made in the verification APIs
to allow loopback calls to successfully verify if the alleged
hostname resolves to the verifying node's IP address, and the
received IP address is the loopback IP address.
InterLinked1 added a commit to InterLinked1/phreakscript that referenced this issue Nov 6, 2023
It seems that on systems where calls to a hostname which
resolves locally to the loopback address will fail
with IAX2. This is outlined here: asterisk/asterisk#413

This might happen on nodes where a (public) hostname
resolves internally to a loopback IP address.

As a workaround, we now resolve the hostname and if it resolves
to a loopback IP address, we explicitly rewrite the destination
with the loopback IP address.

A corresponding change has been made in the verification APIs
to allow loopback calls to successfully verify if the alleged
hostname resolves to the verifying node's IP address, and the
received IP address is the loopback IP address.
@InterLinked1
Copy link
Contributor Author

I am unable to reproduce this using the sample configuration and the provided information. Please provide a complete iax.conf file that reproduces this, to ensure that no other configuration options are required to expose it.

This happens with the stock iax.conf.sample.

It doesn't even matter if the dialed destination exists or not, since this all happens before that point.

@jcolp
Copy link
Member

jcolp commented Nov 6, 2023

Welp, still not happening for me I'm afraid.

@InterLinked1
Copy link
Contributor Author

Welp, still not happening for me I'm afraid.

Hmm, just to clarify, you're dialing a hostname that resolves publicly to a public interface but internally to the loopback one?
I don't think this bug is manifested in any other scenario.

For example, this doesn't happen on another system if I try to reproduce it, since the above isn't true.

@jcolp
Copy link
Member

jcolp commented Nov 6, 2023

Hostnames don't resolve to interfaces. If you mean it's using split DNS to resolve internally to loopback, but externally to a public IP address that is what I tried. If there are other specific required conditions, then you'd need to elaborate further on the setup such as how it is resolving it locally (for example using /etc/hosts or a local dnsmasq instance).

@InterLinked1
Copy link
Contributor Author

Hostnames don't resolve to interfaces. If you mean it's using split DNS to resolve internally to loopback, but externally to a public IP address that is what I tried. If there are other specific required conditions, then you'd need to elaborate further on the setup such as how it is resolving it locally (for example using /etc/hosts or a local dnsmasq instance).

My precise DNS setup is:

sub1.example.com is a CNAME record for sub2.example.com
sub2.example.com is an A record for the machine's public IP address

In /etc/hosts, the following is present:

127.0.1.1 sub2.example.com sub2
127.0.0.1 localhost

I didn't add that there, that might have been done automatically as part of VPS setup.

In contrast, the old machine only has:

127.0.0.1 sub3

If I remove sub2.example.com from /etc/hosts, the problem goes away.

@jcolp
Copy link
Member

jcolp commented Nov 6, 2023

And there's the missing piece, it's binding to 127.0.1.1. I was binding to 127.0.0.1 because that is what is on my loopback interface. By default chan_iax2 binds to "0.0.0.0" which does not appear to cover 127.0.1.1. That would be a Linux/socket thing, not chan_iax2 specifically. It looks like an application has to explicitly bind to that IP address to receive its traffic. For example this appears to work for me:

bindaddr=0.0.0.0
bindaddr=127.0.1.1

@jcolp
Copy link
Member

jcolp commented Nov 6, 2023

At least, that is what caused it to work for me.

@jcolp
Copy link
Member

jcolp commented Nov 6, 2023

Actually it does receive the traffic, but something gets confused - possibly chan_iax2 itself. Binding explicitly on 127.0.1.1 does cause it to work though as I mentioned.

@InterLinked1
Copy link
Contributor Author

And there's the missing piece, it's binding to 127.0.1.1. I was binding to 127.0.0.1 because that is what is on my loopback interface. By default chan_iax2 binds to "0.0.0.0" which does not appear to cover 127.0.1.1. That would be a Linux/socket thing, not chan_iax2 specifically. It looks like an application has to explicitly bind to that IP address to receive its traffic. For example this appears to work for me:

bindaddr=0.0.0.0
bindaddr=127.0.1.1

That makes sense to me. I guess I'll chalk this up to stupidity on how the loopback interface was set up then, I'm not sure why anything would configure to use 127.0.1.1 over 127.0.0.1, I just assumed loopback would take care of 127.0.0.0/8 since that whole range is technically reserved for it. But if this is what they're doing now, more people might be having this issue in the future.

I don't think there's anything that can be done in Asterisk for that, does the 32769 call number make sense given what would be happening or does anything else about that seem strange for you? Even if we can't do anything to fix this, I'd still like to add a warning to expose this so people know what they might need to look at.

@jcolp
Copy link
Member

jcolp commented Nov 6, 2023

I don't know IAX2, I can't comment on that.

@InterLinked1
Copy link
Contributor Author

I don't know IAX2, I can't comment on that.

Okay, you can assign this to me then, based on my analysis, I'll at least add a warning I guess, to tell people to look at their network configuration, since we can't do much else. The issue is detectable reliably, but currently there are no debug messages or log messages at all when this scenario happens.

@jcolp jcolp added support-level-core Functionality with core support level and removed triage feedback-required labels Nov 6, 2023
@jcolp
Copy link
Member

jcolp commented Nov 6, 2023

I would suggest actually spending further time investigating the root cause beyond my high level analysis/experimentation. Placing a warning message without understanding the true context can cause confusion.

InterLinked1 added a commit to InterLinked1/asterisk that referenced this issue Nov 6, 2023
Certain calls can fail due to networking/DNS misconfigurations
where a call is made to a loopback address on which chan_iax2
is not listening. When this happens, there is no relevant
logging currently, making this very difficult to track down.
This issue can be detected reliably, so this adds a warning
message to direct users to check their configuration.

Resolves: asterisk#413
@InterLinked1 InterLinked1 linked a pull request Nov 6, 2023 that will close this issue
InterLinked1 added a commit to InterLinked1/phreakscript that referenced this issue Nov 6, 2023
It seems that on systems where calls to a hostname which
resolves locally to the loopback address will fail
with IAX2. This is outlined here: asterisk/asterisk#413

This might happen on nodes where a (public) hostname
resolves internally to a loopback IP address, but chan_iax2
is not listening on that specific address (e.g. 127.0.1.1
instead of 127.0.0.1). This can likely be resolved by adjusting
/etc/hosts.
@seanbright
Copy link
Contributor

When Asterisk sends a NEW to a peer - in this case 127.0.1.1 - it expects to get responses from that IP. Because we do not explicitly set the source address for outgoing packets the responses are sourced from 127.0.01.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug support-level-core Functionality with core support level
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants