Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable split script to use Magic SD-WAN #194

Open
sidprax opened this issue Jul 29, 2023 · 12 comments
Open

Enable split script to use Magic SD-WAN #194

sidprax opened this issue Jul 29, 2023 · 12 comments

Comments

@sidprax
Copy link

sidprax commented Jul 29, 2023

Is there a way to use existing options (wireguard). I cannot get internet traffic to go through the Magic S2S tunnel (both side subnets can talk to each other), but I'd like to access internet on site 1 through the WAN on site 2.

@sidprax
Copy link
Author

sidprax commented Jul 30, 2023

I tried a few different ways (outputs below are from nexthop from Site B). 10.0.1.0/24 is Site A, 10.0.10.0/24 is B. Magic S2S uses 192.168.X.0 as gateway. wgsts1000 is the interface name for wg Magic S2S.

Seems like something needs to be done on perhaps remote site B firewall, but I'm hitting a wall here. @peacey Any help will be much appreciated!

Pinging a remote client on B With and without masquerade works-

tcpdump -ni any host 10.0.10.64

tcpdump: data link type LINUX_SLL2
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
23:43:12.678197 wgsts1000 In IP 10.0.1.75 > 10.0.10.64: ICMP echo request, id 1, seq 1814, length 40
23:43:12.678243 br0 Out IP 10.0.1.75 > 10.0.10.64: ICMP echo request, id 1, seq 1814, length 40
23:43:12.678271 rai4 Out IP 10.0.1.75 > 10.0.10.64: ICMP echo request, id 1, seq 1814, length 40
23:43:12.758358 rai4 P IP 10.0.10.64 > 10.0.1.75: ICMP echo reply, id 1, seq 1814, length 40
23:43:12.758358 br0 In IP 10.0.10.64 > 10.0.1.75: ICMP echo reply, id 1, seq 1814, length 40
23:43:12.758443 wgsts1000 Out IP 10.0.10.64 > 10.0.1.75: ICMP echo reply, id 1, seq 1814, length 40

Without masquerade, pinging an internet IP seems to go through the 192.168.X.X subnet and seems to be dropped. (Pinged 4.2.2.2 from Site A)

tcpdump -ni any host 4.2.2.2

tcpdump: data link type LINUX_SLL2
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
23:02:51.683089 wgsts1000 In IP 192.168.5.1 > 4.2.2.2: ICMP echo request, id 1, seq 1392, length 40
23:02:51.683175 wgsts1000 Out IP 192.168.5.1 > 4.2.2.2: ICMP echo request, id 1, seq 1392, length 40
23:02:56.566520 wgsts1000 In IP 192.168.5.1 > 4.2.2.2: ICMP echo request, id 1, seq 1393, length 40
23:02:56.566587 wgsts1000 Out IP 192.168.5.1 > 4.2.2.2: ICMP echo request, id 1, seq 1393, length 40

With Masquerade Doesn't work, it seems like packets keep getting reflected multiple times? (Pinged 4.2.2.2 from Site A)

tcpdump -ni any host 4.2.2.2

tcpdump: data link type LINUX_SLL2
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
23:51:41.363282 wgsts1000 In IP 10.0.1.75 > 4.2.2.2: ICMP echo request, id 1, seq 1910, length 40
23:51:41.363333 wgsts1000 Out IP 10.0.1.75 > 4.2.2.2: ICMP echo request, id 1, seq 1910, length 40
23:51:41.640179 wgsts1000 In IP 10.0.1.75 > 4.2.2.2: ICMP echo request, id 1, seq 1910, length 40
23:51:41.640199 wgsts1000 Out IP 10.0.1.75 > 4.2.2.2: ICMP echo request, id 1, seq 1910, length 40
23:51:41.913949 wgsts1000 In IP 10.0.1.75 > 4.2.2.2: ICMP echo request, id 1, seq 1910, length 40
23:51:41.913970 wgsts1000 Out IP 10.0.1.75 > 4.2.2.2: ICMP echo request, id 1, seq 1910, length 40
.... Several same rows

@peacey
Copy link
Owner

peacey commented Jul 30, 2023

Hi @sidprax,

It's difficult for me to debug with because I don't have two UDMs to try the magic site-to-site on. From the results you show, it's odd that when pinging an external IP (4.2.2.2) with or without masquerate, the request on Site B are being re-routed back through the wireguard tunnel instead of the WAN tunnel.

23:02:51.683089 wgsts1000 In IP 192.168.5.1 > 4.2.2.2: ICMP echo request, id 1, seq 1392, length 40
23:02:51.683175 wgsts1000 Out IP 192.168.5.1 > 4.2.2.2: ICMP echo request, id 1, seq 1392, length 40

It says In IP then Out IP out of the same wgsts1000 tunnel... that shouldn't be the case. It should say Out IP out of the WAN tunnel for it to work. So I'm suspecting some weird rules that Unifi has for this tunnel that forces all traffic out of it to go back through it, perhaps...? Or maybe you're using an incorrect gateway.

First of all, you said you are using 192.168.X.0 as the gateway in the VPN script? .0 isn't a usable IP though, it's the broadcast address and isn't assigned to any host. Did you mean 192.168.X.1 or something like that? And how did you figure out this gateway?

Also, how are you adding a wireguard magic S2S? When I go to S2S options in Unifi, I only see OpenVPN or IPSec options.

@sidprax
Copy link
Author

sidprax commented Jul 31, 2023

Thanks for replying! See below for some outputs from Site A which may be helpful. Let me know if you want to see some outputs from Site B instead.

I see 192.168.X.0 here:
netstat -r

Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
10.0.0.2 0.0.0.0 255.255.255.255 UH 0 0 0 br5.mac
10.0.1.0 0.0.0.0 255.255.255.0 U 0 0 0 br0
10.0.10.0 192.168.0.0 255.255.255.0 UG 0 0 0 wgsts1000
10.0.10.2 192.168.0.0 255.255.255.255 UGH 0 0 0 wgsts1000
XXXXXXXXX 0.0.0.0 255.255.255.0 U 0 0 0 eth8
192.168.0.0 0.0.0.0 255.255.255.255 UH 0 0 0 wgsts1000

and here:
ip route show 10.0.10.1/24

10.0.10.0/24 via 192.168.0.0 dev wgsts1000 proto ospf metric 20 onlink

Wireguard has 0.0.0.0/0 but when using your script, I disabled blackhole.
wg

interface: wgsts1000
public key: XXXXXX
private key: (hidden)
listening port: 20001

peer: XXXXXXX
endpoint: XXXXXXXX:22570
allowed ips: 0.0.0.0/0, 192.168.0.0/32
latest handshake: 38 seconds ago
latest receive: 3 seconds ago
transfer: 1.85 MiB received, 1.47 MiB sent
persistent keepalive: every 10 seconds
forced handshake: every 5 seconds

The Magic S2S is a new option from firmware 3.1.X when you own multiple unifi OS devices. You can choose the option in unifi dashboard.

image

image

@sidprax
Copy link
Author

sidprax commented Jul 31, 2023

Unifi changes the X in 192.168.X.0 when it reconnects, so don't mind that changing from 5 to 0 in last reply.

@jeffdoo
Copy link

jeffdoo commented Jul 31, 2023

I too am having issues while attempting to setup a new remote UDM Pro for my inlaws.

I do not know if this makes a difference but using the only S2S IPSec method I would see the following:

ip route show 192.168.10.0/24
192.168.10.0/24 dev vti64 proto static scope link metric 30

But when running the same command when using the new Magic method I see the following:

ip route show 192.168.10.0/24
192.168.10.0/24 via 192.168.1.1 dev wgsts1001 proto ospf metric 20 onlink

If I configure vpn.conf to use 192.168.10.1 (Site A) as I am used to with the old S2S IPSec VPN:

/etc/split-vpn/vpn/updown.sh wgsts1001 up huntersville
[Mon Jul 31 08:41:02 EDT 2023] split-vpn: wgsts1001 up: Loading configuration from /mnt/data/split-vpn/nexthop/huntersville/vpn.conf.
Error: Nexthop has invalid gateway.

If I use 192.168.1.1 everything starts fine but nothing is routed to the internet.

@sidprax
Copy link
Author

sidprax commented Jul 31, 2023

I too am having issues while attempting to setup a new remote UDM Pro for my inlaws.

I do not know if this makes a difference but using the only S2S IPSec method I would see the following:

ip route show 192.168.10.0/24
192.168.10.0/24 dev vti64 proto static scope link metric 30

But when running the same command when using the new Magic method I see the following:

ip route show 192.168.10.0/24
192.168.10.0/24 via 192.168.1.1 dev wgsts1001 proto ospf metric 20 onlink

If I configure vpn.conf to use 192.168.10.1 (Site A) as I am used to with the old S2S IPSec VPN:

/etc/split-vpn/vpn/updown.sh wgsts1001 up huntersville
[Mon Jul 31 08:41:02 EDT 2023] split-vpn: wgsts1001 up: Loading configuration from /mnt/data/split-vpn/nexthop/huntersville/vpn.conf.
Error: Nexthop has invalid gateway.

If I use 192.168.1.1 everything starts fine but nothing is routed to the internet.

I think the wireguard implementation is actually great because I'm pretty sure there's some wizardry happening in the back end for CG-NAT. I never got openvpn or IPSec S2S to work for me in the past. The wireguard implementation is working pretty well for connecting to clients on Site B, but I think some rules (either by unifi's design or omission) are blocking external bound traffic.

We need a networking wizard to help here 😃 @peacey whenever you have some time!

@sidprax
Copy link
Author

sidprax commented Aug 6, 2023

@peacey @jeffdoo Any chance you were able to look at this?

@jeffdoo
Copy link

jeffdoo commented Aug 7, 2023

@sidprax I did not have time to further investigate and went back to the old IPSec S2S solution. Hopefully this can be resolved because the Magic method makes connecting UDM Pros incredibly easy.

@sidprax
Copy link
Author

sidprax commented Aug 8, 2023

This is weird, I'm not sure why my ip route shows a via .0 (broadcast address) while yours @jeffdoo shows via .1 😑

ip route show 10.0.10.0/24
10.0.10.0/24 via 192.168.5.0 dev wgsts1000 proto ospf metric 20 onlink

@angusdavis2
Copy link

angusdavis2 commented Aug 24, 2023

Magic Sites feels like a much better site-to-site VPN implementation to alternatives as it supports all the scenarios that plague traditional site-to-site VPN setup in Unifi, such as when you failover to your secondary WAN connection, or dynamic IP addresses / FQDN support, etc. But, figuring out how to route traffic over the magic site VPN remains a mystery to me -- if we could get split-vpn to work with magic sites it would be awesome!

Running ip route show, I see similar output to @sidprax (my remote network reached via the VPN is 192.168.2.0/24):

ip route show 192.168.2.0/24
192.168.2.0/24 via 192.168.1.0 dev wgsts1000 proto ospf metric 20 onlink

The 192.168.1.0 is not a routable address, as @peacey noted, but it's what appears here. Note if you just run ip route show you will see this, note the two entries related to the wireguard site-to-site VPN (wgsts1000):

24.171.201.1 dev ppp0 proto kernel scope link src 70.45.6.247   # My Primary Internet 
100.64.0.0/10 dev eth7 proto kernel scope link src 100.100.44.63  # My Starlink Secondary
192.168.0.0/24 dev br0 proto kernel scope link src 192.168.0.1 
192.168.1.0 dev wgsts1000 proto kernel scope link
192.168.2.0/24 via 192.168.1.0 dev wgsts1000 proto ospf metric 20 onlink

I have noticed that even though the address shown is 192.168.1.0 (not routable), going to 192.168.1.1. brings up the UDM Pro (even though my local network is 192.168.0.1.

In vpn.conf, I have experimented getting split-vpn to work with the following settings, to no avail:

  • DEV=wgsts1000
  • BYPASS_MASQUERADE_IPV4="ALL" (also tried BYPASS_MASQUERADE_IPV4="")
  • VPN_ENDPOINT_IPV4= I tried the gateway of the remote network (192.168.2.1) similar to how split-vpn works for an IPSec VPN, I also tried 192.168.1.0 which is the weird broadcast address provided by ip route show, similar to how split-vpn works for an OpenVPN, I also tried 192.168.1.1. When using 192.168.1.1 or 192.168.0.1, the script runs without complaining but reaching the FORCED destinations does not work. When using the remote site's router (192.168.2.1), the script fails with Error: Nexthop has invalid gateway..

By saying "reaching the FORCED destinations does not work", what I mean is, consider this example, whatismyip.com is forced using IP sets, its IP address is 172.67.189.152. I attempt to ping it from my local network host (192.168.0.84) while running tcpdumnp on the local UDM Pro.

Behavior when split-vpn is DOWN / turned off (normal, expected behavior, going out over the WAN interface):

#  tcpdump -ni any host 172.67.189.152
tcpdump: data link type LINUX_SLL2
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
11:45:27.387920 switch0 In  IP14 (invalid)
11:45:27.387920 switch0.1 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 59900, seq 0, length 64
11:45:27.387920 br0   In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 59900, seq 0, length 64
11:45:27.387980 ppp0  Out IP 70.45.6.247 > 172.67.189.152: ICMP echo request, id 59900, seq 0, length 64
11:45:27.416340 ppp0  In  IP 172.67.189.152 > 70.45.6.247: ICMP echo reply, id 59900, seq 0, length 64
11:45:27.416382 br0   Out IP 172.67.189.152 > 192.168.0.84: ICMP echo reply, id 59900, seq 0, length 64
11:45:27.416387 switch0.1 Out IP 172.67.189.152 > 192.168.0.84: ICMP echo reply, id 59900, seq 0, length 64

Behavior when split-vpn is UP / turned on, with VPN_ENDPOINT_IPV4=192.168.1.1 -- appears to be in a loop:

#  tcpdump -ni any host 172.67.189.152
tcpdump: data link type LINUX_SLL2
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
11:43:17.246175 switch0 In  IP14 (invalid)
11:43:17.246175 switch0.1 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.246175 br0   In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.246220 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.319732 wgsts1000 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.319759 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.393499 wgsts1000 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.393524 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.467263 wgsts1000 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.467290 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.541941 wgsts1000 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.541962 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.615379 wgsts1000 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.615400 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.689211 wgsts1000 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.689237 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.764531 wgsts1000 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.764555 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.839995 wgsts1000 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.840019 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.915014 wgsts1000 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.915040 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.989137 wgsts1000 In  IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
11:43:17.989169 wgsts1000 Out IP 192.168.0.84 > 172.67.189.152: ICMP echo request, id 20476, seq 0, length 64
( continues like this ad infinitum)

Sorry I am not much of a networking whiz, but would love to get split-vpn working with the magic site stuff as it is so far superior to any other UDM solution for site-to-site VPN. Feedback / suggestions welcome.

@jacobmr
Copy link

jacobmr commented Feb 5, 2024

Looks like this thread is stale. I'm curious if anyone has made any progress here. @peacey - given that the new low-cost express devices support Site Magic, I bet that those of us here who would love to have magic site would sponsor the purchase of one for you so that you can test/implement (if possible) support for magic site ... anyone else game for this?

@rigwig
Copy link

rigwig commented Mar 17, 2024

Looks like this thread is stale. I'm curious if anyone has made any progress here. @peacey - given that the new low-cost express devices support Site Magic, I bet that those of us here who would love to have magic site would sponsor the purchase of one for you so that you can test/implement (if possible) support for magic site ... anyone else game for this?

I would also be down to chip in if this would help dev work on this.

I've tried all combinations as some of previously mentioned to no avail as well.

10.0.1.0/24 via 192.168.1.1 dev wgsts1000 proto ospf metric 20 onlink
10.0.1.2 via 192.168.1.1 dev wgsts1000 proto ospf metric 20 onlink
10.1.0.0/24 dev br0 proto kernel scope link src 10.1.0.1
10.1.1.0/24 dev br21 proto kernel scope link src 10.1.1.1
10.1.2.0/24 dev br22 proto kernel scope link src 10.1.2.1

I do have a routeable address, 192.168.1.1, am able to bring the tunnel up without error, but no dice on the actual connection

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants