Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for discovery from within a docker container #517

Open
pdumais opened this issue Sep 27, 2023 · 14 comments · May be fixed by #524
Open

Add support for discovery from within a docker container #517

pdumais opened this issue Sep 27, 2023 · 14 comments · May be fixed by #524
Labels
enhancement New feature or request

Comments

@pdumais
Copy link

pdumais commented Sep 27, 2023

Since broadcasting a discovery packet won't work within a docker container (or kubernetes), I'd like to propose the following option:
A new discover_from_range(start,end) function that would send a UDP packet to each endpoints within the range and then listen for incomming responses.
This is different than doing a whole bunch of of discover_single and using asyncio.gather because it would create less overhead with tasks creation. I'd like it if we could simply send a whole bunch of UDP packets (a kind of manual broadcast) and then just wait for responses.

Is there an interest from the community in having such functionality? If yes, then I am willing to create a PR for this.

@rytilahti
Copy link
Member

So I think this is a very, very special use-case and I am not a great fan of the idea of having a separate method for such. Is there no way to perform broadcasts / expose networks in docker/kubernetes?

However, I'm open for the idea of refactoring the necessary bits that would make it easier to implement such outside the library. What you are basically looking is to have a way to initialize a _DiscoverProtocol instance that is not bound to a single target but has a do_discover(host: str) instead, right?

@rytilahti rytilahti added the enhancement New feature or request label Oct 4, 2023
@pdumais
Copy link
Author

pdumais commented Oct 4, 2023

Yeah, I wasn't sure if there was an interest. But I think anyone running in a container would have no other real options. You can setup container host networking but that is often undesirable.

The method would be sending a udp packet on multiple hosts but listen only on 1 socket. Right now, I'm creating a discovery for each host in the network and using asyncio.gather. but this creates many listening tasks. Icd rather have 1 task send all packets and then wait for the period of time to see who answers.

But yeah ... if I'm the only one running in docker, then it's not worth it.

@rytilahti
Copy link
Member

rytilahti commented Oct 4, 2023

Yeah, I understand the need and I'm personally open for PRs that make it easier to do exactly that, but I'm not sure if it makes sense to introduce a new high-level API for that.

One option to make it easier to fulfill your use case would be modify the _DiscoverProtocol.do_discover() as I suggested above, and maybe exposing that protocol instead of keeping it internal, to allow doing something like this:

from ipaddress import ip_network
from itertools import chain

async def on_discovered_callback(...):
    """handle discovered devices."""

async def discover_on_networks(*networks: List[IPv4Network]):
	transport, protocol = await loop.create_datagram_endpoint(
	    lambda: DiscoverProtocol(),
        target=None,  # to avoid automatic call for do_discover
	    local_addr=("0.0.0.0", 0),
        on_discovered=on_discovered_callback,
	)
	protocol = cast(DiscoverProtocol, protocol)
	for host in chain(*networks):
	    protocol.do_discover(str(host))

	asyncio.sleep(5)

discover_on_networks(ip_network("192.168.0.0/24").hosts(), ip_network("10.0.0.0/24"))

It's really hard to judge how much interest there is for such a feature, but I'm pretty sure there are people using this project inside docker-environments in a way or another :-)

@pdumais
Copy link
Author

pdumais commented Oct 5, 2023

That sounds like a good option. I'm not sure I fully understand your example, I'll have to look at the current implementation. But if I understand correctly, this would be a way for people to implement custom strategies and hook them into the library? That's a very good option I think.

@rytilahti
Copy link
Member

rytilahti commented Oct 5, 2023

So basically it would involve two changes:

  1. Modifying do_discover() to include a new parameter target (or host) which overrides the target tuples that are currently set inside __init__.
  2. Modifying the logic inside _DiscoverProtocol.__init__ to avoid calling do_discover if no target is specified (and allow None for target for that).
  3. Maybe also: change __init__ to default the target to None, and modify the Discover class to call the do_discover(target="255.255.255.255") explicitly.
  4. Maybe also: expose the protocol class via kasa and rename it to DiscoveryProtocol (instead of DiscoverProtocol..).

So the example above (which I just updated a bit) shows how the library user could use this adapted API to send multiple discoveries, by not using the Discover class but its backing, currently internal _DiscoverProtocol class directly.

Does that make sense to you?

@pdumais
Copy link
Author

pdumais commented Oct 5, 2023

Yeah, that makes sense. I'll try to take a look at this in the next weeks. Thanks.

@rytilahti rytilahti linked a pull request Oct 5, 2023 that will close this issue
@rytilahti
Copy link
Member

rytilahti commented Oct 5, 2023

I did a quick test to see how it would look like, see the linked draft and feel free to give it a try!

edit: looks like there's something off and the datagrams are not being sent as expected, I'll see if I can figure it out...

@pdumais
Copy link
Author

pdumais commented Oct 5, 2023

Nice, thanks for that. I'll give it a try in the next days

@rytilahti
Copy link
Member

Alas, for some reason that does not work in my testing but I'm not sure why. #528 got recently merged that allows constructing device instances by using the regular TCP connectivity, which could potentially be useful in your use-case, too.

@pdumais
Copy link
Author

pdumais commented Oct 8, 2023

I tried adding a sleep between each sendto. I can see that the packets go out but only on a few ip addresses. 8 IPs on my network. I have no idea why nothing is sent out to the other ones.

The documebtation for create_datagram_endpoint is confusing because it talks about establishing a connection. How is that relevant for UDP?

@pdumais
Copy link
Author

pdumais commented Oct 9, 2023

I just had to chance to test more today.
When send a packet to each address in the subnet, an ARP query is sent for all addresses that don't exist (since they are not in the ARP cache yet... and will never be). It seems like after about 64 (or something like that) ARP queries, asyncio stalls the requests for 2 seconds and then sends the rest.

I have a device at 192.168.1.187. If I change the sleep in your example to 5s before closing the transport, and I scan on 192.168.1.128/25, then I get the packet back. If I wanna use 192.168.0/24, then I have to put a sleep for 10s.
If I scan on 192.168.1.160/27, then I get the response instantly.

Odly enough, if I use plain UDP sockets in python, without asyncio, I am able to send the UDP packets instantly to everyone.

@rytilahti
Copy link
Member

rytilahti commented Oct 9, 2023

My testing mirrors yours, but I haven't had time to look into why.. I have earlier experienced ratelimiting caused by sending sending loads (i.e., thousands) of ICMP requests at once, but that was caused by the kernel. As using plain UDP sockets works in your tests, it's likely something in the python's asyncio stack that is causing this..

@pdumais
Copy link
Author

pdumais commented Oct 9, 2023

I'd bet on asyncio instead of the kernel since I dont have this issue if using the plain socket library.

Wanna wait if someone else can chime in? I can try to dig more later.

@rytilahti
Copy link
Member

Given this is a rather esoteric use-case, I doubt anyone will chime in. But I guess it'd be worth to look into how sendto works to figure out how it works and what should be changed. For regular write()s one is expected call drain() to clear the output buffer, maybe sendto() has something similar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants