-
Notifications
You must be signed in to change notification settings - Fork 72
What about mobile domains? #18
Comments
Thanks @ThirtySomething I'll take a look at it in the next week or two and let you know what I find. |
Not only mobile domains give issues, some domains have a www/www8/w1 in front of it and it isn't blocked by Pi-hole. To solve it, all domains need to be added into the block list or the list needs be converted to a blacklist with wildcards. For example (^|.)ladies.de$ which will block them all (www.ladies.de/m.ladies.de/ladies.de). I started the manually check the list but forgot the mobile sites. I don't know what is faster, a block list or a blacklist. There are advantages to use the blacklist because you can shorten the list for double items with different domains. It is a big job to change the list. Not for adding the (^|.)l and \ but the change the double items which can be captured by a regex. |
@MdBruin, exactly my thoughts with the regex. At one point I was going down the path of writing a dns enumerator to query all possible dns entries of a domain... but it grew very quickly and took forever to run on 2 million domains so I killed the idea. |
@chadmayfield, I can understand it. The first step I have taken to filter all non existing pages and domains. So checking for 404 or no http response code, it takes ages to check them. Unfortunatelly I cannot feed the who list with all lines in it, it give a buffer overflow on my virtual machine and doesn't request a http code of the sites. So I'm running 10k at a time and around 10% doesn't exist anymore (after checking 15% of the total list). |
I don't know how the script works. Thinking about this a possible solution within a few steps:
Check the default of the domain. The user-agent is a desktop version. The domain is redirected to "https://ladies.de/". So let's check this one:
This is again redirected to "https://www.ladies.de/" - let's check this:
This seems to be the valid domain. Now let's request this domain with a mobile user-agent:
This is now rediected to "https://m.ladies.de/" - let's check this one, too:
Seems to be the default mobile domain. This way you could find out the full url of the website and the full url of the mobile domain. But... this only works if the server handles the redirections. Otherwise this will not work. In the first line of the response you get the HTTP status code: 200 is ok - then the url is a valid url of the server. If it is a redirection (301), then the line "Location" refers to the new location. Looks like some parsing is necessary here. Keep in mind that other http status codes are possible, see the list here. |
By the way, the proposed solution works only in case Pi-Hole is not active. Otherwise the domains are blocked - of course. |
@ThirtySomething You can check the code into the test/check_domains.sh script. I new about the -A / --user-agent option and is used in my script for checking mobile domains. It's still a work in progress and not functional, just found out that bash has an option to create functions. It makes the code more readable and less code which is repeated. |
@MdBruin the script is a very rough PoC so if you extend it great! I abandoned it early on. I had gone down that path at one point and thought it would be good but it's kind of a rabbit hole. With CDN's and so many different domains that sites and content are served on I think it would almost be better to use Sub-domain/DNS enumeration on sites found in the main list to compile a list of current A/AAAA/CNAME in the domain DNS records (using something like SubBrute. See something similar by entering a domain here; https://dnsdumpster.com. Rapid7 has a dataset of all forward DNS responses they receive through their Project Sonar. At some point it would be nice to use that dataset to extend my list.... but again I have other personal priorities at the moment and hope to get to it soon. Ultimately a regex for each domain would be the best option. |
Sorry if this is a bit out of subject but I was curious if using openDNS would'nt be simpler instead? |
@etienne1911, it depends on your requirements. I run my pi-hole with a unbound and nsd dns server. @chadmayfield I have tried the regex option. Using a filter took 100k domains with a single dot in it (like ladies.de) and made a script to add the regex options (for example (^|.)ladies.de$ including the escape slash for the . which are filtered out by github ). It slowed the pi-hole some, but mine is running in a virtual machine on a gen 8 Intel NUC i3. After that I have tried it on a RaspberryPi first gen and the impact was even bigger, too big to be useful and it's not even the complete list. It looks to me that the pi-hole first tries to make the first part (www/m/etc.) and add it to the complete web address and then checks it. The complete list is around 19x bigger and I think it's not the way to go. We can make the list smaller by filtering the .com .net. etc. tld's. Even that won't shrink the list that much. After that we could go to make complex regexes but that would mean there is a bigger chance of false positives and a need to include a whitelist which will be large. |
Hello again, Sorry for the delay. @MdBruin I am not so familiar with bash scripting. Instead I created a small Python script that you can find here. From my point of view there is no need to use such sophisticated things like checking DNS records or the like. The webmasters want their pages to be called. Therefore I assume that the web servers are configured correctly. This means that if a wrong URL is called, the server sends a redirection to the correct URL. This can be exploited by following the redirections until you don't get another redirection. This kind of determination can be made for the domain for a desktop browser as well as for a browser from a mobile device. From my point of view no more URLs are necessary to block them with Pi-Hole. Maybe I see this topic too simple. But on the one hand this is quite pragmatic. And on the other hand it is also quite easy to implement. Everything else would be an extremely reliable solution. But from my point of view it would also be quite complex. What do you think about the matter? Greetings ThirtySomething |
Hi @chadmayfield , I've found an interesting post to this topic here. Regards ThirtySomething |
First of all: Thank you for your list. Makes my network much cleaner than without. Very good work!
I have found that the domain "ladies.de" is always locked, but "m.ladies.de" is always unlocked. Is there a way to block all subdomains of a domain?
I am not sure if this is a mistake in Pi-Hole interpreting your list or an error in the list itself. If you look for possible subdomains, the runtime of your script will be extended exponentially. I also don't know if this is an easy task to solve.
The text was updated successfully, but these errors were encountered: