Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added emails from throwawaymail.com #437

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

TravisLittlechilds
Copy link

@haumacher
Copy link
Contributor

Hi @TravisLittlechilds, I'm wondering how you found all these domains, since the site throwawamail.com does not allow the user to select a domain from a list and has strong protection against automatic queries. However, when I visited the site, I got an e-mail from domain "mybx.site", which is not in your list...

@TravisLittlechilds
Copy link
Author

@haumacher it was an annoying process over a few weeks consisting of many sessions with incognito windows. I didn't detect any kind of pattern in the domains it handed out, so I'm not surprised I didn't manage to find them all.

@haumacher
Copy link
Contributor

@TravisLittlechilds I'm still wondering how to (a) automatically decide whether a PR to this list is valid and (b) how a good heuristics would look like that decides whether a new e-mail domain seen in some user registration form is disposable or not (given this list of "well-known" disposable domains).

There are several services that claim to decide about e-mail domains to be disposable or not. On of them (https://check-mail.org/) seems to take the DNS MX record into account that is associated with the domain.

This "check" applied to the domains you entered gives the following result:

No MX record:

em4.catchservers.com
mx4.catchservers.net

Mail server 164.90.194.37

jual.me
seoph.website

All other domains have the mail servers 165.22.201.68 and 137.184.154.224

Maybe the maintainers of this list could give some more insights?

@martenson
Copy link
Member

to my knowledge a valid MX record is not needed for receiving mail

https://datatracker.ietf.org/doc/html/rfc5321#section-5

we've had some discussions about this topic at #84 and #58

@haumacher
Copy link
Contributor

OK, not having a MX record means the domain/host is its own mail server. So not having a MX record finally provides no clue, whether the domain could be a disposable e-mail domain or not.

But having a MX record resolving to the same IP address as some other "well-known" disposable domain provides some evidence that this domain is also a disposable one, right?

@haumacher
Copy link
Contributor

Or the other way around, if a domain has no MX record, but its resolved IP address points to some mail server used by some other "well-known" disposable domain also provides some evidence that this domain is disposable, too.

In the example above, em4.catchservers.com points to 137.184.154.224, which is the same as one of the mail servers of all other domains. mx4.catchservers.net points to 165.22.201.68, which is the other mail server of all other domains entered in this PR.

@martenson
Copy link
Member

@haumacher your reasoning seems plausible to me. However I do not have deep knowledge of the intricacies that come with mailing systems.

Are you proposing of making something like disposable_ip_blocklist?

@haumacher
Copy link
Contributor

@martenson No, I don't think a block-list of "disposable" IP addresses would be a good solution, because this data is too volatile. However, a dynamically built classification of mail server IP addresses could be helpful to decide, whether a newly discovered e-mail domain is suspicious to be disposable.

Look at the domains added in this PR. Manually, nobody is able to verify, that those really belong to the fake-mail service mentioned in the PR. The service does not offer a list of domains a user can select from and repeatedly querying the service (from the same IP?) requires solving strong CAPTCHAs to get another e-mail address. Therefore, even automating the lookup process as proposed in #450 seams not to be a feasible solution.

What I'm thinking about is a database with "well-known" fake-mail services and "well-known" fake mail domains associated with those services. The question is how to classify a new e-mail domain based on the MX records and IP addresses of associated mail servers. If a reasonable heuristics can be found, it would be sufficient to manually manage a list of well-known fake-mail providers (web sites) and some examples of their fake-mail-domains (as training data), which can easily be collected manually. New domains could then be classified automatically...

@martenson
Copy link
Member

@haumacher Sounds like a honking good idea to me. Would you care to outline this approach in a new issue?

@haumacher
Copy link
Contributor

@martenson I opened issue #456 with initial thoughts for such heuristics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants