Skip to content

qvest-digital/rfc822

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

┌────────────────────────────────┐
│ eMail address parser/validator │
└────────────────────────────────┘

The purpose of this library is primarily to enable users to verify
eMail addresses, which cannot be done with a regular expression.
More eMail-related parsers, validators, emitters, etc. later.

Validation of IP addresses (IPv6, Legacy IPv4) and FQDNs (hostnames,
domains) is provided as well.

Release note
────────────

This early release of this library had a tight deadline. As such,
it only parses address/mailbox lists and their dependents and can
validate, using eMail rules, localparts and domains (IP address or
FQDN, also usable separately).

Installation
────────────

Add a suitable dependency to your project, for example with Maven:

<dependency>
	<groupId>org.evolvis.tartools</groupId>
	<artifactId>rfc822</artifactId>
	<version>0.8</version>
</dependency>

Or download releases manually from Maven Central:
	https://repo1.maven.org/maven2/org/evolvis/tartools/rfc822/
Building the library yourself is also possible, of course. The
source code may be retrieved (via anonymous read-only git access) or
inspected (per gitweb) at the Evolvis repository; a mirror on another
platform is also available: https://github.com/qvest-digital/rfc822

Make sure your project can handle Java 8 bytecode.

Usage
─────

To validate eMail addresses first get a parser instance:

	final String address = "user <localpart@domain>";
	final Path p = Path.of(address);

If address was null or too long (we’re generous here), this
will return null. Otherwise, call the parser object:

	import lombok.val;

	val mailbox = p.forSender(false); // Path.Address
	val address = p.forSender(true);  // Path.Address
	val mbxList = p.asMailboxList();  // Path.AddressList
	val adrList = p.asAddressList();  // Path.AddressList
	val adrspec = p.asAddrSpec();     // Path.AddrSpec

(Using “val” means the protected type of the result can be stored.
This is not necessary and the return values are of the (public)
Path.ParserResult interface, which works.)

The return value is null if the address cannot be parsed. The
first thing to do now is, with To: for example, to weed out
parsable but invalid input (e.g. bad domain or IP or too long):

	if (!adrList.isValid()) {
		LOG.error("invalid recipients: {}", adrList.invalidsToString());
		return null;
	}

Afterwards, decide what you wish to do with the result. Mostly,
you’ll intend to send out eMails, which needs just addresses:

	return adrList.flattenAddrSpecs();  // List<String>

It is also possible to validate hostnames or domains…

	final String hostname = "foo.example.com";
	final boolean ok = FQDN.isDomain(hostname);

… and IP addresses, both IP and Legacy IP (a.k.a. IPv4):

	final String ip = "2001:db8::1";
	final InetAddress ia = IPAddress.v6(ip); // or .v4(ip)

… checking both kinds of IPAddress in one go:

	final InetAddress ia = IPAddress.from(ip);

In all cases ia will not be null for valid addresses.

There are other useful methods on the resulting object; toString()
especially and introspection of the various parts of a mail path.

CLI (command-line interface) utility
────────────────────────────────────

This project ships an executable JAR, offering access to most
validation functionality from the command line. (The run.sh
launcher can be extracted with jar or unzip then placed in the
same directory as the JAR. It can be used from PATH.) On the
other hand, the downloads available from the Maven Central repo
at https://repo1.maven.org/maven2/org/evolvis/tartools/rfc822/
only offer the .jar file anyway, so maybe use that…

Usage:

$ java -jar rfc822-0.8.jar [-lax] [--] [input …]
or
$ ./run.sh [-lax] [--] [input …]

This will show (in a colourful format intended for human con‐
sumption only!) for each argument which productions match it.
This command exits with errorlevel 40. A double-dash argument
separator before the first input argument is mandatory if the
first input begins with a hyphen-minus. If no “input” is pre‐
sent an interactive REPL is started, prompting if stdin isn’t
redirected (d̲o̲n̲’̲t̲ redirect stdout/stderr!), exiting on end of
input (^D on Unix, ^Z on DOS).

The -lax option enables the “UX” (more user-friendly) parser,
which allows some more constructs on input but converts those
to their respective standard forms on output, for all parsing
regarding eMail addresses (Path). In lax mode, all inputs are
additionally whitespace-trimmed on both sides (no matter what
parse mode is selected).

$ java -jar rfc822-0.8.jar -h
or
$ ./run.sh -h

Show short usage info on stderr and exit nonzero.

$ java -jar rfc822-0.8.jar [-lax] -TYPE [--] input
or
$ ./run.sh [-lax] -TYPE [--] input

Validates the one given “input” for the specified TYPE.
Possible TYPEs are:
-addrspec	(e.g. foo@example.com)
-mailbox	(e.g. Foo Bar <foo@example.com>)
-address	(mailbox or group, e.g. Label:foo@example.com,bar@example.com;)
-mailboxlist	mailbox ["," mailbox]*
-addresslist	address ["," address]*
-domain		FQDN (hostname or domain name)
-ipv4		Legacy IP address, dotted-quad, e.g. 192.0.2.1
-ipv6		IP address (without scope), e.g. 2001:db8::1

If input was valid, exits with errorlevel 0; no other codepath
(other than -extract if all inputs were valid) exits 0 so this
is a reliable validator. Additionally, a (somewhat, subject to
JRE details; for example, a v4-mapped IPv6 address is rendered
as IPv4 address instead) canonical representation of the input
followed by a newline will be written to stdout on success.

If input was invalid this exits with errorlevel 43 for domain,
ipv4, ipv6; with errorlevel 41 (cannot be parsed) or 42 (fails
post-parsing validation) for the others (eMail-related types);
nothing is printed on standard output in this case.

$ java -jar rfc822-0.8.jar [-lax] -extract [--] input …

Validates each given “input” item as address-list. Diagnostics
will be shown on stderr for invalid inputs. The exit status is
45 if no valid inputs were provided; 0 if all input given were
valid; 44 otherwise (both invalid and valid input presented).

Additionally, one corresponding line will be written to stdout
for every input: an empty line if invalid; the addr-spec items
present in the input, separated by comma and space, otherwise.
This facilitates matching between input and output and sending
eMails to recipients based on header values. This allows users
to check either stdout, stderr or the errorlevel independently
from each other, handling the result in various automated ways
or by manual stderr inspection (characters not printable ASCII
are backslash-escaped).

Limitations
───────────

This API checks for both RFC5322 and RFC5321 compliance, with
several others (DNS) influencing. It’s intended to be used by
eMail senders towards public internet which limits, among the
allowed lengths and characters, features:

• IPv6 Zone identifiers (ff02::1%vr0) aren’t valid as they
  are strictly host-specific

• General-address-literal isn’t permitted, as (other than
  IPv6 addresses) there currently isn’t any usable tag

• display-name must be valid ASCII dot-atom, or quoted-string
  (but no tabs), syntax

Future directions
─────────────────

Pass more of the structure up (comments, for example).

Improve UXAddress (the “DWIM mode” Path parser) to handle more
input varieties users will enter into webforms such as:

	Es „ß“ Zett <sz@example.com>; Shaun D. Mäh <sheep@example.com>
	James T. Kirk, III. <jim@example.com>

Currently, the following deviations are accepted:
• <hal@ai.> (trailing dot)
• use of semicolon (‘;’) as mailbox-list separator

The following further ones are planned:
• accept and encode extra punctuation for display names (jim)
• accept, but drop at first, nōn-ASCII display names
• later MIME-encode them
• new “local uses allowed” mode that supports link-local IPs

For more though…
Sorry, no time…

More patches, improvements, tests, etc. are of course welcome!

About

mirror of the repository of the evolvis tartools rfc822 repository

Resources

Stars

Watchers

Forks

Packages

No packages published