Skip to content

yam655/gofor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gofor

This is a stupidly tiny, nonconformant (but functional) Gopher server

Features

  • Requires gophermap files.
  • Refuses to serve documents outside of the root
  • Refuses to serve anything that isn't world-readable
  • Written in Python 3 (Python 3.7 tested)
  • No outside dependencies
  • No needless features

Fewer features, fewer things to go wrong

There's no support for directories that don't contain a gophermap file.

What you've put in the gophermap file is all there is. There's no support for inserting the contents of a directory in to a gophermap file.

There's no PHP support. No CGI support. No search support.

The code can be easily audited. The code can be easily understood.

Right now, there's also no unit tests, and the code could be cleaner.

Still, the baseline is simple and small and works. It's simple and small enough that it could provide a test-case to iterate upon a variety of potential designs.

Simple code that isn't totally stupid

The Gopher protocol prevents . and .. from being a part of valid selectors. But, I'm not writing my own path parsing code.

If it is running on a port above 1024 as a non-root user, chroot isn't available and I still want to be very sure there is no possible way to accidentally serve documents from outside the document root. Right now, I'm resolving the expected path, (following all symlinks, processing '.' and '..', etc) and then verifying that it is still within my expected document root.

It would be very possible for me to think, "I'm avoiding '..' and I refuse to serve symlinked files or directories" and then someone creates a symlink to / and references /somedir/link-to-root/etc/passwd. somedir is in my document root, and etc is a real directory and not a symlink. No periods are found in the selector at all!

I can, however, reliably check for and fail all references to .. if I'm currently running chrooted and I can trust that symlinks that point out of my jail will fail. We can't chroot as a normal user, but then we can't listen to port 70 as a normal user. So, chroot is a valid additional safety measure, but it isn't enabled by default.

I would like to drop priviledges. The current Python 3 asyncio logic doesn't make it easy to drop priviledges after a priviledged port has been grabbed.

About gophermap files

Gophermap files have a gopher-type character prefixed to a TAB-separated list of columns.

The gopher-type indicates what sort of resource it is:

  • 0: plain-text file (US-ASCII encoding)
  • 1: directory (which should have a gophermap in it)
  • 3: Error code (mostly generated by system)
  • 9: binary file
  • h: HTML file or HTML-style URL
  • g: GIF image file (Can also use I, :, or 9.)
  • I: Other image file (Can also use : or 9.)
  • s: Sound or other audio file (Can also use < or 9.)
  • i: info-line (normally implied -- details later)

Less common, but still potentially useful:

  • 8: Telnet (good for BBSes, MUDs, etc.)
  • T: Interactive 3270 emulation sessions (valid, but rare)
  • +: Redundant/Mirror server
  • 7: Gopher full-text search (unsupported by gofor)

Other types include:

  • 4: BinHex encoded file (Text-encoding of binary data? Do not use.)
  • 5: DOS binary archive (Use 9 instead.)
  • 6: Uuencoded file (Text-encoding of binary data? Do not use.)
  • c: Some sort of Calendar (Use 9 instead.)
  • e: Some sort of Event (Use 9 instead.)
  • M: MIME multipart/mixed (Text-encoding of binary data? Do not use.)
  • :: Gopher+ any image (Use I instead.)
  • <: Gopher+ any sound or audio (Use s instead.)
  • ;: Gopher+ any movie (Use 9 instead.)
  • d: Binary document (Use 9 instead.)

Possibly unexpected results:

  • -: do not list entry (Not supported by gofor and will be sent to client.)
  • #: internal comment (Not supported by gofor and will be sent to client.)
  • !: page title (Not supported by gofor and will be sent to client.)

Adding missing columns

The gophermap files consist of four TAB-seperated columns.

If no TAB is present, it is treated as an informational line. The 'i' character is prepended automatically, and stub values are filled in for the selector, host and port.

If there are two columns, it is treated as a Gopher link to the current Gopherhole. This adds your current server and current port to the missing columns.

If there are three columns, it is expected that you're referencing a Gopherhole other than your own. Regardless of what port your own Gopherhole is listening on, this will always only ever add the standard Gopher port, 70.

If there are four columns, it is passed as-is.

If there are more than four columns, the remaining columns are dropped. This is required to be compatible with Gopher+. If gofor adds some Gopher+ features later, additional columns may be supported.

Other gopher-type characters

You probably see all manner of other gopher types that vary by which Gopher server you're looking at. Here's the thing, though: the client is responsible for handling the gopher-type, not the server.

Is a server listing movies with the Gopher+ ; gopher-type? This means that line will entirely disappear for some clients, when you could have just offered it via the standard 9 binary type. (The whole point of the extended types is so that lines can be dropped from display to users when gopher-types are unsupported.)

Is a server returning all archives with the 5 "DOS binary" gopher-type? Any gopher-client that expects those to be DOS-specific will drop them from display.

gofor is stupidly simple. Those gopher-type characters only matter to servers when they create the gophermaps that they send to clients. Since gofor doesn't support that, you can use whatever new or unusual gopher-type characters you want.

It's the clients that care about the gopher-type characters. They're the ones that need to handle the new and nonstandard gopher-types that you may be using. Personally, I don't have the hardware available to test all of the possible clients, so I keep my own gopherhole strictly standard.

Usage

usage: gofor [-h] [--fqdn FQDN] [--port PORT] [--root ROOT] [--ipv4]
             [--verbose] [--version] [--chroot]

gofor: simple gopher server

optional arguments:
  -h, --help            show this help message and exit
  --fqdn FQDN, -f FQDN  Fully qualified domain name clients should use.
  --port PORT, -p PORT  The port to listen to.
  --root ROOT, -r ROOT  The document root to serve from.
  --ipv4, -4            Bind to 0.0.0.0 instead of ::
  --verbose, -v         Be more verbose.
  --version             show program's version number and exit
  --chroot              chroot in to the document root

Did I say nonconformant?

This gopher server violates the spec.

Gopher is supposed to serve files terminated by a '.' on a line by itself. It's supposed to avoid this for binary files (of course) but everything else is subject to this.

However, clients don't tell the server whether they expect a binary or a text file. All incoming requests look the same. It's just a selector. The gopher-type is only known by the client at selection time.

What qualifies as a text-file? What qualifies as a binary file? According to RFC1436, only 5 (DOS binary archives) and 9 (binary files) are immune from the period requirement. While there weren't that many types of media available, there was g for GIF files and I for arbitrary types of images. According to the spec, those images should be terminated with periods on lines of their own.

How can I serve documents properly when I don't know what the client is expecting? Some Gopher servers track state and remember the expected types of files. But, with the clients I've tested, the only time the terminal period was needed was when dealing with directories.

If clients only need the terminal period in that one case, then I just only send the terminal period in that one case.

Gopher+ got rid of the requirement for the terminal period. Instead a Gopher+ selector can return the amount of data, '-1' for the old period-terminator, or '-2' to terminate on connection-close. It also clarifies that it is <CRLF>.<CRLF>.

The only way binary images (as opposed to text-based images like PNM) could have worked reliably with Gopher-1 would be if servers violated the spec and ignored the terminal period.

Also, consider that Gopher+ added new gopher types for images, audio, and movies. Gopher+ is supposed to be compatible with Gopher 1 clients. It should be possible to request those resources with an older client. The only way those files can be served is without a terminal period. The only way older clients could access those resources without choking is if it was expected for clients to handle connection termination as the preferred way to terminate potentially unknown types of files.

In practice, I don't think any functioning Gopher-1 client could have been relying on the terminal period for anything outside of the gophermaps. It's possible that something accessed as plain-text needed it, but the protocol didn't really support reusable connections. If the server closes the connection, it's pretty easy to treat it as the end of the file.

About

This is a stupidly tiny, nonconformant (but functional) Gopher server

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages