GitHub - arpa2/krsd: a remotestorage server for POSIX systems

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 243 Commits
auth		auth
demo		demo
lib		lib
scripts		scripts
src		src
test		test
tools		tools
.gitignore		.gitignore
.gitmodules		.gitmodules
CHANGELOG		CHANGELOG
CHANGES		CHANGES
COPYING		COPYING
LICENSE		LICENSE
LIMITATIONS		LIMITATIONS
Makefile		Makefile
README		README
TODO		TODO
init-script-defaults		init-script-defaults
init-script.sh		init-script.sh
package.json		package.json

Repository files navigation

krsd - A Kerberised RemoteStorage server implementation
=======================================================

Contents:

1. Introduction
2. Overview
2.1 remotestorage
2.2 webfinger
2.3 authorization tools
2.4 storage system
3. Installing
3.1 Dependencies
3.2 Getting the code
3.3 Building
3.4 Installing system-wide
3.5 Setting options
3.6 Integrating authorization

1) Introduction
---------------

remotestorage is an open specification for personal data storage. It is supposed
to replace the currently popular proprietary "cloud storage" protocols using an
open standard and thereby promoting the seperation of applications and their
data on the web.

For more information, check out these links:
* http://remotestorage.io/ - Information about the remotestorage protocol
and current implementations.
* http://unhosted.org/ - Philosophy, hands-on Tutorials and App collection.

2) Overview
-----------

krsd brings three things:
* a HTTP endpoint implementing remotestorage: /storage/{user}
* a HTTP endpoint implementing webfinger: /.well-known/webfinger

User management is based on Kerberos identities. The SPNEGO mechanism
specified in RFC 4559 is used to this end. Briefly put, this is a GSS-API
exchange embedded as base64 in WWW-Authenticate and Authorization headers
of the Negotiate type.

User accounts are unrelated to accounts on the server hosting this service.
Storage is both available to users and to services, and each is identified
with their usual principal names, including forms like these:

xmpp/xmpp.arpa2.net@ARPA2.NET
john@ARPA2.NET
john/admin@ARPA2.NET

These forms are translated to paths on the local filesystem as described
below, under "Storage system".

krsd is a fork of the useful work on rs-serve, and the respective
locations of the original and derived work are:
https://github.com/remotestorage/rs-serve
https://github.com/arpa2/krsd
The name has been changed to avoid confusion with application users. It is
not clear to date if this project will live its own life or get integrated
with the original branch. In the first case, a separate name brings
clarity; in the second, it has been harmless.

krsd is entirely written in C, using mostly POSIX library functions. It
relies on a few portable libraries, see the list under "Dependencies" below.
It does however currently use the signalfd() system call, which is only
available on Linux. (this is a solvable problem though, if you want to
be able to run on another system, please open an issue to ask for help)

2.1) remotestorage
------------------

The currently implemented protocol version is "draft-dejong-remotestorage-01". It has been modified to permit implied, that is HTTP-level, authentication and authorisation. Demo code is available in a separate directory.

Currently the following features are supported:
* CORS support for all verbs (TODO: May not work at present)
* GET, PUT, DELETE requests on files and folders
* Opaque version strings (in directory listings and "ETag" header)
* Conditional GET, PUT and DELETE requests ("If-Match", "If-None-Match" headers)
* Protection of all non-public paths via authentication by the browser at the HTTP level, and authorisation based on a .k5remotestorage file in the destination file system
* Special handling of public paths (i.e. those starting with /public/), such that
requests on non-directory paths succeed without authorization.
* HEAD requests on files and folders with "Content-Length" header
(not part of remotestorage-01, only enabled when --experimental flag is given)

2.2) webfinger
--------------

The webfinger implementation only serves information about remotestorage
and is currently not extensible.
The hostname part of user addresses is expected to be the hostname set for
the rs-serve instance. This currently defaults to "local.dev" and can be
overridden with the --hostname option.
Virtual hosting (== hosting storage for multiple domains from a single
instance) is currently not supported.

The pathname returned ermits for the krsd to parse out two components of
the pathname to which it stores: /storage/domain.tld/user/ should be at
the beginning of the URI if it is to be accepted as a remoteStorage URI
on krsd. Anything after this is taken as a path into the storage
structures used.

2.3) authorization
------------------

This version of rs-serve cannot employ the original authorization backend
and frontend, because it removes the implicit bearer flow described by
OAuth2, for reasons of security. Instead, it relies on the implicit and
time-constrained access provided by Kerberos.

It might have been possible to rely on a bearer token obtained from a
Kerberos-specific OAuth2 node, which would have made minimal or no changes
to rs-serve, but the downside of that would be that the intermediate form
of the bearer tokens provide access for all times, whereas Kerberos tickets
passed directly put a time constraint on the exchange. (This point however,
is negotiable. It seems that OAuth2 permits it. Encryption to resource
and information-rich handling in the IDP would be possible, in a site
managed and controlled by the user's side of things. Let's talk?)

Another option might have been to package Kerberos tickets in bearer tokens,
but that would have introduced an extra intermediate party with no added
value but unpacking / repacking the token, so it was deemed less attractive.

The current implementation does not constraint access to particular users,
or groups of users. It is very likely that such a facility will be added
in a future version, possibly based on central management infrastructure.

The --auth-uri option now causes a warning, but is not fatal.

2.4) Storage system
-------------------

The payload data of the remotestorage endpoint is stored on the local filesystem
within the respective user's target directory. Determining this directory is
done by splitting the principal name at the @ sign and reshaping it according
to these steps:

0. Initially, the pathname of a realm is a fixed string that is shared by
all principals.

Example 0a.
A fallback default could be /var/lib/krs/

Example 0b.
When using OpenAFS as a backing store for this service, the starting
point /afs/ may be more useful.

1. Firstly, the realm is removed and translated into its related domain name,
which should be supported in the surrounding infrastructure. In case of
generic hosting, the surrounding infrastructure could be the DNS, with
DNSSEC validated for sites that use it. The result from this mapping is
the DNS domain name, represented as lowercase characters without a
trailing dot.

Example 1a.
ARPA2.NET is setup in the local infrastructure, and its domain name
is found to be arpa2.net -- which is used as a pathname component.

Example 1b.
ARPA2.NET is not setup in local infrastructure, and it is looked up
in DNS under the same domain name, arpa2.net -- and if that site
thought it sufficiently important to sign with DNSSEC, then this
will be validated. What is looked up under the domain name is a
TXT record named _kerberos, holding what should match literally
with the realm name, case included. In case of a mismatch, fail to
help the user. Since the realm is being looked up, rather than a
DNS hostname, there will not be a trace up to parent domains.

What we now concatenate the DNS-name to the installation directory
as another subdirectory level.

2. Secondly, the local part (before the @ sign) is treated literally as a
path, with one or more levels. Note that this means that a slash is
treated as a directory separator. The last level is, once again, a
directory name.

Example 2a.
john@ARPA2.NET will append john/ to the path found so far.

Example 2b.
john/admin@ARPA2.NET will append john/admin/ to the path found
so far.

Example 2c.
http/chitchat.arpa2.net will append http/chitchat.arpa2.net/ to
the path found so far.

3. Thirdly, a fixed directory name such as remotestorage/ is appended to
the path name. This is done to separate remote storage from local
storage and from other remote storage components, and to make all the
other directories unavailable. This is of no use to local storage, but
it is immensely useful when dealing with storage on an OpenAFS share
that is also employed for other purposes.

The result is a concrete path into a filesystem. A few complete examples may
be helpful at this point.

Example. The user john@ARPA2.NET could be found in DNS zone arpa2.net, and
dependent on local settings his files could end up in mounted locations like:

/afs/arpa2.net/john/remotestorage/...
/var/lib/krs/arpa2.net/john/remotestorage/...

Example. After changing user-ID to john/admin under the same realm, the
files accessible to John are found in places like:

/afs/arpa2.net/john/admin/remotestorage/...
/var/lib/krs/arpa2.net/john/admin/remotestorage/...

Example. When a server is not acting on behalf of a user (through S4U with
Constrained Delegation) but on its own title, it depends on its principal
name. For example, xmpp/xmpp.arpa2.net@ARPA2.NET could be found on:

/afs/arpa2.net/xmpp/xmpp.arpa2.net/remotestorage/...
/var/lib/krs/arpa2.net/xmpp/xmpp.arpa2.net/remotestorage/...

The filesystem path is configured in the webfinger profile, and may or may
not relate to the Kerberos Principal Name used to access the resource.
For full flexibility, the remotestorage directory should hold a file named
.k5remotestorage which must hold the Kerberos Principal Name on a line of
its own, if it is to have read/write access to anything underneath this
directory.

The reliance on filesystem paths implies a few noteworthy restrictions:

* The remotestorage endpoint cannot be used to store both a directory and a file
under the same path (ignoring the trailing slash). That means you cannot store
/foo/bar/baz and /foo/bar, but only one of them. This is a natural restriction
of traditional filesystems, that is currently well adhered to by all apps using
remotestorage (as far as I know).

* MIME types may not be exact for files that were added "out-of-band", that is
not added via the remotestorage protocol, but by copying to the remotestorage/
directory by other means. krsd stores MIME type and character encoding
under the "user.mime_type" and "user.charset" extended attributes, given these
are supported by the underlying filesystem. When these attributes aren't set,
a MIME type is guessed using libmagic, which may not always yield desirable
results. (for example an empty file, created using "touch" will be transmitted
via remotestorage with a Content-Type header of "inode/x-empty; charset=binary")
If even libmagic fails to make sense of a file, the Content-Type is set to
"application/octet-stream; charset=binary".

* Filesystem privileges must be setup to grant access to the user. In the
case of OpenAFS, this will be based on the user, and krsd will act
on behalf of the user when storing files. In this case, Constrained
Delegation must be permissive to S4U2Proxy use, and the principal ticket
for the user must probably be created proxiable. The details of this have
not been ironed out yet.

3) Installing
-------------

These steps should enable you to install krsd.

3.1) Dependencies
-----------------

- GNU make
- pkg-config (or tweak the Makefile)
- gcc
- libc
- libevent (>= 2.0)
- libmagic
- libattr
- BerkeleyDB

On Debian based systems, this should give you all you need:

apt-get install build-essential libevent-dev libmagic-dev libattr1-dev libssl-dev libdb-dev pkg-config

If you want to develop, you may also want debug symbols and valgrind (required by
leakcheck.sh script):

apt-get install libevent-dbg valgrind

3.2) Getting the code
---------------------

Given you are reading this file, you probably have the code already, but just to
be sure:

Currently the krsd code is hosted on github.

You can browse it online, at:

https://github.com/arpa2/krsd

or close it using git:

git clone git://github.com/arpa2/krsd.git

Note that krsd itself is a clone of rs-serve, with the distinctive
feature that krsd introduces Kerberos authentication.

3.3) Building
-------------

Given you have all dependencies installed, simply run

make

and you should be good to go.

3.4) Installing system-wide
---------------------------

To install the krsd binary to /usr/bin, run

make install

as a privileged user.

To install somewhere else, tweak the Makefile first.

This will also install an init script to /etc/init.d/krsd and a default
configuration to /etc/default/krsd.

On Debian based systems (i.e. when "update-rc.d" is present), "make install"
will also install the krsd init script into /etc/rc*.d/.

3.5) Setting options
--------------------

There are a variety of options

If you want to use the init script, you can set options in /etc/default/krsd,
otherwise just pass them on the command line.

Run:

krsd --help

to get a list of supported options.

3.6) Integrating authorization
------------------------------

To integrate an authorization endpoint, you need to do two things:

* configure endpoint URI

Set the --auth-uri option to a printf style format string. "%s" will be
replaced with the username.

* configure your authorization endpoint to manage krsd tokens

krsd doesn't care where tokens come from, but it need to know them to
decide whether a given request is authorized or not. It maintains an internal
store for authorizations (i.e. structures of [user-name, token, scopes]),
which must be managed from the outside.

The tools to do this are:

* rs-add-token:

Usage: rs-add-token <user> <token> <scope1> [<scope2> ... <scopeN>]

- <user> is the login name of the user (krsd must be able to resolve
it using getpwnam() in order to find the home directory)
- <token> is the token string authenticating future requests. For krsd
it is an opaque string.
- <scope1>..<scopeN> are scope strings in the same form as described in
draft-dejong-remotestorage-01, Section 9.

* rs-remove-token:

Usage: rs-remove-token <user> <token>

<user> and <token> must both be given.
If the token cannot be found, rs-remove-token terminates with non-zero status.

* rs-list-tokens:

Lists all currently installed tokens and their respective scopes.

The output format is primarily meant for (human) debugging and subject to change.

4) Contributing
---------------

* Note that krsd is a fork of the original admirable work named rs-serve.
Never blame the rs-serve authors for mistakes that I (Rick) have added.
And don't expect them to support my mistakes!

* If you've found a bug, or have any questions, please open an issue on github:
https://github.com/remotestorage/rs-serve/issues

* If you want to contribute, fork the project on github and send pull requests.

* In any case, don't hesitate to talk with us on IRC:
#remotestorage and #unhosted, both on irc.freenode.org

Webchat links:
- #unhosted: http://webchat.freenode.net/?channels=unhosted
- #remotestorage: http://webchat.freenode.net/?channels=remotestorage