DAVIM

An Instant Message System Based on the WebDAV standard

Introduction | Project Plan | Project News | Links | Contact

Introduction

DAVIM is an open source project aimed at creating a distributed Instant Message system utilizing web servers with WebDAV capability. The project is current in the coding stage.

WebDAV stands for "Web-based Distributed Authoring and Versioning." It is an extension to the HTTP protocol 1.1 that establishes a standardized way to write information to a web server. It adds functionalities such as directory creation and file lock, all of which provides the basis for turning web servers into Internet file servers that are accessible worldwide. For more information about WebDAV, visit www.webdav.org.

We can implement a simply, yet powerful messaging system on WebDAV-enabled servers, by putting in place a proper access model and adopting a name-space convention on the placement of messages on the server (i.e. put which file into which directory).

DAVIM schematic diagram
Figure 1 - Basic DAVIM scheme

In the example above, Jane and her fans have agreed that messages for Jane should be place in the directory /davim/jane/inbox/. Jane's fans will have write-only access. When one of them wishes to send her a message, he uploads it to /davim/jane/inbox/. Jane herself will have read-write access. After she has downloaded the messages in her inbox, she can delete them.

There can potentially be two types of DAVIM systems: closed and open

Scenario - Closed System

Let us examine the dynamics of message delivery in a closed system. Both Jack and Jane are valid users on foo.com. We will assume that the server is running Apache with mod_dav.

Closed system diagram
Figure 2 - Message delivery in a closed system

(1) Jack connects to foo.com through HTTP. Apache authenticates Jack as a valid user and grants him the permission to write to Jane's inbox. Jack creates the sub-directory 192.168.1.1.2000.04.01.12.01.06.000/ (IP address + timestamp) in /davim/jane/inbox/ - an envelope for his message, using the MKCOL method. Jack then locks the sub-directory with the LOCK method. He uploads the file message.en.html there with the PUT method. The envelope remained locked, until Jane performs a lock-discovery on her inbox and unlock the sub-directory with the discovered lock-token.

(2) Using an optional notification mechanism, The server notifies Jane immediately about the new message's arrival. When the support of this mechanism is not available (because of a firewall, for example), the client software can poll the server periodically for new messages.

(3) Jane connects to foo.com through HTTP. Apache authenticates Jane and grants her read-write access to /davim/jane/inbox/. A scan of the directory, using the PROPFIND method reveals the new sub-directory. Jane downloads everything in it, using the GET method, then deletes the sub-directory with the DELETE method. She also has the option of keeping the message on the server.

Because foo.com is a closed system, Jane can be reasonably confident about the authenticity of the messages she receives. The identities of the senders are always known to the server. Forged message could be traced to the offender in the server log.

Scenario - Open System

Let us now examine the dynamics of message delivery in an open system, where users can receive message from anyone on the Internet. Jack is a user on foo.com, while Jill is a user on bar.com. We will assume that both servers are running Apache with mod_dav.

Open system diagram
Figure 2 - Message delivery in an open system

(1) Jack connects to bar.com through HTTP. First, he download Jill's public-key for encrypting the message, since the message will travel across a public network. Then he tries to write to Jill's inbox. Apache does not know who Jack is, but since his IP address is not on the blacklist, he is granted access. Jack creates the envelope sub-directory 192.168.1.1.2000.04.01.12.05.32.000 in /davim/jill/inbox/, and locks the envelope, citing his address jack@foo.com as the owner. He uploads the file message.en.html.dhb, encrypted using his private-key and Jill's public-key. The message contains images, so Jack creates the sub-sub-directory image/ and puts all the encrypted .gif.dhb or .jpg.dhb files there.

(2) The server notifies Jill of the new message.

(3) Jill connects to bar.com. Apache authenticates her and gives her read-write access to her inbox. She downloads all the files in the recently created sub-directory.

(4) Jill connects to foo.com to retrieve Jack's public key, located at the URI /davim/jack/public/. She might have Jack's public key already, in which case she would not need to download it again (unless it has changed). Jill deciphers all the files using her private-key and Jack's public-key. If all the files are deciphered correctly, then the message probably is from the real jack@foo.com - unless the secrecy of Jack's private-key (or hers) has been compromised.

Besides his public key, Jill might download additional information about Jack (a picture of the man, for example) from his public directory.

Server Access Model

The sender of a message needs access to execute these WebDAV methods on the inbox directory: MKCOL, LOCK, PUT.

This security set-up is rather peculiar. Thankfully Apache is flexible enough to accommodate it. Each inbox directory will need its own .htaccess, since each is readable to a different person. The following seems to work well:

[/davim/jane/inbox/.htaccess]
AuthUserFile /davim/jane/inbox/.htpasswd
AuthType Basic
AuthName "DAVIM"
<LimitExcept PUT MKCOL LOCK>
require user jane
</LimitExcept>

In the Apache config file, we add the following:

[/etc/httpd/conf/httpd.conf]
<Directory /davim/*/inbox/>
AllowOverride All
DAV On
</Directory>

Using this set-up, adding a new user is as easy as creating a new directory, an .htaccess file and an .htpasswd file.

Namespace Convention

Location of inbox: /davim/username/inbox/

Location of public directory: /davim/username/public/

Message Format

DAVIM's message format is very simple. The envelope is a sub-directory in the recipient's inbox. The message content is consisted of individual files.

The name of the envelope is required to be unique. A reasonable solution is concatenating the sender's IP address and the current time. A more sophisticated implementation would use Universal Unique Identifiers (UUID), as described in [ISO-11578].

The sender's address is stored as a property of the envelope.

<sender xmlns="http://davim.sourceforge.net">
<user>jack</user>
<host>foo.com</host>
</sender>

The envelope can contain sub-directories. Files in these must be referenced by files at the top level. A good example would be an HTML file referencing .gif files stored in the image/ sub-directory. Clients should not display files in sub-directories as independent resources.

Files at the top level are independent resources. Currently two types of files are envisioned:

message.[language].[format] - The message itself. [language] is a two letter language code - "en" for English, for example. [format] is the file's format, probably HTML or XML.

Inclusion of the language code makes it possible to send multilingual messages. The idea is that in the future, machine-translation will become more common and there will be a need to send both the translated version and the original.

While it is possible to have something like message.en.wav or message.ru.mpeg, it is probably better to support additional media types through HTML. Plain text should be supported through HTML as well. If a client is incapable of rendering HTML, it should obtain the plain text by stripping all HTML tags and decoding all HTML entities. No client should even send stuff like message.en.txt. HTML files should always have the .html extension and never .htm.

attachments.[language].html - A list of files attached to the message. The files themselves should be stored in a sub-directory. The format of the files is a followed:

<html>
<body>
<ul>
    <li><a href="uri">description</a></li>
    <li><a href="uri">description</a></li>
    <li><a href="uri">description</a></li>
    ...
</ul>
</body>
</html>

Encryption and Digital Signature

The current plan calls for the use of Diffie-Hellman for key-exchange and Blowfish for symmetric encryption. The DH public/private key pair will be 442 bits in length. Fixed P and G will be used, so that the client software don't have to perform the difficult task of calculating large prime numbers. The secret key created using the DH exchange is fed directly into the Blowfish algorithm.

All files should be encrypted. If the client can correctly decipher the files, then the message is assumed to be from the actual sender. A separate digital signature feature will not be implemented.

Notification Mechanism

While DAVIM uses HTTP for transferring information, a second protocol is needed for message-arrival notification. Without such a protocol, the client would have to connect to the server periodically to check for new messages, much like how e-mail clients work. This mode of operation places substantial superfluous load on the server, however, if the polling occurs fairly frequently. Exchanges of messages would also seem less than instantaneous.

Security Consideration

Contemplating...

Project Plan

Stage 1. Brain-storming
Stage 2. Writing preliminary specification
Stage 3. Setting up experimental server
Test experimental server with existing WebDAV clients
~~Stage 4. Writing journal code~~
Stage 5. Writing network code
Stage 6. Writing encryption code
Stage 7. Writing RAP daemon
Stage 8. Writing RAP Apache mod
Stage 9. Writing client code
...
Stage ?. Finalize specification
...

Project News

June 11, 2000

The project is moving along slowly, but things are definitely happening. The WebDAV library code is almost done. A pre-alpha version, EZDAV v.0.3, is available for download. The encryption code is done. I finally came up with a reasonable model for the notification mechanism, consisting of a Apache mod and a relaying daemon. When the mod notices a HTTP method being performed on a particular resource, it sends the name of the method (e.g. "PUT") to the relaying daemon, which then sends it to the client. Both these pieces are done. The client is about, say, 50 percent done. Here's a screenshot.