mbox Mailboxes

This explanation was written in response to a query about how to delete large messages from a user's mailbox. Brain-dead mail user agents like Microsoft Outlook Express couldn't download large messages over a modem connection because the download time exceeded a time-out in the program. Users would have to call tech support and the file would need to be removed from the mailbox.


One of the jobs of a Unix server is to handle email. All Unix machines have this capability. There are a variety of mailbox formats out there. By default, FreeBSD uses the original format, often refered to as the "mbox" format. Other common formats include "mmdf" and "maildir".

With the "mbox" format, all of the messages are stored in a single text file. Under FreeBSD, this is located in /var/mail and the filename s the same as the user's username.

Each message is preceded by a single line, known as the "envelope" or sometimes the "arpa envelope". An example is shown below:

From luser@example.com Mon Aug  5 17:25:51 2002

This shows the date that the message arrived, and the email address of the sender. Note that this is not necessarily the address found in the headers of the message. This email address is communicated as part of the SMTP protocol, and is used when bouncing back an email.

After this envelope line is the actual message, stored in the format specified by RFC 2822. Basically, this is a group of headers, followed by a blank line, followed by the body of the message. All headers have a header name, followed by a colon and a space, followed by the content of the header.

An interesting thing about the mbox format is that the only way to seperate each message is to look for lines that begin with "From ". However, users could put lines in their messages that begin with "From ", so when it is stored in the mailbox, it gets an extra character stuck on the front, usually a '>' character.

Deleting all the messages in a user's mailbox can be done by simply deleting or truncating the file. You can delete the file with the Unix "rm" command. You can truncate the file using the command:

cp /dev/null filename

Truncating the file has the advantage of leaving an empty (zero byte) file which retains the original permissions. However, even if you delete the file, it should be recreated with the correct permissions.

You can combine two mailboxes with the Unix "cat" command. This might be necessary after taking extreme measures to kill the popper process, which leaves the user's mail in a mailbox called ".username.pop".

cat fromfile >> tofile

The ">>" operator tells the shell to append the output of the cat command (which is the contents of fromfile) to the end of tofile. You can read more about this by reading about using the command shell.

A common problem is a user needing to have a large email deleted from their mailbox. Users have a bad habit of emailing large spreadsheets, video files, and CD-ROM's. Then they use a poorly designed mail program which can't download them.

Conceptually, this is very easy to handle. You can open a file with a text editor (like vi), delete the message, and save the file. Of course, no one ever seems to know how to use vi to do this.

The first thing to do is to determine the line numbers at the beginning and the end of the message. You can search for lines beginning with "From ". Use the slash ('/') key to search, from "^From " ('^', capital 'F', lowercase 'r', 'o', 'm', and SPACE), then hit the enter key. You can use the 'n' key to find the next occurance.

You can hit Ctrl-G to find out what line the cursor is on. Usually the offending message is the first one (because the messages before it were successfully downloaded).

When you know what lines to delete, you can hit the colon key (':') to get to the vi/ed/ex command prompt, type the starting line number, comma, the ending line number, 'd', and hit enter. Then you can save the file.

For example, in my mailbox is an mp3 file that someone sent me. It happens to start on line 120837. Searching from there for the next line that starts with "From " puts me at the beginning of the next messages. The line above that (the last line of the large message) is 164248. I would type:

:120837,162248d

That will delete the message. Then I can save the file.

I would recommend that someone who is completely unfamiliar with vi should probably not do this. It is easy to get confused and make a mistake with vi. The result of that would be to delete too much email and/or corrupt the user's mailbox.

There is a man page for vi, but it is a reference. There is an O'Reilly book about vi, Learning the vi and Vim Editors, but maybe I will write a brief introduction to it soon.

Messages are stored in mailboxes by sendmail. sendmail is the program that serves the SMTP protocol. SMTP is used by our users to send their mail out through our server. SMTP is also the protocol used by other mail servers to send email to users on our mail server. sendmail actually calls another program called "mail.local" to actually write the messages

.

Most users use the POP3 protocol to read their email. There is a program called popper that handles this protocol. Once the user's mail program has authenticated (by sending their username and password), the popper program "locks" the mailbox. It does this by renaming the mailbox from "username" to ".username.pop", then it reads the messages from there. This is done so that new email can come in while the mail is being downloaded without confusing popper or the user's mail program.

The sendmail program fills the role of the Mail Transport Agent (MTA). The mail program used by the end user to read his or her email fills a role called the Mail User Agent (MUA).

The POP3 protocol is very simple, and was originally designed so the MUA could download all the mail from the server. The mail would then be deleted from the server. However, the POP3 protocol was extended in a lame attempt to make email available to multiple computers. This works marginally well, but it means that mail has to be left on the server so the other computer(s) can still access it. This tends to use up a lot of space on our mail server. neon currently has 1030 mailboxes on it, using 3.5 GB of disk space. Some of this is because users leave mail on the server, some of it is mailboxes that aren't being checked (but are collecting spam).

P.S. to find more about using your shell, you can read the man page (man ksh) for your shell. However, this reads like a reference and isn't very good guide, especially since much of the shell's functionality involves the shell scripting language. O'Reilly & Associates publishes a good book on shells, Learning the Korn Shell .

Your assignment: use the "more" command to look at your own mailbox on neon, and pay attention to the elements described above. Get visually familiar with the format of mailboxes. Take a look at some of the kinds of headers that appear in each message, and if you see one that doesn't have an self evident purpose, ask me about it. You might have to stop checking your email for a while or set "Leave mail on server" in your MUA so that your mailbox will not be empty.