Resources
HOWTO Documents > How to use "procmail" for mail filtering at CCIS
Introduction: What is procmail?
procmail is a package that lets you automatically filter or otherwise process your mail. Applications include automatically segregating mail from particular mailing lists into separate folders, changing the format of incoming mail, eliminating spam or other unwanted mail, eliminating duplicate messages, adding an email-driven interface to other software, generating automated replies (like vacation messages), delivering incoming mail to your home directory rather than an almost-full system-wide mail disk, and so forth. The author of this document (Jay Sekora) used procmail to implement a simple trouble-ticket system.
procmail's particular advantage is that it's very flexible and general. It's really a toolkit for constructing mail filters, rather than a complete tool in itself, just as sed and awk are tools for constructing text filters. Therefore, if you want to use all of procmail's features, there's a sizable learning curve. But just as you can learn the most common idioms in sed in a few minutes, by example, and start getting useful work done with it, you can also learn the most common procmail idioms very quickly and start filtering your mail right away.
Sources of further information
| • | http://www.procmail.org/ is the home of procmail (and related software). |
| • | The iprocmailrci(5) manual page describes the syntax of the ~/.procmailrc file that controls how mail is processed. |
| • | The procmailex(5) manual page has short example recipes and demonstrates a lot of the features of procmail. |
| • | The procmail(1) manual page documents the procmail binary itself (e.g. command-line arguments), and the 'NOTES' section at the very end has a sample small ~/.procmailrc file set up to deliver most mail to ~/Mail/mbox in standard Unix mailbox format. |
| • | On our Solaris machines, some example .procmailrc files, with comments, and other tutorial material is in /arch/unix/doc/procmail-3.14/examples/. (That link will only work for you if you're viewing this document from a CCIS Solaris machine.) |
Running procmail on all incoming mail messages
procmail reads a mail message on its standard input and does something with it, so all your mail has to be passed through it. That happens automatically at CCIS, where we use procmail to do all local mail delivery (even if you don't have a .procmailrc).
At other sites, however, you may need to put a Magic Incantation in your .forward file. You can find the Magic Incantation by reading the manual page (on that system) for procmail or asking the system administrator. Of course, that assumes that procmail is installed at all.
Overview of procmail's operation
procmail reads a file called .procmailrc in your home directory to determine how it should process incoming messages. That file can contain environment variable assignments (and some of the environment variables are used by procmail) and recipes which are triggered by particular patterns in a mail message and control how pieces of mail that match them are handled.
Each recipe is examined in turn. If a recipe is triggered by the particular message being processed, it gets a chance to try to deliver the message in some way. That could mean piping it through a filter, appending it to a file, dropping it in a special directory, or even catting it to /dev/null or otherwise discarding it. Then the message is considered delivered, and no further recipes are examined. If the recipe didn't apply to this message or delivery to the folder failed (or if the command to deliver the mail failed and you've asked for that to be checked), then the next recipe is examined until something succeeds in delivering the mail or procmail falls off the end of the .procmailrc file.
If procmail hasn't delivered the message by the time it reaches the end of the .procmailrc file, then it is (by default) delivered to the user's incoming mailbox (at CCIS, /var/mail/username) - the same thing that would happen if procmail weren't involved. (You can override that default action if you choose to.) So it's easy to set procmail up to do something special with certain kinds of mail (from a particular mailing list, say, or messages over a certain size) without affecting other messages at all.
The ~/.procmailrc file
The ~/.procmailrc file consists of environment variable assignments (expressed in a syntax that is a very close to the syntax of the Bourne shell) interspersed with recipes for delivering mail. Assignments and recipes can be mixed together, but typically the variable assignments occur at the top of the file, so we'll discuss them first.
Comments in a .procmailrc file can be indicated with hash marks, as in Perl and the shell. To be safe, you should only do this at the beginning of a line. (In some cases it doesn't work at the end of a line.)
Variable assignments
An environment variable assignment consists of a variable name, an equal sign, and a value. You can assign to arbitrary environment variables, but a number are special to procmail, and those are the ones you typically set. The syntax is very complete - you can include backticks and environment variable references in the value, just as in the shell.
As the procmailrc(5) manual page states, ‘Before you get lost in the multitude of environment variables, keep in mind that all of them have reasonable defaults.’ Here are a few of the commonly-set ones:
| Variable | Meaning | Default |
|---|---|---|
| MAILDIR | The current directory for procmail; most conveniently, where you store most of your mail folders. | $HOME |
| ORGMAIL | The normal place where your mail would be delivered by the system, in the absence of procmail. | /var/spool/mail/you |
| DEFAULT | Where procmail will deliver mail if no recipe matches (and succeeds); i.e., if it falls off the end of your .procmailrc file. | $ORGMAIL |
| PATH | As you would expect. You need to add non-OS directories like /arch/unix/bin if you want them. | $HOME/bin:/bin:/usr/bin (pretty minimal) |
| LOGFILE | File to write diagnostics to. (See the manual page.) | (unset) |
| LOGABSTRACT | File to which procmail should write a summary of what it did with each message. Very useful. | (unset) |
| UMASK | As with the shell's umask command. Can be set between recipes if you want some but not all of your mail to be publicly readable. Usually left alone. | 077 (make everything private) |
(Typically, set PATH, DEFAULT - that's all that's necessary.)
procmail recipes
A recipe starts with a magic line that usually looks like:
:0:
but can have additional flags before the second colon. The
second colon itself can be missing if no locking is required. Usually it needs to be there.)
In the above said that ‘if a recipe succeeds’, later recipes aren't considered.
Sometimes you want to do something special with messages that
match a certain pattern, but then you still want them to be affected
by the rest of the .procmailrc file. You can do this by adding a 'c' flag after the '0', so that the line reads
:0c:
Other flags you can add include 'B' (to apply conditions - see below - to the body rather than the
header), 'D' to make pattern matches case-sensitive, 'f' to filter the message in-place, and a number of others. See
the procmailrc(5) manual page for the full story.
After the magic ':0:' line, there are one or more condition lines, which start with asterisks (*). These typically contain regular expressions (à la egrep) which are checked against the headers of the mail message. If they all match, then the recipe is actually triggered - its action will be applied to the current message. (There's a way to grep for the regular expressions in the body rather than the header, and you can also test the value of environment variables or the result of arbitrary Unix commands. So you could take a certain action for all mail messages received on weekends that contain the word 'barbeque' in the body - maybe forward them to your alphanumeric pager so you don't miss the barbeque.) You can 'negate' a condition by preceding it with an exclamation mark (!).
There are some special macros you can use in condition lines. The most important one is '^TO', which causes the following expression to match any recipient of the message. So '* ^TOjay@' will match any message where 'jay@' appears as part of a recipient address, whether on the To:, Cc:, or Bcc: line.
Here are some examples of condition lines:
| Condition | Matches... |
|---|---|
| * ^TOpostmaster\> | messages sent to '>postmaster'. (The \> matches any non alphanumeric character; commonly used for a word break) |
| '* ^Subject:.*laser printer toner cart | messages with the indicated text anywhere in the subject |
| * ^From:.*\<mailer-daemon@ | certain bounced mail |
| * !Received:.*by amber\.ccs\.neu\.edu | messages that did not ('!') pass through CCIS' mail server |
| * Subject:.*urgent | messages with 'urgent' anywhere in the subject |
There are some other conditions you can check for besides regular-expression matches. Here are some examples:
| Condition | Matches... |
|---|---|
| * > 10240 | messages larger than 10k long (including headers) |
| * ? grep 'gone until' $HOME/.plan | whenever 'gone until' appears in my .plan file This test is independent of the message itself. |
Again, you can see the procmailrc(5) manual page for full details, but the examples above cover the normal cases.
After the condition lines (all of which start with '*'), there is exactly one action line, which specifies what to do with the mail message. An action line can have one of the following forms:
| • | A mailbox pathname, referring to
|
||||||
| So procmail can deliver directly to the folders used by almost all mail readers. If you use the '/.' form to deliver to an MH folder, procmail does not update MH's unseen sequence (i.e., it doesn't mark the mail as unread). | |||||||
| • | An exclamation mark (!), followed by an email address (or addresses) to forward the mail to. | ||||||
| • | A vertical bar (|), followed by a program to pipe the message through. This can be an arbitrarily complex command; it can be a pipeline and can have backticks in it. |
If you're delivering to a mailbox, procmail will consider the message undelivered (and therefore continue trying further recipes) if there's some sort of write error, such as running out of space or permission problems. If you're delivering to a pipe, by default procmail will send the message to the pipe, consider it delivered, and exit without waiting for the command to complete. If you want to handle possible errors, you can add the 'w' (wait) flag to the ':0' line, and then procmail will wait for the command to complete and look at its exit status. If the command fails, then, procmail will consider it undelivered and look at the next recipe in your ~/.procmailrc file.
Other utilities that come with procmail
There are some other tools that come with the procmail distribution.
The most important one is probably "formail", which parses and manipulates mail messages. One use is to add, delete, or change particular headers. Another is to generate automated reply mail - "formail" is commonly used in a 'vacation' recipe.
Another one is "lockfile", which creates procmail-compatible lock files; it's useful for writing scripts to work with your mail; that way (if you're careful) they can coordinate with each other and with procmail so that they don't step on each other's toes.
Some example .procmailrcs
Set up a safety net
Mistakes in a .procmailrc file can cause you to lose all your incoming mail! Because of that, it's a good idea to have a safety net recipe at the top of your .procmailrc file whenever you make any changes. That will store an independent copy of all your incoming mail in a file somewhere before the rest of your .procmailrc gets at it. When you're confident your whole .procmailrc is working right, you can get rid of (or comment out) the safety net and delete the file it's been writing to. But if there's a problem in the .procmailrc (after the safety net), a copy of all your mail has been saved.
This recipe uses the 'c' flag, because it saves a copy of each message. The rest of the .procmailrc still gets to process the message.
# $HOME/Mail *should already exist*
MAILDIR=$HOME/Mail
# not setting DEFAULT or ORGMAIL, so mail that doesn't match will
# be left in system mailbox
LOGFILE=$MAILDIR/from
# safety net
:0c:
$HOME/tmp_mail
# ... your own recipes would go down here ...
Filter out some spam
This recipe just recognizes headers to a few common spam messages and files them in a spam folder.
The '* !^TO.*ccs.neu.edu' line in a couple of these recipes makes the recipe fail to match if a piece of mail is addressed to a CCIS email address. That helps avoid 'false positives', e.g. if somebody sends out mail to faculty@ccs.neu.edu or systems@ccs.neu.edu complaining about one of these pieces of spam, you might want to see that.
# /arch/unix/bin is necessary for 'formail'
PATH=/bin:/usr/bin:/usr/ucb:/arch/unix/bin
# $HOME/Mail *should already exist*
MAILDIR=$HOME/Mail
# not setting DEFAULT or ORGMAIL, so mail that doesn't match will
# be left in system mailbox
LOGFILE=$MAILDIR/from
# # safety net (commented out)
# :0c:
# $HOME/tmp_mail
# Recognize some common spam and save it to $HOME/Mail/spam
# $HOME/Mail must already exist; spam will be created if necessary
:0:
* To:.*friend@public\>
* !^TO.*ccs.neu.edu
spam
:0:
* Subject:[ ]*laser p[re]inter toner advertisement
spam
:0:
* Subject:.*FREE 1 *y(ea)?r USA magazine sub
* !^TO.*ccs.neu.edu
spam
Check the body of a message for a pattern
Normally, the regular expressions in the condition lines in a
recipe are only checked against the
headers of the message. To test them against the
body instead, add the
'B' flag to the first (':0...') line of the recipe. For instance, the following recipe
would store any message that contained the URL of our web site
in the body in a separate folder.
# (not a complete .procmailrc, just a single recipe)
:0B:
* http://www.ccs.(neu|northeastern).edu/
spam
(If you need to test regular expressions against both the body and the header - for instance if you want to check based on a combination of body contents and a particular subject - you'll need to add the 'H' flag as well as 'B', so that the first line reads ':0BH:'.)
Handle problems delivering mail
This .procmailrc will try to deliver to the file incoming in user's home directory first (user had read mail with something like 'pine -f ~/incoming'). If that fails, it tries to deliver mail to the system-wide mailbox ($ORGMAIL). If that fails, it saves (or tries to) a copy in /ccs/tmp and also sends a copy to an off-site address. So this is an extremely paranoid .procmailrc</>.
Some points about this file:
| • | $LOGNAME is the user login name |
| • | The recipe that saves to /ccs/tmp has the 'c' flag on its ':0c:' line; that means it's saving a copy of the mail, but processing should continue with the following recipe. That way, if we get that far in the .procmailrc file, a copy will be stored in /ccs/tmp (which gets wiped out periodically) and also the next recipe will be processed, which sends the mail to another address. |
| • | There are no condition lines - each recipe applies to all incoming mail. That means that processing will stop with the first (non-copy) recipe that succeeds. |
# store mailboxes in my home directory by default (atypical)
MAILDIR=$HOME
LOGFILE=$MAILDIR/log
:0:
incoming
:0:
$ORGMAIL
:0c:
/ccs/tmp/$LOGNAME.mbox
:0
! me@me.ne.mediaone.net
Deliver to an MH inbox (method 1)
This .procmailrc doesn't update the MH unseen sequence. Some points:
| • | When you're delivering to a directory (with an action line that ends in '/' or '/.'), you don't need a lockfile; hence ':0' rather than the more normal ':0:'. |
| • | In inbox/., the '/.' means to treat inbox as an MH mail folder (a directory containing numbered message files). |
# $HOME/Mail *should already exist*. In this case we're assuming
# it's your MH directory.
MAILDIR=$HOME/Mail
LOGFILE=$MAILDIR/from
# # safety net (commented out)
# :0c:
# $HOME/tmp_mail
:0
inbox/.
Deliver to an MH inbox (method 2)
This .procmailrc does update the MH unseen sequence, because it pipes the message into MH's normal command for receiving mail.
When you're delivering to a pipe, you don't need a lockfile, but if you want procmail to be able to tell whether the delivery succeeded, you should add the 'w' flag.
# $HOME/Mail *should already exist*. In this case we're assuming
# it's your MH directory.
MAILDIR=$HOME/Mail
LOGFILE=$MAILDIR/from
# If you use EXMH, you might want to uncomment the following line
#MHCONTEXT=.exmhcontext # so EXMH sees the unseen sequence
# # safety net (commented out)
# :0c:
# $HOME/tmp_mail
:0w
|/arch/unix/packages/mh/etc/rcvstore +inbox
Deliver list mail to a pine/elm/Mail mailbox; handle urgent mail specially; leave most mail in inbox
This is a more complicated and realistic example.
| • | '$HOME/mail' is the Pine convention for where your mail folders are stored. If you use Elm, this would probably be '$HOME/Mail'. If you used MH, you'd probably use '$HOME/Mail' and also tack on '/.' to all the folder names (or use 'rcvstore'). |
| • | The '* !^TOjay\>' trick helps ensure that when the user jay Cc:'ed on a message that also goes to the list, it goes into user jay's regular mailbox. (You might want to change that.) |
| • | Food for thought: What happens if you're filtering two lists, and the same message is sent to both of them (so they both appear in the To: line)? What should happen? |
# $HOME/Mail *should already exist*.
MAILDIR=$HOME/Mail
LOGFILE=$HOME/procmail.log
# # safety net (commented out)
# :0c:
# $HOME/tmp_mail
###################################################
# mailing lists
:0:
* !^TOjay\>
* ^TO(alpha-osf|tru64-unix)-managers@
alphamgrs
:0:
* !^TOjay\>
* ^TOsun-managers@
sunmgrs
:0:
* !^TOjay\>
* Sender:.*BUGTRAQ@(netspace.org|securityfocus.com)
bugtraq
# If mail has "urgent" in the subject, send a *copy* to my pager
# and my home address.
:0c:
* Subject.*urgent
! mypager@mypagercompany.com, me@myhomeisp.com
A vacation filter
You can use procmail to implement a vacation auto-responder (something that automatically sends mail back saying you're on vacation and telling senders when you'll get to their mail). I'm not going to copy the recipe here, because (1) it uses fancy features (like formail and chaining recipes onto each other) that we haven't discussed, and (2) it's in the procmailex(5) manual page. However, here are a few points to think about when constructing autoresponders:
| • | You should never autoreply to automated mail (e.g. list mail, bounce messages, etc.) |
| • | You definitely want the 'c' flag so that the mail that triggers the vacation recipe will still be delivered normally. |
| • | You want to avoid mail loops, where your vacation bounce somehow gets bounced back to you (and triggers another vacation bounce). Postmasters tend to get upset when that happens. |
| • | It's courteous to autoreply only once, or only once every week or so, to each address you get mail from. If somebody sends you five pieces of mail for you to read when you get back, they don't really need five copies of your automated reply. |
The vacation recipe in the procmailex(5) manual page handles all these issues properly.