Re: Spamassassin Default Install

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: <chrisl@xxxxxxxxxxxxx>

Does the default install of spamassassin with FC4 actually do anything? Is there anything I would have to configure in the postfix main.cf file to get the default installation working?

What is the recommended spamassassin configuration?

Remove the RPM and install a fake. Then install SpamAssassin from the
tarballs directly. You get a present if you do, a nice collection of
management tools that most distros trim off.

Oh, you didn't ask THAT.

Well, getting SpamAssassin to work largely depends on your email
configuration. If the Fedora did their homework and require it for
other packages then it should be installed to work. I am not sure
which of the installation options they chose. A desktop install
should have per user scores, Bayes, and whitelists enabled. You can
discover if this is the case by looking in your user directory, NOT
ROOT PLEASE, for the /home/<user>/.spamassassin directory. If that
is there it should have "user_prefs" and several "bayes_*" files.
If they're their Bob's your second cousin. (You're in good shape;
but, you're not there yet. {^_-})

If you are setup as I mentioned above then you have to get SA trained.
Now, I am a little ruthless in this regard. What you should do is collect
at least 200 emails you consider ham and another 200 you consider to be
spam. Save them to two different mailboxes. (I do this with a spam training
account in my Outlook Expunge (don't ask). I have the regular account that
fetches from the mail machine via pop3. I like the resultant folder
behavior in OE. I have a second account that maintains several IMAP boxes,
spam and ham, of course, and in addition old_ham and old_spam. I move
messages to be learned by SpamAssassin to the appropriate ham or spam
folder. Periodically, for speed, I move the already learned ham over to
old_ham and similarly with the spam to old_spam.) Anyway, you take your
spam and ham collection and spoon feed it to SpamAssassin via salearn.
But first I'd suggest simply nuking, "rm -f" the .spamassassin/bayes_*
files. For starting up autolearning can lead to a poorly trained Bayes
database. It also helps to widen the automatic learning thresholds. At
this site (two people near a dozen accounts) I use manual training
exclusively. Every night around the log rotate time all accounts are
run through their salearn cycle.

Salearn is only modestly funkity. It does not automatically diagnose your
mailbox format, MailBox or mbox. If your mail boxes are single files
with multiple email messages in them you are running mbox format. If the
mail boxes are directories with many files in them then you are running
MailBox format. For the former case you need the "--mbox" command option
on salearn. So for me, with mbox format, the commands come out like this:
"salearn --mbox --ham ~/mail/ham" and "salearn --mbox --ham ~/mail/spam".
Salearn is smart enough that it trims SpamAssassin markings off before it
learns. But it does not work with forwarded email. That is why I copy un-
marked spam to the IMAP spam folder rather than use a forwarding tactic.
It is also smart enough to recognize it has already learned from a message.
So you do not need to trim the ham and spam directories daily. Monthly is
often plenty. Whether or not you use automated learning processes this is
the technique for your initial feeding that works best.

Then there is this "feature" about SpamAssassin that has a work around but
is not official SpamAssassin. (IMAO it should be part of any distro that
ships SpamAssassin enabled.) SpamAssassin is a siamese twin. It is a Bayesian
filter and a rule based filter wrapped into one package. The above cares
and feeds the Bayesian classifier. The rules shipped with SpamAssassin are
a little thin on the ground. The nice thing is that SpamAssassin has means
for adding rules. (Note that ALL added rule sets belong in the same directory
as "local.cf", generally /etc/mail/spamassassin. Do NOT much with the
/usr/share/spamassassin directory. The exception is that if you installed
from the tarball you'll notice /usr/share/spamassassin/tools. There are some
fine fun tools in there.) The bad news is that building rules requires a
good understanding of perl regular expressions. The good news is that there
are people who build rules for themselves and share them. A fine group of
such ninjas got together and created The SpamAssassin Rules Emporium found
at http://www.rulesemporium.org/. Don't let the cute little ninjas put you
off. These folks mean business. Spammers HATE them. They kinda like that.
The first thing to find on the site is the "RulesDuJour" package. It is
the key to easy maintenance of your rule sets. Then visit the Rules page
and READ the rules and their intended uses. Some rule packages are more
stringent and more likely to get false alarms than others. Pick and choose
what fits your needs. RDJ helps with that. And IMAO RDJ should be a part of
any self respecting Fedora Core installation.

With the selection of rules I have from SARE and SpamAssassin 3.0.5 (tricked
out for some rule debugging which is why I have not moved up) the misfire
rate here is about one in a ten thousand IF I ignore spam to the LKML and
FreeBSD mailing lists which do not filter. (I am toying with the idea of a
parallel SA filtering process for those two lists with rules optimized for
their unusual needs. Debugs and patches look rather spammy when you get
right down to it. {^_-})

The SpamAssassin wiki is a nice resource if a little opaque. One of the most
common problems is with the "ALL_TRUSTED" rule hitting. If you see this in
logs or message markup you have some configuration to do. Precisely what
that configuration might be depends on how you get your mail. Read up in
the wiki about "ALL_TRUSTED". And remember that "trusted_networks" is not
for networks you do not expect to spam you. It is for networks from which
you expect to receive email without forging any header material. Hence I
have Earthlink's servers included. I do not expect THEIR headers to lie. I
gotta trust someone. And I do not trust the whole world not to forge things.
So I tell SpamAssassin, "When you are tracking back on the headers do not
take on faith anything in any received header past the Earthlink email
headers." That looks like: trusted_networks 127/8 207.217.121/24 209.86.93/24.

Another common misunderstanding is that "auto_whitelist" is a combo white
and black list. It learns from messages received and boost or reduces
scoring for sites considered ham or spam in the past. I don't use it. But
others do and like it. I manually use "whitelist_from_rcvd" for my own
special good-guys. I also perversely use "allow_user_rules 1". I trust
myself and my co-user is one of the SARE ninjas on a sorta part time basis.
He has particular fun thwarting Leo Kuvayev. (Look Leo up on XBL.)

The SpamAssassin user's list is a nice resource in a real pinch. But please
make sure the problem is not in whatever tool CALLS SpamAssassin. I happen
to have forced the use of lowly old crufty often deprecated by "those who
(think they) know" procmail. It feeds SA and accept the results and does not
strip SA's markup and install its own. I can also perform some custom
magic with it. So I've no inclination to use amavisd or clamscan or whatever.
(I do use Clam, though. I use the SpamAssassin clam plugin. Search for
"plugin" on the wiki. It's there.) Between Earthlink's AV, ClamAV, and
F-Secure on the Winders machines around here we've not been bit once yet
in just a whole lot of years online. (Note that SpamAssassin may catch a
virus or two on the way to you. But it is a darned poor AV tool. It is also
not really an anti-phish tool except as a fine and worthy side effect.)

{^_^}   Phew, Joanne got verbose she did. I hope it helps without being
       too overwhelming. (And FWIW I used a CPAN install method after
       pulling the trash FC-4 SA install out by its roots.)


[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux