From: "James Wilkinson" <fedora@xxxxxxxxxxxxxxxxxx>
I wrote:
Also, I'd strongly recommend training SA's Bayesian analysis, using
the sa-learn program. SpamAssassin won't use Bayesian analysis until
it has learnt 200 good ("ham") e-mails and 200 spams.
Tim wrote:
Isn't that supposed to be the point of the junk/not-junk buttons on mail
clients?
Usually, that trains the mail clients *own* spam filter.
Unlike many other things, it's not really ideal to have multiple
different spam filters. This is because if one thinks an e-mail is a bit
iffy, but probably OK, it will let it through. But if multiple separate
spam tests all think that an e-mail is dodgy, one can reject it with a
lot more confidence.
SpamAssassin is designed to incorporate different styles of checks --
bayesian, DNSBL checks on the sending mail server, and *lots* of fixed
rules -- and come up with one overall spam score. The more checks that
SpamAssassin can do reliably (which in the case of Bayesian analysis,
means training SA's own Bayesian engine), the more accurately it can
spot spam and let through good e-mail.
Scores. The magic is in scores. No single rule (usually) should be
allowed to define spam. (BAYES_99 is good enough here I score it
high enough to guarantee markup as spam. Then I rely on the small
number of negative scoring rules to save random ham messages that
might get all the way to 0.99 BAYES spam probability.)
Besides, WTF good is Bayes with image spam?
{^_^}