Re: Spamassassin and Spambayes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: "Claude Jones" <claude_jones@xxxxxxxxxxxxxx>

On Monday June 26 2006 21:30, jdow wrote:
Puny results, kid. {^_-} With SpamAssassin, rules, and a carefully hand
fed Bayes I'm not kidding when I say I get about one spam in 1000 that
creeps through. (And for the most part those train at near Bayes 0.50.)

>> Am I missing something here? Is there a better way to train spamassassi
>
> Some people find it helpful to change the BAYES_99 test to me equal to
> the spam cutoff or slightly below it. If the spam cutoff value is 5.0,
> set BAYES_99 test value is set at 5.0 or 4.9.

NOOOOOOOOOOOOOOOooooooooooooooo!

If you are going to automatically train Bayes widen the automatic
thresholds from the stock settings, at least at first. Once you have
the weight of a working Bayes behind you the stock settings might
work OK. I studied how the automatic classification system was
supposed to work, thought about it a little while, and decided I
am a big girl and can spoon feed SpamAssassin. Over the years I've
been running Bayes (I forget when it appeared. I first hit SA at
2.43 I think it was - or maybe even 2.2 something.) I've trained on
less than 2000 hams and 2000 spams. Bayes 99 alone catches 85% of the
spam and hits almost no spam. Bayes 80 and 95 account for another
almost 6%. The rest comes from the various rule sets I have running.
I suppose I should feed the Bayes a little more. I've seen it doing
better. But at the scoring I have (Bayes 99 is 5.001) I see such good
results I am in the "if it ain't broke, don't fix it" mode. {^_-}

I was hoping you'd chime in, Joan. I looked at the article Aaron linked to, but anybody who claims to have tested five programs in depth for a column, and presents results like that is just not convincing to me. You have obviously figured out spamassassin - every time I've tried, I've found the documentation cryptic and tedious - maybe there's better out there, and I need to work on it some more, but, in the spirit of your last quoted sentence just above, after getting Spambayes working yesterday afternoon, and training on a couple of hundred messages, I came home this evening and found only two spam mails in my inbox - there were 313 classified spam mails in the trash, and after going through those, there was only one false positive, and that was from a commercial advertising list I'm subscribed to - I guess my solution ain't broke either...

Sounds like it isn't. Double check that you do not see any "ALL_TRUSTED"
messages. That means SA could not guess your "trusted" mail server(s).
And that "trusted" is a rather loose trust. It's the furthest our from
your site that all email passes through and you trust not to lie to you
about the message headers it inserts. In my case that's the Earthlink
servers and my own machine. (Fetchmail often confuses poor little old
SpamAssassin. So I set it explicitly.)

For mailing lists to which you are subscribed you can use the
"whitelist_from_rcvd", which says you are whitelisting messages that
claim to come from a sender and always goes through a specific mail
server name.

{^_^}
{^_^}


[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux