From: <kevin.kempter@xxxxxxxxxxxxxxxxx>
On Wednesday 14 September 2005 14:16, jdow wrote:
Kevin, it's called a "Joe Job". It is exceptionally common. Headers in
email are pathetically easy to forge as far as the ones that existed
while the email was still on the sender's machines. Often if you trace
the received headers you find "discontinuities" in the chain if the
spammer bothered to forge them anymore. This is one of the things that
automated tools like SpamAssassin have gotten pretty good at finding.
The spammers are into cleverer tricks these days. Spammers still use
the "Joe Job", the forged sender, most of the time. I use it as one of
my customized SpamAssassin rules, as a matter of fact. It's part of a
set of rules and meta rules that can work on my addresses.
... lots deleted
Thanks for the info.
Can you send me info on what a spam assasin filter to catch these will
need to
look like?
May I suggest several things.
1) Join the spamassassin users list. http://www.spamassassin.org will
get you there even though it is an Apache project now.
2) Visit the SpamAssassin Rules Emporium, SARE, and look over the
various rule sets. http://www.rulesemporium.com/
3) Visit the admittedly iffy SpamAssassin wiki. (Pointer on their
main page.)
4) Brush up on your perl regular expressions and read the wiki entry
on making your own rules.
5) Do *NOT* run anything other than 2.64 or 3.04. Earlier versions of
either string are either trash or subject to DoS attacks. 3.04 seems
to be reasonably stable. (It has a perl based problem with per user
rules (not per user scores) if the rules require a perl eval be run.)
2.64 is also stable. But it's getting old.
// Joanne's general rules for a SMALL SpamAssassin server.
6) Do NOT use auto-<anything>. Turn off auto-learn. Turn off auto-
whitelist. They cause problems with the default settings. If you must
use them set the trigger scores farther apart in both directions.
7) Permit at least per user Bayes. Per user scores are good as well.
Some people consider the darndest things to be ham or spam. One
person's gold is another person's cow manure. Per user rules are
nicer yet, IFF your users are smart enough to do it right. (Loren
and I are. Erm, he is one of the SARE ninjas.)
8) As noted visit SARE and setup to use as many rule sets as seem safe
for your needs. (I use about 42 of them.) It saves ME having to look
at a new series of spam using a new trick to find a common identifying
feature around which a good rule can be made.
9) Set the BAYES_99 rule up to about 5 points once your Bayes is well
trained. Here it hits over 50% of all spam and 0.000% of ham. If it
hits I figure it is a VERY small chance it's messed up. It took a
year to get there.
10) Turn off auto-Bayes everything. Since you are not training Bayes on
almost every message there is no need to expire it periodically
either. I've never expired mine. And I find I need to train with
one or two low scoring spams every few days.
a) Setup an IMAP server on your machine that is NOT outside accessible,
of course. Create IMAP folders for spam and ham for each user.
have the users slide samples of good ham and definate spam into
their respective folders. Use a cron job to train on these.
b) I grab ham samples from various mail sorts in my OE setup. I use
POP3 for grabbing mail and a separate IMAP "account" the ham and
spam. I am particular about copying most escaped spam to the spam
folder. (Some has so little indentifying virtue to them I shrug
and let them go away. Although even the geocities.co.uk url only
spams are worth Bayes training.) I also look at the lowest scoring
spam messages and see if their Bayes score is abnormally low. If
it is and there's training meat present I toss them into the
spam training folder.
11) That brings us to another good idea, NEVER simply delete spam. Check
to see if it is a bozo friend sending you a peculiarly formatted
chunk of ham or if it is a customer trying to reach you from AOL.
(Well, same thing, really. But...) I tell spamassassin to encapsulate
spam in a mime layer and add something like **** SPAM **** 024.5 **
to the subject line. Then subject lines with **** SPAM **** in them
get tossed into an OE spam folder. I can sort by score nicely. And
that makes getting the low scores nice.
12) The man spamassassin page is not quite worthless if you want to do
something peculiar. "man Mail::SpamAssassin" is better if you want
details.
13) You MUST have a trusted mail server somewhere in your chain. It may
simply be your own or it may be your ISPs if you use fetchmail as I
do. Trusted in this context *ONLY* means that the server can be
trusted far enough to NOT forge any addresses itself. So that is the
place the DNS based rules can start from.
14) SURBL (Jeff) and URIBL (Chris) are good guys. Watch the scores on
other BLs you may use. Some are overly enthusiastic and catch some
rather significant chunks of ham in their nets. I generally make sure
they score LOW.
15) About the only useful "SPF" is a message that violates an existing
SPF record. I give that a slight score.
16) The various "habeas" sort of things are not generally worth anything.
The mad Russian sometimes forges them just for the grins and giggles
of it. (He is also a very smart critter who plays manipulative tricks
with his DNS servers that are beyond most people. It took me quite
awhile of patient explanation to understand one of them.)
17) <well, that's enough for now. Just how dedicated to this do you want
to be. If you're normal, which I'm not, I've gone past reality for
you already. {^_-} But I will note that I had zero real ham marked
as spam so far today and zero escaped spam. I did have two messages
that were ham marked as spam because they contained very VERY spammy
bodies. A coorespodent and I were discussing the mad Russian's tricks.
And I don't have him whitelisted yet. I've been lazy and remiss. I
pay for it with extra work recovering his emails from my spam folder.>
{^_^}