Man, they'll try anything to hack your system...

Ben Scott dragonhawk at gmail.com
Mon Jan 30 11:31:01 EST 2006


On 1/28/06, Python <python at venix.com> wrote:
> A huge percentage of my spam comes from IP addresses with no reverse
> lookup.

  A huge percentage of my spam contains the character 'e' in the message body.

  Both the above statements illustrate a classic spam-fighting
mistake.  It's *absolutely useless* to say "A huge percentage of my
spam meets such-and-such a criteria" unless you can *also* say "A huge
percentage of my ham does *not* meet the same criteria".  If you've
got that, you've got something useful.  Otherwise, forget it.

  Anyone here have figures on what percentages of their ham violates
standards or best practices?  I don't, but based on anecdotal evidence
from operators much larger then me, the answer is: "A lot."

  The key is to distinguish spam from ham, not merely to assign
characteristics to spam.

> A good chunk of the remaining spam comes from roadrunner
> addresses, presumably rooted zombies.

  Blocking the mass-market consumer Internet feed ranges is reportedly
a rather more effective spam/ham separator then looking for standards
compliance.  The vast majority of mail from such ranges is, in fact,
spam.  Of course, there are a few people running their own MX on such
feeds which get rather annoyed by such actions, including people on
this list.  Sadly, those are so few that they are often considered
"justifiable collateral damage".

> spambayes provides an effective client spam filter (spambayes.org).  The
> Outlook plugin is easy for Windows/Outlook folks.  For everyone else,
> you'd probably run it as an imap/pop proxy.

  I find bayesian filtering with a good user feedback loop is still
the overall best solution.

  At Net Tech, for IMAP clients, I was working on a solution that used
spamassassin and procmail to sort mail into folders, along with a few
specially-named folders clients could move mail to identify it as
"also-spam" or "not-spam".  A cron job ran nightly to process those
exceptions and train the filter.  I left Net Tech before it went
beyond the initial testing phase, but it looked promising.

-- Ben



More information about the gnhlug-discuss mailing list