Junk mail statistics

Written by josh on October 31st, 2009

At Diggy, we take the way we handle junk mail (spam and viruses) seriously. If we didn’t our mailboxes would be full of illegal advertising campaigns, viruses and scams.

I went over our logs for the past week to show how many emails were coming into our main mailservers, how many were rejected based on certain criteria, how many were accepted but marked as spam and how many got all the way through to the inbox.

The number of “clean” messages – ones that our spam detector software believes are ok were 31,500. So we are averaging a real email every 20 seconds. The number of emails that our spam detector believed was spam was 4,350. These are marked in the headers, and by default is filtered into the “Spam” folder that can be accessed through webmail.

Now we have those statistics for a baseline, here’s the interesting ones:

The number of viruses: 535

This doesn’t sound like a lot, I assume this is mostly due to a lot more people being proactive with antivirus software these days.

The number of email connections that we deemed dubious and asked to come back later: 235,000

By dubious, we mean that we check the incoming connection to determine whether they are correctly configured mail servers, as it’s possible they are a virus or spamming software running on someone’s computer. If the server comes back later to deliver the same message, then we assume it is in fact a real mailserver and we then allow the email through.

The number of email connections that we close with a permanent error: 396,000

We close connections with a permanent error if the server is a known spammer, break a lot of rules, or are trying to forward emails through us to an email address we don’t host. In some cases when we tell the connecting server to come back later it attempts to continue sending the email, so we then have to force the connection to close. So our total number of connections is less than the sum of all of the above.

So in the 7-day period we had 562,000 connections, attempting to send us email. That’s almost one every second. Assuming that each of those connections was only going to deliver one email, 94% of incoming emails are spam. Since it’s often with spammers that they use one connection to send many emails at once, this figure is even higher.

The spam detection battle is ongoing. The spammers get smarter each week and try using different techniques. So far we’ve managed to keep our inboxes relatively clean while not removing real emails.

The spam software that we use is called dspam which is able to be trained. If you receive spam, or an email is incorrectly classified as spam, please send it to the dspam program to allow it to learn for your mailbox. You can find the details on how to do this on our support area of our site under Managing Spam

This entry was posted on Saturday, October 31st, 2009 at 7:50 am and is filed under Technical. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Leave a Reply

You must be logged in to post a comment.