Anti spam filter reviews, junk mail advice and spam blocker user ratings
Anti spam filter reviews, junk mail advice and spam blocker user ratings from WhichSpamFilter.com
Spam Filter Reviews, Anti Spam Tips, Advice and Spam Filter Ratings
Home | Dealing With Spam | Types of Filters | Spam Filters | Our Reviews | Resources | Forum
Contents

Types Of Filters
Terminology
Content Based Filters
Bayesian Filters
WhiteList/BlackList Filters
Challenge/Response Filters
Community Filters
Filter Placement
Conclusion


Newsletter

Sign up for our newsletter and receive the latest information on the best new filters as they release, special member-only offers and what is happening in the world of spam fighting!

E-mail Address:

We never share your e-mail address with any third party!
You can opt out of mailings at any time.

Types of Spam Filters

All spam filters available today act in one of a few ways, or a combination of these ways.
It must be noted here that the "perfect" spam filter has not been invented yet - and almost certainly never will. There is an ongoing game of "cat-and-mouse" between the spammers and the filters designed to protect against them. Whenever new measures are put in place to protect against them, the spammers try to come up with new ways to get round them.
The good news is that, as evidenced by the desperate measures they are currently taking to fool the filters at the expense of getting their message across, we have them on the run.

Before we discuss the different kinds of filters currently available, it will be necessary to introduce some simple terminology:

Terminology

  • False Positive - This is where a spam filter identifies a message as spam when it is innocent. This is the worst kind of error for a spam filter to make. It is better to err on the side of false negatives.
  • False Negative - This is where a spam filter fails to identify a spam message as spam. A lesser problem than false positives, but still to be avoided.

The "ideal" spam filter would produce zero false positives and zero false negatives. This, of course is an impossibility, but some filters, set up correctly will get pleasantly close! Let's look at the different kinds available...

Content Based Filters

These have always been the traditional types of spam filters. These simply analyse the message subject, headers and content looking for "Kill" words or phrases, or other indicators of spam.
Whenever an undesirable message gets through to your inbox, you simply create a new filter by choosing certain words, or phrases from the message that indicate it is spam.
Over the years, spammers have been aware that their messages were being killed by these content filters and have resorted to ever more desperate tricks to try to fool the content filters. This would explain why you get so much mail for "Vi@gra", "Mort.gage", "L|0|a|n|$" etc...

This practice has become so prevalent nowadays that older content-based filters have been performing less well. However, some of the more modern offerings have the ability to perform "wildcard" searches and at least one has the ability to "see through" the spammer's attempts at "obfuscating" the words/phrases such as in the examples above and even recognise that these attempts are being made - therefore indicating that it is spam.

At the end of the day, the spammer has to get his or her message across, and the only way they have to do that in an email is with the written word. A lot of spam messages now are virtually illegible because of their attempts to fool content-based filters.

Pros:

  • Flexible. You can easily tailor the filtering to the exact type of spam message you have to deal with and, just as importantly, not to baulk at words or phrases that you use daily in your business or with your friends.

Cons:

  • Require more "hands on" tuning and maintenance. As spammers resort to new tricks to foil the filters, or new products get advertised, extra filters have to be created to deal with them.

Bayesian Based Filters

Born in London 1702, the son of a minister, Thomas Bayes developed a formula which allowed him to determine the probability of an event occurring based on the probabilities of two or more independent evidentiary events.

Bayesian filters are filters that are based on this theory.
Bayesian filters have to be "trained" from known "good" and "bad" e-mails. During training they extract "tokens" (separate words) and store them in a database.


When analysing a new message, the message is split into tokens and each token is given a value according to the following criteria:

  • The frequency of the token in spam messages that the filter has been trained on
  • The frequency of the token in good messages that the filter has been trained on
  • The number of spam messages the filter has been trained on
  • The number of good messages the filter has been trained on

From applying Bayes' formula to these results, a value is extracted that gives the probability of this message being spam or not. This value is often called "spamicity".

Some current Bayesian based filters are returning very impressive detection rates with minimum false positives or false negatives.

Pros:

  • Require less maintenance than other filters. Once the engine has been "trained", they pretty much look after themselves.
  • They automatically adapt to shifting trends in spam. Because Bayesian filters continue to learn from newly arrived messages they will naturally adapt to shifting trends.
  • Will automatically adapt to the particular user's usual e-mails. If a user is, for instance, a loan officer, then messages that repeatedly mention loans won't necessarily be identified as spam.
  • Good record of minimum false positives.

Cons:

  • Filtering is only as good as the messages on which they are "trained". Many filters based on this technology come "pre-trained", but obviously not on your type of messages. All will require some time before they reach optimum filtering ability.
  • Has the potential to be fooled by diluting the spam message with enough obviously innocent words.

Whitelist/Blacklist Filters

These are very basic types of filters which nowadays are rarely used on their own, but are still used as part of an integrated filtering system comprising some of the other methods shown here.

Whitelist filters will not accept e-mail from any address unless it is on a list of known "good" e-mail addresses.

Blacklist filters, conversely, will allow messages from any address unless the address is on a list of known "bad" sources.

Blacklists can be stored and administered on a local system or referenced via the internet. Blacklists available on the internet are referred to as "RBLs", or Realtime Blackhole Lists.

Pros:

  • Whitelists are guaranteed to stop e-mail from unwanted sources.
  • Properly maintained blacklists should result in zero false positives.

Cons:

  • Whitelists are a drastic measure with very little flexibility.
  • Sometimes the people that compile RBLs - the realtime blacklists available on the internet put entire ranges of IP addresses on their blacklist even though previous abuse occurred only an a certain part of that range. This results in "collateral damage" - the situation where innocent people get blocked as a by-product of stopping the spammer and is the subject of much contention.

Challenge/Response Filters

Challenge/Response filters are characterised by their ability to automatically send a response to an unknown sender asking them to take some further action to ensure their message will be received. This is often referred to as a "Turing Test" - named after a test devised by British mathematician Alan Turing to determine if machines could think.

Recent years have seen the appearance of some internet services which automatically perform this Challenge/Response function for the user and require the sender of an e-mail to visit their web site to facilitate the receipt of their message.

Critics of this system claim it to be too drastic a measure and sends a message that "my time is more important than yours" to the people trying to communicate with you.
While this may be true, it is our opinion that it is a valid measure providing that the challenge is not sent as a matter of course, but only once a message has been analysed and deemed to be questionable.

For some low traffic e-mail users though, this system alone may be a perfectly acceptable method of completely eliminating spam from their inbox - one step above the "Whitelist" system outlined above.

Community Filters

These types of filters work on the principal of "communal knowledge" of spam. These types of filters communicate with a central server. When a user receives a message that is spam, they simply mark it as such. This information is posted to the central server where a "fingerprint" of the message is added to the database. When enough people have "voted" the message as spam, it will be blocked from user's inboxes in the future.

Pros:

  • Easy to set up and very minimal administration.
  • "Feel Good Factor" - knowing that by marking a message as spam, you are preventing it from being delivered to thousands of others.

Cons:

  • Before enough votes are cast, somebody will be getting the spam messages.
  • One person's idea of spam may not be another's - consequently, some innocent mails may be blocked by over zealous people preventing their delivery (false positives).
  • Some spammers will slightly change each message sent so that the "fingerprint" of the message is different, meaning it may not be recognised as the previous message blocked.

Filter Placement

One final distinguishing factor to consider between spam filters is their actual placement.
There are three main schemes for filter placement:

  • Filters that Integrate with your e-mail client - Many modern spam filters will integrate with popular e-mail clients, such as Outlook or Outlook Express.
    Pros:
    • Minimal impact on your normal e-mail reading habits. Spam messages are usually simply moved to a "Junk Mail" folder where they can be reviewed and/or deleted.
    Cons:
    • Ties you to your current e-mail client.
    • Inflexible - Often gives you limited choices as to your alerting level. For instance, when running Microsoft Outlook with an integrated spam filter, whenever a spam message arrives, you still get an alert that a new message has arrived. You have to go into the Outlook interface to confirm that the newly arrived message was spam and not an important e-mail.
      We have been unsuccessful in trying to get Outlook to create either a different audible alert between good and bad messages, or only to alert on the arrival of "good" messages since all messages go into the Inbox before they are acted upon by the filter and moved to a separate folder.
      This will result in either ignoring a new message that arrives, or continually being disturbed only to find that the new message is spam.

  • Filters that act as a "proxy" between the mail server and your e-mail client - These filters run in the background on your desktop and periodically poll your e-mail server, retrieve the messages found and act on them before they reach your normal e-mail client.
    Pros:
    • Flexibility - usually have more control over the messages on your server and can mark, move or even delete messages before they are seen by your normal e-mail client.
    • They do not tie you to any particular e-mail client.
    • Security - they represent another layer between the internet and your e-mail client. They usually will not run any applications or run scripts found in the e-mail message.
    Cons:
    • Impose a change on your normal e-mail viewing habits. Effective use of these involve turning off auto-checking on your normal e-mail client so that the proxy has a chance to work on the server first.
    • E-mail account information will need to be set up in the filter as well as in your normal e-mail client.

  • Server Based Filters - These are usually only used in a corporate, or business environment rather than in the home. All e-mail arrives at a central server where it is filtered by the server-based filter and individual users collect their messages on their desktop from the central server.
    Pros:
    • Central management of all e-mail filtering rules ensuring consistency across the network.
    • Individual users have no little or no responsibilities for spam management, freeing them to be more productive in their work.
    Cons:
    • Usually require more maintenance and require the presence and time of an experienced network administrator to manage the filter.
    • Often more expensive.

Conclusion

The perfect spam filter has not been invented yet, and probably never will. Furthermore, the situation will continually change as the ongoing battle between the spammers and the anti-spammers progresses. Today's super-filter may easily become tomorrow's "has-been".

In the end, the best filters will probably use a combination of many, or all of the features listed above.

The best of today's filters are returning quite impressive results in terms of capture rate, low false positives and low false negatives.

For a look at the best available in the world of spam filtering, see our Spam Filters and Reviews sections.

Spam Filter Reviews, Anti Spam Tips, Advice and Spam Blocker Ratings