MailShell Spam Catcher Plugin for CommuniGate Pro


SpamCatcher Plugin Overview

The MailShell Spam Catcher Plugin runs as an External Filter and calculates a spam "score" for each message being processed. Unlike tools with statically defined patterns for spam messages, the MailShell SpamCatcher Plugin dynamically retrieves new patterns from SpamCatcher Network thus providing greater accuracy for new spam messages.

The score ranges from 0 to 100; the higher the message score the more likely the message is spam. The score info is added to message headers so it can be processed by Server-Wide, Domain-Wide and Account Rules.

By default the added header lines look like this:

X-SpamCatcher-Score:  87 [XXXX]
X-Alert: possible spam!
X-Color: red
Besides the digital score value, the header field contains a "bar score" to simplify automated message processing: the more 'X' characters the higher the score. The following ratios between the digital and bar scores are used:
Digital score rangeBar score
0[]
1-39[X]
40-76[XX]
77-84[XXX]
85-90[XXXX]
91-99[XXXXX]
100[XXXXXX]

Every day at midnight the Plugin generates a report message about the number of mails processed and their spam scores.

Note: The MailShell Spam Catcher Plugin is available only for some platforms supported with the CommuniGate Pro server software. Before you order the SpamCatcher Plugin License, make sure the plugin is available for your CommuniGate Pro Server platform.

Note: The MailShell Spam Catcher Plugin requires CommuniGatePro version 4.0.6 or later.


Download the SpamCatcher Plugins

MailShell SpamCatcher plugins are available for certain platforms only.
Operating System CPU Download
via
http
via
ftp
Sun Solaris Sparc
Linux (RedHat 6.x/7.x, SuSE) Intel
Microsoft Windows NT/2000/XP
Microsoft Windows 95/98/ME
Intel
Apple MacOS X (Darwin) PowerPC
FreeBSD Intel
IBM AIX PowerPC

The current version of the Plugin is 1.5


Installing on a Unix System.


Installing on a MS Windows System.


Testing the SpamCatcher Plugin.

On a Unix System:

On a MS Windows System:


Integrating the SpamCatcher Plugin with CommuniGate Pro.

Please check the External Filters section of the CommuniGate Pro manual.

Open the General page in the Settings section of the WebAdmin Interface and click the Helpers link. Create a Helper for the SpamCatcher Plugin:

Content Filtering
Use Filter:
Log: Program Path:
Time-out: Auto-Restart:
Note: For a MS Windows system the Program Path should be CGPSpamCatcher\CGPSpamCatcher.exe

The recommended scanning Rule is as follows:

DataOperationParameter
ActionParameters

This rule skips messages from the MAILER-DAEMON address (such as non-delivery reports, return-receipts, etc.), and scans only messages for local recipients.

Note: The SpamCatcher License limits the number of messages the Plugin can scan within any 60 minute period. If the E-mail traffic exceeds the licensed limit, the Plugin will let the messages go through unrated. Without the license you can rate up to 5 messages per hour.

Here is a sample account-level Rule for dispatching the rated messages:

DataOperationParameter
ActionParameters

This Rule moves the incoming messages with score 85 and greater to the "junk_mail" mailbox.

You can create the above rule as Domain-Wide Rule, but make sure the "junk_mail" mailbox exists in every account in the domain.


The Plugin Configuration File

On startup the SpamCatcher Plugin reads the contents of the CGPSpamCatcher.cfg file from the current directory. If this file does not exist, the Plugin creates a new file with the default values.

The default CGPSpamCatcher.cfg has the following contents:

HEADER=X-SpamCatcher-Score: ^1 [^2]
This line defines the header to be added to the rated messages.
The ^1 combination is replaced with the digital message score.
The ^2 combination is replaced with the bar score.
To create a multi-line header use the \e combination as a line breaker. Make sure each line is a RFC-compliant header, it would be best if you start each with the "X-" prefix. Example: HEADER=X-Score: ^1\eX-Bar-Score: ^2

ALERT_LEVEL=85
This line defines the score which triggers the ALERT_HEADER to be inserted into the message.

ALERT_HEADER=X-Alert: possible spam!\eX-Color: red
This line defines the header to be added to the rated messages if its score is equal or greater than the value of ALERT_LEVEL. The "X-Color: red" combination changes the message color when viewed via CommuniGate Pro WebMail interface.
Note: To dispatch spam via Rules you may check for the ALERT_HEADER presence instead of checking the message scores, but this method is not flexible because different users may want to use different scores as a threshold.

REPORT_ADDRESS=postmaster
This line defines the e-mail address where the daily reports are to be sent. With the default settings the reports are sent to postmaster address from the CommuniGate main domain.

SUBMITTED_DIR=/var/CommuniGate/Submitted
This line defines the CommuniGatePro Submitted directory (required for submitting the reports)


The SpamCatcher Engine Configuration File

In initialization time the SpamCatcher Engine reads configuration options specified in the data/spamcatcher.conf file.

The following lists the valid options. Note that all arguments are specified as strings. If options are not explicitly set they will assume their default value.

netcheck
Whether to communicate with the Mailshell SpamLabs to determine scoring.
Default: no
Valid values: yes,no

ruleupdate
How often to retrieve new rules from the Mailshell SpamLabs. The value is specified in units of integral seconds. Note that a value of "0" disables this feature and rule files will not be updated.
Default: 3600
Valid values: 0, 600 .. 2^32-1

sntimeout
Limit how long single request to the Mailshell SpamLabs can take. The value is specified in units of integral seconds. Note that a value of "0" disables this feature and no limit will be placed.
Default: 5
Valid values: 0 .. 2^32-1

extended_rules
Enable the extended rule set for higher accuracy. Note that with this option enabled the program may take several minutes to initialize.
Default: yes
Valid values: yes,no

rbl_list
Specifies a list of Realtime Blackhole List (RBL) servers to query when analyzing messages.
Format: rbl_list=server:response:offset,server2:response2:offset2,...
rbl_list expects a comma separated list of RBL entries. In turn, each RBL entry consists of up to 3 colon separated items. Those items are:
1) server - name of an RBL server
2) response - the response given by an RBL server when an IP address is listed e.g. 127.0.0.2, 127.0.0.3, 127.0.0.4, etc. This is optional. The default is that all responses apply.
3) offset - The numeric offset to apply to the spam score if an IP address is listed on this RBL server. This is optional. The default is an offset of 100
Default: none
Example: rbl_list=bl.spamcop.net::40,bl.spamcop.net:127.0.0.3:75

use_https
Communication between the SDK and the Mailshell SpamLabs is always encrypted. This encrypted communication can be sent over standard HTTP (port 80) or over HTTPS (port 443). If this option is set to "no", then HTTP is used. If set to "yes", then HTTPS is used.
Default: no

proxy_host
Specify the host name and port number of a HTTPS proxy to connect to the Mailshell servers.
Default: none
Example: proxy_host=squid.corp.com:8080

proxy_userpwd
Specify the host name and port number of a HTTPS proxy to connect to the Mailshell servers.
Default: none
Example: proxy_userpwd=joe:mypassword

use_score_offsets
Enable the Training Database.
Default: no
Valid values: yes,no

use_score_history
Enable the tracking of historical scores for repeat senders. This can improve accuracy but it is still experimental.
Default: no
Valid values: yes,no


Approved Senders List

The Mailshell Spam Engine will accept a list of sender addresses or domains whose messages never will be considered spam.

The control file for Approved Senders is located at data/approvedsenders and will contain one line per sender. Each line can contain an email address or a domain. Addresses are of the format mailbox@domain and domains are simply of the format domain. Examples:

user@isp.com
spammer.net

Leading and trailing white space is ignored. Lines beginning with the # character are considered comments.

Blocked Senders List

The Mailshell Spam Engine will accept a list of sender addresses or domains whose messages are always considered spam.

The control file for Blocked Senders is located at data/blockedsenders and will contain one line per sender. Each line can contain an email address or a domain. Addresses are of the format mailbox@domain and domains are simply of the format domain. Examples:

user@isp.com
spammer.net

Leading and trailing white space is ignored. Lines beginning with the # character are considered comments.

Precedence of Approved and Blocked Addresses

When an address matches entries in both the Approved Senders and Blocked Senders lists, the following priority will be observed. Email addresses will take precedence over domains, e.g. if you block the domain host.net but approve the specific address joe@host.net, mail from the latter sender will be approved. In addition, approved addresses will take precedence over blocked addresses if identical entries exist on both the Approved Senders and Blocked Senders lists.

Note: In the current version of the SpamCatcher Engine enabling the Training Database may disable Approved and Blocked senders lists. This behavior may change in the next versions.

Note: After making any changes in Approved Senders or Blocked Senders list files, in order to make the changes take effect you should tell the plugin to restart the SpamCatcher Engine by creating "update.sig" file with any contents in the current directory of the plugin. Example: echo >update.sig The Plugin will delete that file.


Launching the Plugin from the Command Shell.

The Plugin is so-called text-only application which can be launched from a command shell, and it accepts some command line options. When used with CgommuniGate Pro as a helper applicaiton the Plugin does not require any command line options.

Rating Message Files.

The Plugin can be used to calculate the spam scores of a number of message files in a directory.

The syntax of the program is: CGPSpamCatcher RATE <directory>

The <directory> must contain message files in RFC822 format or in CommuniGate format (RFC822 with the envelope info), there must be one message per file. Messages in .EML, .MSG, .mbox and other formats must be converted to the RFC822 format.

Example: ./CGPSpamCatcher RATE /var/CommuniGate/Queue

Training the SpamCatcher Engine.

The Plugin application can add the directory of messages to a Training Database, which is stored in the data/scoffset.bin.full and data/scoffset.bin.incr files.

Note: By default, the Training Database is disabled and ignored. If you want the Engine to use the Training Database, then the option use_score_offsets must be set to yes. If enabled, the Engine will read Training Database files the next time that it is initialized.

To use the training program, you must first collect a directory of spam messages and/or a directory of legitimate messages. Each message must be in RFC822 format or in CommuniGate format (RFC822 with the envelope info). There must be one message per file. Messages in .EML, .MSG, .mbox and other formats must be converted to the RFC822 format. Then you can use the training program to analyze the directories.

The syntax of the program is: CGPSpamCatcher TRAIN [options] <directory>

The options are:

-forget
Optional. Specify this if you wish to remove the scoring offset set previously. By default, the program will add the messages to the Training Database.

-o <offset>
Optional. If you are adding messages, specify the scoring offset as this parameter. The value should be between -200 and 200. -200 will cause the message to be treated as approved, while 200 will cause it to be treated as blocked.

-score
Optional. Compute scores of messages and factor them into future scoring of messages from the senders.

-v
Optional. Flag to output status of add and delete operations.

-spam
Optional. Indicates message is spam. Equivalent to specifying -o 200

-ham
Optional. Indicates message is not spam. Equivalent to specifying -o -200

-clear
Optional. Remove all entries. The Training Database files will be cleared.
Examples:
  1. Approve all messages in the directory named messagedir:
    ./CGPSpamCatcher TRAIN -ham messagedir

  2. Block all messages in the directory named messagedir:
    ./CGPSpamCatcher TRAIN -spam messagedir

  3. Compute scores for messages in directory dir2. If the messages were sent by the recipients of approved messages (as set by Example 1) then these scores will be used in the analysis of future messages from those senders. This can help reduce false positives.
    ./CGPSpamCatcher TRAIN -score dir2

  4. Forget about messages in a directory.
    ./CGPSpamCatcher TRAIN -forget messagedir

  5. Clear the database. All data set by previous training sessions along with scoring history will be deleted.
    ./CGPSpamCatcher TRAIN -clear


Evaluating the required license type.

The SpamCatcher License limits the number of messages the Plugin can rate within any 60 minute period of time. If the E-mail traffic exceeds the licensed limit, the Plugin will let the messages go through unrated. Without the license you can rate up to 5 messages per hour.

To evaluate the required license type:

#!/usr/bin/perl
$|=1;
my $count=0;

while(<STDIN>) {
  chomp;
  @line = split(" ");
  if ($line[1] eq "FILE") { $count++; }
  print $line[0] . " OK " . $count . " messages scanned.\n";
}

CommuniGate® Pro Guide. Copyright © 2003, Stalker Software, Inc.