![]() ![]() Inform everyone in advance on the users and dev lists that we will be starting mass-checks shortly, and they should get their corpora nice and clean (see CorpusCleaning) and sign up for RsyncAccounts.Įnable all rules using the helper script to do this: Here's the process for generating the scores as of SpamAssassin 3.1.0: 1. It will tell you how many emails were used, and what the hit rates of all the rules were. If you want to see the statistics from the last corpus run, check the STATISTICS.txt files that come in the SA tarball. Occasionally people review the submitted logs for obvious mistakes, but it is largely a trust system. They submit the output logs mass-check generates. They hand-classify their mail and then run mass-check over it. The corpus consists of many (approximately 1 million) pieces of real-world, hand sorted mail.Ī smallish number of people (about 15), including the developers themselves, work as volunteer "corpus submitters". so that SA thinks the ham messages are nearly all ham, and the spam messages are nearly all spam). We generate new scores by analyzing a massive collection of mail (a "corpus"), and running software to create a score-set that gets the best possible set of scores, so that the maximum possible number of mails in that corpus are correctly classified (ie. It takes quite a while and is labour-intensive, so we do it infrequently. ![]() This is the procedure we use to generate new scores. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |