Final Thoughts
This chapter has introduced you to language classification, with a focus on the conceptual pieces of a language classifier. These components are generally very similar, providing a common framework for both filter developers and systems administrators to understand and tune them.
It’s important for systems administrators to find the right filter for their network. Filters come with many different sets of features and are aimed at different implementations. If you operate a very large network with hundreds of thousands of users, find a filter that can scale to support your storage and processing constraints. We’ll cover many different ways to tune statistical filters as we explain their technical details throughout the book.Over the past few years, language classification has become quite popular. It is widely used for filtering spam, but several different companies are also using it to categorize documents and route incoming emails. Companies with high volumes of customer support requests that have previously had to dedicate an entire department of people just to the task of forwarding customer requests have now found a way to save time, space, and money by automating these tasks with language classification.In the next chapter, we’ll explore the mathematics behind these components as they apply to statistical filtering. Statistical filtering is a mathematical harmony—an artful masterpiece. If you’re a developer, understanding the mathematical components should prove rather refreshing. If you are a systems administrator, the next chapter will explain the science behind statistical filtering and describe many of the different tuning knobs related to these components.