An Imperfect Solution
Language classification is imperfect—a small margin of error will always remain. Since language classifiers learn based on historical decisions, they are always going to make decisions based only on what they already know. In other words, it is impossible to design a perfect algorithm to solve the problem of language classification, because decisions are based only on historical information (namely, what the classifier knows so far about the recipient). Because email evolves, language classifiers must act to some degree as a crystal ball to predict how the user will classify new messages.From a philosophical perspective, language classification is more of an art form than a science. Instead of approaching it with the idea that it can be made perfect, you will save a considerable amount of time when you realize that the process is imperfect; the goal should be to design and implement a system that is “good enough,” with practical resources in mind. This is the same philosophy used to find square roots. The term “good enough” as we use it here denotes the ideal balance between accuracy and system resources, making use of software that is practical in a wide range of scalable environments.