Chapter 12: Fifth - Order Markovian Discrimination - Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification [Electronic resources] - نسخه متنی

Jonathan A. Zdziarski

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
توضیحات
افزودن یادداشت جدید







Chapter 12: Fifth - Order Markovian Discrimination


Overview


Until now, we’ve been discussing the most popular approaches to language classification involving Bayesian analysis. Markovian discrimination is a new implementation used in some of the more advanced language classification projects, specifically the CRM114 discriminator.

Bill Yerazunis invested a considerable amount of research into Markovian discrimination and presented his findings at the MIT Spam Conference in 2004. This chapter was coauthored by Bill and incorporates much of his research into Markovian discrimination to provide an accurate and full understanding of how the Markov principles apply to machine learning.

Just as Bayesian learning starts out with Reverend Thomas Bayes in 1764, Markovian analysis was kicked off by Andrei Markov, a professor at St. Petersburg University, in 1886. Markov’s early research included number theory and analysis, approximation theory, and convergence of series. Markov is best known for his work on random processes that have memory; these are now known as Markov chains. Markov chains are a basic building block for most advanced statistical theory; they are particularly useful for spam filtering.

The central idea of Markov’s work was that some things in nature are more complex than Bayes’ independent event statistics can describe. Markov came up with a very simple yet powerful description of nonindependent, related events that accurately models many natural processes and natural languages.

/ 151