Floating-Point Renormalization and Underflow

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

اصطلاحنامه مجموعه ها مرورالفبایی لغت نامه دهخدا

➟

جستجو در لغت نامه

بیشتر

کتابخانه شخصی پرسش از کتابدار ارسال منبع

Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification [Electronic resources] - نسخه متنی

Jonathan A. Zdziarski

| ،

افزودن به کتابخانه شخصی

میخواهم بخوانم

درحال خواندن

خوانده شده

ارسال به دوستان

آدرس پست الکترونیک گیرنده :

آدرس پست الکترونیک فرستنده :

نام و نام خانوارگی فرستنده :

پیغام برای گیرنده ( حداکثر 250 حرف ) :

کد امنیتی را وارد نمایید

ارسال

جستجو در متن کتاب

بیشتر

تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب

➟

جستجو در لغت نامه

بیشتر

توضیحات

افزودن یادداشت جدید

Bayesian classifiers designed for the Graham model typically keep only the 20 to 50 most exemplary local probabilities and evaluate those through the Bayesian chain rule. This makes the math easy because 0.1 to the 50th power is 10^-50, which is still well within the floating-point range of IEEE’s specification for floating point. However, 1.0 minus 10^-50is exactly equal to 1.0, even in 80-bit floating point. (For those readers with a vague familiarity with computer arithmetic, this is called loss of precision due to floating-point normalization.) This is bad; it means that once the Bayesian chain rule hits a Pout of 1.0, it can never recover change from that value.

For classifiers like CRM114 that don’t throw away any of the features, this problem of loss of precision becomes much worse. Unless special steps are taken, the useful dynamic range of an 80-bit IEEE number can be exhausted even before the headers of an email message are fully processed.

To prevent this, CRM114 uses a unique two-range system. Both P(spam) and P(nonspam) are calculated separately, and the smaller is used to recalculate the larger. Thus, even though the larger probability may become numerically indistinguishable from 1.0, the smaller will retain full accuracy down to 10^-300 which is quite useful for classifying large documents.

Floating-Point Renormalization and Underflow - Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification [Electronic resources] نسخه متنی

فارسی

کردی

العربیه

اردو

Türkçe

Русский

English

Français

کانال فیلم من

تبیان من

فایلهای من

کتابخانه من

پنل پیامکی

وبلاگ من

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification [Electronic resources] - نسخه متنی

Jonathan A. Zdziarski

آدرس پست الکترونیک گیرنده :

آدرس پست الکترونیک فرستنده :

نام و نام خانوارگی فرستنده :

پیغام برای گیرنده ( حداکثر 250 حرف ) :

کد امنیتی را وارد نمایید

فونت

اندازه قلم

حالت نمایش

Floating-Point Renormalization and Underflow