Intelligent Agents for Data Mining and Information Retrieval [Electronic resources] نسخه متنی

"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

ASSOCIATION RULES DESCRIPTION

Given a transaction database DB, I={I₁,I₂,

… ,I_m}is a set of itemsets with m different itemsets in DB. Each transaction T in DB is a set of items (i.e., itemsets), so T

⊆ I.

Definition 1

Itemset P is defined as A₁

∩ A₂

∩…∩ A_k, A_i

∈ I(i=1,2,

… ,k), and P containing k items is called k-itemset.

Definition 2

The support of itemset P is defined as

σ (P/DB)=the support account containing P in DB/the total transaction amount in DB=|A/DB|/|DB|.

Definition 3

If A and B are two itemsets, and A

∩ B=

Φ , then the confidence of association rule A

↠ B in DB is defined as

ψ (A

↠ B /DB)=

σ (A

∩ B /DB)/

σ ( A /DB).

Definition 4

Let the minimum support be

σ _min. Then the set of k frequent itemsets and the set of k non-frequent itemsets are defined separately as:

To mine efficacious association rules in DB, minimum support

σ _min and minimum confidence

ψ _min must first be defined. Mining association rules find all of the association rules satisfying

σ (A

∩ B /DB)

≥

σ _min and

ψ (A

↠ B /DB)

≥

ψ _min in DB. Owing to the fact that the result of

ψ ( A

↠ B /DB) can be gotten from the value of

σ (A

∩ B /DB) and

σ (A /DB), the key to mining association rule A

↠ B is to generate the set of k frequent itemsets. Therefore, the substantive study at present focuses on generating the set of k frequent itemsets (see Agrawal & Srikant, 1994; Feng et al., 1998; Zhang et al., 2000), which is the key to heightening the mining efficiency. We also focus on pattern match, which is the key to generating k frequent itemsets. The corresponding Apriori algorithm is as follows:

C₁={candidate 1-itemsets}

L₁={c

∈ C₁|c.count

≥σ _min }

For (k=2; L_k

− 1

≠

Φ ; k++)

C_k=apriori-gen(L_k

− 1)

Count_support(C_k)

L_k ={c

∈ C1|c.counte

≥σ _min}

Resultset=

∪ L_k

Here, C_k is candidate k-itemsets, L_k is k-itemsets, Count_support(C_k) is to count the support count of candidate k-itemsets, C_k, apriori-gen(L_k

− 1) is to generate C_k, which includes two steps. First, join L_k

− 1 into k-itemsets. This is called the join step:

insert into C_k
select P.A₁, P.A₂,
… , P.A_k
− 1,Q. A_k
− 1
from L_k
− 1 P inner join L_k
− 1 Q
where P.A₁= Q.A₁, P.A₂= Q.A₂,
… , P.A_k
− 2= Q.A_k
− 2, P.A_k
− 1< Q.A _k
− 1

Then, delete any (k

− 1)-subitemsets of C_k which not be included in L_k

− 1. This is called the prune step:

For all itemsets c

∈ C_k
For all k-1_subitemsets s of c
If (s

∉ L_k-1), then
Delete c from C_k
and get the candidate k-itemsets C_k.

During the mining of association rules, pattern match mainly occurs in Count_support(C_k), which is the account of the support count of candidate k-itemsets. The resulting account is a match between the k-itemsets constructed by all the k items, compounded by each transaction in transaction data set and the set of candidate k-itemsets C_k(k=1,2,

… ). From the above, we know the pattern match of mining association rules is the match between any k-itemsets from each transaction of transaction data set whose item number is not less than k and any one itemset in the set of candidate k-itemsets.

Intelligent Agents for Data Mining and Information Retrieval [Electronic resources] نسخه متنی

فارسی

کردی

العربیه

اردو

Türkçe

Русский

English

Français

کانال فیلم من

تبیان من

فایلهای من

کتابخانه من

پنل پیامکی

وبلاگ من

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Intelligent Agents for Data Mining and Information Retrieval [Electronic resources] - نسخه متنی

Masoud Mohammadian

آدرس پست الکترونیک گیرنده :

آدرس پست الکترونیک فرستنده :

نام و نام خانوارگی فرستنده :

پیغام برای گیرنده ( حداکثر 250 حرف ) :

کد امنیتی را وارد نمایید

فونت

اندازه قلم

حالت نمایش

ASSOCIATION RULES DESCRIPTION