Intelligent Agents for Data Mining and Information Retrieval [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Intelligent Agents for Data Mining and Information Retrieval [Electronic resources] - نسخه متنی

Masoud Mohammadian

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
توضیحات
افزودن یادداشت جدید


"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">









  • ASSOCIATION RULES DESCRIPTION

    Given a transaction database DB, I={I1,I2,

    … ,Im}is a set of itemsets with m different itemsets in DB. Each transaction T in DB is a set of items (i.e., itemsets), so T

    ⊆ I.

    Definition 1






    Itemset P is defined as A1

    ∩ A2

    ∩…∩ Ak, Ai

    ∈ I(i=1,2,

    … ,k), and P containing k items is called k-itemset.











    Definition 2






    The support of itemset P is defined as

    σ (P/DB)=the support account containing P in DB/the total transaction amount in DB=|A/DB|/|DB|.











    Definition 3






    If A and B are two itemsets, and A

    ∩ B=

    Φ , then the confidence of association rule A

    ↠ B in DB is defined as

    ψ (A

    ↠ B /DB)=

    σ (A

    ∩ B /DB)/

    σ ( A /DB).











    Definition 4






    Let the minimum support be

    σ min. Then the set of k frequent itemsets and the set of k non-frequent itemsets are defined separately as:












    To mine efficacious association rules in DB, minimum support

    σ min and minimum confidence

    ψ min must first be defined. Mining association rules find all of the association rules satisfying

    σ (A

    ∩ B /DB)

    σ min and

    ψ (A

    ↠ B /DB)

    ψ min in DB. Owing to the fact that the result of

    ψ ( A

    ↠ B /DB) can be gotten from the value of

    σ (A

    ∩ B /DB) and

    σ (A /DB), the key to mining association rule A

    ↠ B is to generate the set of k frequent itemsets. Therefore, the substantive study at present focuses on generating the set of k frequent itemsets (see Agrawal & Srikant, 1994; Feng et al., 1998; Zhang et al., 2000), which is the key to heightening the mining efficiency. We also focus on pattern match, which is the key to generating k frequent itemsets. The corresponding Apriori algorithm is as follows:

    C1={candidate 1-itemsets}



  • L1={c

    ∈ C1|c.count

    ≥σ min }



  • For (k=2; Lk

    − 1

    Φ ; k++)



  • Ck=apriori-gen(Lk

    − 1)



  • Count_support(Ck)



  • Lk ={c

    ∈ C1|c.counte

    ≥σ min}



  • Resultset=

    ∪ Lk



  • Next



  • Here, Ck is candidate k-itemsets, Lk is k-itemsets, Count_support(Ck) is to count the support count of candidate k-itemsets, Ck, apriori-gen(Lk

    − 1) is to generate Ck, which includes two steps. First, join Lk

    − 1 into k-itemsets. This is called the join step:



    • insert into Ck

      select P.A1, P.A2,

      … , P.Ak

      − 1,Q. Ak

      − 1

      from Lk

      − 1 P inner join Lk

      − 1 Q

      where P.A1= Q.A1, P.A2= Q.A2,

      … , P.Ak

      − 2= Q.Ak

      − 2, P.Ak

      − 1< Q.A k

      − 1



    Then, delete any (k

    − 1)-subitemsets of Ck which not be included in Lk

    − 1. This is called the prune step:


    For all itemsets c

    ∈ Ck
    For all k-1_subitemsets s of c
    If (s

    ∉ Lk-1), then
    Delete c from Ck
    and get the candidate k-itemsets Ck.


    During the mining of association rules, pattern match mainly occurs in Count_support(Ck), which is the account of the support count of candidate k-itemsets. The resulting account is a match between the k-itemsets constructed by all the k items, compounded by each transaction in transaction data set and the set of candidate k-itemsets Ck(k=1,2,

    … ). From the above, we know the pattern match of mining association rules is the match between any k-itemsets from each transaction of transaction data set whose item number is not less than k and any one itemset in the set of candidate k-itemsets.

    / 171