12.3 Data Mining
Data mining, as
we define it in this chapter, is the
use of mathematical algorithms to model relationships in the data
that solve difficult problems involving large numbers of variables
with unknown relationships. You might need such techniques if you are
trying to solve such business problems as fraud detection, customer
churn analysis, and marketing contact response prediction. The
algorithms used to solve such problems include clustering techniques
that show how business outcomes can fall within certain groups (such
as for market basket analysis) and logic models (if A occurs, then B
or C are possible outcomes), validated against small sample sets and
then applied to larger data sets for prediction.Since the Oracle9i database release, Oracle has
provided a set of data mining algorithms in the
database's Data Mining Option for solving such
problems. Algorithms now embedded in the database include:Naïve Bayes AssociationsAdaptive Bayes NetworksClustering, Support Vector Machines (SVM)Nonnegative Matrix Factorization (NMF)
These algorithms are accessible via a Java API.Data mining applications can be custom-built using
Oracle's JDeveloper in combination with the Data
Mining for Java interface. DM4J is used to develop, test, and score
the models. It provides the ability to define metadata, tune the
generated Java code, view generated XML files, and test application
components. The J2EE-generated applications can be deployed within
the database, to Oracle Application Server, or to other J2EE
platforms.OracleAS Personalization is one
application of data mining that is bundled with Oracle Application
Server. It provides the tools necessary to instrument a web site,
gather data on how a web-site visitor traverses the site, and then
use the data gathered to provide a
"personalized" web experience by
presenting specific pages that are determined to be of likely
interest to the visitor.This solution is implemented in three tiers, as shown in Figure 12-5. The Recommendation Engine API (REAPI) is
a set of Java classes that are integrated into the web application on
the web site, enabling the gathering of needed data and specific
actions based on recommendations from the middle tier. The data that
is gathered as a web-site visitor traverses the site is captured and
sent to an Oracle database installed with the data mining algorithms.
There, scoring takes place to create a sorted, ordered list that is
then applied to the middle-tier Recommendation Engine. That engine
services the web site in real time, providing recommendations based
on similar, previously analyzed trends.
Figure 12-5. OracleAS Personalization architecture

Administration User Interface (OPUI) accesses the
Mining Object Repository (MOR) in the
backend. The MOR schema contains business model information and is
used for configuration and report administration. A
Mining Table Repository (MTR)
contains the basic schema needed to create the model of the business,
also referred to as a
taxonomy.