From data warehousing to data mining?
Data Warehouse Usage
1) Three kinds of data warehouse applications
i) Information processing
a) supports querying, basic statistical analysis, and reporting using crosstabs, tables, charts and graphs
ii) Analytical processing
a) multidimensional analysis of data warehouse data
b) supports basic OLAP operations, slice-dice, drilling, pivoting
iii) Data mining
a) knowledge discovery from hidden patterns
b) supports associations, constructing analytical models, performing classification and prediction, and presenting the mining results using visualization tools.
Friday, July 4, 2008
Further development of data cube technology)
Further development of data cube technology?
Discovery-Driven Exploration of Data Cubes
1) Hypothesis-driven: exploration by user, huge search space
2) Discovery-driven
a) pre-compute measures indicating exceptions, guide user in the data analysis, at all levels of aggregation
b) Exception: significantly different from the value anticipated, based on a statistical model
c) Visual cues such as background color are used to reflect the degree of exception of each cell
d) Computation of exception indicator (modeling fitting and computing SelfExp, InExp, and PathExp values) can be overlapped with cube construction
Complex Aggregation at Multiple Granularities: Multi-Feature Cubes
1) Ex. Grouping by all subsets of {item, region, month}, find the maximum price in 1997 for each group, and the total sales among all maximum price tuples
select item, region, month, max (price), and sum (R.sales)
from purchases
where year = 1997
cube by item, region, and month: R
such that R.price = max(price
Discovery-Driven Exploration of Data Cubes
1) Hypothesis-driven: exploration by user, huge search space
2) Discovery-driven
a) pre-compute measures indicating exceptions, guide user in the data analysis, at all levels of aggregation
b) Exception: significantly different from the value anticipated, based on a statistical model
c) Visual cues such as background color are used to reflect the degree of exception of each cell
d) Computation of exception indicator (modeling fitting and computing SelfExp, InExp, and PathExp values) can be overlapped with cube construction
Complex Aggregation at Multiple Granularities: Multi-Feature Cubes
1) Ex. Grouping by all subsets of {item, region, month}, find the maximum price in 1997 for each group, and the total sales among all maximum price tuples
select item, region, month, max (price), and sum (R.sales)
from purchases
where year = 1997
cube by item, region, and month: R
such that R.price = max(price
Subscribe to:
Posts (Atom)