Data Mining & Data warehousing
Data Mining & Data warehousing unit 5 2 marks with Answers and 16 mark questions
Unit V
Part A
1) Give examples for complex structure valued data.
Set – Valued, List – Valued data and data with nested structures.
2) Define Set – Valued attribute.
A Set – Valued attribute may be of homogeneous or heterogeneous type. It can be generalized by 1) Generalization of each value in the set into its corresponding higher level concepts or 2) Derivation of the general behavior of the set such as number of elements in the set etc. A Set – Valued attribute can be generalized into a Set – Valued or a Single – Valued attribute.
3) Define List – Valued attribute.
A List – Valued or Sequence – Valued attribute can be generalized in a manner that the order of the element in the sequence should be observed in the generalization. Each value in the list can be generalized into its corresponding higher level concepts. A list may be generalized into a list, a set, or a single value.
4) Define Plan mining.
A plan consists of a variable sequence of actions. A plan database or plan base is a large collection of plans. Plan mining is a task of mining significance pattern or knowledge from a plan base. It can be used discover travel patterns of business passengers in an air flight database. Plan mining is a extraction of important or significant generalize pattern from a plan base.
5) Define spatial data mining.
A spatial database stores a large amount of space related data such as maps, preprocessed remote sensing or medical imaging data and VLSI chip layout data. Spatial data mining refers to the extraction of knowledge, spatial relationships or other interesting patterns not explicitly stored in the spatial databases. It is used for understanding spatial data. It has applications in geographic information systems, geomarketing etc.
6) What is a multimedia database?
Multimedia database system stores and manages a large collection of multimedia objects such as audio data, image data, video data, sequence data etc.
7) What are the two main families of multimedia indexing and retrieval systems?
Description based retrieval systems and content based retrieval systems.
8) Give the kinds of queries used in content based retrieval system.
There are two kinds of queries: Image sample based queries and Image feature specification queries.
9) Give the categorization of mining association in multimedia data.
Three categories are: 1) Association between image content and non-image content features. 2) Association among image contents that are not related to spatial relationships. 3) Association among image contents related to spatial relationships.
10) What is a time series database?
It consists of sequence of values or events changing with time. Values are typically measured at equal time intervals. These are applicable in studying daily fluctuations of a stock market, scientific experiments and medical treatments.
11) What is the sequence database?
It is a database that consists of ordered events with or without concrete notions of time example web page traversal sequences.
12) What is sequential pattern mining?
It is the mining of frequently occurring patterns related to time or other sequences.
13) What are the parameters in sequential pattern mining/
Duration of a time sequence T, event folding window w, time interval, int.
14) What is periodicity analysis?
It is the mining of periodic patterns i.e. search of recurring patterns in time series databases. Eg. Seasons, tides, daily traffic patterns, all present certain periodic patterns.
15) What is information retrieval?
IR is a field that has been developing in parallel with database systems for many years. It has been concerned with the organization and retrieval of information from a large number of text based documents. Typical information systems include on-line library catalog systems and on-line document management systems.
16) What are the two basic measures for assessing the quality of text retrieval?
Precision, Recall.
17) What s keyword based association analysis?
It collects sets of keywords or terms that occur frequently together and then finds the association or correlation relationships among them.
18) What is web usage mining?
It mines web log records to discover user access patterns of web pages. A web server usually registers a (web) log entry or web log entry, for every access of a web page. It includes URL requested, the IP address from which the request originated and a timestamp. Analyzing and exploring regularities in web log records can identify potential customers for electronic commerce, enhance the quality and delivery of internet information services to the end user and improve web server system performance.
19) Define visual data mining.
It discovers implicit and useful knowledge from large data sets using data and/or knowledge visualization techniques.
20) What is intelligent query answering?
It employs data mining techniques to analyze the intent of a user query, providing information relevant to the query. It extends the power and usability of query processing systems.
Part B
1) Explain mining spatial databases in detail?
2) Explain mining multimedia databases in detail?
3) Explain mining time series and sequence data in detail?
4) Explain mining WWW in detail?
5) Explain the social impacts of data mining?
6) Draw a star schema of weather spatial data warehouse.
7) Explain in detail about the social impacts of applying data mining techniques?
8) Write about any two tools used in mining.
9) Write short notes on text mining.
10) List the applications of data mining and give the methodologies to implement.
No comments:
Post a Comment