Wednesday, August 22, 2012

Data Mining & Data warehousing unit 1 2 marks with Answers and 16 mark questions

Data Mining & Data warehousing    

Data Mining & Data warehousing  unit 1 2 marks with Answers and 16 mark questions 

Unit I

Part A

  1. What is Data mining?

Data mining refers to extracting or "mining" knowledge from large amount of data. It is considered as a synonym for another popularly used term Knowledge Discovery in Databases or KDD.

  1. Give the steps involved in KDD.

KDD consists of the iterative sequence of the following steps:

v     Data cleaning

v     Data integration

v     Data selection

v     Data transformation

v     Data mining

v     Pattern Evaluation

v     Knowledge Presentation

  1. Give the architecture of a typical data mining system.

The architecture of a typical data mining system consists of the following components:

v     Database, data warehouse, or other information repository

v     Database or data warehouse server

v     Knowledge base

v     Data mining engine

v     Pattern Evaluation module.

v     Graphical user interface.

  1.  Define Database management system.

A database system also called database management system consists of a collection of interrelated data known as a database and a set of software programs to manage and access the data.

  1. Define relational database.

Relational database is a collection of tables each of which is assigned a unique name. Each table consists of a set of attributes (columns or fields) and usually stores a large set of tuples (record or rows).

  1. Define data warehouse

Data warehouse is a repository of information collected from multiple sources stored under a unified schema and which usually resides at a single site. It is constructed via a process of data cleaning, data transformation, data integration, data loading and periodic data refreshing.

  1. Define data mart and compare it with data warehouse.

Data mart is a department subset of a data warehouse. It focuses on selected subjects and thus its scope is department wide. On the other hand data warehouse collects information about subjects that span an entire organization and thus its scope is department wide.


  1. Define transaction databases.

A transaction database consists of a file where each record represents a transaction. A transaction typically includes a unique transaction identity number and a list of items making up the transaction.

9.      Explain object oriented databases.

Object oriented databases are based on object-oriented programming paradigm where each entity is considered as an object. Each object has e associated with it the following:

v     A set of variables

v     A set of messages

v     A set of methods.

  1. Explain spatial databases.

Spatial databases contain spatial-related information. Such databases include geographic databases, VLSI chip design databases, medical and satellite image databases. Spatial data are represented in raster format consisting of n-dimensional bit maps or pixel maps. Maps are represented in vector format where roads, bridges are represented as a union of basic geometric constructs such as points, lines, polygons etc.

  1. Explain temporal and time-series databases.

A temporal database usually stores relational data that include time-related attributes. These attributes may involve several timestamps each having different semantics.A time-series database stores sequence of values that change with time such as data collected regarding the stock exchange.

  1. Explain text databases and multimedia databases.

Text databases are databases that contain word description for objects. These descriptions are long sentences or paragraphs such as product specifications, error or bug reports etc.Multimedia databases store image, audio, and video data. They are used in applications such as picture content based retrieval, voice mail systems, www, etc.

  1. Define legacy databases.

A legacy database is a group of heterogenous databases that combines different kinds of data systems such as relational or objects oriented databases, hierarchical databases, or file systems.

  1.  Give the classification of Data Mining tasks

Descriptive – Characterizes the general property of the data in the database.

Predictive – perform inference on the current data in order to make predictions.

  1. Describe class/concept description.

Data can be associated with classes or concepts. The individual classes can be described in summarized, concise, and yet precise terms. Such descriptions of a class or a concept are called class/concept descriptions. These descriptions can be derived via data characterization or data discrimination.

  1. Define data characterization.

It is a summarization of the general characteristics or feature of a target class of data. The data corresponding to the user-specified class are typically collected by a database query.

  1. Give the output forms of data characterization.

Pie charts, bar charts, curves, multidimensional data cubes and multidimensional tables including cross tabs. The resulting descriptions can also be presented as generalized relations or in rule form called characteristic rule.

  1. Define data discrimination.

It is a comparison of the general features of target data objects with the general features of objects from one or a set of contrasting classes. The target and contrasting classes are specified by the user and the corresponding data objects retrieved through database queries.

  1. What is an association analysis?

Association analysis is the discovery of association rules showing attribute-value conditions that occur frequently together in a given set of data. It is widely used for market basket or transaction data analysis.

  1. Define Classification.

It is the process of finding set of models that describe and distinguish data classes or concepts for the purpose of being able to use the model to predict the class objects whose class label is unknown. The derived model is based on the analysis of a set of training data.

Part B

1)      Define data mining. Describe the steps Involved in data mining when viewed as a process of knowledge discovery. Explain the architecture of the data mining system?

2)      Describe the kinds of data on which data mining is performed?

3)      Briefly explain the kinds of patterns that can be mined?

4)      Give the classification of data mining system. Describe the issues related to data mining.

5)      Define data warehouse. Explain its features. Differentiate operational database systems and data warehouses?

6)      Briefly describe star snowflake and fact constellations schemas with examples?

7)      Explain data warehouse architecture in detail?

8)      How a fact table is to be designed for data warehouse process?

9)      Explain the steps to be involved in designing the dimension table?

10)  Write briefly about the horizontal partitioning strategy

11)  Explain about vertical partitioning strategy

12)  Explain about hardware partitioning strategy

Hackerx Sasi
Don't ever give up.
Even when it seems impossible,
Something will always
pull you through.
The hardest times get even
worse when you lose hope.
As long as you believe you can do it, You can.

But When you give up,
You lose !
I DONT GIVE UP.....!!!

with regards
prem sasi kumar arivukalanjiam

No comments:

Post a Comment


Image Slider By The slide is a linking image  Welcome to Engineer Portal... #htmlcaption

Tamil Short Film Laptaap

Tamil Short Film Laptaap


About Blogging (1) Advance Data Structure (2) ADVANCED COMPUTER ARCHITECTURE (4) Advanced Database (4) ADVANCED DATABASE TECHNOLOGY (4) ADVANCED JAVA PROGRAMMING (1) ADVANCED OPERATING SYSTEMS (3) ADVANCED OPERATING SYSTEMS LAB (2) Agriculture and Technology (1) Analag and Digital Communication (1) Android (1) Applet (1) ARTIFICIAL INTELLIGENCE (3) aspiration 2020 (3) assignment cse (12) AT (1) AT - key (1) Attacker World (6) Basic Electrical Engineering (1) C (1) C# AND .NET FRAMEWORK (11) C++ (1) Calculator (1) C Aptitude (20) Chemistry (1) Cloud Computing Lab (1) Compiler Design (8) Computer Graphics Lab (31) COMPUTER GRAPHICS LABORATORY (1) COMPUTER GRAPHICS Theory (1) COMPUTER NETWORKS (3) computer organisation and architecture (1) Course Plan (2) C Program (88) Cricket (1) cryptography and network security (3) CS 810 (2) cse syllabus (29) Cyberoam (1) DATABASE MANAGEMENT SYSTEMS (8) Data Mining Techniques (5) Data structures (3) DATA WAREHOUSING AND DATA MINING (4) DBMS Lab (11) Design and Analysis Algorithm CS 41 (1) Design and Management of Computer Networks (2) Development in Transportation (1) Digital Principles and System Design (1) Digital Signal Processing (15) DISCRETE MATHEMATICS (1) dos box (1) Download (1) ebooks (12) electronic circuits and electron devices (1) Embedded Software Development (4) Embedded systems lab (4) Embedded systems theory (1) ENGINEERING ECONOMICS AND FINANCIAL ACCOUNTING (5) ENGINEERING PHYSICS (1) Engineer Portal (1) english lab (7) Entertainment (1) Facebook (2) fact (31) FUNDAMENTALS OF COMPUTING AND PROGRAMMING (3) Gate (3) General (3) Global warming (1) GRAPH THEORY (1) Grid Computing (11) hacking (4) HIGH SPEED NETWORKS (1) Horizon (1) III year (1) INFORMATION SECURITY (1) Installation (1) INTELLECTUAL PROPERTY RIGHTS (IPR) (1) Internal Test (13) internet programming lab (20) IPL (1) Java (38) java lab (1) Java Programs (28) jdbc (1) jsp (1) KNOWLEDGE MANAGEMENT (1) lab syllabus (4) MATHEMATICS (3) Mechanical Engineering (1) Microprocessor and Microcontroller (1) Microprocessor and Microcontroller lab (11) Mini Projects (1) MOBILE AND PERVASIVE COMPUTING (15) MOBILE COMPUTING (1) Multicore Architecute (1) MULTICORE PROGRAMMING (2) Multiprocessor Programming (2) NANOTECHNOLOGY (1) NATURAL LANGUAGE PROCESSING (1) NETWORKPROGNMGMNT (1) NETWORK PROGRAMMING AND MANAGEMENT (1) networks lab (16) News (14) Nova (1) NUMERICAL METHODS (2) Object Oriented Programming (1) ooad lab (6) ooad theory (9) openGL (10) OPEN SOURCE LAB (22) Openstack (1) Operating System CS45 (2) operating systems lab (20) other (4) parallel computing (1) parallel processing (1) PARALLEL PROGRAMMING (1) Parallel Programming Paradigms (4) pdf (1) Perl (1) Placement (3) Placement - Interview Questions (64) PRINCIPLES OF COMMUNICATION (1) PROBABILITY AND QUEUING THEORY (3) PROGRAMMING PARADIGMS (1) Python (3) Question Bank (1) question of the day (8) Question Paper (13) Question Paper and Answer Key (3) Railway Airport and Harbor (1) REAL TIME SYSTEMS (1) RESOURCE MANAGEMENT TECHNIQUES (1) results (3) semester 4 (5) semester 5 (1) Semester 6 (5) SERVICE ORIENTED ARCHITECTURE (1) Skill Test (1) software (1) Software Engineering (4) SOFTWARE TESTING (1) Structural Analysis (1) syllabus (34) SYSTEMS MODELING AND SIMULATION (1) SYSTEM SOFTWARE (1) system software lab (2) Tansat (2) Tansat 2011 (1) Tansat 2013 (1) TCP/IP DESIGN AND IMPLEMENTATION (1) TECHNICAL ENGLISH (7) Technology and National Security (1) Theory of Computation (3) Thought for the Day (1) Timetable (4) tips (4) Topic Notes (7) tot (1) TOTAL QUALITY MANAGEMENT (4) tutorial (8) Ubuntu LTS 12.04 (1) Unit Wise Notes (1) University Question Paper (1) UNIX INTERNALS (1) UNIX Lab (21) USER INTERFACE DESIGN (3) VIDEO TUTORIALS (1) Virtual Instrumentation Lab (1) Visual Programming (2) Web Technology (11) WIRELESS NETWORKS (1)