Wednesday, August 22, 2012

Data Mining & Data warehousing unit 3 2 marks with Answers and 16 mark questions

Data Mining & Data warehousing    

Data Mining & Data warehousing  unit 3 2 marks with Answers and 16 mark questions 

Unit III

Part A

 

1)      Define support in association rule mining

The rule A => B holds in the transaction set D with support s where s is the percentage of transactions in D that contain A U B i.e., both A & B. This is taken to be the probability, P (A U B).

2)      Define confidence.

The rule A => B has confidence c in the transaction set D if c is the percentage of transactions in D containing A that also contains B. This is taken to be the Conditional Probability P (B|A).

3)      Define occurrence frequency of an item set.

A set of items is referred to as an item set. The occurrence frequency of an item set is the number of transactions that contain the item set.

4)      How is association rules mined from large databases?

Association rule mining is a two step process.

·        Find all frequent item sets

·        Generate strong association rules from the frequent item sets

5)      When an item set satisfies minimum support?

An item set satisfies minimum support if the occurrence frequency of the item set is greater than or equal to the product of min_sup and the total number of transactions in D.

6)      Define minimum support count.

The number of transactions required for the item set to satisfy minimum support is therefore referred to as minimum support count. If an item set satisfies minimum support then it is a frequent item set.

7)      Give the classification of association rules.

·        Based on the types of values handled in the rule

·        Based on the dimensions of data involved in the rule.

·        Based on the levels of abstractions involved in the rule set.

·        Based on various extensions to association mining.

8)      Define Frequent Closed Item Set.

Frequent Closed Item Set is a frequent closed item set where an item set c is closed if the there exists no proper superset of c, c' such that every transaction containing c also contains c'.

9)      Define Apriori property.

If an item set I does not satisfy the minimum support threshold, min_sup then I is not frequent i.e., P (I) < min_sup. If an item A is added to the item set I then the resulting item set I U A cannot occur more frequently than I. Therefore I U A is not frequent either i.e., P (I U A) < min_sup.

10)   Define Anti-Monotone property.

If a set cannot pass a test or, all of its supersets will fail the same test as well. It is called anti-monotone because property is monotonic in the context of failing a test.

11)   List the two step process involved in Apriori algorithm.

·        Join Step

·        Prune Step

12)   List the search strategy for mining multi level associations with reduced support.

·        Level by level independent

·        Level cross filtering by single item

·        Level cross filtering by K item set.

13)   Compare Level by level independent and level cross filtering by K item set strategy.

Level by level independent strategy lead to examining numerous infrequent at low levels finding association between items of little importance.

Level cross filtering by K item set strategy allows the mining system to examine only the children of frequent K item sets. This restriction is very strong in that, there usually are not K item sets that when combined are also frequent. Hence many valuable patterns may be filtered out using this approach.

14)   Define single dimensional association rule.

Buys(X, "IBM desktop computer") => buys(X, "Sony b/w printer")

The above rule is said to be single dimensional rule since it contains a single distinct predicate (eg buys) with multiple occurrences (i.e., the predicate occurs more than once within the rule. It is also known as intra dimension association rule. 

15)   Define multi dimensional association rules.

Association rules that involve two or more dimensions or predicates can be referred to as multi dimensional associational rules.

Age(X, "20…29") ^ occupation (X, "Student") => buys (X,"Laptop")

 

The above rule contains three predicates (age, occupation, buys) each of which occurs only once in the rule. There are no repeated predicates in the above rule. Multi dimensional association rules with no repeated predicates are called interdimension association rules.

16)   Define categorical attribute.

Categorical attributes have finite number of possible values with no ordering among the values (eg. Occupation, brand). Categorical attributes are also called nominal attributes since there values are "Names of Things".

17)   Define Quantitative Attributes.

Quantitative attributes are numeric and have an implicit ordering among values (eg age, income, and price).

Part B

1)      Define Apriori algorithm in detail.

2)      Explain the techniques used to improve the efficiency of Apriori algorithm?

3)      Explain mining frequent item sets without candidate generation?

4)      Explain multi level association rules from transaction databases?

5)      Explain association rule mining in detail with an example?

  1. What are the security requirements in a data warehouse?
  2. What are the overnight operations in the data warehouse?
  3. .What is clustering? Briefly describe the following approaches: partitioning methods, hierarchical methods, density-based method an model based method?
  4. Briefly outline how to compute the dissimilarity between objects described by the following types of variables: Asymmetric binary variable, nominal variable, ratio-scaled variable, numerical variable.

10. Describe the OLAP data cub techniques.


--
Hackerx Sasi
Don't ever give up.
Even when it seems impossible,
Something will always
pull you through.
The hardest times get even
worse when you lose hope.
As long as you believe you can do it, You can.

But When you give up,
You lose !
I DONT GIVE UP.....!!!

with regards
prem sasi kumar arivukalanjiam

No comments:

Post a Comment

Slider

Image Slider By engineerportal.blogspot.in The slide is a linking image  Welcome to Engineer Portal... #htmlcaption

Tamil Short Film Laptaap

Tamil Short Film Laptaap
Laptapp

Labels

About Blogging (1) Advance Data Structure (2) ADVANCED COMPUTER ARCHITECTURE (4) Advanced Database (4) ADVANCED DATABASE TECHNOLOGY (4) ADVANCED JAVA PROGRAMMING (1) ADVANCED OPERATING SYSTEMS (3) ADVANCED OPERATING SYSTEMS LAB (2) Agriculture and Technology (1) Analag and Digital Communication (1) Android (1) Applet (1) ARTIFICIAL INTELLIGENCE (3) aspiration 2020 (3) assignment cse (12) AT (1) AT - key (1) Attacker World (6) Basic Electrical Engineering (1) C (1) C Aptitude (20) C Program (87) C# AND .NET FRAMEWORK (11) C++ (1) Calculator (1) Chemistry (1) Cloud Computing Lab (1) Compiler Design (8) Computer Graphics Lab (31) COMPUTER GRAPHICS LABORATORY (1) COMPUTER GRAPHICS Theory (1) COMPUTER NETWORKS (3) computer organisation and architecture (1) Course Plan (2) Cricket (1) cryptography and network security (3) CS 810 (2) cse syllabus (29) Cyberoam (1) Data Mining Techniques (5) Data structures (3) DATA WAREHOUSING AND DATA MINING (4) DATABASE MANAGEMENT SYSTEMS (8) DBMS Lab (11) Design and Analysis Algorithm CS 41 (1) Design and Management of Computer Networks (2) Development in Transportation (1) Digital Principles and System Design (1) Digital Signal Processing (15) DISCRETE MATHEMATICS (1) dos box (1) Download (1) ebooks (11) electronic circuits and electron devices (1) Embedded Software Development (4) Embedded systems lab (4) Embedded systems theory (1) Engineer Portal (1) ENGINEERING ECONOMICS AND FINANCIAL ACCOUNTING (5) ENGINEERING PHYSICS (1) english lab (7) Entertainment (1) Facebook (2) fact (31) FUNDAMENTALS OF COMPUTING AND PROGRAMMING (3) Gate (3) General (3) gitlab (1) Global warming (1) GRAPH THEORY (1) Grid Computing (11) hacking (4) HIGH SPEED NETWORKS (1) Horizon (1) III year (1) INFORMATION SECURITY (1) Installation (1) INTELLECTUAL PROPERTY RIGHTS (IPR) (1) Internal Test (13) internet programming lab (20) IPL (1) Java (38) java lab (1) Java Programs (28) jdbc (1) jsp (1) KNOWLEDGE MANAGEMENT (1) lab syllabus (4) MATHEMATICS (3) Mechanical Engineering (1) Microprocessor and Microcontroller (1) Microprocessor and Microcontroller lab (11) migration (1) Mini Projects (1) MOBILE AND PERVASIVE COMPUTING (15) MOBILE COMPUTING (1) Multicore Architecute (1) MULTICORE PROGRAMMING (2) Multiprocessor Programming (2) NANOTECHNOLOGY (1) NATURAL LANGUAGE PROCESSING (1) NETWORK PROGRAMMING AND MANAGEMENT (1) NETWORKPROGNMGMNT (1) networks lab (16) News (14) Nova (1) NUMERICAL METHODS (2) Object Oriented Programming (1) ooad lab (6) ooad theory (9) OPEN SOURCE LAB (22) openGL (10) Openstack (1) Operating System CS45 (2) operating systems lab (20) other (4) parallel computing (1) parallel processing (1) PARALLEL PROGRAMMING (1) Parallel Programming Paradigms (4) Perl (1) Placement (3) Placement - Interview Questions (64) PRINCIPLES OF COMMUNICATION (1) PROBABILITY AND QUEUING THEORY (3) PROGRAMMING PARADIGMS (1) Python (3) Question Bank (1) question of the day (8) Question Paper (13) Question Paper and Answer Key (3) Railway Airport and Harbor (1) REAL TIME SYSTEMS (1) RESOURCE MANAGEMENT TECHNIQUES (1) results (3) semester 4 (5) semester 5 (1) Semester 6 (5) SERVICE ORIENTED ARCHITECTURE (1) Skill Test (1) software (1) Software Engineering (4) SOFTWARE TESTING (1) Structural Analysis (1) syllabus (34) SYSTEM SOFTWARE (1) system software lab (2) SYSTEMS MODELING AND SIMULATION (1) Tansat (2) Tansat 2011 (1) Tansat 2013 (1) TCP/IP DESIGN AND IMPLEMENTATION (1) TECHNICAL ENGLISH (7) Technology and National Security (1) Theory of Computation (3) Thought for the Day (1) Timetable (4) tips (4) Topic Notes (7) tot (1) TOTAL QUALITY MANAGEMENT (4) tutorial (8) Ubuntu LTS 12.04 (1) Unit Wise Notes (1) University Question Paper (1) UNIX INTERNALS (1) UNIX Lab (21) USER INTERFACE DESIGN (3) VIDEO TUTORIALS (1) Virtual Instrumentation Lab (1) Visual Programming (2) Web Technology (11) WIRELESS NETWORKS (1)

LinkWithin