CHAPTER ONE

1.0 INTRODUCTION

Data mining is described as the extraction of hidden helpful information from a collection of huge databases, data mining is also a technique that encompasses an enormous form of applied mathematics and computational techniques like link analysis, clustering, classification, summarizing knowledge , regression analysis and so on. data mining tools predict future trends and behaviors, permitting businesses to create knowledge-driven selections. The machine-driven, prospective analyses offered by data mining move on the far side the analyses of past events. data mining tools provides answer to business questions that were time consuming. They search databases for hidden patterns, finding useful information that is beyond the reach of specialists.

Data mining techniques is enforced speedily on existing package and hardware platforms to reinforce the worth of existing information resources, and might be integrated with new product and systems as they're brought. once enforced on high performance client/server or multiprocessing computers, data mining tools will analyze huge databases to provide answers to questions such as, "What goods consumers tend to buy the most and goods that go along side with it".

PROBLEM STATEMENT

Through in depth research and observations carried on supermarket we have discorvered that retailers are willing to know what product is purchased with the other or if a particular products are purchased together as a group of items. Which can help in their decision making with respect to placement of product , determining the timing and extent of promotions on product and also have a better understanding of customer purchasing habits by grouping customers with their transactions.

This project is aimed at designing and implementing a well-structured market basket analysis software tool to solve the problem stated above and compare the result to that of an existing software called WEKA.

1.2 AIM AND OBJECTIVE OF THE STUDY

The aim of the study is to maximize profit for the retailers by providing better services to the consumers

The objective of this study are:

· Cross-Market Analysis - Data Mining performs Association/correlations between product sales.

· Identifying Customer Requirements - helps in identifying the best products for differentcustomers. It uses prediction to find the factors that may attract new customers.

· Customer Profiling - helps to determine what kind of people buy what kind of products.

1.3 METHODOLOGY

I. Data Pre-Processing

Due to the fact that the data we are getting is a raw data,raw data in the real world may be incomplete it has to be pre-processed the raw data has to go through data cleaning,dataintegration,datanormarlization,data reduction because without a quality data there will be no quality mining results.

Ø data cleaning:This has to do withfilling of missing values, resolving of inconsistencies in the raw data.

Ø data integration:combining data from multiple sources and generating the user with unified view of the data

Ø normarlization: normalization is used to minimize or to reduce redundancy.

Ø data reduction:reduction of the data set that is much smaller in volume but yet yields the same analytical results

1.5 SCOPE OF THE STUDY

This scope of the study focuses on Babcock Ventures supermarket and the scope of this project includes:

1. We aim to develop our very own market basket analysis software, which will be used in babcock university

2. The software will exhibit a colorfulGUI(graphical user interface).

3. The software will be based onApriori .

4. We intend to conduct a research into the various branches of science that this software will be based on, such as artificial intelligence.

5. We will develop a software that will eventually stand out among other data mining software.

1.6 LIMITATION OF THE STUDY

The limitations of this software will include:

1. Data restrictions:this is a major factor that stands in the way of the execution of this project.Since there is no data on households and individual consumers ,we neglect such purchases.

2. Time constraints: this is also a major factor due to the fact that it can't work on a small amount of raw data because it tends to mislead the retailer in a nut shell this software will work on large volumes of data

MARKET BASKET ANALYSIS

Presented To

🔗 Related Topics