Data Mining tutorial | Data Mining Basics

This Data Mining tutorial covers data mining basics including data mining architecture working, companies, applications or use cases, advantages or benefits etc. The data mining tutorial also mentions links to other resources on data mining including tools and techniques etc.

Definition:
Data Mining is the term which refers to extracting knowledge from large amount of data. It can be resembled to gold mines where in extraction of gold from sand, stones and dust from the deep mines is carried out. The examples include searching mobile phones with specific specifications or pricing from large amount of databases of amazon or flipkart etc. as well as searching specific pattern or query on google or any other search engines. This refers to extracting data of our interest from huge amount of data available in data base or data warehouse.

Data mining uses search algorithms, tools and techniques in order to provide excellent user performance. Many companies are developing software tools across the world to provide data analytics in different forms. The major data mining tools include SPSS Clementine and Intelligent Miner by IBM, MineSet by SGI, Enterprise Miner by SAS etc. Following are the use cases or applications of data mining.

Data Mining Applications or Use Cases

Following are the applications or use cases of data mining in different fields:
• Banking: loan/credit card approval based on prediction of good customers based on old customers
• Customer relationship management: identify those who are likely to leave for a competitor.
• Targeted marketing: identify likely responders to promotions
• Fraud detection: telecommunications, financial transactions from an online stream of event identify fraudulent events
• Manufacturing and production: automatically adjust knobs when process parameter changes
• Medicine: disease outcome, effectiveness of treatments
• analyze patient disease history: find relationship between diseases
• Molecular/Pharmaceutical: identify new drugs
• Scientific data analysis: identify new galaxies by searching for sub clusters
• Web site/store design and promotion: find affinity of visitor to pages and modify layout

Data mining process include all of the following steps:
• data selection• pre-processing: cleaning• transformation• mining•  result evaluation• visualization

Data Mining Architecture working

data mining architecture working

The figure-1 depicts data mining architecture working. Let us take example of searching smartphone in the range from 10K to 15K in the amazon website.

➨At the bottom, figure depicts data sources from where data is to be fetched. It include database, data warehouse, world wide web and other repositories.
➨The next step involves data cleaning, data selection and data integration. Data cleaning refers to removing unwanted data as well as removing noise from it. Parser does this cleaning operation. Data selection refers to sorting out data of interest from huge amount of available data. Data integration refers to aggregation or combining of data and storage of the integrated data in the database.
➨Data warehouse server or database server serves the requests from the user i.e. it finds, extracts and provides relevant data to the user as per interest. It refers as data mining request.
➨Data mining engine:This is very essential module in data mining system. It does various tasks which include characterization, prediction, association, co-relation analysis, classification, clustering analysis etc. It interacts with database or warehouse, knowledge base and pattern evaluation modules.

how does data mining work

➨Pattern evaluation: This component or module interacts with other modules to focus on search as per pattern given by the user. If the requested query is one among the previous ones than the results are obtained from previously stored knowledge base. This is the place where results of all search queries are stored so that next time if same pettern is searched, data mining process need not have to go through all the steps and will deliver results from its' database. The steps involved to derive this knowledge base is shown in the figure-2. It includes extraction of target data, pre-processed data, transformed data, finalization of search patterns and storage of results in a knowledge database.
➨User interface: This is the module which helps users pose search query to the data mining system to locate smartphones between 10K to 15K range. The rest is done by the data mining architecture for the user.

Benefits or Advantages of Data Mining

Following are the benefits or advantages of data mining for individuals as well as organizations.
• It helps in identifying fraudulent transactions based on user behaviour and pattern of data. This will help to banks as well as financial institutions to issue loan, credit card etc. based on user behaviour.
• It helps in getting customers who can buy the product based on relevant advertising campaign based on their previous purchase behaviour as well as search patterns on google. Hence data mining techniques such as machine learning helps in increasing sales of a business. Google and other search engines utilize this to push relevant advertisements on web pages. This will benefit customers, advertizers and marketing companies.
• It helps in improving layout of retail and other grocery stores based on customer feedback and previous purchases. This will help retail stores to keep the most sold items at right place to have the highest attention of the customers.
• It helps in searching and selecting the right product on popular e-commerce websites such as amazon, flipkart etc.

To have complete knowledge of data mining basics, refer all the links of data mining tutorial.

Data Mining Tutorial RELATED LINKS

Data Mining Tools and Techniques  cloud storage tutorial  What is Cloud Storage  Types  Infrastructure  How does it work  traditional storage vs cloud storage  Service providers  cloud storage security  cloud computing tutorial