Data Mining In Brief

Elements of a generic LCS algorithm[edit]Data mining is a very popular topic nowadays. Unlike a few years ago, everything is usually bind with data now and are capable of handling these kinds of large information well.

By collecting and checking these data, people were able to find out some patterns. Even the whole information set is a junk, there are some concealed patterns that can be extracted by merging multiple data sources to provide important insights. This is called as Data Mining.

Data mining is often coupled with various sources of data including business data that is secured by a business and has privacy issues and occasionally multiple sources are integrated which includes third party data, customer demographics plus financial data etc . The quantity of data available is a critical aspect here. Since we are going to discover styles in sequential or non-sequential information, correlations, to determine if the amount of acquired data is of good quality, as much as information available is good.

Let’s begin with an example. Assume we got several data related to login logs for any web application. As a whole, it of data has no value. It may contain the username of a consumer, login timestamp, time spent in order to log out, activities have done and so forth

If we take an overview understand this, it is a whole mess. But we are able to analyze this to do extract several useful information.

For example, this particular data can be used to find out a regular routine of a particular user. Further, it can help to find out the peak hours from the system. This extracted information may be used to increase the efficiency of the system plus make more user-friendly.

However, information mining is not a simple task. It takes a certain amount of time and it needs a special procedure as well.

Data Mining Steps

The basic steps associated with data mining are follows

Data Collection
Data Cleaning
Data Analysis
Data collection —â€? ******************************************) 1st step is to collect some information. As much as information we now have is good to make the analysis easier afterwards. We have to make sure that the source associated with data is reliable.
Data cleaning —â€? **********************************************) we are getting a large amount of information, we need to make sure that we only have the required data and remove the unwanted. Otherwise, they may lead us to fake conclusions.
Data Analysis —â€? *******************************************************************************) title says the analysis and obtaining patterns is done here
Interpretation —â€? ****************************************************************) the analyzed data is viewed to take important conclusions like forecasts
Data mining Models

There are very different kinds of models associated with data mining

  1. Descriptive modeling
  2. Predictive modeling
  3. Prescriptive modeling

In Descriptive Modeling, this detects the similarities between the gathered data and the reasons behind them. This is very important in constructing the final bottom line from the data set.

Predictive Modeling is used to analyze the past data plus predict the future behavior. Past information give some kind of hint about the potential future.

With the significant development of internet, text mining has added like a related discipline to data exploration. It is required to process, filter plus analyze data properly to create like predictive models.

Applications of Data Mining

Data mining is useful in lots of ways. For marketing, it can be applied successfully. Using data mining we can evaluate the behavior of customers and we can do marketing by getting more close to all of them.

It will help to identify trends of shoppers for goods in the market and it enables the retailer to understand the buy behavior of a buyer.

In education and learning domain we can identify the learning actions of students and the learning organizations can upgrade their modules plus courses accordingly.

We can use information mining to solve natural disasters too. If we can collect some info, we can use them to predict things such as land sliding, rainfall, tsunami and so forth

There are much more applications within data mining nowadays. They can differ from very simple things like marketing in order to very complex domains like producing environmental disaster predictions etc .

Special Remarks

  • Data mining should not be utilized when complete, accurate solution for any particular problem is possible. When like solution is not possible we can use information mining techniques with lots of data in order to characterize the problem as input-output partnership.
  • Need to analyze the problem home to determine whether it is a Classification (discrete output Ex: True or False) or Estimation (continuous output Ex: real numbers between 0, 1) problem.
  • The inputs must have sufficient information to make an accurate result. Otherwise it will lead to an unavoidable decrease.
  • There should be sufficient data make an accurate result. Need to select a proper algorithm according to the insight data. Some algorithms need wide range of data to reach a good accuracy while some reach quickly.
  • (****

Leave a Reply