Predictive Modeling and Their Uses

Kelvin Arellano
4 min readJan 28, 2021

Predictive analytics is the use of data, statistical algorithms and machine-learning techniques to identify the likelihood of future outcomes based on historical data.

The goal is to go beyond descriptive statistics to providing a plausible on what will happen in the future. The end result is to streamline decision making and produce new insights that lead to better actions.

Predictive models use known results to develop (or train) a model that can be used to predict values for different or new data. The modeling results in predictions that represent a probability of the target variable (for example, revenue) based on estimated significance from a set of input variables. This is different from descriptive models that help you understand what happened or diagnostic models that help you understand key relationships and determine why something happened.

The top five predictive analytic models are: Classification Model, Clustering Model, Forecast Model, Outliers Model, Time Series Model.

First the classification model is, in some ways, the simplest of the types of predictive analytics models. It puts data in categories based on what it learns from historical data.

Classification models are best to answer yes or no questions, providing broad analysis that’s helpful for guiding decisive action. These models can answer questions such as:

  • For a retailer, “Is this customer about to stop using our product?”
  • For a loan provider, “Will this loan be approved?” or “Is this applicant likely to default?”
  • For an online banking provider, “Is this a fraudulent transaction?”

The breadth of possibilities with the classification model — and the ease by which it can be retrained with new data — means it can be applied to many different industries.

Next the clustering model sorts data into separate, nested smart groups based on similar attributes. If an e-commerce shoe company is looking to implement targeted marketing campaigns for their customers, they could go through the hundreds of thousands of records to create a tailored strategy for each individual. But is this the most efficient use of time? Probably not. Using the clustering model, they can quickly separate customers into similar groups based on common characteristics and devise strategies for each group at a larger scale.

Other use cases of this predictive modeling technique might include grouping loan applicants into “smart buckets” based on loan attributes, identifying areas in a city with a high volume of crime, and benchmarking SaaS customer data into groups to identify global patterns of use.

After that we come to the forecast model. It deals in metric value prediction, estimating numeric value for new data based on learnings from historical data.

This model can be applied wherever historical numerical data is available. Scenarios include:

  • A SaaS company can estimate how many customers they are likely to convert within a given week.
  • A call center can predict how many support calls they will receive per hour.
  • A shoe store can calculate how much inventory they should keep on hand in order to meet demand during a particular sales period.

The forecast model also considers multiple input parameters. If a restaurant owner wants to predict the number of customers she is likely to receive in the following week, the model will take into account factors that could impact this, such as: Is there an event close by? What is the weather forecast? Is there an illness going around?

Penultimately we have the outliers model is oriented around anomalies in data entries . It can identify anomalies either by themselves or in conjunction with other numbers and categories.

  • Recording a spike in support calls, which could indicate a product failure that might lead to a recall
  • Finding anomalous data within transactions, or in insurance claims, to identify fraud
  • Finding unusual information in your NetOps logs and noticing the signs of impending unplanned downtime

The outlier model is particularly useful for predictive analytics in retail and finance. For example, when identifying fraudulent transactions, the model can assess not only amount, but also location, time, purchase history and the nature of a purchase (i.e., a $1000 purchase on electronics is not as likely to be fraudulent as a purchase of the same amount on books or common utilities).

Lastly we have the time series model. It comprises a sequence of data points captured, using time as the input parameter. It uses the last year of data to develop a numerical metric and predicts the next three to six weeks of data using that metric. Use cases for this model includes the number of daily calls received in the past three months, sales for the past 20 quarters, or the number of patients who showed up at a given hospital in the past six weeks. It is a potent means of understanding the way a singular metric is developing over time with a level of accuracy beyond simple averages. It also takes into account seasons of the year or events that could impact the metric.

If the owner of a salon wishes to predict how many people are likely to visit his business, he might turn to the crude method of averaging the total number of visitors over the past 90 days. However, growth is not always static or linear, and the time series model can better model exponential growth and better align the model to a company’s trend. It can also forecast for multiple projects or multiple regions at the same time instead of just one at a time.

And that’s all for now, with this hopefully you managed to get a better understanding of predictive models and how they’re used and how they’re useful.

--

--