Renowned author and founder of the Predictive Analytics World conference series Eric Siegel shares six key definitionsâand The Five Effects of Predictionâfrom his book, Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die.
Predictive Analytics Terminology
1. Predictive analytics
Technology that learns from experience (data) to predict the future behavior of individuals in order to drive better decisions.
In this definition, individuals is a broad term that can refer to people as well as other organizational elements. Most examples in this book involve predicting people, such as customers, debtors, applicants, employees, students, patients, donors, voters, taxpayers, potential suspects, and convicts. However, predictive analytics also applies to individual companies (e.g., for business-to-business), products, locations, restaurants, vehicles, ships, flights, deliveries, buildings, manholes, transactions, Facebook posts, movies, satellites, stocks, Jeopardy! questions, and much more. Whatever the domain, PA renders predictions over scalable numbers of individuals.
2. Predictive Model
A mechanism that predicts a behavior of an individual, such as click, buy, lie, or die. It takes characteristics (variables) of the individual as input and provides a predictive score as output. The higher the score, the more likely it is that the individual will exhibit the predicted behavior.
3. Artificial Intelligence
Advanced machine capabilities that are by definition impossible to achieve since, once achieved, they have necessarily been trivialized (by way of being mechanized) and are therefore not impressive in the subjective sense of âintelligence,â so they no longer qualify. To put it another way, the word âintelligenceâ has no formal definition, so why use it in an engineering context, Howeverâ?I still feellike IBMâs Watson seems truly intelligent when watching it play the TV quiz show Jeopardy!. Iâm like, âWow!â This definition is not an excerpt from the book Predictive Analytics, but it does summarize one of my conclusions in the bookâs chapter on Watson.
4. Uplift Model
A type of predictive model that predicts the influence on an individualâs behavior that results from applying one treatment over another. Synonyms include: differential response, impact, incremental impact, incremental lift, incremental response, net lift, net response, persuasion, true lift, or true response model.
The uplift score output by and uplift model answers the question, âHow much more likely is this treatment to generate the desired outcome than the alternative treatment,â For more information, see the article Personalization Is Back: How to Drive Influence by Crunching Numbers (which includes links for further reading at the end), Chapter 7 of Predictive Analytics, and, for more technical citations, the Notes corresponding to that chapter, which may be downloaded as a PDF.
5. Vast Search
The term that industry leader (and Chapter 1âs predictive investor) John Elder coined for predictive modelingâs intrinsic automation of testing many predictor variables and the associated peril of stumbling across a correlation with the target variable that may be perceived as significantâif considered in isolation without considering the search that was employed to unearth itâbut that in fact was only due to random perturbations. Synonyms include: multiple comparisons trap, multiple hypothesis testing, researcher degrees of freedom, over-search (akin to overfit), look-elsewhere effect, the garden of forking paths, fishing expedition, cherry-picking findings, data dredging, significance chasing, and p-hacking.
For more information, see my article âHBO Teaches You How to Avoid Bad Science,â Chapter 3 of the 2016 updated edition of my book, Predictive Analytics, and, for more technical citations, the Notes corresponding to that chapter, which may be downloaded as a PDF.
6. Automatic Suspect Discovery (ASD)
In law enforcement, the identification of previously unknown potential suspects by applying predictive analytics to flag and rank individuals according to their likelihood to be worthy of investigation, either because of their direct involvement in, or relationship to, criminal activities.
Further info: This topic is explored in a special sidebar on the NSAâs use of predictive analytics within the ethics and privacy-focused chapter 2 of Predictive Analytics. Also see my Newsweek op-ed on this topic.
ASD provides a novel means to unearth new suspects. Using it, law enforcement can hunt scientifically, more effectively targeting its search by applying predictive analytics, the same state-of-the-art, data-driven technology behind fraud detection, financial credit scoring, spam filtering, and targeted marketing. ASD flags new persons of interest who may then be elevated to suspect by an ensuing investigation. By the formal law enforcement definition of the word, an individual would not be classified as a suspect by a computer, only by a l