Article published on Daniels Insights, thought leadership from Purdue’s Business School. https://business.purdue.edu/daniels-insights/
Some of the most valuable opportunities in business are also the hardest to find.
Fraud cases make up a tiny fraction of transactions. High-value B2B prospects are buried in massive databases. Niche product adopters hide among mainstream buyers. In health care, rare disease patients represent a minuscule share of the population but carry enormous clinical and financial impact.
Research published in Quantitative Marketing and Economics by Qiang Liu of the Mitch Daniels School of Business and his coauthors Yong Cai, Yunlong Wang and Fan Zhang offers a high value insight leaders should notice: when events are rare, relationships matter more than isolated data points.
The study, “Predicting rare events in markets with relational data,” focuses on identifying physicians who treat hereditary angioedema (HAE), a life-threatening genetic disorder affecting roughly 1 in 50,000 people. Using national prescription and medical claims data — including nearly 196 million patients and more than 2 million physicians — the researchers built a predictive model designed to detect extremely rare outcomes in a highly interconnected system. Their insights extend far beyond health care.
Rare events break traditional models
Most predictive systems are optimized for overall accuracy. That works when positive outcomes are common. But when the event rate is extremely low — fraud below 0.1% or conversion rates under 0.2% — traditional models often default to the majority class. They predict “no” almost all the time and still appear accurate.
The researchers addressed this by modeling the network linking physicians and patients. Patients see multiple physicians; physicians treat many patients. These connections create patterns that amplify rare signals.
The business implication calls for paying attention to relational data: stop treating customers, suppliers or partners as independent units if they clearly operate in networks. Audit your analytics processes. Where are you assuming independence? Where could shared customers, shared vendors or shared behaviors improve prediction? Incorporating relational data can surface opportunities isolated models miss.
Small improvements create outsized returns
When the team compared their network-aware model to leading machine learning benchmarks, it performed better at identifying true rare-disease physicians — including “emerging” physicians not yet formally documented.
In rare-event settings, even modest gains matter. A small lift in fraud detection can prevent significant losses. A slight improvement in targeting can uncover a handful of high-value accounts. Identifying physicians likely to see rare-disease patients can accelerate diagnosis and generate meaningful revenue. Leaders should rethink performance metrics. Instead of celebrating overall accuracy, analytics teams should focus on precision among top-ranked targets, lift over baseline and the cost of missed positives. In rare-event markets, marginal improvement often drives disproportionate value.
Indirect signals can unlock opportunity
In health care, strict privacy rules limit direct use of rare diagnostic codes. The researchers instead relied on broader behavioral and relational signals to predict which physicians were likely treating rare-disease patients.
The lesson for business is that when high-signal data is restricted, incomplete or costly, indirect patterns can still reveal opportunity. Engagement trends, transaction overlap, referral networks or usage clusters may signal emerging potential long before formal labels appear.
Rather than waiting for perfect data, leaders should ask what adjacent behaviors correlate with their target outcome and build models around those patterns.
Build for emergence, not just confirmation
Perhaps the most powerful result was the model’s ability to identify physicians likely to treat rare-disease patients in the future, not just those already recognized.
Most organizations build systems that confirm what has already happened. Fewer design analytics to identify what is about to happen.
Leaders should shift analytics from backward-looking validation to forward-looking discovery. That means designing models to surface emerging accounts, emerging risk clusters and emerging markets, not just reinforce existing segments.
The strategic takeaway
Rare events define high-impact markets. They are difficult to detect, easy to overlook and disproportionately valuable.
This research shows that the signals leaders seek can often be found in relationships rather than standalone data points. Organizations that incorporate network-aware thinking into their analytics while focusing on minority-class will have a competitive advantage others miss. It doesn’t come from predicting the average case. It comes from identifying the rare one that matters most.
Cai, Y., Liu, Q., Wang, Y., Zhang, F. (Equal Contribution) (2025).Predicting rare events in markets with relational data, Quantitative Marketing and Economics, Vol 23, 544-588.

