Skubl Ai Predicts Pregnancy Through Shopping Habits
Contents

There is a moment, weeks before a pregnancy test is taken, when something shifts.
Not in the mind. Not yet. But in the body… and quietly, consequentially, in the shopping basket.
Hormones change before awareness does. Nausea arrives before the explanation for it. The body begins adjusting to a new reality that the conscious mind hasn’t caught up with yet. And in that gap, between biological onset and human acknowledgement, a pattern emerges in purchase data that is structured, measurable, and remarkably consistent.
Skubl’s predictive intelligence engine found it.
The Question Behind the Research
The starting point was a well-known piece of retail folklore. In 2012, Forbes reported that Target’s internal analytics team had identified a cluster of product purchases, unscented lotion, certain supplements, specific food items, that reliably predicted pregnancy among shoppers. In some cases, they knew before the shopper’s own family did.
The finding was striking. The methodology was never published. The story entered popular culture as a cautionary tale about corporate surveillance rather than a serious contribution to data science.
Skubl’s research team asked a different question: could this be done transparently, reproducibly, and at scale, using publicly available data, with a published methodology, and with ethics built into the architecture rather than bolted on afterwards?
The answer, it turns out, is yes.
What the Basket Actually Reveals
Working with over 3.4 million grocery orders from the publicly available Instacart dataset — and training the model on a validated corpus of 1,000 labelled consumer records, Skubl’s engine identified a set of purchase signals that precede pregnancy with high predictive accuracy. The model achieved an AUC-ROC of 0.901: a strong result for any predictive classifier, and particularly significant given that it operates entirely on anonymised transaction sequences, with no access to demographic records, personal information, or post-hoc confirmation.
The signals themselves are revealing, not just statistically, but behaviourally.
The strongest early predictors are hormonally driven. Folic acid and prenatal vitamin purchases are the highest-weighted signals in the model, typically appearing eight to twelve weeks before any explicit pregnancy-related behaviour change. These are not conscious decisions to “announce” anything through the basket. They are instinctive responses to physiological signals the buyer may not yet have interpreted.
Nausea-associated purchases follow closely. Ginger ale, sea bands, sparkling water as a soft drink substitution, these appear six to ten weeks before explicit confirmation, and they tell a story of a body managing symptoms before the mind has named them.
Dietary composition shifts emerge in parallel. An increase in organic produce, a change in the structure of weekly shops, a broadening of what’s being bought and why, these register in the model’s Basket Composition Shift Index, a measure of how far a user’s current purchasing is diverging from their own historical baseline.
What is notably less powerful as an early signal is what many people might expect: the suppression of alcohol and tobacco. These do appear in the data, but they appear later. They reflect conscious behavioural adjustment following confirmed awareness of pregnancy. They lag the transition rather than predict it. By the time someone has stopped buying wine, the more meaningful predictive window has already opened and, in a commercial sense, already begun to close.
The basket knows first. Not because it’s reading intent. Because it’s reading life.
The Window That Opens Before Awareness
Applied to a sample of 25,181 eligible Instacart users, the model identified 4,776, approximately 19% of the scored cohort, exhibiting purchase patterns consistent with early pregnancy. This figure is consistent with population-level pregnancy prevalence rates adjusted for the younger, urban demographic profile typical of online grocery shoppers.
More significant than the number is the timing. The Basket Composition Shift Index begins to diverge from baseline patterns several order cycles before any explicitly pregnancy-associated product appears. There is a window, measurable in weeks, in which the data has already registered a transition that the consumer has not yet consciously identified or acted upon.
This is the window that changes everything for a brand.
Life transitions are the moments at which consumer loyalties are most fluid. When a person’s life fundamentally changes, through pregnancy, relocation, a new pet, bereavement, their entire consumption landscape is up for renegotiation. The brands they used before may not follow them into the next chapter. The brands that reach them at the threshold of that transition, with relevance and precision, have an extraordinary opportunity to become part of a new, durable relationship.
Reaching a consumer after they’ve announced a pregnancy, after they’ve already filled a new basket, is not the same as reaching them at the onset. The former is reactive. The latter is intelligence.
Beyond Pregnancy: A Generalisable Framework
Pregnancy is the headline finding, but it is not the limit of what this methodology can detect.
The same basket-sequence modelling framework, applied to different labelled training data and different feature signals, produces coherent predictive results for new pet ownership, residential relocation, and bereavement. Each of these transitions reshapes household consumption in structured, temporally predictable ways. Each generates signals in purchase data that precede the visible behavioural shift.
The implication is significant: this is not a pregnancy model. It is a life-stage intelligence infrastructure. Any major life transition that rewrites the shopping basket is, in principle, detectable within sequential purchase data, before it becomes visible through conventional means.
How Skubl Applies This
The research was conducted using publicly available, fully anonymised datasets. No individual was identified. No personal data was accessed. The methodology is published and reproducible, a deliberate contrast to the opaque commercial practices that have historically attracted criticism in this space.
In commercial application, Skubl’s predictive engine operates on first-party data provided by brands, with their customers’ consent. Life-stage intelligence is delivered as a probabilistic enrichment layer, a statistical signal applied to existing customer data, interpreted and actioned by Skubl’s expert team, not simply surfaced in a self-serve dashboard.
This is the distinction that matters. The technology is proprietary. But the application is bespoke, expert-led, and tailored to the specific commercial question each brand is trying to answer.
A Different Kind of Consumer Intelligence
Most consumer data tells you who someone was. What they bought last month. What category they belong to. What their lifetime value has been.
Skubl tells you who they’re becoming.
The grocery basket has always contained more information than most brands have known how to read. It doesn’t respond to demographics. It doesn’t respond to declared intent. It responds to life,
to the quiet, structural shifts that precede the moments every brand most wants to be present for.
The cart knows first.
Now, so do you.
This post summarises findings from “The Cart Knows First: Life-Stage Prediction from Large-Scale Consumer Purchase Data” — a paper by Cameron Batt, published by Skubl Research, 2025. The full paper is available at skubl.com.
All research conducted on publicly available, anonymised datasets. No individuals are identified. Skubl’s commercial applications operate on consented first-party data.
Cameron Batt
Founder @ Skubl and published machine learning researcher.