When you run a pricing experiment, you're trying to answer a deceptively simple question: if I charge X instead of Y, will I make more or less money? The challenge is that the answer depends entirely on which customers you're asking about. Your ProductHunt launch buyers, your AppSumo deal customers, and your organic search visitors are three completely different populations — with different willingness to pay, different reasons for buying, and different responses to price changes.
A pricing model that ignores these differences will give you wrong answers. This guide explains what a cohort is in the pricing context, why they need to be treated separately, and how cohort-aware simulations produce more accurate price recommendations for indie founders with limited data.
What Is a Cohort in Pricing?
In general analytics, a cohort is a group of users who share a common characteristic at a specific point in time. In pricing specifically, the most relevant cohort definition is: groups of buyers who came in through the same channel and at the same approximate price point.
Here's why this matters: buyers who come through different channels have fundamentally different price sensitivities. Let's say your product normally sells for $49. You run an AppSumo deal that puts it in front of deal-hunters at $29. The buyers who respond to AppSumo are, by definition, more price-sensitive than your organic buyers — they found you because of a discount, not because they were searching for a solution to a specific problem.
If you mix these cohorts in a price elasticity model, the model sees high sales volume at $29 and lower sales at $49 and concludes: "this product has high price elasticity, demand drops a lot when price is high." But that's wrong. Your organic buyers at $49 represent your steady-state demand, and they're much less price-sensitive than the AppSumo cohort suggested.
The consequence of a mixed-cohort model: your price recommendation will be too conservative. It'll suggest prices closer to your current level than your data actually supports, because it's been contaminated by price-sensitive promotional buyers.
The Common Cohort Patterns in Digital Product Sales
1. Organic cohort
Buyers who found your product through search, word of mouth, or organic social. This is your baseline — the cohort that best represents "what will happen if I raise my price for a normal buyer?" Organic buyers tend to have the highest product-market fit and the lowest price sensitivity.
2. Launch cohort
Buyers who arrived during a ProductHunt launch, newsletter feature, or viral moment. These buyers are often early adopters: more price-tolerant on one hand (they're excited about new things), but also skeptical (they've seen dozens of launches and know many products don't last). Their behavior often doesn't predict steady-state demand.
3. Promotional cohort
Buyers who responded to a discount, sale, or bundle deal. These are definitionally price-sensitive. Their conversion rate at the promotional price tells you very little about conversion rate at your standard price.
4. Referral cohort
Buyers who came through a partner, affiliate, or recommendation. These buyers often have higher trust (they came via a recommendation) and may be willing to pay more, or may expect the "friend discount" that referral programs often provide.
How Cohort Mixing Distorts Your Elasticity Estimate
Price elasticity of demand (ε) measures how demand changes in response to price changes. Formally:
ε = % change in quantity demanded / % change in price
If ε = -1.0, a 10% price increase causes a 10% demand decrease — revenue stays roughly flat. If ε = -0.5, a 10% price increase only reduces demand by 5% — revenue increases. If ε = -2.0, a 10% price increase causes a 20% demand decrease — revenue falls significantly.
Now consider what happens when you mix your organic cohort (ε ≈ -0.6) with a promotional cohort (ε ≈ -2.5) in the same dataset. The blended elasticity might be ε ≈ -1.2 — worse than your organic baseline and systematically biased toward price sensitivity. Your model then recommends more conservative price tests than your actual organic data would justify.
This isn't a theoretical concern. In practice, AppSumo deals can drive 5–10× your normal monthly sales in a single campaign period. A single AppSumo deal in your dataset can dominate your elasticity estimate and make your organic buyers look 3–4× more price-sensitive than they actually are.
MAD Spike Detection: The First Line of Defense
PricingSim uses a Median Absolute Deviation (MAD) filter to automatically identify and flag sales spikes before estimating elasticity. Here's how it works:
For each transaction period, we calculate the daily sales rate (units per day). We then compute the median daily sales rate across all periods. The MAD is the median of the absolute deviations from that median:
MAD = median(|qty_i - median(qty)|)
// Modified Z-score for each period:
z_i = 0.6745 × |qty_i - median(qty)| / MAD
// Flag as spike if z_i > 3.0
The 0.6745 scaling factor makes the MAD score comparable to standard deviations for normally distributed data. The threshold of 3.0 corresponds to roughly 3 standard deviations — flagging only extreme outliers.
Why MAD instead of standard deviation? Because standard deviation is itself inflated by outliers. If you have one period with 200× normal sales (an AppSumo spike), the standard deviation becomes huge, and the spike no longer looks extreme relative to the inflated baseline. MAD is "robust to outliers" — it uses the median, which can't be moved by extreme values.
After spike removal, PricingSim also allows cohort tagging. If you know that a particular sales period was a promo deal, you can tag it directly and it's excluded from the elasticity estimation regardless of whether the spike detector flags it.
The Cohort-Aware Simulation Model
Once spikes are removed, the simulation model estimates elasticity from the cleaned, organic-cohort data. It uses a Normal-InvGamma Bayesian conjugate model with the following structure:
The model assumes:
- Log(quantity change) = ε × Log(price change) + noise
- ε has a prior distribution: Normal(-1.0, 0.5²)
- Noise variance has an InvGamma prior: InvGamma(3, 0.5)
The prior on ε is centered at -1.0 because most digital products have near-unit elasticity. But with standard deviation 0.5, it allows meaningful probability mass from -2.5 to 0 — covering both highly elastic products (>|1|) and inelastic ones.
As organic transaction data accumulates, the posterior distribution updates. With 5 data points, the posterior looks very similar to the prior — the data barely moves the estimate. With 50 data points, the data dominates — the prior's influence is small. This is exactly the right behavior: be cautious when you have little data, be confident when you have a lot.
Interpreting Percentile Outcomes
The simulation produces not just a point estimate of expected revenue at the test price, but a full probability distribution. PricingSim reports this as five percentiles: p05, p25, p50, p75, p95.
How to read them:
- p50 (median): The most likely revenue outcome. "If I had to pick one number, this is it."
- p05: The 5th percentile — your downside scenario. "In the worst 5% of outcomes, revenue would be X." This is the floor used in the conservative recommendation rule.
- p95: The 95th percentile — your upside scenario. "In the best 5% of outcomes, revenue would be Y."
The recommendation to "test" a price is only made if p05 >= 0.95 × current_revenue. This means that even in the pessimistic scenario, you don't lose more than 5% of your current revenue. It's a safety constraint, not a performance target.
Worked Example With Numbers
Suppose you have 12 months of Gumroad data for a $39 template. Here are your monthly sales figures:
- Jan–Mar (organic): 3, 4, 2 sales/month at $39
- April (ProductHunt launch at $19): 47 sales
- May–Jun (post-launch at $39): 5, 3 sales/month
- Jul–Sep (organic at $39): 4, 3, 5 sales/month
- Oct (AppSumo deal at $29): 31 sales
- Nov–Dec (organic at $39): 4, 3 sales/month
The MAD spike filter identifies April (47 sales) and October (31 sales) as spikes. Excluding these, your organic dataset has 10 periods with prices of $39 and average sales of 3.6/month.
Your elasticity estimate from this organic data is approximately ε = -0.7 (you only have one price point, so the prior dominates: the posterior is approximately Normal(-0.85, 0.35²)).
Testing a price of $49 (26% increase): predicted demand = 3.6 × (49/39)^(-0.85) ≈ 3.0 sales/month. Predicted revenue: $49 × 3.0 = $147/month vs current $140/month. A modest +5% expected lift.
At p05 (ε = -1.4): demand = 3.6 × (49/39)^(-1.4) ≈ 2.6 sales → $127/month — 9% below baseline. This fails the 5% downside floor. The system recommends a more conservative test: $45 instead of $49.
At $45 (15% increase): p50 revenue = $45 × 3.2 = $144/month (+3%). p05 revenue = $45 × 2.9 = $130/month (-7%). Still fails the downside floor, but barely. The system might recommend testing $43 instead.
This iterative tightening is exactly what the conservative optimizer does: it finds the highest price where the downside scenario still stays within 5% of your baseline. For data-sparse situations like this (10 organic periods, 1 price point), the recommendation is conservative by design. More data and more price variation gives tighter elasticity estimates and more aggressive recommendations.
Why This Matters in Practice
The difference between a naive elasticity model and a cohort-aware one isn't just academic. In the example above, a naive model including the April and October spikes would estimate ε ≈ -1.8 (very price sensitive) and recommend keeping prices at or below $39. The cohort-aware model, using only organic data, estimates ε ≈ -0.85 and recommends testing $43–$45.
If the cohort-aware model is right (and it usually is), the naive model is leaving 10–15% revenue on the table every month by being too conservative. Over 12 months, that's real money.
For founders doing $500–$5,000 MRR, a 10% revenue increase from a smarter pricing model can be the difference between a hobby project and a real business. That's the practical value of getting the statistics right.