In Part 1 of this series, we covered the five foundational metrics every merchant should track: revenue and margin, customer lifetime value, inventory turnover, conversion rate, and acquisition cost. Those metrics are the starting line. They tell you if the business is healthy at a glance.

But they don't answer the harder questions. Why are customers leaving after their second purchase? Which products are actually profitable after all costs? Are your discounts driving new revenue or just subsidizing sales that would have happened anyway?

This guide covers five intermediate analytics techniques that answer those questions. Each one builds on the foundation from Part 1, and none of them require a data science degree — just your order data and a willingness to look beyond the surface.

Cohort analysis: grouping customers by behavior over time

Your average customer retention rate is a lie. Not because the math is wrong, but because it blends every customer into one number. A customer acquired through a viral Instagram post behaves differently from one who found you via Google Shopping. Averaging them together obscures both stories.

Cohort analysis fixes this by grouping customers based on when they first purchased, then tracking each group's behavior over time.

How it works

A cohort is a group of customers who share a common starting point — usually the month of their first purchase. You create one cohort per month and then measure what percentage of each cohort makes a repeat purchase in subsequent months.

The result is a retention table that looks something like this:

Cohort	Month 0	Month 1	Month 2	Month 3	Month 4	Month 5
Jan	100%	22%	15%	12%	11%	10%
Feb	100%	18%	11%	8%	7%	—
Mar	100%	25%	18%	14%	—	—
Apr	100%	28%	20%	—	—	—

Month 0 is always 100% — every customer in the cohort made at least one purchase (that's how they entered the cohort). The subsequent columns show what percentage came back.

How to read the table

Look for two things:

The drop from Month 0 to Month 1. This is your first-to-second purchase conversion. In the table above, January lost 78% of its customers after the first purchase. That's normal for most ecommerce — but it's also your biggest leverage point. Moving that number from 22% to 28% (as the April cohort achieved) has a massive impact on LTV.

Whether later cohorts perform better or worse than earlier ones. If March and April have better Month 1 retention than January, something improved — maybe a post-purchase email sequence, a better unboxing experience, or a product quality change. If retention is declining cohort over cohort, something is breaking.

Actionable takeaways

Compare cohorts that received different marketing treatments, post-purchase flows, or product assortments
If your Month 1 retention is below 20%, focus all effort on the first-to-second purchase gap before investing in anything else
Track cohorts by acquisition channel, not just by month — a cohort of customers from paid ads will behave differently from organic search customers

RFM segmentation: scoring your customers on what matters

RFM stands for Recency, Frequency, and Monetary value. It's a framework for scoring every customer on three dimensions so you can treat different segments differently instead of blasting everyone with the same email.

The scoring methodology

Each customer gets a score from 1 to 5 on each dimension:

Recency (R): How recently did they last purchase? A customer who bought 3 days ago scores a 5. One who last bought 200 days ago scores a 1.

Frequency (F): How many times have they purchased in a given period (typically 12 months)? Someone with 10+ orders is a 5. A one-time buyer is a 1.

Monetary (M): How much have they spent total? Your top spenders score a 5. Your smallest orders score a 1.

The thresholds between scores depend on your business. A subscription coffee brand and a furniture retailer will have very different definitions of "high frequency." The approach is to divide your customer list into five equal groups (quintiles) for each dimension, so each score bucket contains roughly 20% of your customers.

The result is a three-digit score for each customer. A 5-5-5 is your champion: they bought recently, they buy often, and they spend a lot. A 1-1-1 is long gone.

Practical segments that drive action

You don't need to manage all 125 possible score combinations. Map them into five or six actionable segments:

Champions (R:5, F:4-5, M:4-5): Your best customers. Protect them with loyalty perks, early access, and personal attention. Never send them a discount — they're already buying at full price.

Loyal Customers (R:3-5, F:3-5, M:3-5): Consistent buyers who haven't quite reached champion status. Upsell and cross-sell to increase their monetary score. Recommend higher-value products or complementary items.

At-Risk (R:2, F:3-5, M:3-5): Previously good customers whose recency is slipping. They used to buy regularly but haven't purchased recently. These need a win-back campaign — now, before they slide further.

New Customers (R:4-5, F:1, M:1-3): They just bought for the first time. Everything you do in the next 30 days determines whether they become loyal or disappear. Focus on post-purchase experience, not more marketing.

Lost (R:1, F:1-2, M:1-2): Bought once or twice a long time ago and never came back. Don't spend heavily trying to revive them. A single re-engagement email is fine; beyond that, reallocate the budget to segments that respond.

Why this beats simple "top customers" lists

A top-customers-by-revenue list will include a customer who spent $2,000 in a single order eight months ago and hasn't returned. That customer is not your "best" customer — they're a one-timer. RFM catches this because their frequency and recency scores are low despite a high monetary score. The distinction matters for how you allocate marketing spend.

Contribution margin analysis: what you actually keep

Gross margin — revenue minus cost of goods sold — is a good start. But it doesn't tell you what you actually keep from a sale. Contribution margin goes further by subtracting every variable cost tied to that transaction.

The full calculation

Contribution Margin = Revenue - COGS - Shipping Cost - Transaction Fees - Packaging - Discount Applied - Returns Allowance

Here's an example for a $50 order:

Line Item	Amount
Revenue	$50.00
COGS	-$18.00
Shipping (merchant-paid)	-$5.50
Payment processing (2.9% + $0.30)	-$1.75
Packaging materials	-$1.20
Discount applied (10% off)	-$5.00
Returns allowance (8% avg return rate)	-$4.00
Contribution margin	$14.55
Contribution margin %	29.1%

Most merchants are surprised by this number. The headline gross margin looks like 64% ($50 - $18 = $32). The contribution margin is 29.1%. That's the real number — the amount left to cover fixed costs (rent, salaries, software) and profit.

Where to apply this

Product-level contribution margin: Run this calculation for every SKU. You'll find products with high revenue but low or negative contribution margins — often because of high return rates, heavy shipping costs, or deep discounts. These "revenue heroes" might actually be dragging your business down.

Channel-level contribution margin: Different channels carry different variable costs. Your Shopify store has payment processing fees but no marketplace commission. Your Amazon channel has referral fees of 8-15%. Your Square POS sales avoid shipping costs entirely. Compare contribution margins across channels to understand where your money actually comes from.

Customer-level contribution margin: Combine contribution margin with RFM data. A customer with a high monetary score but a habit of buying only during sales and returning 30% of orders might have a negative contribution margin. Knowing this changes how much you're willing to spend to retain them.

Basket analysis: what your customers buy together

Basket analysis examines co-purchase patterns — which products appear together in the same order. It turns your transaction data into cross-sell and bundling opportunities.

The three metrics that matter

Support: How often does a product pair appear across all orders? If 800 out of 10,000 orders contain both Product A and Product B, the support is 8%. This tells you how common the pairing is.

Confidence: Of all orders containing Product A, what percentage also contain Product B? If 2,000 orders contain Product A and 800 of those also contain Product B, the confidence is 40%. This tells you how predictable the relationship is.

Lift: Does buying Product A actually increase the probability of buying Product B beyond random chance? A lift above 1.0 means the association is real. Below 1.0, the products just happen to both be popular — there's no genuine affinity.

Finding cross-sell opportunities

Run basket analysis on your last 6-12 months of order data. Sort product pairs by confidence and filter for lift above 1.5. The results typically fall into three categories:

Obvious pairs you already know: Shampoo and conditioner. Phone and case. These validate your existing merchandising but don't reveal anything new.

Surprising pairs you didn't expect: A skincare store might discover that 38% of Vitamin C serum buyers also buy a specific SPF product — not the best-seller, but a mid-range option that gets no promotion. These unexpected affinities are where revenue hides.

One-directional relationships: Sometimes buying A predicts buying B, but not the reverse. Coffee maker buyers often add a grinder. Grinder buyers don't necessarily add a coffee maker. This asymmetry determines where you place the cross-sell recommendation.

Acting on the data

Bundle high-confidence pairs with a small discount — enough to feel like a deal, not enough to erode margins
Reposition products that frequently co-occur closer together on your site (or in your physical store)
Build email flows around the relationship: "You bought X — customers who bought X also love Y"
Adjust inventory planning for paired products — if one is trending up, its partner likely needs restocking too

Discount effectiveness: measuring what discounts actually do

Discounts feel like they work. Revenue spikes during a sale. Orders increase. The dashboard looks great. But the question most merchants never answer is: how much of that revenue was incremental, and how much would have happened anyway?

Incremental vs. cannibalized sales

Incremental revenue is the revenue you would not have earned without the discount. A new customer who discovers your store through a 15% off promotion and makes their first purchase — that's incremental.

Cannibalized revenue is revenue that shifts from full price to discounted price. A loyal customer who was going to buy this week anyway but waits for your Saturday sale — that's cannibalization. You got the same sale at a lower margin.

How to measure discount effectiveness

Compare discounted vs. non-discounted periods. Track revenue, order volume, and new customer acquisition during promotional periods and equivalent non-promotional periods. If a 20% off sale generates 40% more orders but 60% of those customers are existing buyers, the incremental lift is much smaller than it appears.

Track the pre-sale dip and post-sale hangover. If revenue drops in the days before a sale (customers waiting) and the days after (demand pulled forward), the net impact of the promotion is smaller than the sales-period spike suggests. Plot daily revenue for two weeks before, during, and after each promotion.

Calculate the break-even discount rate. For a product with a 50% contribution margin, a 20% discount requires a 67% increase in unit sales just to break even on gross margin. The formula:

Break-even volume increase = Discount % / (Contribution margin % - Discount %)

At 50% margin and 20% discount: 20% / (50% - 20%) = 67%. If your volume doesn't increase by at least 67%, you lost money on the promotion.

Segment discount performance by customer type. Using your RFM segments, measure how much discount revenue comes from each group. If most of your promotional revenue comes from Champions and Loyal Customers who would have purchased anyway, the discount is destroying margin. If it's driving New Customer acquisition, it might be worth it.

Building a discount policy

Based on effectiveness data, set rules:

Cap discount frequency to avoid training customers to wait for sales
Reserve deep discounts (20%+) for clearance and dead stock only
Use targeted discounts (emailed to At-Risk segments) instead of site-wide promotions
Track every promotion's incremental revenue contribution, not just total revenue during the promotional period

Putting it into practice

These five techniques — cohort analysis, RFM segmentation, contribution margin, basket analysis, and discount effectiveness — move you from knowing your headline numbers to understanding the mechanics behind them. You don't need to implement all five at once. Pick the one that addresses your most pressing question and start there.

If you're losing customers after the first purchase, start with cohort analysis. If you're running frequent promotions but margins are thinning, start with discount effectiveness. If you have no idea which products actually make money after all costs, contribution margin analysis will be eye-opening.

Spark can run all five of these analyses on your store data automatically — cohort retention tables, RFM scoring, product-level contribution margins, co-purchase patterns, and discount impact measurement. Connect your store and ask the questions you've been guessing at.

In Part 3 of this series, we'll go further into advanced territory: predictive LTV modeling, price elasticity testing, vendor concentration risk, channel attribution, and seasonal decomposition.

Go beyond surface-level metrics

Spark runs cohort analysis, RFM segmentation, margin analysis, and more — automatically, across all your sales channels.

Try Spark Free Sign In

Ecommerce Analytics: The Intermediate Guide