Is open rate still a useful metric?

Open rate is no longer reliable as an absolute number because Apple Mail Privacy Protection pre-loads tracking pixels for every iPhone and Mac user, inflating opens. Use it only for relative A/B comparisons within the same campaign — never as your primary success metric for cold outbound.

How many contacts do I need for an A/B test?

For reply-rate tests on cold outbound, plan for at least 200 contacts per variant. Open-rate tests need 150 per variant given the higher natural variance, and bulk click-rate tests need around 500. Smaller samples produce results that frequently reverse at higher volume.

What is a good spam complaint rate?

Stay below 0.10%. Google's published guidelines treat 0.10% as the threshold where filtering begins, and 0.30% triggers bulk filtering. Above that, deliverability drops sharply across Gmail and any provider following Google's lead.

Email Analytics & Campaign Optimization Guide

Checking your open rate after a campaign is not analytics. Analytics is knowing which metric to focus on, when a difference is meaningful versus noise, and what to change based on what you find. This guide covers the metrics that actually predict performance, how A/B testing works in practice, and how to diagnose the most common campaign problems.

Metrics that actually matter

Not all metrics carry equal weight. Some predict outcomes. Some look interesting but are misleading. The distinction matters because optimizing the wrong metric can lead you in the wrong direction.

For cold outbound

Primary

Reply rate

The most direct measure of whether your message connected. A reply requires a human to make a decision. It cannot be faked by a pixel pre-load or a bot.

Primary

Positive reply rate

Interested replies as a share of total sends. A high overall reply rate with a low positive rate often signals targeting problems — the wrong people are responding.

Use carefully

Open rate

Directionally useful for comparing A vs. B within the same campaign. Unreliable as an absolute benchmark due to Apple Mail Privacy Protection.

Critical

Spam complaint rate

Google's sender guidelines set 0.10% as the threshold where filtering begins. This is the single most dangerous metric to ignore. Track it in Postmaster Tools.

Bounce rate is a health indicator rather than a performance metric. It should stay under 2%. Above that, pause the campaign and clean the list. See the List Formatting guide for verification steps.

For bulk and marketing email

Metric	What it measures	Why it matters
Click-to-Open Rate (CTOR)	Clicks / Opens	Shows whether the body of your email delivered on the subject line's promise.
Conversion rate	Goal completions / Emails sent	The only metric that directly connects email to business outcomes.
Unsubscribe rate	Unsubscribes / Emails sent	Above 0.5% per send signals content irrelevance or sending too frequently.
Revenue per email (RPE)	Revenue attributed / Emails sent	The clearest measure of list value for e-commerce or transactional programs.

Reading your dashboard correctly

Apple Mail Privacy Protection and open rate

Apple launched Mail Privacy Protection with iOS 15 in September 2021. When an iPhone or Mac user opens an email in the Apple Mail app, iOS pre-downloads the email content (including tracking pixels) on Apple's proxy servers, even if the user never actually opens the email.

The practical effect: any contact using Apple Mail registers as "opened" whether or not they read the message. Since Apple Mail is one of the most widely used email clients globally, this affects a meaningful share of most B2B lists.

Do not use open rate as your primary success metric for cold outbound.Use reply rate instead. Open rate is still useful for relative comparisons — if email A is getting 60% opens and email B is getting 30% opens when sent to the same audience, subject line A is working better. But open rate as an absolute number tells you less than it did before 2021.

Aggregate vs. segment-level data

A campaign average can hide significant variation. A 10% overall reply rate might mean one vertical is responding at 20% while another is at 2%. Most outbound tools let you filter analytics by campaign tag, list segment, or sequence step.

Attribution windows

For bulk email platforms, the attribution window determines which conversions get credited to an email. Most platforms default to a 5-day click attribution and a 1-day open attribution. Know what window your platform uses before drawing conclusions about revenue impact.

A/B testing

Most A/B tests in email produce noise that gets called a result. The two most common mistakes are testing too many variables at once and drawing conclusions from too small a sample.

What to test and in what order

Test one variable at a time. If you change the subject line and the opening line simultaneously and see a difference, you do not know which change caused it.

The variables that typically have the most impact, roughly in order of effect size:

Subject lineThe highest-impact lever on open rate. Test specificity vs. curiosity vs. name-drop approaches.
Opening lineThe highest-impact lever on reply rate. Test generic segment-level vs. personalized openers.
Call to actionQuestion-based vs. meeting request vs. direct ask. Small changes here can move reply rate meaningfully.
Email lengthVery short (50 to 75 words) vs. standard (100 to 125 words). Worth testing when fundamentals are already solid.
Sending day and timeGenerally lower impact than the above. Tuesday through Thursday mornings are conventional wisdom, but this varies by audience. Test last.

Sample size and interpreting results

Small samples produce unreliable results. A test run on 50 contacts per variant might show a 3% vs. 7% reply rate difference that completely reverses at 200 contacts. The general minimums for cold outbound A/B tests:

What you're testing	Minimum contacts per variant
Reply rate	200 contacts
Open rate	150 contacts (higher natural variance)
Click rate (bulk email)	500 contacts

Statistical significance matters, but so does practical significance. A reply rate difference of 5.8% vs. 6.1% might be statistically significant at a large enough sample size, but it is not a meaningful difference to act on. Look for a 20% or greater relative difference between variants before calling a winner. For example, 8% vs. 10% is a 25% relative difference — that is meaningful.

Wait for the sequence to finish before evaluating.Do not judge a 4-step sequence after the first email sends. Some variants perform better in later steps. Let the full sequence run before comparing results.

Diagnosing common problems

Symptom	Common causes and what to check first
Low open rate	Run an inbox placement test before assuming it is a subject line problem. If emails are going to spam, a better subject line will not help. If placement is clean, test subject lines and check the sender name looks human ("Ethan from SimpleSend" rather than "SimpleSend Support").
Good opens, low replies	The subject line is working but the body is not. Revisit the opening line (too generic?), the pitch (leading with features instead of problems?), and the CTA (clear, singular ask?). The most common cause is an opener that could have been sent to anyone.
High bounce rate	List was not verified before sending. Pause the campaign, run remaining contacts through NeverBounce or ZeroBounce, and remove invalids before resuming. If the list is old, re-verify entirely.
High unsubscribe rate	Content is not matching what the audience expects. Check list targeting first, then send frequency. Also check whether contacts opted in expecting something different.
Rising spam complaints	The most urgent problem. Pause the campaign. Check whether recent sends were going to contacts who are not a good fit. Make sure opt-outs are being processed immediately. A sudden spike can cause immediate filtering even from a previously healthy domain.
Replies but few positives	Usually a targeting problem. You are reaching people who understand the email but it does not apply to them. Narrow the segment and revisit whether the pain point you are describing matches this audience.

Building a continuous improvement loop

Individual optimizations compound over time. A campaign that has gone through ten improvement cycles, each producing a meaningful improvement, will dramatically outperform one that has never been tested. The process needs to be systematic rather than ad hoc.

Set a baselineRun a full campaign and record reply rate, positive reply rate, open rate, and bounce rate. Do not optimize anything yet.
Find the constraintWhich single metric, if improved, would have the biggest impact? Start there.
Run one testOne variable. Document the hypothesis, the control, and the variant before you start. Minimum 200 contacts per variant.
Record and implementLog the result in a shared doc. Roll out the winner. Update your baseline. Find the next constraint.

A simple log format — date, variable tested, control result, variant result, winner, notes — is enough. Over time it becomes institutional knowledge about what works for your specific audience, which is worth more than any generic benchmark.

Tools for analytics and monitoring

Tool	Use	Link
SmartLead Analytics	Per-campaign and per-inbox reply/open tracking	smartlead.ai
Google Postmaster Tools	Domain reputation and spam rate monitoring	postmaster.google.com
GlockApps	Inbox placement testing by provider	glockapps.com
Mail-Tester	Quick free spam score check	mail-tester.com
Mailchimp / SendGrid Reports	Bulk campaign analytics (CTOR, bounces, conversions)	mailchimp.com
Looker Studio	Custom dashboards from CSV exports or API data	lookerstudio.google.com

Sources & further reading

Email Analytics and Campaign Optimization