OpenMarketing
  • testing
  • April21st

    Best practices

    Posted in: testing

    Baristas are experts at using standard equipment and premium beans to make a truly outstanding cup of Joe.  So it is with experts in customer marketing and analytics.  True aficionados know how to structure your test to maximize learning without compromising on ROI or revenue delivery. 

    Don’t test what you already know

    • Leverage industry, segment, & audience knowledge
    • Specificity of offer & targeting always improves performance
    • Time-driven offers perform better than open-ended ones

    Start with the basic fundamentals, strategy occurs after fundamentals are understood

    • Keep it simple
    • Agree upon process
    • Quality control

    Design campaigns so learning can be generalized

    • Track beyond response, all the way to behavior you ultimately desire (e.g. seminar registration)
    • Use controls, e.g. Do Not Mail Control Groups
    • Use statistically valid cell sizes

     

  • April21st

    Package

    Posted in: testing

    The goal of this kind of testing is to determine whether using a different form factor will lift response.  The most common form factors used in direct marketing include: 

    • #10 letter mailing
    • self mailer
    • over-sized postcard mailing
    • dimensional mailing

    Best Practice
    A best practice here is to know that any test you are fielding isn’t testing form factor alone but also the creative execution devised to make the most of that form factor.  For this reason, we refer to this type of testing as a “package test”:

    DM Package = Form Factor + Creative Execution

    Planning
    The minimum sample size for each test cell is determined by the response rate, confidence interval, and allowable percentage error. Click here for a tool that calculates minimum required sample size.

    Backend Analysis
    By properly designing the test cells in the planning stages, the back end analysis becomes simpler and allows us to learn with a higher degree of confidence.

    Using a confidence interval worksheet, we can determine whether observed response rates are affected by package. The confidence interval worksheet can be found by clicking here.  For this example, at the 95% confidence level, the letter mailer had an expected response rate between 1.11% and 1.29%. Similarity, the expected response for the self mailer is between .82% and .98%.

     

    Business Impact
    Here, the letter mailer outperformed the self mailer.  The difference in response rate seen between the two mailings was statistically significant at the 95% level. 

  • April21st

    Catalogers tend to think about their house file in terms of the amount of time that has elapsed since the last purchase. In the example below the quantity mailed represents all the customers on the file while buyers are only those people who purchased following the most recent catalog drop. Instead of response rates, catalogers focus on the percentage of customers mailed who went onto purchase. Most commonly, this percentage is called the conversion rate. 1

    The purpose of the test is to understand how soon to send another catalog to a given customer based upon their last purchase. In the example below, the relationship is curvilinear: the optimal time to re-contact a recent buyer is some 4-12 months after their last purchase.

    Planning
    An example below is a test to determine the relationship between recency and purchase behavior.

    The minimum sample size for each test cell is determined by the expected error, confidence interval, and allowable percentage error. A handy sample size tool to calculate the minimum sample size can be found by clicking here.

    Backend Analysis
    By properly designing the test cells in the planning stages, the back end analysis becomes simpler and allows us to learn with a higher degree of confidence.

    Using a confidence interval worksheet, one can determine if conversion varies by recency of last purchase. The confidence interval worksheet can be found by clicking here.

    For this example, at the 95% confidence level, the 0-3 month buyers had an expected conversion rate of between 1.04% and 1.16%. The 4-6 month buyers had an expected conversion rate of between 1.79% and 1.97% at the 95% confidence level. At the 95% confidence level, the 7-12 month buyers had an expected conversion rate of between 1.79% and 2.01%. Finally, the 12 month+ buyers had an expected conversion rate between .93% and .95%.

    Business Implications

    Here, the best time to mail is 4-12 months after the last purchase. Statistically speaking, there is no difference in conversion rate between customers who are 4-6 months old versus those that are 7-9 months old. Mailing to customers who purchased most recently will result in a lower conversion rate.

    Note

    1. Alternatively, this percentage can be refered to as the purchase incidence
  • April21st

    Frequency

    Posted in: testing

    Frequency is your friend.  Research on advertising effectiveness shows suggests that your message must be repeated 3-4 times before it gets heard never mind acted upon. 

    Planning
    The example below shows how customers respond to frequency of touch based upon a campaign that mails customers 1x, 2x, or 3x. This test is designed to see how frequency impacts response. 

    The minimum sample size for each test cell is determined by the expected response rate, confidence interval, and allowable percentage error. Click here for a tool to calculate minimum required sample size.

    Backend Analysis
    By properly designing test cells in the planning stage, backend analysis becomes simpler and allows us to learn with a high degree of confidence. Using a confidence interval worksheet, one can determine whether the observed response rates between the three are statistically significant. The confidence interval worksheet can be found by clicking here

    For this example, at the 95% confidence level, the customers contacted 1x had an expected response between .71% and .81%. Similarity, the expected response for customers contacted 2x was between .85% and 1.05%, and the expected response for customers touched 3x was between .87% and 1.15%.

    Business Impact
    Here, customers responded best when touched 2x in a campaign.  At the 95% confidence level, there was no statistical difference between customers who were touched 2x or 3x.  Therefore the third mailing is not worth doing.  It will add to your costs while driving exactly the same response. 

    Best Practices
    A best practice is to build frequency into your campaigns upfront, when planning your campaign. Make sure you are mailing to the same type of customers or prospects and that the only variable you are changing is the frequency delivered by test cell.  Your target should be assigned to each test sell at random.  Sometimes this is not practical, say when you decide that certain types of customers merit being mailed 3x based on their lifetime value.  If this is your situation, you can look at frequency on the backend, when you analyze program results.  This isn’t a controlled test but more of a natural experiment.  Natural experiments like this are very useful in generating hypotheses that you’ll want to go on and test in a more rigorous and controlled way. 

  • April21st

    Message

    Posted in: testing

    Many marketing managers want to see how different segments respond to different messaging. Psychological studies have shown that the human brain processes pictures first an then words1. For this reason, it is a best practice to test the message and the creative execution as if they are one and the same thing2.

    Message testing is usually your last priority not your first. Within a particular target segment differences in messaging/creative execution ARE NOT typically associated with huge changes in responses. For this reason, we typically recommend that marketing managers look to test list sources and offers first before testing the message/creative execution.

    Planning
    Below is an example that tests has small-and-medium business customers (SMBs) respond when presented with different messaging/creative executions. The test is designed to see whether SMBs respond better to vertical or generalized messages.

    The minimum sample size for each test cell is determined by the expected response rate, confidence interval, and allowable percentage error. Click here for a tool to calculate minimum required sample size.

    For each campaign, a diagram needs to be constructed that visually depicts the testing we plan within each segment. Click here for an example of what we mean.

    Backend Analysis
    By properly designing test cells in the planning stage, backend analysis becomes simpler and allows us to learn with a high degree of confidence.

    Using a confidence interval worksheet, one can determine whether we can lift response by varying the message/creative execution. Click here for an example.

    In this example, at the 95% confidence level, the vertical messaging had an expected response between .72% and .88%. Similarity, the expected response for a generalized message is between .62% and .78%.As a result, we can conclude at the 95% confidence level that there is no statistical difference in response for SMB customers when presented with vertical or general messages.

    Business impact
    Vertical messaging appears to lift response slightly when compared with general messages. However, this result is not statistically significant. This suggests that the added expense and time involved in fielding vertical messages may not be worth it. Further
    testing is warranted to explore this finding in more detail. For example, it may be that vertical messages work to lift response in a more dramatic way for some segments and not others.

    Notes
    1. Source: How Customers Think: Essential Insights into the Mind of the Market, by Gerald Zaltman, HBS Press, January 2003.
    2. For a valid test, you should vary only the message/creative execution and not the type of DM package used.

  • April21st

    Media Mix

    Posted in: testing

    The media mix refers to the way we are using different marketing media to drive response in an integrated fashion. In this example, we look at how traditional direct mail (DM, sometimes called “snail mail”) works alone vs. email (EM) alone versus DM + EM used in combination.

    Planning
    A test cell must be large enough to be statistically valid and to enable comparison across cells. If the test cell is too small, the results are not statistically significant and are therefore meaningless. The minimum sample size for each test cell is determined by the expected response rate, confidence interval, and allowable percentage error. Click here for a tool that calculates the minimum sample size.

    Backend Analysis

    Using a confidence interval worksheet, we can determine whether the observed response rates between the three programs are statistically significant. The confidence interval worksheet can be found by clicking here.

    For this example, at the 95% confidence level, the expected response rate for the DM-only cell ranged from a low of 1.36% to a high of 1.63%. The expected response rate for the EM-only cell ranged from a low of 1.28% to a high of 1.52%. The expected response rate for the DM + EM cell where direct mail and electronic mail were used in combination ranged from a low of 2.03% to a high of 2.37%.

    Business Impact
    Here we can conclude at the 95% confidence level that customers targeted responds best to a combination of email and direct mail. Additionally, there is no statistical difference in response for direct mail (alone) versus email (alone).

    Since EM is much less expensive than DM the fact that the two medium work equally well is an important finding. Assuming this result continues to hold up with additional testing, it means that we should target customers based on their addressability. Customers who can be reached through DM + EM should come first in the order of priority, customers who can be reached through EM should come second, and customers who can be reached through DM only should come third in priority.

  • April21st

    Offer

    Posted in: testing

    It’s been said many times but needs to be said again. The offer is one of the most important ingredients of any customer marketing program. The purpose of offer testing is to determine which offer drives the most response.

    Best Practice
    A best practice is to test various offers against each other and against a “no offer” control. This is particularly important when you are selling something that the target values. Through such testing you may find out that an offer of a free gift with purchase – for example – actually does little to drive incremental response.

    Planning
    An example below is a test to determine for differences in response rates for two different offers (Offer #1 – Personal Offer; Offer #2 – Job Support Offer) versus a Control group (no offer). The minimum sample size for each test cell is determined by the expected response rate, confidence interval, and allowable percentage error. A handy sample size tool to calculate the minimum sample size can be found by clicking here.

    For each campaign, a diagram needs to be constructed that visually depicts the testing we plan within each segment. Click here for an example of what we mean.

    Backend Analysis
    By properly designing test cells in the planning stage, backend analysis becomes simpler and allows us to learn with a high degree of confidence.

    Using a confidence interval worksheet, we can determine whether observed response rates between the three are statistically significant. The confidence interval worksheet can be found by clicking here.

    For this example, at the 95% confidence level, consumers receiving the job support offer had an expected response rate between 1.13% and 1.27%. Consumers who received the personal offer had a response rate between 1.42% and 1.58%. The “No Offer” control group had an expected response between .98% and 1.22%.

    Business Impact
    Here, we can say at the 95% confidence level that the target responded best when given the job-support offer. The job support offer (Offer #2) outperformed the personal offer (Offer #1) in a way that was statistically signifcant. However, the personal offer worked about as well as no offer at all. In other words, the difference in response rate seen between these two cells was not statistically significant at the 95% level.

    Keep in mind that a 95% confidence level doesn’t mean that you are 95% sure. It means that if you repeated this particular test 100 times, you would get the same results 95 times out of 100. Go to the glossary to learn more about the confidence interval and the margin of error.

  • April21st

    List Source

    Posted in: testing

    Typically this type of testing focuses on determining how external lists perform relative to internal lists and also which external lists meet can deliver results most cost-effectively. The example below is for a two-step mailing program, where step 1 asks prospects to attend a seminar and step 2 asks those who attend to go on and purchase the product.

    Planning
    As is usually the case, the test plan needs to be developed before the campaign; it will be extremely difficult to measure in the back-end of the campaign if the test plan was not designed in the planning stages.

    The minimum sample size for each test cell is determined by the expected error, confidence interval, and allowable percentage error. A handy sample size tool to calculate the minimum sample size can be found by clicking here.

    Backend Analysis
    By properly designing the test cells in the planning stages, the backend analysis becomes simpler and allows us to learn with a high degree of confidence.

    Using a confidence interval worksheet, one can determine if the observed response rates between the four lists are statistically significant. The confidence interval worksheet can be found by clicking here.

    Business Impact
    Here, we see that internal list performed better than all the external lists combined at the 95% confidence level. This will almost always be the case – that your house file or internal list will outperform any external list.

    Looking at response rates of the three external lists, we can see that external list #2 performed significantly better than list #1 or list #3.