Image Blog posts

The Ultimate Guide to Message Testing: Best-in-Class Practices on How to Test And Optimize Brand Messaging.

Everything you ever wanted to know about the art and science of message testing in pharma, all in one place.
ImageNewristics Image01 Feb 2024

Messaging: The most important marketing lever for brands

Good messaging can get the attention of potential customers and inspire them to take action. Great messaging can even disrupt categories, create brand-defining opportunities for companies, and lead to significant growth in market share.


On the other hand, bad messaging can confuse or discourage customers and lead to missed revenue growth opportunities. Even worse, in today’s social media-driven world, bad messaging can go viral faster than good messaging, ruin a brand’s reputation, and create a never-ending stream of memes that continue to negatively impact brand equities over time.

The best way to eliminate guesswork and ensure the effectiveness of messaging efforts is through message testing. By testing messages, market researchers get a clear direction on how to talk about their products in a way that will get customers to buy.

In this article, we'll discuss the basics of message testing, methods to improve the effectiveness of messaging, their limitations, and how innovation in message testing can help to find winning messaging campaigns.

B2B and B2C Messaging

Message testing in B2B and B2C environments differs significantly due to the unique characteristics of these target audiences and their decision-making processes.


Messaging Objectives: In B2B, the focus is on building trust, establishing credibility, and providing detailed information to support rational decision-making. On the other hand, consumer and emphasizes emotional appeal, creating a connection with the consumer, and highlighting the benefits and value of the product or service.


Extensive knowledge base: Only when one has significant knowledge about many diverse topics can one use that information (via the working memory) to pay attention to new changes and be able to access relevant experiences that enable creative thinking.


Interpersonal trust: There is a significant amount of faith shown in others’ abilities while improvising. The popular idea of “having chemistry with someone” is relevant to improvisation. When people trust each other spontaneously and do not rely on scripts or protocols, they can bounce ideas off each other and collectively “build” new solutions.


Self-monitoring: Improvisation requires a significant amount of self-reflection and awareness of one’s thought processes. It requires immediate feedback, noticing one’s own biases, and mental flexibility that is purposeful (as opposed to daydreaming). It is hard to improvise without self-monitoring because one cannot often keep track of new changes in the problem and solution space.


Self-monitoring: Improvisation requires a significant amount of self-reflection and awareness of one’s thought processes. It requires immediate feedback, noticing one’s own biases, and mental flexibility that is purposeful (as opposed to daydreaming). It is hard to improvise without self-monitoring because one cannot often keep track of new changes in the problem and solution space.

What is Message Testing?

The importance of message testing: Why it matters

In the last decade, marketing has changed a lot. Leading brands now use digital-first and omnichannel marketing in almost every industry, including the pharmaceutical industry.


It is important to ensure that marketing messages break through the digital noise and effectively connect with customer segments to drive growth. Even if the brand teams know their customers well, figuring out which messages will work best in the real world can be tricky. That's why message testing is important. It helps to fine-tune the messaging and create winning message bundles for different campaigns.

When teams skip message testing, they risk creating echo chambers where their own beliefs and preferences cloud their judgment. This often leads to messaging campaigns that leave us scratching our heads, wondering, "What were they thinking when they came up with this campaign?"

That's precisely why leading brands test messages multiple times before launching their campaigns. It boosts their chances of getting a message just right before the first time a customer sees it.

In a nutshell, message testing serves a dual purpose:

  1. It helps brands understand how potential customers will respond, taking into account their preferences, behavior, pain points, and goals.
  2. When teams use it to find the winning messaging, they significantly boost the chances of their campaign success.

How to structure a successful message testing plan?

Message testing is done using two methods: qualitative and quantitative. Qualitative testing involves non-numerical research based on observations, while quantitative testing gathers hard data through polls, surveys, and similar closed-ended questions.

Here’s the roadmap to get started:

  1. Establishing Objectives and Identifying the Audience: The initial phase involves outlining clear messaging objectives that are Specific, Measurable, Achievable, Relevant, and Time-bound (SMART). These objectives could range from boosting brand awareness and loyalty to improving purchase intent within a targeted customer segment. During this stage, teams should answer key questions such as:
    • What specific goals do they aim to achieve through their messages?
    • Who is the intended audience?
    • What emotions, actions, or behavior changes should be prompted in potential customers after they read or hear the messages?
  2. Finding Messaging Assumptions: This includes defining the beliefs about customer segments—their needs, preferences, and objections. These assumptions guide message strategy and help brand teams decide what to emphasize and how to present their product. These messaging assumptions should be easy to understand, short, and able to be tested. For example, an assumption could be that a certain group of people care more about speed and convenience than price and features when choosing a product. Messaging should be aligned with these assumptions.
  3. Choosing a Message Testing Method: The next step involves choosing the right methods of message testing, depending on the available resources, timeline, and objectives. There are different options to test messages, like surveys, AB testing, interviews, landing page testing and many more.

    Multivariate testing is also an alternative approach, where teams experiment with different combinations of elements within their messages to find the most effective one.

    Pre-testing and post-testing are useful methods to test the message with a sample of customer segments before and after launching it to the market. These methods can provide feedback and insights into their reactions and perceptions, as well as measure the impact and effectiveness of the message.

  4. Analyze Results: After conducting the tests, teams should analyze the results to understand what worked and what didn't. They should conduct a comprehensive data analysis for informed decision-making.

    • Descriptive statistics, such as mean, median, mode, frequency, and percentage, provide concise overviews of key data attributes.
    • Inferential statistics, including hypothesis testing, correlation, and regression, enable broader insights.
    • Data visualization through charts, graphs, and tables enhances data communication.
    • Specialized software like Excel, SPSS, and R automates complex analyses, expediting workflows and ensuring accuracy.
  5. Apply Findings: This signifies applying the findings and recommendations to improve brand messaging. Brand teams should make adjustments based on feedback and insights and implement the best-performing versions of messages. These messages should be tailored to different audience segments, and tested for better optimization.
  6. Learnings: Lastly, the results and insights should be shared with the team, stakeholders, or partners, and these findings should be used to inform future brand communication strategies and campaigns.

Testing messages BEFORE launching a messaging campaign

Most brands in most industries test messages in primary market research with a representative sample of customers BEFORE launching a messaging campaign. This helps them optimize their messaging platform before launch so that there is less course correction needed later and the brand team has more confidence that they are maximizing ROI on their campaign spend.

In more digital or ecommerce-oriented industries, marketers don’t test messages in primary market research before launching a campaign. Instead, they approach campaigns as a way to test messages live in the market, identify the best messages through A/B testing, and spend the rest of the marketing budget against the winning message.

Message testing in primary market research before launching a campaign helps brands communicate effectively, minimize risks, and maximize their chances of success in an increasingly competitive marketplace.

Using qualitative market research to test messages before launch
Qualitative research involves non-numerical analysis, such as in-depth interviews, to gain a better understanding of customer motivations. It's great for gaining a well-rounded insight into your audience.


During qualitative message testing, you gather opinions and reactions from participants through individual or group interviews, focus groups, and video recordings. These conversations can occur face-to-face, over the phone, or in online chat rooms, and they are usually guided by an interviewer or moderator.

Market researchers can then analyze these responses to uncover common themes and patterns. This analysis can be done manually or using automated tools, depending on your resources.


In-depth Interviews: In-depth interviews involve a skilled researcher having a detailed conversation with a participant. This method is useful for thoroughly exploring individual messages and understanding the psychology behind them. It can provide valuable and in-depth data, and can even be conducted online. However, it can be time-consuming, expensive, and influenced by biases.


Focus Groups: Focus groups are another type of qualitative research where a small group of 6-10 people engage in a discussion, instead of a one-on-one interview. These discussions can take place in person at a designated facility or online through a chat room-like platform. A skilled moderator guides the conversation and gathers feedback from the group regarding specific messages or stimuli. However, focus groups can be costly, participants may not feel comfortable expressing their opinions openly, and there is a risk of groupthink.


Online Chat Groups: They're low-cost online qualitative testing methods where participants engage in real-time discussions, allowing brands to gather instant feedback on their messaging. An interactive session provides insights into audience reactions, preferences, and perceptions, that help teams to optimize their messaging for maximum impact.


Ethnographic research: This kind of research involves studying the social interaction of users in a natural environment or real-life setting rather than a lab. Such observations can be made anywhere from the user's home to the workplace to their outing with friends and family. They help to understand how they view, act, and respond to different situations in these environments.


Web micro surveys: Web micro surveys are popular nowadays because they enable brands to reach a large number of respondents with minimal investment. Creating a survey is also simple and can be done using user-friendly platforms like Qualtrics, Suzy, etc. Although results can be obtained quickly, micro surveys are not suitable for handling a high volume of messages and cannot differentiate between messages with subtle language differences. Learn about the limitations of surveys for all message testing methodologies in our latest blog.


UI/UX Heatmaps: Heatmaps are visual tools used in message testing and website usability analysis to illustrate how users interact with a webpage containing messaging. They are particularly useful for understanding where users focus their attention, which areas they click on, and how far they scroll down a page.


Neuroscience Testing: Neuroscience research involves the use of fMRI techniques to capture brain activity in response to messages, ads, creative concepts, etc. Respondents participate in the lab-like setting with sensors attached to various spots on their heads that capture neuronal activity in the brain. After a message is shown, if a certain section of the brain lights up that is responsible for fear-based decision-making, the message is considered effective in evoking fear.

Using quantitative market research to test messages before launch

Quantitative research, on the other hand, focuses on collecting hard data through techniques such as surveys, polls, and close-ended questions. It's a scalable way to identify trends and patterns.

Measurement techniques commonly used in quantitative messaging market research


Likert Scaling: The Likert Scale is a scaling method to measure how strongly consumers feel about something. Typically, a 5-point or 7-point Likert scale is used to collect feedback in surveys. For example, if a 5-point


Likert scale was used to capture agreement with an attitudinal statement, the scaling options could be: 5 = Strongly agree, 4 = Somewhat agree, 3 = Neutral, 2 = Somewhat disagree, and 1 = Strongly disagree. It is simpler to use and understand than other methods, but it doesn't give a full explanation for why a customer might disagree. Additionally, the data is often now discriminated enough as responses often regress to the mean or average.

2.Maxdiff Scaling: Maxdiff, also known as Best-Worst analysis, is a straightforward survey technique where respondents choose the 'Best' and 'Worst' options from a provided set. When used in a well-designed experiment, it provides relative rankings for each option. Maxdiff helps teams get a direct insight into consumer preferences and rank their products accordingly. This method is also a form of Conjoint analysis.

3.Line Scaling: Line scaling is used to measure the intensity or degree of a respondent's feelings, attitudes, opinions, or perceptions about a specific topic or stimulus. It involves presenting respondents with a line or scale and asking them to mark a point along the line to indicate their level of agreement, satisfaction, preference, or other subjective responses. The line or scale has labeled points or anchors at each end, representing extreme positions or contrasting statements. Respondents then mark the line to express their position or degree of agreement.

4.Choice-Based Conjoint: Advanced quantitative choice-based message testing studies are a good option if the goal is to test multiple messages in a single study. However, they don't allow for in-depth exploration of each message and don't provide detailed drivers/barriers of appeal for each message.

5.Discrete Choice: Discrete choice models help us understand why people choose one option from a group of two or more options. They allow researchers to analyze and predict how people's choices are influenced by their characteristics and the available alternatives. This technique is used to guide not only brand messaging, but also product positioning, pricing, product concept testing, and other strategic and tactical areas of interest.

6.Tournament Style Choice: Tournament style choice methodologies use a very simple sports tournament-type approach to identify winners and losers in messaging. Pairs of messages (A vs. B, B vs. C, A vs. C, etc.) are tested with respondents, and messages with the highest wins and fewest losses/ties rise to the top.

Diagnostic techniques commonly used in quantitative messaging market research

Brands can use diagnostic questions within surveys to gain deeper insights into respondents' perspectives. By using such questions, teams can effectively discover the underlying perceptions and misconceptions held by potential and existing customers regarding their brand and product. They help businesses evaluate various aspects of their messaging strategy.


Purchase Intent Based on Message: This metric measures if the message influenced respondents' likelihood to buy a product or service. It's a key indicator of how persuasive the message is.


Differentiation vs. Competitors: Researchers can check if the message successfully sets the brand or product apart from competitors, identifying unique selling points.


Relevance: This aspect shows how well the message aligns with the target audience's needs and interests. A relevant message is more likely to connect with customers.


Believability: This metric assesses how trustworthy the message appears. A believable message is more likely to impact consumer decisions.


Comprehension (Easy to Understand): Messages should be clear and easy to understand. High comprehension ensures the audience understands the intended meaning.


Value: Value reflects if the message effectively communicates the benefits and value of the product. It helps the team determine if the message is compelling to consumers.


Key Message Takeaway: Researchers evaluate if respondents accurately recall the main message of the advertisement or communication. This shows how memorable the message is.


Fast Message Recall: Fast message recall measures how quickly respondents remember the message. Faster recall often indicates a stronger and more impactful message.


Slow Message Recall: Slow message recall assesses if the message remains memorable over time. It helps determine if the message has a lasting impact and if consumers remember it beyond the initial exposure.


Message Sentiment: Does the message evoke a positive, neutral or negative sentiment in the customer?


Emotional Resonance: Beyond sentiment, what specific emotions are triggered by the message in the customer and are they desirable emotions for the brand?


Implicit Response: Tapping into the philosophy of implicit biases, some research methodologies test messages in a way that measures the implicit response. A message is quickly exposed on the screen, after which the respondent is asked to complete a task that has a right vs. wrong answer. Exposure to the message can change their response and can be used to assess their implicit reaction to the message.

By conducting surveys and assessments on these dimensions, marketers and researchers can gain a comprehensive understanding of how well their messages resonate with their audience and make informed adjustments to optimize their messaging strategies.

Some diagnostic measures (e.g., Purchase Intent, Believability) are mandatory for brand teams, no matter what types of messages they are testing. Other diagnostic measures (e.g. Emotional Resonance, Implicit Feedback) can be important depending on the objectives of the messaging campaign, the target customer for the campaign, etc.

Sourcing participants for messaging market research studies (qual/quant)

Whether one is using qualitative or quantitative market research to test messages, one needs to get enough respondents to participate in the study. It is critical that the respondents recruited for the study are representative of the customers of the brand whose messages are being tested.

Let’s explore a few ways to recruit participants for messaging market research. The quantity and quality of respondents recruited to participate in the study heavily influence how reliable the findings of the message testing research study are and how projectable they can be to the real world.

Over the past 10 years, respondent recruiting for primary market research has gone through major industry shifts from third-party maintained panels to first-party assembled customer communities.

Third-party research panels – A research panel or market research company creates a database of people with contact information and other relevant data for future studies. This allows brands to speed up their market research studies by accessing a ready-made sample of participants from a database of millions. The main benefits of using third-party customer panels are:

  1. Speed, as recruiting a panel can be time-consuming
  2. Short-term cost, as buying a sample may be cheaper for a few studies
  3. Bandwidth, as managing a panel may not be the best use of a limited research team's time
  4. Diversity, as buying from an external panel allows for choosing specific demographics to expand the audience.

First-party customer online communities – First-party data is information obtained directly from customers, site visitors, or social media followers. It includes data from online and offline sources such as websites, apps, CRM, and surveys. This data is valuable and reliable as it comes straight from customers. Examples include population statistics, user behavior, CRM data, social media discussions, and survey information.

Web page exit surveys – Exit intent surveys are shown to visitors as they are about to leave a website, helping teams understand why visitors are leaving. These surveys are an important part of your Voice of the Customer (VoC) program. Getting feedback directly from customers and website visitors is the best way to understand how well the customer experience and user experience are working for current and potential customers.


Social media followers – Engaging with social media followers is a great way to find participants for market research. These people have already shown interest in a brand and its messages by choosing to follow it on social platforms. By tracking their engagement, collecting feedback, conducting surveys, or running A/B tests, teams can evaluate the impact and effectiveness of their messaging strategies.

Using the social media community can help marketers make sure their communication connects with those who are already engaged with their brand, which could lead to more loyalty and conversions.

YouTube Polls - YouTube polls are a great tool for interacting with viewers and getting their feedback. They can help marketers get more insights about their customers in real-time and improve their brand without having to use external pages or surveys. YouTube polls are becoming more popular as they are quick and easy to complete, with a better response rate. They often provide immediate feedback on how others have answered the same question.

LinkedIn Brand Survey - LinkedIn surveys let social media marketers conduct short brand surveys for interested audiences. These surveys appear in the LinkedIn feed or inbox and are displayed on the LinkedIn Page. The audience can choose from various options to answer about the brand. After submitting the survey, the participants cannot see other members' responses. Brands can see the number of answers received but not who answered or how they responded.


Affinity organizations - Surveys can be fielded to members of affinity-based organizations like the VA, Worker Unions, Student clubs, etc. While members of these organizations are likely to engage quickly due to their affinity, they don’t offer the diversity in respondent recruitment that most brands seek.

Retail Intercepts - In many smaller countries, mall or retail store intercepts are still used as an important source of research participants for surveys. Before the advent of internet-based surveys, mall intercepts followed by kiosk-based surveys werethe primary challenge of surveys even in developed countries.

Using AI-based synthetic market research for message testing

AI is transforming the way marketers can create and predictively score messages before launching a campaign. Message writing was historically done by copywriters working at brand agencies. Now, Generative AI can use large language models like GPT-4, LLAMA, Bloom, etc. to generate copy for brand messaging. Similarly, Predictive AI can be used to train machine learning algorithms on messaging databases so that they can replace primary market research-based message testing with synthetic research.

Generative AI – Changing the future of marketing content development

Generative AI is rapidly changing the landscape of how brand teams and even their agencies will develop marketing content in the future.


Large Language Models (LLMs) are demonstrating great potential in generating natural, human-like language and can be used to produce marketing content of reasonably high quality within seconds.

Large Language Models have been trained on almost all the text available on the internet using unsupervised machine learning. Most of the LLMs have been trained on 100s of billions of parameters and can generate text for any use case.

Generating useful messaging for a brand is not easy for Generative AI platforms because it has to satisfy a number of criteria before it can even be legitimately considered an option

  • The messaging has to be ON-STRATEGY, which means that it has to deliver against marketing goals and objectives of the brand
  • The messaging has to have the right TONALITY, because every marketing/brand team follows a tonality that they have created for their brand.
  • The messaging has to be legally approvable, because no messaging campaign can get to market if the brand’s legal team doesn’t believe that the messages can be supported with evidence.

While LLMs are still working through many glitches like hallucinations, they can already produce high-quality brand messages by paraphrasing, spinning, or rewriting existing marketing content. They can also generate new marketing content and messages through prompt engineering on product specs and features.

Predictive AI – Creating a new way to test messages using synthetic research

Machine learning algorithms can be good at making predictions if they are trained on high-quality/quantity data. Since billions of messages have been tested in years past, either through primary market research or through A/B testing on digital platforms, there are large volumes of training data available on message testing that can be used to train machine learning algorithms to predictively score messages based on appeal to customers.

Predictive AI is paving the way for a new field of research called SYNTHETIC RESEARCH that has the potential to disrupt primary market research in the future.

Synthetic research replaces customer surveys with algorithmic predictions, which means no fieldwork, no survey programming, no data analysis, etc.

Messages can simply be fed into a machine learning model and scored predictively. If the output of synthetic research can be 80+% similar to primary market research, it offers a faster, cheaper, almost as good alternative to the primary market research process.

Generative/Predictive AI solutions for synthetic message testing Image

Anyword uses LLMs to help brands create, test, and score messaging across various communication platforms. With its advanced AI technology, the platform generates multiple variations of original content, each with a unique tone, style, and perspective.

Anyword scores these content variations on a scale from 1 to 10 using its predictive performance algorithm trained on billions of data points from Facebook ads. This feature allows marketers to identify and select the highest-performing messages generated by Anyword, whether for email campaigns, social media posts, or ad copies.


Newristics is famous for using messaging science, algorithms, and databases to help brands create, test, and score their messaging. Focusing exclusively on healthcare messaging, Newristics has trained deep learning algorithms on millions of data points from past message testing primary market research studies and can now predict the effectiveness of any message in any disease state.

Newristics' KRISTL machine learning algorithm can replace primary market research with synthetic research for message testing. KRISTL can identify winning/losing messages in an inventory of messages without the need for customer surveys. KRISTL can also analyze the effectiveness of 1,000s of messages across all brands in a product category and create a messaging performance scorecard comparing the effectiveness of brand messaging.


Phrasee is an AI-powered platform that uses natural language generation and machine learning to analyze and score messages. It helps businesses test the effectiveness of different message elements like subject lines, email copy, and social media content. The platform relies on data from historical A/B testing to train algorithms that can predict how audiences will respond to specific phrases, guiding teams to choose the most effective messaging.

Testing messages DURING a live campaign using A/B testing

With the advent of digital marketing, brand marketers are producing and using a lot more marketing content and it is simply not practical for such large volumes of content to be tested in primary market research using qualitative or even quantitative research methodologies.

As a result, digital-heavy or digital-first industries have evolved into a new modality of message testing called A/B testing.

What is A/B testing and how can it be used to test messages?


A/B testing is based on a simple concept. If you have two ideas (A and B), you can expose each idea to customers in the market using a split cell experimental design approach and measure how many customers respond to each idea. Customer response can be measured using a variety of behavioral metrics like Open Rate, Click Through Rate, Registrations, Form Fill Rate, Purchase Rate, etc.

The idea (let’s say B) that gets statistically significantly better customer response at the end of this real-world experiment becomes the winning idea and replaces the losing idea (in this case A) so that all customers are now exposed to only idea B for the remainder of the campaign

If A and B are very different ideas, then the number of exposures needed for each idea can be smaller because the response rate can be very different. If A and B are similar in nature, then even after many exposures, it is possible that no winner emerges from the A/B test.

Due to the efficient nature of A/B testing, it has become very popular and has even replaced primary market research-based message testing completely in many industries.

Popularity of A/B testing across digital and social media platforms

A/B testing means trying different ways of saying something to see which one works best. Teams choose what to test, make two versions of the message, run the test, see which version is better replace the losing version with the winning version.

A/B testing is being used in almost every digital channel as a way to test and continuously optimize messaging while running live campaigns.

Email A/B testing - Email A/B testing is the process of sending two different variants of emails to different lists of subscribers with a goal to determine which variant performs better. A simple A/B test can involve sending multiple subject lines to see which one generates more opens. A more advanced A/B test can involve testing completely different email templates to see which one generates more click-throughs. This allows teams to measure how a change in the email subject or content affects the email's performance.

Web page A/B testing – Web A/B testing is a process where different versions of a variable (such as a web page or page element) are shown to different segments of website visitors at the same time. This helps determine which version has the greatest impact and drives business metrics. By eliminating guesswork, this method helps teams make data-backed decisions for website optimization.

Digital ad A/B testing – Digital ad A/B testing allows advertisers to determine the most effective version of their ads. They create multiple versions of an ad with slight variations, such as different words or images, and identify the most successful elements based on their performance. Such an analysis helps to maximize the ROI of existing campaigns and improve the performance of future ones.

Social media ad A/B testing – This is used to test and analyze the performance of ads running on social media platforms like Facebook, Twitter, LinkedIn, and more. Marketers test multiple versions of their ads or different variants of the same ad on social channels and analyze which one works best for their target audience.

Mobile App A/B testing – Mobile app A/B testing involves testing different experiences within mobile apps to determine which one is most effective in achieving desired actions or improving key metrics. Users are randomly assigned to different arms of the A/B experiment and shown different in-app experiences.

By analyzing the results of the experiment, app developers and marketers can optimize the app, improve user engagement, and achieve better campaign results.

Mobile app A/B testing has historically been challenging due to technical difficulties and the need to test on both Android and iOS platforms. However, it is crucial for app marketing as it helps identify the best user experience and understand user behavior, ultimately improving key performance indicators.

Top software solutions for Email A/B testing

Email A/B testing software helps marketers to experiment with different aspects of their email campaigns, such as subject lines, content, send times, and design, to determine what works best for their audience. Below are some of the top options available.


Mailchimp offers a wide range of testing features for optimizing email campaigns. For example, The subject line can be fully customized, allowing for various verb choices, value propositions, and tones of voice.

The platform allows populating the sender's name and can replace an email address with an actual name. Targeting specific members based on factors like age or purchase history is possible, and A/B tests can be conducted within these targeted communities. Different messages and layouts can be trialed by scheduling the same email at different times of the day.


Get Response has a user base of 350,000 and offers a comprehensive A/B-test feature list. Marketers can create variants of their visual or text content to reflect a different style or message.

Testing different subject lines can help to get better open rates and make a first impression that stands out while the ‘From Field’ allows for a personal touch and can help avoid the spam folder. The software allows scheduling emails for a specific time of day or around an important event to test the engagement rate. Another AB testing feature, List Segmentation, helps brands to target specific groups of contacts with similar attributes.


ActiveCampaign is another popular email A/B testing software with many customization options for subscribers, like personalizing senders' information, email addresses, subject lines, and content.

The software allows testing up to five variable versions, but it does not support A/B testing based on time or audience segmentation. While this may not be ideal for teams that require these options, it is a suitable choice for businesses that target a specific type of customer.

Top software solutions for Web Landing page A/B testing

Choosing the best A/B testing software can be challenging, especially with hundreds of options available in the market. However, if brand teams have a clear objective of what they want to test, they can select one based on its strengths and weaknesses. Here are our top 5 recommendations for web page A/B testing:


This free A/B testing tool from Google helps SMBs run a limited number of experiments with different objectives and personalization. Its premium version, called Optimize 360, comes with additional bandwidth and testing features to support enterprise-level businesses.


Visual Website Optimizer (VWO) is a platform for experimentation that offers a complete set of CRO tools. It allows you to A/B test various elements of your website, including the headline, CTA button, and images, to determine which variations, result in higher user conversions.


Omniconvert helps to optimize the website with capabilities like testing, surveys, personalization, customer segmentation, and behavioral targeting. Marketers can generate detailed reports for generating insights on test variations, conversion rates, and statistical significance.


This landing page builder software includes AB testing and analytics features, enabling marketers to optimize their landing pages and improve conversion rates using data-driven insights.


Crazy Egg is a website optimization tool that lets you analyze user behavior on your website. It has heat maps, scroll maps, and click reports to help you test different web page versions and determine which generates more engagement or conversions.

Top Software solutions for Digital/Social Ad A/B testing

A/B testing of digital and social media campaigns helps marketers test different ad elements and identify the most successful parameters.


Facebook and Instagram Ads: Facebook Ads Manager allows marketers to conduct split testing on their Facebook and Instagram ad campaigns. The interface enables testing of ad creatives, headlines, descriptions, call-to-action buttons, and audience targeting to determine the most effective approach.

By ensuring that audiences are evenly split and statistically comparable, Facebook Ads A/B testing allows marketers to confidently assess the performance of different campaign variables. This empowers teams to make informed decisions about which strategy performs the best, enabling them to allocate more resources to the most successful campaign.


Google Ads AB testing is a helpful way for marketers and business owners to compare two versions of an ad and measure their effectiveness. Drafts and experiments can be used for both ad variations and campaign testing.

Campaign experiments allow testing variables like ad copy, bid strategies, landing pages, ad extensions, display URLs, keywords, negative keywords, ad scheduling, targeting settings, and bid adjustments.

The feature allows creating a copy of an existing campaign and making desired changes, while the original campaign remains unchanged. The original campaign serves as the "control" and the new campaign serves as the "experiment," where the variable is being tested. Both campaigns run simultaneously, allowing marketers to monitor real-time performance and compare the experiment's performance to the control.

One advantage of A/B testing with campaign experiments is that the experiment shares the same budget as the original campaign. Owners have control over how to split the budget. Running the experiment and control at the same time helps limit the impact of outside variables on your results, such as seasonality.

Additionally, Google notifies the account manager when certain metrics in the campaign experiment reach statistical significance, which helps them analyze the results and make informed decisions.


LinkedIn Ads offers A/B testing for sponsored content and sponsored InMail. This feature allows customizing B2B ad campaigns to increase the interaction rate with the desired segments or audience.

Marketers can try out different ad headlines, ad copy, images, and targeting criteria. After the test, LinkedIn determines the winning campaign based on the chosen KPI, such as cost per click, if applicable.

For account managers who prioritize campaign optimization and market sentiment, LinkedIn A/B testing provides data-driven insights on the audience, creative elements, and ad placement. These insights assist in making campaign decisions that enhance ROI.


AdEspresso optimizes Facebook and Instagram ads and offers advanced A/B testing. It allows the creation of multiple variations of ad campaigns with different elements like ad copy, headlines, images, call-to-action buttons, and audience targeting. This helps experiment with various combinations to identify what resonates best with the target audience. Marketers can track KPIs such as CTR, conversion rates, CPC, and ROAS.

Top software solutions for Mobile App A/B testing

Mobile app product managers and marketers can gain valuable insights into user behavior, reduce the costs of new ideas, and optimize conversion rates through A/B testing and a culture of experimentation. Here are a few top mobile app A/B testing tools to analyze user engagement, purchases, and app abandonment.


Firebase is a platform by Google for developing mobile and web apps. It offers various tools and services for developers. With Firebase A/B Testing, product managers can manage backend infrastructure, monitor performance, and conduct experiments. It seamlessly integrates with other Google tools like Google Analytics, making data sourcing and insights easy. The app owner can roll back features if issues arise during testing. Setting up and deploying experiments is simple.


Optimizely Classic Mobile comes with the capability to run experiments in iOS or Android apps. It enables developers to conduct UI-based and server-side experiments, reducing the risk when launching new features. Users can enjoy full-stack and multi-channel experimentation, phased feature rollouts, instant app updates, and other mobile optimization features offered by Optimizely.


The VWO Mobile App Testing solution offers developers a comprehensive way to optimize their mobile apps. With the ability to experiment with various in-app user experiences and test key features before and after launch, developers can easily make changes to improve app conversion rates, engagement, usage, and retention. This includes testing basic UI changes like CTA or banner copy, color, and placement, as well as more significant optimizations to search engine algorithms and game experiences.

Limitations of message testing methodologies

Market researchers and marketers have several options for testing messages before and during the launch. However, each testing methodology has its pros and cons. Let's look at the challenges faced in testing messages and explore the unmet needs that remain unaddressed in the field.
Limitations of all message testing methodologies


1.One of the primary limitations of all message testing methodologies is the inability to test a large number of messages within a single study. Even with quantitative methods, it remains physically impractical to test numerous messages. However, testing more messages is crucial as it increases the likelihood of finding effective and winning messaging bundles.

2.Another challenge arises from the sea of sameness that emerges in primary market research as well as A/B testing. Many messages fail to differentiate themselves or receive distinct scores in the research. This lack of separation makes decision-making after the research more challenging.
3.Investing considerable time and effort in testing messages only to find that none of them perform well presents another dilemma. Going back to the drawing board to develop new messages and conducting further testing is neither efficient nor cost-effective. This limitation raises questions about the feasibility of improving messages before or during the testing process.

Understanding why people like or dislike certain messages is not easy in message testing. Currently, there is no effective way to gather this information silently from respondents without directly asking them. While real-world A/B testing can provide insights into which message performs better, it fails to explain the underlying reasons

5.Primary market research tends to focus on identifying the top-performing messages rather than determining the optimal bundle and story flow for a campaign. This approach overlooks the importance of crafting a cohesive messaging strategy.
6.Primary market research tends to focus on identifying the top-performing messages rather than determining the optimal bundle and story flow for a campaign. This approach overlooks the importance of crafting a cohesive messaging strategy.

Limitations of qualitative market research-based message testing

  • Qualitative message testing surveys are great for understanding the psychology behind individual messages, but they have limitations on the number of messages that can be tested and the time it takes to test them.

  • Qualitative message testing research also does not accurately represent how customers react to messages in the real world. During qualitative testing, the interviewer can focus on a single message with the respondent for several minutes, asking a dozen or more questions to collect their feedback on each message. However, in the real world, the same respondent may not spend more than a few seconds on a message, and that's if they even notice it in the first place.
  • Qualitative testing creates an artificial environment for message testing research, making the findings less projectable in the real world.
  • Qualitative testing feedback is likely to be "stated, not derived." During qualitative interviews, respondents are repeatedly asked to provide explanations for why they like/dislike each message.
    Behavioral science suggests that people can rarely explain the true drivers of their decisions because they themselves are not aware of the mental shortcuts they use to make a decision in the first place. Therefore, it is questionable whether qualitative testing can accurately identify the drivers of message appeal by simply asking respondents stated questions.

Limitations of quantitative market research-based message testing

Quantitative message testing can test more messages, but it doesn't allow for deep exploration of each message or provide detailed drivers/barriers of appeal.

  • Quantitative message testing software is not ideal for getting ideas from respondents on how to improve messages.
  • Many quantitative message testing methodologies produce similar scores across messages, making it difficult for marketing teams to make decisions based on research alone.

Some methods only deliver a rank order/hierarchy of messages and a TURF-type analysis, which is not enough for marketing teams to be campaign-ready. They need a segment-level, channel-specific message map, which is not currently provided by research.

Limitations of real-world A/B message testing


A/B testing can be time-consuming and resource-intensive. It often takes longer than other testing methods and can drain the time and resources of marketing teams. This is especially true for low-traffic sites, where it may take a significant amount of time to reach statistical significance and review the results. Additionally, deploying the winning variation requires expertise and experience that may not be readily available within the team.

  • Another limitation of A/B testing is that it assumes a static worldview and does not account for changes in trends, consumer behavior, or seasonal events. The winning variation may change over time due to these influencing factors.
  • It is important to note that A/B testing is only suitable for specific goals, such as determining which product page yields the best results. If the goals are less easily measurable, pure A/B testing may not provide the desired answers. Additionally, A/B testing does not improve a website that already has usability problems; it only tests variations of the existing design.
  • Even companies with heavily visited sites or those who spend a lot of media dollars on digital channels can’t test more than 8-10 options in A/B testing. Testing more options would take months and cost a lot because of the experimental design of the test.
  • A research study on A/B testing revealed that 80% of A/B tests never reach statistical conclusions and are abandoned by marketers, thereby reducing the actionability of A/B testing.

Limitations of AI synthetic market research-based message testing

AI synthetic market research-based message testing offers many advantages, but it also comes with several limitations and challenges. Here are some of the key limitations:

Data Quality and Bias: AI models are highly dependent on the quality of the training data. If the data used to train the AI model is biased or unrepresentative, it can lead to biased results and inaccurate message testing.

Contextual Understanding: AI may struggle to grasp the full context and nuances of messages, especially when dealing with sarcasm, humor, or cultural references. This can result in misinterpretation of the message's impact.

Lack of Creativity: AI models often lack the creativity and intuition of human researchers. They may not be as adept at understanding novel or unconventional message elements that could be effective.

Overfitting: AI models can sometimes overfit to the training data, meaning they perform well on the data they were trained on but struggle when faced with new, unseen data. This can limit the model's ability to adapt to evolving market trends.

Ethical Concerns: The use of AI in message testing raises ethical concerns, especially in terms of data privacy, surveillance, and the potential manipulation of consumer behavior.

Inability to Predict Long-Term Impact: AI may not accurately predict the long-term impact of messages, including their influence on brand loyalty or customer retention.

Message testing needs innovation to support the future of marketing

For 30+ years, there has been little to no innovation in how message testing is conducted by brands. Methodologically, few/no new techniques have been proposed that address unmet needs in the field of message testing:

  • Marketers ideally want to test a large quantity of messages in one study/experiment, but there are no good solutions to test 100s of messages together.
  • Marketers need better differentiation between good/better/best messages from message testing research so that they can make more confident decisions about which messages to use in their messaging campaigns.


    Conventional message testing methodologies often produce scores that regress to the mean, making it difficult for marketers to choose messages for their campaigns.

  • Marketers want to simultaneously test a wide range of ideas (akin to a color palette) and many ways to express each idea (akin to shades of the same color) when they conduct message testing. Most conventional messaging testing approaches can’t simultaneously accommodate this request. Creating many expressions of the same idea also requires a different approach to messaging development – one that is driven more by the use of science and algorithms and less by creativity.
  • Marketers look for messaging testing research to explain the drivers and barriers of appeal for every message. However, none of the message testing methodologies can answer the question, “Why do consumers like this message?” very well because of the stated nature of survey questions.

Lately, advances in the fields of decision heuristics science and AI have created an opportunity to finally transform message testing.

  • Improving messages using decision heuristics science: Decision heuristic science can be employed to carefully curate the language used in a message so that it subconsciously talks to the mental shortcuts of the target customer.


    This science can be used as a tool to create more persuasive messages without relying on creative copywriting. It can efficiently create multiple alternative versions of a single message, each written to a different decision heuristic.

  • Testing a greater number of messages: If every message being tested is written to a specific decision heuristic, then appeal of the underlying heuristic in a message can help predict appeal of other messages written to the same heuristic without the need to test every message with consumers. This means that decision heuristics science can also be used to create a more efficient research methodology that can test a large number of messages in one survey.
  • Increasing separation in message scores: Decision heuristics science can also help to get better separation in scores between messages. Lack of differentiation in message scores results from randomly showing messaging choices to respondents in a survey. If the decision heuristics of a respondent are known, messaging choices can be customized to each respondent in the survey, thereby forcing them to make tougher decisions between multiple messages that are all highly persuasive. They can also use this exact same science to get the signal on what inside a message is driving the appeal of the message without relying on direct or diagnostic questions.
  • Drivers of message appeal: Finally, decision heuristics science allows us to identify the hidden drivers of message appeal without having to ask stated survey questions. If a respondent repeatedly prefers messages written to a decision heuristic like Ambiguity Aversion, we can conclude that they are driven by that heuristic and language in the messages that talked to Ambiguity Aversion is driving the appeal of the messages. We don’t have to ask, and even if we did, it is highly unlikely that the respondent would have offered an explanation centered around the idea that they have an aversion to unknowns and uncertainties.
  • AI-powered identification of winning message bundles: If marketers develop and then test a large number (100s) of messages, billions or even trillions of message bundles can possibly be generated using individual messages. How can marketers identify the best message bundles for their brand efficiently using the survey data? By employing advanced genetic algorithms, they can identify the most effective messaging bundles out of billions or trillions of possibilities quickly without having to simulate every possible message bundle out of all possibilities.
  • Rules-based AI for channel and segment customization of messaging: AI can determine not only the best message bundle that can serve as the overall campaign strategy, it can also identify the best message bundle for every channel and customer segment using rules-based simulations.


    This level of precision can significantly improve the actionability of message testing research and provide better campaign readiness for marketers. Historically, message testing studies would produce two main deliverables – a message hierarchy and a TURF analysis. Rules-based AI can generate an omni-channel messaging playbook with the optimal message bundles and storyflow for digital displays, webpage design, or personal promotion channels.

What is the Roadmap for Using Behavioral Science and AI to Test Messages Differently?

Step 1: Preparing Better Messages Using Decision Heuristic ScienceMarketing teams are currently spending weeks or even months developing and fine-tuning messages in preparation for message testing research. The process can involve long meetings with the agency, message review workshops, and rounds of editing the same few words over and over again.

Using decision heuristics science to optimize language in messages can simplify the process and create significant time and process efficiencies.
By heuristicizing messages, many alternative versions of the same message can be created quickly and either all of them can be tested in research or the team can collectively choose the best ones to test.

Step 2: Testing and Customizing Messages with Respondent FeedbackAll of these messages are now used in research, where respondents participate in a quantitative research survey that includes not just the messages involved in testing but also the underlying heuristics behind those messages.

If every message is written to a specific decision heuristic and a Message  Heuristic map is available for all messages, this information can be leveraged in real-time during the research survey to customize which messages are presented to each respondent based on their own dominant decision heuristics.

Researchers can now test and customize individual messages as well as create message bundles in real-time according to each respondent's decision heuristics.

Step 3: Analyzing Messages and Optimizing Campaigns Using AI


Instead of analyzing the survey data in SPSS, Sawtooth or SAS, data can be fed into rules-based AI platforms, enabling the identification of optimal message bundles and story flows from billions or trillions of possibilities.

Rule engines can help create segment-specific message maps and communication strategies for different channels.

Step 4: Translating Research Findings into Actionable Messaging PlaybooksThe output of message testing research doesn’t have to be limited to traditional PowerPoint decks or TURF simulators. Messaging testing research can and should produce a messaging playbook for the brand which contains a comprehensive message map by channel and customer segment. Marketing teams can utilize these playbooks to execute targeted messaging campaigns effectively.

The Upside of Combining Behavioral Science and AI in Message Testing

Using Decision Heuristic Science and AI can potentially transform message-testing research:

  • Decision heuristics science can change how one writes messages before testing them in market research
  • Decision heuristics can change how one tests messages in a research survey, allowing for more messages to be tested at once, getting greater separation in message scores and also identifying the hidden drivers of message appeal without asking stated questions.
  • AI and machine learning algorithms can streamline the translation of survey data into compelling message bundles and storyflow that are campaign-ready.

Traditionally, research data was analyzed using conventional statistical methods such as SPSS or SAS. However, leveraging artificial intelligence (AI) on survey data presents a new opportunity where researchers can generate output that is more campaign-ready and reduces the number of steps required to translate research findings into marketing campaigns. This direct line of sight from research to execution not only accelerates time-to-market but also enhances campaign accuracy.

Why is this significant?

Under the conventional approach, when humans manually interpret market research findings and translate them into messaging and marketing campaigns, a significant amount of subjective judgment is involved. This often leads to disparities between the tested messages and the actual ones launched in the market.

Consequently, a considerable degree of “messaging de-optimization” occurs after the market research is complete and the resulting marketing campaigns launched often don’t reflect the output of the research. Messages are rewritten based on judgment, message bundles are changed based on personal preference, and message edits are made without realizing the negative impact on campaign performance.

By incorporating machine learning algorithms or AI into the process, this de-optimization of campaigns can be minimized. AI can identify the optimal message bundles and storyflows for every channel and customer segment based on rules. Marketing campaigns that performed the best in message testing research can now be launched in the market without any/many unnecessary alterations.

Meta-Analysis Approach and Key Findings

The effectiveness of this innovative message testing approach, which involves infusing behavioral science into message testing research and then using AI to analyze the data from the research, was studied in a large meta-analysis of research studies and was consistently proven to identify winning messaging campaigns.

A meta-analysis was conducted across 35 message testing projects involving 1,000s of physicians and patients. These studies included 20 different pharmaceutical brands across 22 disease states. With more than 6,500+ participants and 4,800 messages tested, the analysis conducted over 3 years yielded promising results for innovation in message testing.

In every one of the message testing research studies analyzed in the meta-analysis, messages were tested using heuristics and then the optimal bundles of messages were identified using AI. Preference share for the new message bundles identified through research was compared against benchmark message bundles. In most studies, benchmark bundles included current in-market messaging for the brand and also messaging for key competitors.


The results are based on a comparison of preference share data on new optimal message bundles vs. benchmark controls from the 35 studies.

Success rate

The optimal message bundles and the story flows identified by testing messages using the heuristics + AI approach consistently outperformed the preference share of any benchmark used, including current in-market messaging and competitive messaging controls.

100% of projects resulted in improvements versus current messages and competitors. This was statistically significant every time, in every study.


Their preference share was found to be 1.7 times higher than the current in-market messaging or competitor message bundles. This 70% improvement in messaging effectiveness offers substantial growth potential for brands in the marketplace. In 26 out of 35 studies the improvement in preference share vs. benchmark controls was > 30% and even the lowest improvement level was still in double digits!

Market Leadership

Brands that employed the heuristics + AI-based message testing approach also improved their market leadership vs. competitors in the market research study. 7 out of 10 brands identified new message bundles and improved their market share projections vs. key competitors. For brands that were already market leaders, the new messaging was projected to further widen their lead in the market. For challenger brands, the new messaging identified through research was projected to close the market share gap vs. the leaders.


The results are promising as this kind of innovation in message testing can lead to more effective messaging campaigns, resulting in better outcomes.
Generally, when marketers are asked what they would need to do to improve the effectiveness of their messaging campaigns, the most common responses are that they need better messages and insights. However, teams rarely suggest that the way messages are tested can have a significant impact on the outcome of the campaigns.

This analysis suggests otherwise. The method in which messages are tested has an impact on which messages are selected for the campaigns, the message bundles that are used, the story flow that is created for different marketing assets, and eventually, the success rate of the campaigns. These findings are intriguing and suggest that there is room for improvement in message testing research.


If a messaging campaign doesn't succeed in changing customer behavior, it is a wasted marketing opportunity. Message testing helps brand teams understand which messages have an impact. If they don't include message testing in their marketing strategy, they may waste resources on ineffective copy and expensive campaigns that don't engage audiences. By using decision heuristic science and AI, brands can identify successful message combinations and create more impactful campaigns.