What You Should Know About the Net Promoter Score (NPS)

Recently, we have noticed an increase in requests for inclusion of the NPS by our customer survey clients. Although we routinely include a variant of the NPS question in our Strategic Customer Research surveys, we view this trend as being a bit disturbing. This paper will discuss some problematic aspects of NPS, both conceptual and technical.

NPS is a concept and research method promoted by customer loyalty guru Frederick Reichheld in his book, The Ultimate Question: Driving Good Profits and True Growth (2006).

The idea is that customer satisfaction and loyalty are strongly linked to revenue growth and profitability. An updated version of the book, to be called The Ultimate Question 2.0 is expected to be released in the fall of 2011.

In The Ultimate Question, Reichheld argues that most customer surveys do little more than annoy customers. All that businesses critically need to know about how they stand with customers is provided by customers’ answers to the question “How likely is it that you would recommend our company to a friend or colleague?” Respondents score themselves on an eleven-point rating scale that runs from “0” (not at all likely) to “10” (extremely likely).

The “net promoter” score is so called because the measure is computed by subtracting the percentage of detractors from the percentage of promoters. Detractors are defined as respondents rating their likelihood to recommend as 6 or less, with promoters only those who rated their likelihood a 9 or 10 (respondents who selected 7 or 8 are considered neutral). The NPS measure can run from -100% (0% promoters, 100% detractors) to 100% (100% promoters, 0% detractors), with typical results in the 25-40% range.

Although a casual reader might form the impression that Reichheld more or less invented the NPS question, in fact it and variants of it have been used for many decades by market researchers as a standard surrogate measure of customer loyalty. Asking respondents directly about loyalty has been shown to be ineffective whereas someone who is willing to recommend you to others is highly likely to be at least somewhat loyal. (A complete discussion of customer loyalty would take a whole book which is, in fact, how Reichheld earned his guru status.)

One of the positive attributes in the view of Reichheld and others is that NPS allows direct comparisons of scores between and among industries and companies, and also between internal business units in a given company. Among its virtues are its simplicity and its appealing and rather intuitive model of detractors and promoters. Managers find it easy to describe and explain to co-workers, and setting measurable NPS improvement goals is straightforward. Although popular with managers, and seemingly increasingly so, most research professionals have been skeptical at best. A number of analysts and academics have published studies questioning and even refuting Reichheld’s research. Research blogger Dr. Bob Hayes surveyed customer feedback professionals in 2008 as to whether they agreed with Reichheld that NPS was a better predictor of growth than other loyalty questions or indices. Eighty-one percent disagreed or were neutral (“Customer Feedback Professionals Do Not Believe the NPS Claims”).

It is worth noting that Reichheld and his associates are very protective of the NPS name and image. Because of its simplicity and the fact that some online survey providers offer the question and its scoring as an optional feature, some companies using NPS on their own have been surprised to receive letters from or on behalf of Reichheld requiring at a minimum that credit be given.

Many people are now taking issue with the Net Promoter Score methodology for several reasons. Most importantly,

1. The NPS is not diagnostic.

Although the NPS score may suggest you have a problem, the score alone doesn’t tell you what needs to be fixed. To the extent that Reichheld acknowledges this situation, he suggests that it can be addressed by asking an open ended follow-on question asking why the respondent gave the rating in question. While such responses may provide some insight into the nature of the organisational and customer issues faced by the company, they are anecdotal at best and tend not to lend themselves to the level of analytical rigour needed for effective problem solving and improvement planning.

2. The division of respondents into the categories of promoters, neutrals and detractors is arbitrary and has no scientific basis.

This fact by itself robs the NP score of any objective meaning. How is it that a one-point different in score on an eleven-point scale can accurately determine whether an individual is a promoter rather than a neutral, or a neutral rather than a detractor, whatever those terms mean? “The rule-of-thumb score classes proposed by Reichheld (promoters are those respondents who give a likelihood of recommendation of 9 or 10 while the detractors give 6 or less) are not supported statistically, mask important changes and potentially mislead management that there is negative NPS when this may not be the case.” – Ken Roberts, Forethought Research Australia.

3. The wording of the NPS question is questionable.

“How likely would you be to recommend…?” is a question about future intention with the implication that the question is behavioural. Yet a large body of research indicates that claimed intention is a better reflection of present attitudes than it is of future behaviour (Bird, Ehrenberg and Barnard). In addition, some who read the question literally may respond with a low score while feeling a high degree of loyalty simply because they think they may have few opportunities in the future to recommend the company. For these reasons, we prefer to ask how willing the respondent is to recommend, a question that is plainly attitudinal and does not pretend to be behavioural. This phrasing reduces ambiguity for the respondent.

In addition, the NPS question itself is unipolar (likeliness to recommend) but Reichheld treats it as bipolar (likely to detract vs. likely to promote). The implications of this are unclear but may reduce the validity of the results. Using our recommended wording avoids this trap.

4. The single NPS question is less reliable than a composite index would be.

Researchers and statisticians generally agree that composite indices are more stable than individual item scores. Some have suggested that ratings of customer satisfaction and intention to repurchase might be added to the recommend question to improve reliability. “In his Harvard Business Review article ‘The One Number You Need to Grow’, Reichheld maintained that since his tests showed propensity to recommend to be the single question that had the strongest statistical relationship to future company performance, there was no point asking any other questions in customer surveys… (However) a single item question is much less reliable and more volatile than a composite index.” – Customer Satisfaction – The customer experience through the customer’s eyes,” – Nigel Hill, Greg Roche and Rachel Allen.

5. The eleven-point scale used by NPS is problematic.

By collapsing the responses into three groups (0-6, 7&8 and 9&10) much information is ignored. Using a 3-point scale (1, detractor; 2, neutral; 3, promoter) would be the equivalent scale and would provide the same information. However, there is no reason to believe that someone scoring the company with a zero would have the same attitude and customer behaviour as someone scoring a six. This is nonsense on the face of it, yet NPS does not account for these obvious differences which, in fact, it measures but doesn’t use.

Additionally, many psyshometricians have suggested that the average respondent can comfortably discriminate among no more than seven points at a time. Thus longer valued scales may pose difficulties. This in part may be why the NPS 11-point scale has been shown to have lower predictive validity than other scales. The paper “Measuring Customer Satisfaction and Loyalty: Improving the ‘Net-Promoter’ Score” by Daniel Schneider, Matt Berent, Randall Thomas and Jon Krosnick demonstrate that the 11-point scale has the lowest predictive value of any of the scales tested. The authors recommend a 7-point scale with labeled ends and midpoint for the NPS question. The authors also recommend a bipolar scale for a reworded variant.

In response to these issues, we suggest the continued use of the recommend question, but to substitute willing to recommend in place of likely to recommend. We believe that a 7-point scale is optimum for this question and that the 11-point scale should be avoided. Finally, we recommend against the use of the NPS framework, using the recommend question as the outcome variable in a driver analysis or as an element with other factors in a loyalty scale.

