Likert Scale

The Likert Scale is a common attitudinal survey format, requiring survey participants to select an option between “Strongly agree” and “Strongly disagree”. The most common scales have 5 or 7 points, but others such as 4 or 3 point scales are also used.

The method is named after Rensis Likert, who published examples of such surveys in the 1932 paper A Technique For The Measurement of Attitudes. Likert is often incorrectly listed as the inventor of the method by people who have not read the original paper. Instead, the paper presents an overview of several typical types of surveys conducted around 1930s to poll American audiences on topics such as racial segregation and military interventions, including Yes/No and other types of “surveys of opinions”.

Although Likert did not invent the scales, his important contribution was to popularise them, and to show that just assigning numerical values to each element of the scale then adding up the response values from a set of related questions provides an easy and useful way to measure people’s attitudes.

Examples of Likert scales

In a wider sense, the Likert scale can refer to any rating between two extreme feelings or behaviours, such as: such as:

Likert’s original paper actually shows approval scales rather than agreement scales, but the disagree/agree opposites are now effectively the most commonly used option in surveys (so much that some authors insist that only Disagree/Agree scales should be called “Likert scales”). Examples of popular surveys using such scales are SUS and UMUX.

Many survey formats do not label all points on the scale, but just label the extreme values.

UMUX-LITE

# Question Strongly Disagree Strongly Agree
1 [This system’s] capabilities meet my requirements 1 2 3 4 5 6 7
2 [This system] is easy to use 1 2 3 4 5 6 7
The UMUX-LITE survey uses a 7-point Likert scale to capture responses.

Generally, scales with fewer points may cause users to be indecisive, as they might not feel strongly about any option. Scales with more points allow for finer-grained selection, but they can also start to confuse if the difference between subsequent points is not clear. Scales with five to seven points seem to strike a good balance between those forces.

Scales with an odd number of points usually include a neutral or undecided opinion in the middle. You can force the respondents to make a decision by using an even number of potential options that excludes that middle point.

Scoring Likert scale surveys

A common way to score surveys based on Likert scales is to assign a numerical value to each point in the scale (for example 1 for Strongly Disagree, 2 for Disagree, 3 for Neutral and so on), then calculate an average value from all the survey respondents.

Some survey formats, such as SUS, alternate between redundant positive and negative statements to detect bias. In such cases, the scoring method requires normalisation. A typical way to do that is to subtract 1 from positive statement scores (normalizing the scale to start with 0), and to subtract the negative statement scores from the maximum value (inverting and normalizing the scale to start with 0). For example, choosing 1 on a positive statement would score as 0, but choosing 1 on a negative statement with a 5-point scale would score as 4. This allows an average to be calculated from both positive and negative statement responses, and even to score the entire survey with a single numerical value.

How many participants are required for a Likert scale survey?

If using the standard normal (z) distribution, Batterton and Hale suggest including at least 200 to 300 people in Likert-style surveys, quoting work by Carmen R. Wilson VanVoorhis and Betsy L. Morgan. Wilson and Morgan actually suggest that the survey sample size should depend on the number of independent factors (questions), but as a general rule of thumb they suggest 50 as very poor, 100 as poor, 200 as fair, 300 as good, 500 as very good and 1000 as excellent.

In the book Quantifying The User Experience, Jeff Sauro and James R Lewis suggest that even smaller samples of 30 or so can be used with t-distribution.

Interpreting and analysing Likert-scale surveys

In the article The Likert Scale What It Is and How To Use It, Katherine Batterton and Kimberly Hale point out that a key source of confusion for interpreting Likert scale data is that the individual responses are ordinal, but the aggregate scores can be treated as data from an interval.

Gail M Sullivan and Anthony R Artino Jr suggest in Analyzing and Interpreting Data From Likert-Type Scales that when looking at individual questions, it’s best to treat data as ordinal and look at the frequency of responses. Effectively, it’s much more useful to know that the most common answer to a question is “Highly Unlikely”, even if the average trends towards neutral.

However, when looking at surveys composed of several related questions, treating the data as numeric intervals and computing a single score is quite useful, in order to be able to track relative comparisons. In the book Quantifying The User Experience, Sauro and Lewis suggest computing the mean and standard deviation of the responses, then using the using confidence intervals based on the t-distribution. This is similar to the standard normal distribution for larger sample sizes, and helps to account smaller groups of participants (30 or less).

Learn more about the Likert Scale