What Is Interaction Cost?

Interaction cost is the sum of efforts — mental and physical — that users must deploy in interacting with a digital product in order to reach their goals.

Ideally, we’d like users to go to a site and find the answer they’re looking for immediately. That would mean zero interaction cost and is the holy grail of usability as a field.

Unfortunately, zero interaction cost is rarely attainable, since most sites and apps offer many things that users may want to do. Most of the time, users have to look around, read, possibly scroll, find a promising link, click on it, wait for the page to load, and then repeat the process all over. Sometimes a new window may pop up on top of the existing one, and in that case, users must switch attention to the new window and perhaps also look back to the old one to integrate information in both windows. In other situations, users may need to remember information on one page and apply it on a different one. All these actions require cognitive effort and make up the interaction cost.

Usable sites minimize the interaction cost required to attain a variety of user goals. That is, they minimize:

  • Reading
  • Scrolling
  • Looking around in order to find relevant information
  • Comprehending information presented to them
  • Clicking or touching (without making mistakes)
  • Typing
  • Page loads and waiting times
  • Attention switches
  • Memory load ­— the information that users must remember in order to complete their task.

These user actions contribute differently to the total interaction cost.

  • Their relative importance may depend on the user — for example, dyslexic users may have a harder time reading than clicking around, whereas users with motor impairments may find clicking more difficult.
  • They also depend on the device — a page load on a desktop connected to a high-speed network may be insignificant, but a page load on a mobile device may take forever if the cellular coverage is slow.

Many usability guidelines address the question of minimizing the various components of the interaction cost. For instance, the rules of writing for the web lower the cost of reading by recommending bullet points and short, to-the-point sentences and paragraphs.

An Example of Interaction Cost

Let’s take a simple example. Assume we want to find where the word “ceremony” comes from. We’ll use the Dictionary.com iPhone app for this task. We ignore the cost involved in finding the app on the phone, and we start our analysis immediately after launching the Dictionary app.

Dictionary.com displays on a blue background
The first thing that appears after starting the app is a splash screen.

At this point, the interaction cost involves waiting for a few seconds for the splash screen to disappear and make room for the first actionable screen of the app:

Word of the day, cummerbund and definition
On this page, the interaction cost comes from locating the search box and moving the finger to it to start typing. Locating the box may take some effort because of the competing graphical elements on the screen. The search box is a fairly big, easy-to-touch target, so moving the finger to it will likely be low-effort.

Next, users had to edit the search query.

A dropdown list of recent queries displays under the search input field.
When the input focus moves to the search field, the main screen displays a list of recent queries. These suggestions potentially lower the interaction cost if the user happens to search for something they’ve searched before. However, if the user is looking for a different word (like in our case), the suggestions will increase the interaction cost, because we might spend time scanning the list first and then deciding that none of the suggestions is relevant.

As users start to type, a new list of autosuggestions that match the typed letters is automatically displayed. Users could look at the autosuggestions and decide if they want to continue typing or select something from the list.

Typing into the search changes the words on the suggestion list with each character typed.
As the user starts typing the word “ceremony”, suggestions are displayed underneath. The user can inspect the suggestions and decide whether they want to continue typing or stop and pick a suggestion.
The list shortens until only one suggestion is left, the one that matches the typed-in characters.
It’s likely that the user will type until their target word becomes visible in the suggestion box and then pick it.

Once the word “ceremony” has been selected (or typed), the users have to press Search to get to the result page. They need to wait for a few moments for the new page to appear:

Main and alternate meanings for the word are shown, along with the part of speech (noun, etc.) and derivative words, such as the plural form.
On this page, some users will probably scroll down to find out if the etymology was listed farther down the page. Others may notice the tabs at the top and realize that they can scroll horizontally to see more options.
Besides the current tab "DEFINITION", there are multiple other tabs on on this page, including "LEARNERS", "GRAMMAR". More tabs appear when users scroll to the right.
If the users decide to look into the options in the tabs, they will scroll horizontally to explore additional tabs. None of the tabs is called Origin or Etymology, so they will go back to the current tab and inspect it before deciding to select another.
A section titled "ORIGIN OF CEREMONY" appears after scrolling down in the default tab.
When users scroll down in the default tab, they will find the ORIGIN OF CEREMONY section following the definition.

Finally, users will spend time reading the explanation in the ORIGIN OF CEREMONY section.

Let’s summarize the various components of the interaction cost to find the origin of the word “ceremony”:

  1. Wait for the splash page
  2. Search
    • Find the search box and tap to move the input focus to it
    • Scan the list of recently searched queries
    • Decide that the recent queries are not relevant
    • Type and/or choose autosuggestions
      • Enter a few characters
      • Scan the list of autosuggestions to see whether the desired word is among them
        • If no, enter more characters and repeat the previous step
        • If yes, choose the desired word by tapping it
    • Tap Search
  3. Wait for the result page
  4. Find where the relevant etymology information may be on the result page
    • Notice the tabs and scan visible tab labels
      • Notice that there are more hidden tabs to the right
      • Infer that etymology may be one of the hidden tabs
    • Look for etymology in the tabs
      • Remember that swiping exposes content to the right
      • Swipe to the right
      • Decide the tabs are not relevant
      • Swipe back to the original DEFINITIONS tab
    • Scroll down the DEFINITIONS tab content and scan the content to find the etymology information
    • Notice ORIGIN OF CEREMONY section
  5. Read about where the word “ceremony” comes from in the ORIGIN OF CEREMONY section

As you can see, a fairly simple and painless process takes a lot of steps and substeps; each of them incurs an interaction cost.

For some, the interaction cost is insignificant — for instance, remembering that swiping to the right exposes more content has a very low interaction cost, because people have encountered horizontal scrolling many times before on mobile devices or on the web.

Other steps can be optimized to minimize the interaction cost. For example, the placement and the visual design of the search field impact how quickly people locate it on the page. Similarly, making the buttons and search fields big can help with tapping the targets. A descriptive heading in plain language like ORIGIN OF CEREMONY also helps users quickly find the information they are looking for. (Though, in this case, using all caps makes the label harder to read.)

Expected Utility

Note that for some of the steps in the previous sections, users have multiple choices. For instance, they can either pick a suggestion from the autosuggest list or type the string to the end.

How do people decide which action to pick? The answer lies in the concept of expected utility:

Expected utility = Expected benefits – Expected interaction costs

Users try to maximize the expected utility of an action: In other words, they weigh the benefits and the costs of each action, and they choose the one that has the best balance of benefits versus costs.

When there are several ways to reach the same goal with similar benefits, users typically tend to pick actions that minimize the estimated interaction cost.

For instance, many people may not scroll down in the list of autosuggestions to find the word “ceremony” and might rather type one (or a few) more characters until the word ceremony is visible, because the cost of scrolling down the small list and scanning the list for the right word is higher than the cost of typing one or even a few more characters.

The list of suggestions shortens to display the target word at the bottom of the list, just above the onscreen keyboard.
Auto suggestions are less accurate when fewer characters are typed out. As a result, the target word may appear lower in the list or not at all, increasing the interaction cost of scanning the suggestions. In such cases, it's often faster to finish typing the word than to sift through the list.

This type of thinking generalizes at the site level as well. If it looks like it is going to be really hard to reach their goal on any given site, most users will just move to another site with a lower estimated interaction cost unless the benefit of interacting with the initial site is really high. To give an example, if the user really wants to buy an Apple computer, they probably are going to stick with Apple’s site because it’s unlikely that they will be able to buy it elsewhere. In this case, the user motivation is really high, so they may be willing to put up with a high interaction cost. However, if the user wants to buy a grill, they may not care if they buy it from Home Depot or Lowe's or some other site, and they will navigate away from sites that have high interaction costs.

Marketing and branding usually have the job of increasing the user motivation and expected benefits for engaging with a particular site or brand; usability deals with lowering the interaction cost. Both methods ultimately address the issue of increasing the expected utility of using a site or a piece of software.

Why You Should Care About Interaction Cost

Interaction cost is a direct measure of usability. In fact, the concept was introduced back in the early days of human-computer interaction to evaluate the usability of a software system. All usability heuristics minimize the interaction cost for the user.

A quick assessment of the interaction cost of a design can save a lot of money in the long run, as it can give you a good measure of how difficult the interface is going to be for the user. It can also serve as a comparison tool between design alternatives: usually, the one that minimizes the interaction cost has a better chance of success.