Scroll-based interaction is incredibly popular for interactive storytelling. There are many compelling reasons for this, yet scrolling is surprisingly nuanced and easy to break. So here are five rules for employing scrolling effectively.
As a form of interaction, scrolling is ubiquitous; we do it constantly. You will “interact” with this static page by scrolling while reading. Scrolling’s ubiquity makes it almost effortless, as you need not consciously coordinate your hand and eye as you scan. In contrast, clicking on a tab or stepper is a comparatively complex action that requires greater deliberation. Making content visible by scrolling is almost always better than hiding it behind a click.
Scrolling is especially good for narratives because it is linear: you can only scroll in one dimension.* It provides a single track with no branching, starting at the top and progressing continuously to the bottom. The only choice the user must make is whether to go forward to new content or backward to content already seen. With a click-based interface, it’s easy to overload the reader with choice, and even pieces that encourage exploration often have a primary path favored by the author.
While scrolling is efficient, the eye can scan even faster without scrolling, so it is preferable to avoid hiding content in the first place. Custom scrolling gives you a lot of freedom, but remember that browser windows come in a range of aspect ratios and sometimes it’s useful to show more content simultaneously. Also beware full-screen designs that lack any scrolling affordance: the reader might not realize they can scroll. A visual indication of additional content, either as partially-visible content or an explicit prompt, avoids the reader being stranded on the cover.
As readers, we have strong expectations on how scrolling should behave. As designers, we may yet change scrolling behavior, but too much change risks confusing or frustrating readers — the phenomenon of scrolljacking.
In typical usage, scrolling is a continuous gesture: it is only constrained by the height of the page. The reader can slide the viewport to any position between the top and bottom. In contrast, pages are not typically an endless column of homogeneous text but a composition of discrete sections, such as photos, graphics and videos. How should we optimize the display for heterogeneous content, given that scrolling is continuous? How should we isolate an individual graphic or video to present it dramatically, rather than awkwardly cropped and cluttered by the arbitrary viewport?
There are two schools of thought here. The first is to constrain scrolling to fixed positions at content boundaries. The second is to leave scrolling as-is and instead adapt the content to the viewport.
The first approach fundamentally alters the experience of scrolling. Instead of continuously sliding the viewport, the reader now performs discrete swipe-down or swipe-up gestures. Yet without knowing whether the reader’s fingers are touching the input device, it is difficult to detect swipes;* we must instead interpret wheel events and guess (poorly)! Rolling your own swipe gesture detection leads to several common problems:
It’s extraordinarily sensitive. Even the smallest wheel event — scrolling a single pixel — is interpreted as scrolling an entire window height, an effective magnification of three orders of magnitude!
Swipe gestures that occur during the animation of a previous swipe are either dropped on the floor or queued until the animation finishes. This makes scrolling feel laggy or unresponsive and is exacerbated by sensitivity; there’s no way to interrupt an accidental scroll.
You cannot control the speed of the swipe animations, frustrating readers that want to scroll faster (or slower), and disabling rapid visual scanning of the entire page. Even a quick peek ahead requires sitting through two animations.
You lose the scroll bar, so you can no longer click and drag the scroll thumb for direct manipulation of the viewport or click the scroll bar to jump to an arbitrary point. And you lose a visual representation of the currently-visible region of the page. The custom replacement — tiny vertical dots — restores only a subset of this functionality.
You lose standard keyboard controls (see #5).
You lose rubber-banding at the top and bottom of the page, abandoning helpful visual feedback and further making the page feel unresponsive.
The second approach retains standard scrolling behavior but alters the display based on the current viewport through the use of position-fixed elements. The page is divided vertically into “screens” with scroll-based thresholds to trigger transition between screens.
Transitions between screens may be time-based with an instantaneous trigger (such as a 250ms cross-fade when pageYOffset
crosses 1200px), or positioned-based within a transitionary region (such as a cross-fade from zero opacity at 1100px to full opacity at 1200px). Position-based transitions automatically adjust speed to match how the reader scrolls; however, triggered time-based transitions avoid the possibility of sitting in a transition indefinitely, which can be unsettling. Importantly, even when time-based transitions are used, the reader can still interrupt the transition and has full control over the viewport.
As you may have guessed, I am a strong proponent of the second approach.* It preserves direct manipulation whereas the first approach feels indirect and detached; I would rather scroll the page myself than hand-wave a butler to advance the page on my behalf. The first school seems to value the purity of the designer’s aesthetics over the user’s experience. Rapid, incremental, reversible scrolls are more usable than slow, animated swipes.
But don’t take my word for it; you decide! Two well-known examples of the first school are Huge’s home page and Apple’s Mac Pro page. Two examples of the second school with position-based scroll transitions are A Game of Shark and Minnow and our preview of Ted Ligety for Sochi 2014. For an example of triggered time-based transitions, see my Stack presentation library.
Scrolling should provide instantaneous visual feedback in response to user input; when you scroll using the mouse, touch, or keyboard, you should instantly see the display respond. This feedback is an essential component of direct manipulation and lets you rapidly adjust your input to scroll the desired distance and velocity.
When implementing custom scrolling, provide feedback by having at least some visible content scroll normally at all times. This means content that moves in direct proportion to the amount scrolled, and thereby responds consistently and predictably to input. Don’t simply make everything position-fixed; if nothing moves when the reader scrolls, even temporarily, the page will again feel laggy or unresponsive. This is why Shark and Minnow has text scrolling on top of the video. If you find free-floating text too messy, use something smaller; any visible element that scrolls normally should suffice.
This advice applies not just to scroll-based transitions between screens, but any time you alter normal scrolling behavior. In 342,000 Swings Later, Derek Jeter Calls It a Career, a looped animation of Jeter’s swing zooms out as you scroll down, offering a sense of scale. The scrolling text in the foreground serves as an anchor of predictable behavior while the background animation zooms dramatically.
The viewport does not precisely correspond with the reader’s eye position; a typical viewport is many times larger than the area of foveal vision. Use caution playing animations or video as soon as the elements scroll into view, particularly with audio; even though the elements are visible, the reader may not yet have finished reading preceding text. Surprises are likely to distract the reader’s eye and interrupt reading, and repeated disruptions can frustrate readers.
To play video on scroll safely, isolate it making it fill the viewport; then you can assume the reader is ready for it, since it will be the only thing visible. Non-disruptive auto-play is simply a matter of cueing so that the reader knows what is about to happen, and can choose to scroll in or wait. It’s only the unexpected trigger that is disruptive. A fade-to-black can further reinforce the visual cue of an imminent video.
While we may primarily consider the mouse wheel or touch pad, scrolling supports secondary input modes as well: pressing an arrow key, clicking the scroll bar or dragging the scroll thumb. There are at least twelve standard keyboard shortcuts for scrolling:
It’s reasonable to override some of these to tweak scrolling behavior, say to change up and down arrows to jump between discrete screens of content rather than individual lines of text. But don’t inadvertently break other shortcuts when you implement custom keyboard controls. And most importantly, don’t break browser history navigation (command-left, command-right, delete)!