Next.js

Streaming and Suspense in Next.js

How streaming works with the App Router, what loading.tsx gives you for free, and how to wrap slow components in Suspense to keep the rest of the page f...

Most pages have at least one slow bit. Maybe it's a database query that takes 400ms, or a third-party API call you can't cache. With a traditional server-rendered page, that slow bit blocks everything. The browser stares at a blank screen until the server is done. Streaming fixes this by sending HTML in chunks as it becomes ready, so the fast parts of your page arrive while the slow parts are still being computed.

The App Router is built around this. You don't need to configure anything, it works by default. But knowing how it works helps you use it deliberately.

How Streaming Works

HTTP/1.1 has supported chunked transfer encoding for a long time. The server sends a response with Transfer-Encoding: chunked, then writes data to the connection in pieces rather than buffering the whole thing. The browser renders each chunk as it arrives.

React's server rendering takes advantage of this. When you render a Server Component tree, React can flush the already-resolved HTML down to the browser immediately, then send additional chunks as Suspense boundaries inside the tree resolve. The browser gets a progressively more complete page rather than waiting for everything at once.

In practice, this means:

  • Your page shell (header, nav, layout chrome) arrives fast
  • Slow sections show a fallback while their data loads
  • The full page fills in without a full navigation or any client-side fetch

loading.tsx: Suspense for Free

The easiest way to get streaming in the App Router is loading.tsx. Drop one in any route segment folder and Next.js wraps the page in a Suspense boundary automatically, using your loading file as the fallback.

app/
└── dashboard/
    ├── page.tsx
    └── loading.tsx   <- automatic Suspense boundary
// app/dashboard/loading.tsx
export default function Loading() {
  return (
    <div className="space-y-4 p-6">
      <div className="h-8 w-48 animate-pulse rounded bg-gray-200" />
      <div className="h-4 w-full animate-pulse rounded bg-gray-200" />
      <div className="h-4 w-3/4 animate-pulse rounded bg-gray-200" />
    </div>
  )
}

While the page component is still awaiting its data, the browser shows your loading UI. When the data is ready, React streams the real content and swaps it in. From the user's perspective, the page structure appears immediately and the content follows.

This is also why navigations feel fast in the App Router. Next.js shows the loading.tsx instantly on navigate, even before the new page has fetched anything. The user gets feedback right away.

Wrapping Individual Components in Suspense

loading.tsx covers the whole page. That's useful, but sometimes you want finer control. If a page has one slow section and three fast sections, you don't want to hide everything behind a spinner. You want to show three sections immediately and stream in the fourth when it's ready.

That's where explicit Suspense boundaries come in.

// app/dashboard/page.tsx
import { Suspense } from 'react'
import { RecentActivity } from '@/components/recent-activity'
import { AccountSummary } from '@/components/account-summary'
import { ActivitySkeleton } from '@/components/skeletons'

export default function DashboardPage() {
  return (
    <div className="grid grid-cols-2 gap-6">
      <AccountSummary />
      <Suspense fallback={<ActivitySkeleton />}>
        <RecentActivity />
      </Suspense>
    </div>
  )
}

AccountSummary fetches fast data (or no data at all) and renders immediately. RecentActivity hits a slow API, so it's wrapped in Suspense with a skeleton fallback. The browser gets the account summary right away while the activity feed catches up in its own time.

The key is that RecentActivity itself is a Server Component that does its own data fetching:

// components/recent-activity.tsx
export async function RecentActivity() {
  const activity = await fetchRecentActivity() // slow
  return (
    <ul>
      {activity.map((item) => (
        <li key={item.id}>{item.description}</li>
      ))}
    </ul>
  )
}

No useEffect, no loading state in the component, no client-side fetch. The component just awaits its data and returns JSX. Suspense handles the rest.

Skeleton UIs

A skeleton is a placeholder that mimics the shape of the real content. It gives users a sense of what's coming, which feels faster than a spinner even if the actual load time is identical.

// components/skeletons.tsx
export function ActivitySkeleton() {
  return (
    <ul className="space-y-3">
      {Array.from({ length: 5 }).map((_, i) => (
        <li key={i} className="flex items-center gap-3">
          <div className="h-8 w-8 animate-pulse rounded-full bg-gray-200" />
          <div className="h-4 flex-1 animate-pulse rounded bg-gray-200" />
        </li>
      ))}
    </ul>
  )
}

Keep skeletons close to the components they stand in for. They're part of the component's public contract: here's what you'll see while I'm loading.

The Performance Case

The reason to care about this is straightforward. A page that blocks on its slowest query is only as fast as that query. If you have a page with four data sources and one of them takes 800ms, the whole page takes 800ms minimum.

With Suspense, each slow section is isolated. The fast parts of the page arrive in the first few hundred milliseconds. Users can start reading, scrolling, and interacting while the slow parts stream in. Core Web Vitals improve because Time to First Byte and Largest Contentful Paint are no longer bottlenecked by your worst query.

It also makes slow third-party integrations less catastrophic. If an analytics widget or a recommendation engine occasionally takes two seconds, you can wrap it in Suspense and the rest of the page is completely unaffected.

Composing Suspense Boundaries

You can nest Suspense boundaries freely. A page can have five different Suspense boundaries, each streaming independently:

import { Suspense } from 'react'

export default function AnalyticsPage() {
  return (
    <div className="space-y-8">
      <Suspense fallback={<MetricsSkeleton />}>
        <KeyMetrics />
      </Suspense>
      <Suspense fallback={<ChartSkeleton />}>
        <TrafficChart />
      </Suspense>
      <Suspense fallback={<TableSkeleton />}>
        <TopPages />
      </Suspense>
    </div>
  )
}

Each section streams in as its data resolves. There's no coordination needed. The boundaries are independent.

The mental model is the same as the App Router overall: each piece of the tree is responsible for its own loading state. Compose them and you get a page that feels fast even when the underlying data isn't.

← Older
The Metadata API and SEO in Next.js
Newer →
Server Actions in Next.js 15

Newsletter

A weekly newsletter on React, Next.js, AI-assisted development, and engineering. No spam, unsubscribe any time.