Working with Large Datasets
Working with Large Datasets
Loading an entire table into memory works fine for small datasets. It does not work when your table has millions of rows. Laravel provides several tools to process large volumes of data efficiently without exhausting your server's memory.
Chunking
chunk() retrieves a fixed number of rows at a time and passes them to a callback. Laravel issues a new query for each chunk, keeping memory usage flat:
User::orderBy('id')->chunk(1000, function (Collection $users) {
foreach ($users as $user) {
// Process each user
$user->update(['last_synced_at' => now()]);
}
});
Return false from the callback to stop early:
User::orderBy('id')->chunk(500, function (Collection $users) {
foreach ($users as $user) {
if ($user->balance < 0) {
// Stop processing when we hit a negative balance
return false;
}
}
});
Lazy Collections
Laravel's lazy() method streams results one row at a time using PHP generators. It uses even less memory than chunk() because it never holds more than one model in memory:
User::orderBy('id')->lazy()->each(function (User $user) {
$user->update(['status' => 'migrated']);
});
You can chain standard collection methods on a lazy collection — filter, map, take — and they evaluate lazily:
User::orderBy('id')->lazy()
->filter(fn(User $user) => $user->is_active)
->take(100)
->each(function (User $user) {
SendNewsletterJob::dispatch($user);
});
Cursor Pagination
Offset-based pagination (LIMIT x OFFSET y) becomes painfully slow on large tables because the database must scan and skip y rows every time. Cursor pagination avoids this by using a WHERE id > last_seen_id clause instead:
$users = User::orderBy('id')->cursorPaginate(25);
The resulting paginator includes nextCursor and previousCursor values that the client passes back on the next request. This works efficiently regardless of table size.
paginate() vs simplePaginate() vs cursorPaginate()
| Method | Knows total pages? | Efficient on large tables? | Use when |
|---|---|---|---|
paginate() |
Yes | No (runs a COUNT query) | Small-to-medium tables, admin interfaces |
simplePaginate() |
No | Yes | You only need next/previous links |
cursorPaginate() |
No | Yes | Very large tables, infinite scroll |
Inserting Large Datasets
Inserting rows one at a time is slow. Use insert() to batch them:
$records = collect(range(1, 10000))->map(fn($i) => [
'name' => "User {$i}",
'email' => "user{$i}@example.com",
'created_at' => now(),
'updated_at' => now(),
]);
// Insert in batches of 1000
$records->chunk(1000)->each(function (Collection $batch) {
User::insert($batch->toArray());
});
For upserts (insert or update if a row already exists), Laravel provides upsert():
User::upsert($records, ['email'], ['name', 'updated_at']);
The second argument is the unique column(s) used to detect conflicts; the third is the list of columns to update on conflict.
Tips
- Prefer
lazyById()overlazy()when your table has an auto-incrementing primary key. It uses keyed chunking under the hood, which is more efficient and avoids issues with rows being added or deleted mid-scan. - Always
orderBywhen chunking or using cursor pagination. Without a stable sort order, rows can appear in different positions between queries and you may miss or duplicate data. - If you are running a long-running batch job, consider dispatching it as a Laravel job so it does not block a web request.