API Rate Limiting Implementation
Every public API faces a fundamental challenge: how do you serve thousands of legitimate users while protecting your infrastructure from abuse, scraping bots, and runaway clients? The answer is rate limiting — a technique that controls how many requests a client can make within a given time window. Without it, a single misbehaving client can degrade service for everyone or rack up enormous cloud bills.
In this guide, we'll go deep on rate limiting: the algorithms behind it, HTTP response conventions, real implementations in Node.js/Express and Laravel, Redis-backed distributed solutions, and the nuances that separate a naive implementation from a production-grade one.
GitHub's REST API limits unauthenticated requests to 60/hour and authenticated requests to 5,000/hour. Stripe enforces 100 requests/second. Twitter's API has complex tiered limits. These aren't arbitrary — they protect infrastructure, ensure fair resource sharing, and enable predictable capacity planning.
Rate Limiting Algorithms Explained
There are several algorithms for implementing rate limiting, each with different trade-offs in accuracy, memory usage, and burst tolerance. Understanding them helps you choose the right one for your use case.
Rate Limiting Algorithm Comparison
Fixed Window Counter
The simplest approach: divide time into fixed windows (e.g., every minute) and count requests per window. Fast and memory-efficient, but suffers from a boundary burst problem — a client can make N requests at the end of one window and N more at the start of the next, effectively making 2N requests in a short time.
Sliding Window Log
Stores a timestamp log of every request for each user. When a new request arrives, remove all timestamps older than the window and check the count. Perfectly accurate but memory-intensive for high-traffic APIs.
Sliding Window Counter (Recommended)
A hybrid approach: keep two counters (current window, previous window) and estimate the rate using a weighted calculation. This approximates a true sliding window while using constant memory per user — the best balance for most production APIs.
Token Bucket
Each user has a bucket that holds up to N tokens. Tokens refill at a constant rate. Each request consumes one token. This allows short bursts while enforcing long-term rate limits — ideal for APIs where brief spikes are acceptable.
HTTP Rate Limit Response Standards
Rate limiting responses should follow established conventions so clients can react programmatically. The standard HTTP status code for rate limit exceeded is 429 Too Many Requests, and you should include informative headers.
| Header | Meaning | Example Value |
|---|---|---|
X-RateLimit-Limit |
Max requests allowed in the window | 100 |
X-RateLimit-Remaining |
Requests remaining in the current window | 73 |
X-RateLimit-Reset |
Unix timestamp when the window resets | 1713456000 |
Retry-After |
Seconds to wait before retrying (on 429) | 47 |
RateLimit-Policy (RFC 6585) |
Machine-readable policy description | 100;w=60 |
Never use 403 Forbidden for rate limiting — that implies the client is permanently unauthorized. Always use 429 Too Many Requests with a Retry-After header so clients know the block is temporary.
Implementing Rate Limiting in Node.js / Express
The most popular approach in the Express ecosystem is express-rate-limit, which provides a flexible middleware with multiple storage backends.
Install Dependencies
Install express-rate-limit for the middleware and rate-limit-redis for distributed storage.
npm install express-rate-limit rate-limit-redis ioredis
Basic Rate Limiter Middleware
Create a rate limiter that applies globally or to specific routes.
const rateLimit = require('express-rate-limit');
// General API limiter: 100 requests per 15 minutes
const generalLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes in milliseconds
max: 100, // Max requests per window
standardHeaders: true, // Send X-RateLimit-* headers (RateLimit-* in draft-7)
legacyHeaders: false, // Disable X-RateLimit-* legacy headers
// Custom key generator: rate limit by API key or IP
keyGenerator: (req) => {
return req.headers['x-api-key'] || req.ip;
},
// Custom response when limit is exceeded
handler: (req, res, next, options) => {
res.status(429).json({
error: 'Too Many Requests',
message: `You have exceeded the ${options.max} requests per ${options.windowMs / 60000} minutes limit.`,
retryAfter: Math.ceil(options.windowMs / 1000),
});
},
// Skip rate limiting for trusted IPs (e.g., internal services)
skip: (req) => {
const trustedIPs = ['127.0.0.1', '::1'];
return trustedIPs.includes(req.ip);
},
});
// Strict limiter for auth endpoints: 10 requests per 15 minutes
const authLimiter = rateLimit({
windowMs: 15 * 60 * 1000,
max: 10,
standardHeaders: true,
legacyHeaders: false,
message: {
error: 'Too Many Requests',
message: 'Too many login attempts. Please try again later.',
},
});
// Very strict: password reset, 3 per hour
const passwordResetLimiter = rateLimit({
windowMs: 60 * 60 * 1000, // 1 hour
max: 3,
standardHeaders: true,
legacyHeaders: false,
});
module.exports = { generalLimiter, authLimiter, passwordResetLimiter };
Apply to Routes
Apply limiters globally or per-route depending on your needs.
const express = require('express');
const { generalLimiter, authLimiter, passwordResetLimiter } = require('./middleware/rateLimiter');
const app = express();
app.use(express.json());
// Trust proxy headers if behind Nginx/load balancer
// This makes req.ip reflect the real client IP, not the proxy
app.set('trust proxy', 1);
// Apply general limiter to all API routes
app.use('/api/', generalLimiter);
// Stricter limits on sensitive auth routes
app.post('/api/auth/login', authLimiter, loginController);
app.post('/api/auth/register', authLimiter, registerController);
app.post('/api/auth/forgot', passwordResetLimiter, forgotPasswordController);
// Public routes with higher limits (or no limit)
app.get('/api/public/status', (req, res) => {
res.json({ status: 'ok' });
});
app.listen(3000);
If your app runs behind Nginx or a load balancer, always set app.set('trust proxy', 1). Without it, req.ip will be the proxy's IP, and all users will share the same rate limit bucket — a catastrophic misconfiguration.
Redis-Backed Distributed Rate Limiting
In-memory rate limiters break in distributed environments: if you have 3 Node.js instances, each has its own counter, so the effective limit is 3× your intended limit. The solution is a shared store — Redis is the industry standard for this.
Distributed Rate Limiting with Redis as Shared Store
const rateLimit = require('express-rate-limit');
const RedisStore = require('rate-limit-redis');
const Redis = require('ioredis');
// Create Redis client
const redisClient = new Redis({
host: process.env.REDIS_HOST || 'localhost',
port: process.env.REDIS_PORT || 6379,
password: process.env.REDIS_PASSWORD,
// Reconnect on failure
retryStrategy: (times) => Math.min(times * 50, 2000),
});
redisClient.on('error', (err) => {
console.error('Redis error:', err);
});
// Redis-backed rate limiter
const redisRateLimiter = rateLimit({
windowMs: 15 * 60 * 1000,
max: 100,
standardHeaders: true,
legacyHeaders: false,
// Use Redis store for distributed environments
store: new RedisStore({
sendCommand: (...args) => redisClient.call(...args),
// Key prefix to avoid collisions with other Redis keys
prefix: 'rl:',
}),
keyGenerator: (req) => {
// Rate limit by API key if present, fall back to IP
return req.headers['x-api-key']
? `apikey:${req.headers['x-api-key']}`
: `ip:${req.ip}`;
},
});
module.exports = { redisRateLimiter, redisClient };
Sliding Window with Raw Redis Commands
For full control over the algorithm, you can implement a sliding window counter directly using Redis's atomic operations. This is the approach used by high-performance APIs:
const Redis = require('ioredis');
const redis = new Redis();
/**
* Sliding window rate limiter using Redis.
* Uses two keys: current window counter and previous window counter.
* Approximates a true sliding window with O(1) memory per user.
*
* @param {string} key - Unique identifier (user ID, API key, IP)
* @param {number} limit - Max requests allowed per window
* @param {number} windowMs - Window duration in milliseconds
* @returns {{ allowed: boolean, remaining: number, resetAt: number }}
*/
async function slidingWindowRateLimit(key, limit, windowMs) {
const now = Date.now();
const windowSec = Math.floor(windowMs / 1000);
const currentSlot = Math.floor(now / windowMs);
const prevSlot = currentSlot - 1;
const currentKey = `rl:sw:${key}:${currentSlot}`;
const prevKey = `rl:sw:${key}:${prevSlot}`;
// Atomically increment current window counter
const pipeline = redis.pipeline();
pipeline.incr(currentKey);
pipeline.expire(currentKey, windowSec * 2); // TTL: 2 windows
pipeline.get(prevKey);
const [[, current], , [, prev]] = await pipeline.exec();
const currentCount = parseInt(current, 10);
const prevCount = parseInt(prev, 10) || 0;
// Calculate weight of previous window's requests in the current window
const windowProgress = (now % windowMs) / windowMs;
const prevWindowWeight = 1 - windowProgress;
const estimatedCount = Math.ceil(prevCount * prevWindowWeight) + currentCount;
const allowed = estimatedCount <= limit;
const remaining = Math.max(0, limit - estimatedCount);
const resetAt = (currentSlot + 1) * windowMs;
return { allowed, remaining, resetAt, estimatedCount };
}
// Express middleware wrapper
function createSlidingWindowMiddleware(limit, windowMs) {
return async (req, res, next) => {
const key = req.headers['x-api-key'] || req.ip;
const result = await slidingWindowRateLimit(key, limit, windowMs);
// Always set headers
res.set('X-RateLimit-Limit', limit);
res.set('X-RateLimit-Remaining', result.remaining);
res.set('X-RateLimit-Reset', Math.floor(result.resetAt / 1000));
if (!result.allowed) {
const retryAfter = Math.ceil((result.resetAt - Date.now()) / 1000);
res.set('Retry-After', retryAfter);
return res.status(429).json({
error: 'Too Many Requests',
retryAfter,
limit,
resetAt: new Date(result.resetAt).toISOString(),
});
}
next();
};
}
module.exports = { slidingWindowRateLimit, createSlidingWindowMiddleware };
Rate Limiting in Laravel
Laravel has first-class rate limiting support via the ThrottleRequests middleware and the RateLimiter facade, introduced with Laravel 8. The API is expressive and supports custom limiters with dynamic keys.
Apply Laravel's built-in throttle middleware directly in routes:
// routes/api.php
use Illuminate\Support\Facades\Route;
// 60 requests per minute (format: max,minutes)
Route::middleware('throttle:60,1')->group(function () {
Route::get('/users', [UserController::class, 'index']);
Route::get('/posts', [PostController::class, 'index']);
});
// 10 requests per 15 minutes for auth endpoints
Route::middleware('throttle:10,15')->group(function () {
Route::post('/login', [AuthController::class, 'login']);
Route::post('/register', [AuthController::class, 'register']);
});
// Use a named rate limiter
Route::middleware('throttle:api')->group(function () {
Route::get('/profile', [ProfileController::class, 'show']);
});
Define custom rate limiters in App\Providers\RouteServiceProvider:
// app/Providers/RouteServiceProvider.php
use Illuminate\Cache\RateLimiting\Limit;
use Illuminate\Http\Request;
use Illuminate\Support\Facades\RateLimiter;
protected function configureRateLimiting(): void
{
// Default API limiter (authenticated users get higher limits)
RateLimiter::for('api', function (Request $request) {
return $request->user()
? Limit::perMinute(60)->by($request->user()->id)
: Limit::perMinute(20)->by($request->ip());
});
// Per-plan limits for SaaS tiering
RateLimiter::for('api-tiered', function (Request $request) {
$user = $request->user();
$limits = [
'free' => 60,
'pro' => 500,
'enterprise' => 5000,
];
$max = $limits[$user?->plan ?? 'free'];
return Limit::perHour($max)
->by($user?->id ?? $request->ip())
->response(function (Request $request, array $headers) {
return response()->json([
'error' => 'Too Many Requests',
'message' => 'You have exceeded your plan limit.',
'upgrade_at' => 'https://mayurdabhi.com/pricing',
], 429, $headers);
});
});
// Strict limiter for password resets
RateLimiter::for('password-reset', function (Request $request) {
return [
Limit::perHour(3)->by($request->ip()),
Limit::perHour(3)->by($request->input('email')),
];
});
}
Configure Redis as the cache store for rate limiting in config/cache.php:
// .env
CACHE_DRIVER=redis
REDIS_HOST=127.0.0.1
REDIS_PORT=6379
// config/cache.php
'redis' => [
'driver' => 'redis',
'connection' => 'cache',
'lock_connection' => 'default',
],
// config/database.php — Redis connections
'redis' => [
'client' => env('REDIS_CLIENT', 'phpredis'),
'default' => [
'url' => env('REDIS_URL'),
'host' => env('REDIS_HOST', '127.0.0.1'),
'password' => env('REDIS_PASSWORD'),
'port' => env('REDIS_PORT', '6379'),
'database' => env('REDIS_DB', '0'),
],
'cache' => [
'url' => env('REDIS_URL'),
'host' => env('REDIS_HOST', '127.0.0.1'),
'password' => env('REDIS_PASSWORD'),
'port' => env('REDIS_PORT', '6379'),
'database' => env('REDIS_CACHE_DB', '1'),
],
],
Custom Rate Limit Middleware in Laravel
When you need finer control — e.g., different limits per HTTP method or endpoint-specific logic — create a custom middleware:
<?php
namespace App\Http\Middleware;
use Closure;
use Illuminate\Http\Request;
use Illuminate\Support\Facades\RateLimiter;
use Symfony\Component\HttpFoundation\Response;
class ApiRateLimiter
{
public function handle(Request $request, Closure $next): Response
{
// Build a unique key from API key + endpoint + method
$apiKey = $request->header('X-API-Key', $request->ip());
$key = "rl:{$apiKey}:{$request->method()}:{$request->path()}";
// Determine limit based on request method
$maxAttempts = match ($request->method()) {
'GET' => 120, // Read-heavy
'POST' => 30,
'PUT',
'PATCH' => 30,
'DELETE' => 10, // Destructive ops — very strict
default => 60,
};
if (RateLimiter::tooManyAttempts($key, $maxAttempts)) {
$seconds = RateLimiter::availableIn($key);
return response()->json([
'error' => 'Too Many Requests',
'retry_after' => $seconds,
'message' => "Limit of {$maxAttempts} requests per minute exceeded.",
], 429, [
'X-RateLimit-Limit' => $maxAttempts,
'X-RateLimit-Remaining' => 0,
'X-RateLimit-Reset' => now()->addSeconds($seconds)->timestamp,
'Retry-After' => $seconds,
]);
}
RateLimiter::hit($key, 60); // Decay after 60 seconds
$remaining = $maxAttempts - RateLimiter::attempts($key);
$response = $next($request);
// Append rate limit headers to every response
return $response->withHeaders([
'X-RateLimit-Limit' => $maxAttempts,
'X-RateLimit-Remaining' => max(0, $remaining),
]);
}
}
Advanced Strategies
Tiered Rate Limiting
Production APIs rarely have one-size-fits-all limits. Implement tiers based on user plans, roles, or API key types:
// Plan-based limits
const PLAN_LIMITS = {
free: { requests: 100, windowMs: 60 * 60 * 1000 }, // 100/hour
starter: { requests: 1000, windowMs: 60 * 60 * 1000 }, // 1k/hour
pro: { requests: 10000,windowMs: 60 * 60 * 1000 }, // 10k/hour
enterprise: { requests: 100000,windowMs: 60 * 60 * 1000 }, // 100k/hour
};
async function tieredRateLimiter(req, res, next) {
// Look up user's plan from database/cache
const apiKey = req.headers['x-api-key'];
const user = apiKey ? await getUserByApiKey(apiKey) : null;
const plan = user?.plan ?? 'free';
const { requests, windowMs } = PLAN_LIMITS[plan];
const key = apiKey ? `plan:${apiKey}` : `ip:${req.ip}`;
const result = await slidingWindowRateLimit(key, requests, windowMs);
res.set('X-RateLimit-Limit', requests);
res.set('X-RateLimit-Remaining', result.remaining);
res.set('X-RateLimit-Plan', plan);
if (!result.allowed) {
return res.status(429).json({
error: 'Rate limit exceeded',
plan,
limit: requests,
upgrade: plan !== 'enterprise' ? 'https://example.com/pricing' : null,
});
}
next();
}
Endpoint-Specific Limits
Some endpoints are inherently more expensive than others. A search endpoint that queries your database should have a lower limit than a simple status check:
const rateLimit = require('express-rate-limit');
// Cheap: status and health checks — very permissive
const statusLimiter = rateLimit({ windowMs: 60000, max: 300 });
// Moderate: standard CRUD
const crudLimiter = rateLimit({ windowMs: 60000, max: 60 });
// Expensive: search, reports, exports
const searchLimiter = rateLimit({ windowMs: 60000, max: 10 });
// Very expensive: bulk operations
const bulkLimiter = rateLimit({ windowMs: 60000, max: 3 });
app.get('/api/health', statusLimiter, healthController);
app.get('/api/users', crudLimiter, userListController);
app.get('/api/search', searchLimiter, searchController);
app.post('/api/reports/generate', searchLimiter, reportController);
app.post('/api/users/bulk-import', bulkLimiter, bulkImportController);
Rate Limiting vs Other Defenses
| Defense | Protects Against | Works With Rate Limiting? |
|---|---|---|
| Rate Limiting | Abuse, brute force, resource exhaustion | — |
| Input Validation | Injection, malformed data | Yes — complementary |
| Authentication | Unauthorized access | Yes — identify users for per-user limits |
| WAF / Firewall | DDoS, bot traffic, known bad IPs | Yes — first line of defense |
| CAPTCHA | Automated bots | Yes — after repeated failures |
| IP Blocklist | Known malicious IPs | Yes — in combination |
Testing Your Rate Limiter
A rate limiter you haven't tested is a rate limiter that may not work. Here's how to verify your implementation works correctly:
const request = require('supertest');
const app = require('../app');
describe('Rate Limiter', () => {
it('allows requests within the limit', async () => {
const res = await request(app)
.get('/api/users')
.set('X-API-Key', 'test-key-1');
expect(res.status).toBe(200);
expect(res.headers['x-ratelimit-limit']).toBeDefined();
expect(res.headers['x-ratelimit-remaining']).toBeDefined();
});
it('returns 429 after exceeding the limit', async () => {
const apiKey = 'test-key-abuse';
// Exhaust the limit
for (let i = 0; i < 10; i++) {
await request(app)
.post('/api/auth/login')
.set('X-API-Key', apiKey)
.send({ email: 'a@a.com', password: 'wrong' });
}
// This one should be blocked
const res = await request(app)
.post('/api/auth/login')
.set('X-API-Key', apiKey)
.send({ email: 'a@a.com', password: 'wrong' });
expect(res.status).toBe(429);
expect(res.body.error).toBe('Too Many Requests');
expect(res.headers['retry-after']).toBeDefined();
});
it('resets after the window expires', async () => {
// Use fake timers to simulate window expiry
jest.useFakeTimers();
const apiKey = 'test-key-reset';
for (let i = 0; i < 10; i++) {
await request(app)
.post('/api/auth/login')
.set('X-API-Key', apiKey)
.send({});
}
// Advance time by 15 minutes + 1 second
jest.advanceTimersByTime(15 * 60 * 1000 + 1000);
const res = await request(app)
.post('/api/auth/login')
.set('X-API-Key', apiKey)
.send({});
expect(res.status).not.toBe(429);
jest.useRealTimers();
});
});
You can also use curl to quickly test manually:
# Send 15 requests in a loop and watch the headers
for i in $(seq 1 15); do
echo "Request $i:"
curl -s -o /dev/null -w "Status: %{http_code} | Headers: %{response_code}\n" \
-H "X-API-Key: my-test-key" \
http://localhost:3000/api/users
done
# Check specific headers from a single request
curl -I -H "X-API-Key: my-test-key" http://localhost:3000/api/users | grep -i "ratelimit\|retry"
Production Checklist and Conclusion
Implementing rate limiting is not a set-and-forget task. As your API evolves, so should your limits. Here's a production checklist to ensure your implementation is solid:
Production Rate Limiting Checklist
- Use Redis (or another shared store) in any multi-instance deployment
- Set
trust proxycorrectly if behind a load balancer - Differentiate limits by user plan, endpoint cost, and HTTP method
- Always return standard headers:
X-RateLimit-*andRetry-After - Return 429 (not 403) for rate-limited requests
- Log rate limit hits to detect abuse patterns and tune limits
- Whitelist internal services (by IP or service token) to avoid self-throttling
- Test with automated tests — including window reset behavior
- Monitor and alert on 429 spike rates in your observability dashboard
- Document your limits in your API reference so clients can implement back-off
"Rate limiting is not about punishing users — it's about guaranteeing a quality experience for everyone. A well-designed rate limiter with clear, documented limits is a feature, not a restriction."
From simple in-memory fixed windows to distributed Redis-backed sliding window counters, you now have the full toolkit to implement robust API rate limiting. Start with express-rate-limit or Laravel's built-in throttle middleware for most projects, graduate to Redis-backed stores when you scale horizontally, and layer in per-plan and per-endpoint limits as your API matures. Your infrastructure — and your users — will thank you.
