Backend

API Rate Limiting Implementation

Mayur Dabhi

April 18, 2026

14 min read

Every public API faces a fundamental challenge: how do you serve thousands of legitimate users while protecting your infrastructure from abuse, scraping bots, and runaway clients? The answer is rate limiting — a technique that controls how many requests a client can make within a given time window. Without it, a single misbehaving client can degrade service for everyone or rack up enormous cloud bills.

In this guide, we'll go deep on rate limiting: the algorithms behind it, HTTP response conventions, real implementations in Node.js/Express and Laravel, Redis-backed distributed solutions, and the nuances that separate a naive implementation from a production-grade one.

Why Rate Limiting Matters

GitHub's REST API limits unauthenticated requests to 60/hour and authenticated requests to 5,000/hour. Stripe enforces 100 requests/second. Twitter's API has complex tiered limits. These aren't arbitrary — they protect infrastructure, ensure fair resource sharing, and enable predictable capacity planning.

Rate Limiting Algorithms Explained

There are several algorithms for implementing rate limiting, each with different trade-offs in accuracy, memory usage, and burst tolerance. Understanding them helps you choose the right one for your use case.

Rate Limiting Algorithm Comparison

Fixed Window Counter

The simplest approach: divide time into fixed windows (e.g., every minute) and count requests per window. Fast and memory-efficient, but suffers from a boundary burst problem — a client can make N requests at the end of one window and N more at the start of the next, effectively making 2N requests in a short time.

Sliding Window Log

Stores a timestamp log of every request for each user. When a new request arrives, remove all timestamps older than the window and check the count. Perfectly accurate but memory-intensive for high-traffic APIs.

Sliding Window Counter (Recommended)

A hybrid approach: keep two counters (current window, previous window) and estimate the rate using a weighted calculation. This approximates a true sliding window while using constant memory per user — the best balance for most production APIs.

Token Bucket

Each user has a bucket that holds up to N tokens. Tokens refill at a constant rate. Each request consumes one token. This allows short bursts while enforcing long-term rate limits — ideal for APIs where brief spikes are acceptable.

HTTP Rate Limit Response Standards

Rate limiting responses should follow established conventions so clients can react programmatically. The standard HTTP status code for rate limit exceeded is 429 Too Many Requests, and you should include informative headers.

Header	Meaning	Example Value
`X-RateLimit-Limit`	Max requests allowed in the window	`100`
`X-RateLimit-Remaining`	Requests remaining in the current window	`73`
`X-RateLimit-Reset`	Unix timestamp when the window resets	`1713456000`
`Retry-After`	Seconds to wait before retrying (on 429)	`47`
`RateLimit-Policy` (RFC 6585)	Machine-readable policy description	`100;w=60`

Common Mistake

Never use 403 Forbidden for rate limiting — that implies the client is permanently unauthorized. Always use 429 Too Many Requests with a Retry-After header so clients know the block is temporary.

Implementing Rate Limiting in Node.js / Express

The most popular approach in the Express ecosystem is express-rate-limit, which provides a flexible middleware with multiple storage backends.

Install Dependencies

Install express-rate-limit for the middleware and rate-limit-redis for distributed storage.

Terminal

npm install express-rate-limit rate-limit-redis ioredis

Basic Rate Limiter Middleware

Create a rate limiter that applies globally or to specific routes.

middleware/rateLimiter.js

const rateLimit = require('express-rate-limit');

// General API limiter: 100 requests per 15 minutes
const generalLimiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes in milliseconds
  max: 100,                  // Max requests per window
  standardHeaders: true,     // Send X-RateLimit-* headers (RateLimit-* in draft-7)
  legacyHeaders: false,      // Disable X-RateLimit-* legacy headers

  // Custom key generator: rate limit by API key or IP
  keyGenerator: (req) => {
    return req.headers['x-api-key'] || req.ip;
  },

  // Custom response when limit is exceeded
  handler: (req, res, next, options) => {
    res.status(429).json({
      error: 'Too Many Requests',
      message: `You have exceeded the ${options.max} requests per ${options.windowMs / 60000} minutes limit.`,
      retryAfter: Math.ceil(options.windowMs / 1000),
    });
  },

  // Skip rate limiting for trusted IPs (e.g., internal services)
  skip: (req) => {
    const trustedIPs = ['127.0.0.1', '::1'];
    return trustedIPs.includes(req.ip);
  },
});

// Strict limiter for auth endpoints: 10 requests per 15 minutes
const authLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 10,
  standardHeaders: true,
  legacyHeaders: false,
  message: {
    error: 'Too Many Requests',
    message: 'Too many login attempts. Please try again later.',
  },
});

// Very strict: password reset, 3 per hour
const passwordResetLimiter = rateLimit({
  windowMs: 60 * 60 * 1000, // 1 hour
  max: 3,
  standardHeaders: true,
  legacyHeaders: false,
});

module.exports = { generalLimiter, authLimiter, passwordResetLimiter };

Apply to Routes

Apply limiters globally or per-route depending on your needs.

app.js

const express = require('express');
const { generalLimiter, authLimiter, passwordResetLimiter } = require('./middleware/rateLimiter');

const app = express();
app.use(express.json());

// Trust proxy headers if behind Nginx/load balancer
// This makes req.ip reflect the real client IP, not the proxy
app.set('trust proxy', 1);

// Apply general limiter to all API routes
app.use('/api/', generalLimiter);

// Stricter limits on sensitive auth routes
app.post('/api/auth/login',    authLimiter,          loginController);
app.post('/api/auth/register', authLimiter,          registerController);
app.post('/api/auth/forgot',   passwordResetLimiter, forgotPasswordController);

// Public routes with higher limits (or no limit)
app.get('/api/public/status', (req, res) => {
  res.json({ status: 'ok' });
});

app.listen(3000);

Trust Proxy Setting

If your app runs behind Nginx or a load balancer, always set app.set('trust proxy', 1). Without it, req.ip will be the proxy's IP, and all users will share the same rate limit bucket — a catastrophic misconfiguration.

Redis-Backed Distributed Rate Limiting

In-memory rate limiters break in distributed environments: if you have 3 Node.js instances, each has its own counter, so the effective limit is 3× your intended limit. The solution is a shared store — Redis is the industry standard for this.

Distributed Rate Limiting with Redis as Shared Store

middleware/redisRateLimiter.js

const rateLimit = require('express-rate-limit');
const RedisStore  = require('rate-limit-redis');
const Redis       = require('ioredis');

// Create Redis client
const redisClient = new Redis({
  host: process.env.REDIS_HOST || 'localhost',
  port: process.env.REDIS_PORT || 6379,
  password: process.env.REDIS_PASSWORD,
  // Reconnect on failure
  retryStrategy: (times) => Math.min(times * 50, 2000),
});

redisClient.on('error', (err) => {
  console.error('Redis error:', err);
});

// Redis-backed rate limiter
const redisRateLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 100,
  standardHeaders: true,
  legacyHeaders: false,

  // Use Redis store for distributed environments
  store: new RedisStore({
    sendCommand: (...args) => redisClient.call(...args),
    // Key prefix to avoid collisions with other Redis keys
    prefix: 'rl:',
  }),

  keyGenerator: (req) => {
    // Rate limit by API key if present, fall back to IP
    return req.headers['x-api-key']
      ? `apikey:${req.headers['x-api-key']}`
      : `ip:${req.ip}`;
  },
});

module.exports = { redisRateLimiter, redisClient };

Sliding Window with Raw Redis Commands

For full control over the algorithm, you can implement a sliding window counter directly using Redis's atomic operations. This is the approach used by high-performance APIs:

utils/slidingWindowRateLimit.js

const Redis = require('ioredis');
const redis = new Redis();

/**
 * Sliding window rate limiter using Redis.
 * Uses two keys: current window counter and previous window counter.
 * Approximates a true sliding window with O(1) memory per user.
 *
 * @param {string} key      - Unique identifier (user ID, API key, IP)
 * @param {number} limit    - Max requests allowed per window
 * @param {number} windowMs - Window duration in milliseconds
 * @returns {{ allowed: boolean, remaining: number, resetAt: number }}
 */
async function slidingWindowRateLimit(key, limit, windowMs) {
  const now        = Date.now();
  const windowSec  = Math.floor(windowMs / 1000);
  const currentSlot = Math.floor(now / windowMs);
  const prevSlot    = currentSlot - 1;

  const currentKey = `rl:sw:${key}:${currentSlot}`;
  const prevKey    = `rl:sw:${key}:${prevSlot}`;

  // Atomically increment current window counter
  const pipeline = redis.pipeline();
  pipeline.incr(currentKey);
  pipeline.expire(currentKey, windowSec * 2); // TTL: 2 windows
  pipeline.get(prevKey);
  const [[, current], , [, prev]] = await pipeline.exec();

  const currentCount = parseInt(current, 10);
  const prevCount    = parseInt(prev, 10) || 0;

  // Calculate weight of previous window's requests in the current window
  const windowProgress    = (now % windowMs) / windowMs;
  const prevWindowWeight  = 1 - windowProgress;
  const estimatedCount    = Math.ceil(prevCount * prevWindowWeight) + currentCount;

  const allowed   = estimatedCount <= limit;
  const remaining = Math.max(0, limit - estimatedCount);
  const resetAt   = (currentSlot + 1) * windowMs;

  return { allowed, remaining, resetAt, estimatedCount };
}

// Express middleware wrapper
function createSlidingWindowMiddleware(limit, windowMs) {
  return async (req, res, next) => {
    const key    = req.headers['x-api-key'] || req.ip;
    const result = await slidingWindowRateLimit(key, limit, windowMs);

    // Always set headers
    res.set('X-RateLimit-Limit',     limit);
    res.set('X-RateLimit-Remaining', result.remaining);
    res.set('X-RateLimit-Reset',     Math.floor(result.resetAt / 1000));

    if (!result.allowed) {
      const retryAfter = Math.ceil((result.resetAt - Date.now()) / 1000);
      res.set('Retry-After', retryAfter);
      return res.status(429).json({
        error: 'Too Many Requests',
        retryAfter,
        limit,
        resetAt: new Date(result.resetAt).toISOString(),
      });
    }

    next();
  };
}

module.exports = { slidingWindowRateLimit, createSlidingWindowMiddleware };

Rate Limiting in Laravel

Laravel has first-class rate limiting support via the ThrottleRequests middleware and the RateLimiter facade, introduced with Laravel 8. The API is expressive and supports custom limiters with dynamic keys.

Apply Laravel's built-in throttle middleware directly in routes:

// routes/api.php

use Illuminate\Support\Facades\Route;

// 60 requests per minute (format: max,minutes)
Route::middleware('throttle:60,1')->group(function () {
    Route::get('/users', [UserController::class, 'index']);
    Route::get('/posts', [PostController::class, 'index']);
});

// 10 requests per 15 minutes for auth endpoints
Route::middleware('throttle:10,15')->group(function () {
    Route::post('/login',    [AuthController::class, 'login']);
    Route::post('/register', [AuthController::class, 'register']);
});

// Use a named rate limiter
Route::middleware('throttle:api')->group(function () {
    Route::get('/profile', [ProfileController::class, 'show']);
});

Define custom rate limiters in App\Providers\RouteServiceProvider:

// app/Providers/RouteServiceProvider.php

use Illuminate\Cache\RateLimiting\Limit;
use Illuminate\Http\Request;
use Illuminate\Support\Facades\RateLimiter;

protected function configureRateLimiting(): void
{
    // Default API limiter (authenticated users get higher limits)
    RateLimiter::for('api', function (Request $request) {
        return $request->user()
            ? Limit::perMinute(60)->by($request->user()->id)
            : Limit::perMinute(20)->by($request->ip());
    });

    // Per-plan limits for SaaS tiering
    RateLimiter::for('api-tiered', function (Request $request) {
        $user = $request->user();

        $limits = [
            'free'       => 60,
            'pro'        => 500,
            'enterprise' => 5000,
        ];

        $max = $limits[$user?->plan ?? 'free'];

        return Limit::perHour($max)
            ->by($user?->id ?? $request->ip())
            ->response(function (Request $request, array $headers) {
                return response()->json([
                    'error'      => 'Too Many Requests',
                    'message'    => 'You have exceeded your plan limit.',
                    'upgrade_at' => 'https://mayurdabhi.com/pricing',
                ], 429, $headers);
            });
    });

    // Strict limiter for password resets
    RateLimiter::for('password-reset', function (Request $request) {
        return [
            Limit::perHour(3)->by($request->ip()),
            Limit::perHour(3)->by($request->input('email')),
        ];
    });
}

Configure Redis as the cache store for rate limiting in config/cache.php:

// .env
CACHE_DRIVER=redis
REDIS_HOST=127.0.0.1
REDIS_PORT=6379

// config/cache.php
'redis' => [
    'driver' => 'redis',
    'connection' => 'cache',
    'lock_connection' => 'default',
],

// config/database.php — Redis connections
'redis' => [
    'client' => env('REDIS_CLIENT', 'phpredis'),
    'default' => [
        'url'      => env('REDIS_URL'),
        'host'     => env('REDIS_HOST', '127.0.0.1'),
        'password' => env('REDIS_PASSWORD'),
        'port'     => env('REDIS_PORT', '6379'),
        'database' => env('REDIS_DB', '0'),
    ],
    'cache' => [
        'url'      => env('REDIS_URL'),
        'host'     => env('REDIS_HOST', '127.0.0.1'),
        'password' => env('REDIS_PASSWORD'),
        'port'     => env('REDIS_PORT', '6379'),
        'database' => env('REDIS_CACHE_DB', '1'),
    ],
],

Custom Rate Limit Middleware in Laravel

When you need finer control — e.g., different limits per HTTP method or endpoint-specific logic — create a custom middleware:

app/Http/Middleware/ApiRateLimiter.php

<?php

namespace App\Http\Middleware;

use Closure;
use Illuminate\Http\Request;
use Illuminate\Support\Facades\RateLimiter;
use Symfony\Component\HttpFoundation\Response;

class ApiRateLimiter
{
    public function handle(Request $request, Closure $next): Response
    {
        // Build a unique key from API key + endpoint + method
        $apiKey = $request->header('X-API-Key', $request->ip());
        $key    = "rl:{$apiKey}:{$request->method()}:{$request->path()}";

        // Determine limit based on request method
        $maxAttempts = match ($request->method()) {
            'GET'    => 120,  // Read-heavy
            'POST'   => 30,
            'PUT',
            'PATCH'  => 30,
            'DELETE' => 10,   // Destructive ops — very strict
            default  => 60,
        };

        if (RateLimiter::tooManyAttempts($key, $maxAttempts)) {
            $seconds = RateLimiter::availableIn($key);

            return response()->json([
                'error'       => 'Too Many Requests',
                'retry_after' => $seconds,
                'message'     => "Limit of {$maxAttempts} requests per minute exceeded.",
            ], 429, [
                'X-RateLimit-Limit'     => $maxAttempts,
                'X-RateLimit-Remaining' => 0,
                'X-RateLimit-Reset'     => now()->addSeconds($seconds)->timestamp,
                'Retry-After'           => $seconds,
            ]);
        }

        RateLimiter::hit($key, 60); // Decay after 60 seconds

        $remaining = $maxAttempts - RateLimiter::attempts($key);

        $response = $next($request);

        // Append rate limit headers to every response
        return $response->withHeaders([
            'X-RateLimit-Limit'     => $maxAttempts,
            'X-RateLimit-Remaining' => max(0, $remaining),
        ]);
    }
}

Advanced Strategies

Tiered Rate Limiting

Production APIs rarely have one-size-fits-all limits. Implement tiers based on user plans, roles, or API key types:

Tiered Rate Limiting (Node.js)

// Plan-based limits
const PLAN_LIMITS = {
  free:       { requests: 100,  windowMs: 60 * 60 * 1000 },  // 100/hour
  starter:    { requests: 1000, windowMs: 60 * 60 * 1000 },  // 1k/hour
  pro:        { requests: 10000,windowMs: 60 * 60 * 1000 },  // 10k/hour
  enterprise: { requests: 100000,windowMs: 60 * 60 * 1000 }, // 100k/hour
};

async function tieredRateLimiter(req, res, next) {
  // Look up user's plan from database/cache
  const apiKey = req.headers['x-api-key'];
  const user   = apiKey ? await getUserByApiKey(apiKey) : null;
  const plan   = user?.plan ?? 'free';

  const { requests, windowMs } = PLAN_LIMITS[plan];
  const key    = apiKey ? `plan:${apiKey}` : `ip:${req.ip}`;
  const result = await slidingWindowRateLimit(key, requests, windowMs);

  res.set('X-RateLimit-Limit',     requests);
  res.set('X-RateLimit-Remaining', result.remaining);
  res.set('X-RateLimit-Plan',      plan);

  if (!result.allowed) {
    return res.status(429).json({
      error:   'Rate limit exceeded',
      plan,
      limit:   requests,
      upgrade: plan !== 'enterprise' ? 'https://example.com/pricing' : null,
    });
  }

  next();
}

Endpoint-Specific Limits

Some endpoints are inherently more expensive than others. A search endpoint that queries your database should have a lower limit than a simple status check:

Per-Endpoint Limits

const rateLimit = require('express-rate-limit');

// Cheap: status and health checks — very permissive
const statusLimiter = rateLimit({ windowMs: 60000, max: 300 });

// Moderate: standard CRUD
const crudLimiter = rateLimit({ windowMs: 60000, max: 60 });

// Expensive: search, reports, exports
const searchLimiter = rateLimit({ windowMs: 60000, max: 10 });

// Very expensive: bulk operations
const bulkLimiter = rateLimit({ windowMs: 60000, max: 3 });

app.get('/api/health',         statusLimiter, healthController);
app.get('/api/users',          crudLimiter,   userListController);
app.get('/api/search',         searchLimiter, searchController);
app.post('/api/reports/generate', searchLimiter, reportController);
app.post('/api/users/bulk-import', bulkLimiter, bulkImportController);

Rate Limiting vs Other Defenses

Defense	Protects Against	Works With Rate Limiting?
Rate Limiting	Abuse, brute force, resource exhaustion	—
Input Validation	Injection, malformed data	Yes — complementary
Authentication	Unauthorized access	Yes — identify users for per-user limits
WAF / Firewall	DDoS, bot traffic, known bad IPs	Yes — first line of defense
CAPTCHA	Automated bots	Yes — after repeated failures
IP Blocklist	Known malicious IPs	Yes — in combination

Testing Your Rate Limiter

A rate limiter you haven't tested is a rate limiter that may not work. Here's how to verify your implementation works correctly:

tests/rateLimiter.test.js (Jest)

const request = require('supertest');
const app     = require('../app');

describe('Rate Limiter', () => {
  it('allows requests within the limit', async () => {
    const res = await request(app)
      .get('/api/users')
      .set('X-API-Key', 'test-key-1');

    expect(res.status).toBe(200);
    expect(res.headers['x-ratelimit-limit']).toBeDefined();
    expect(res.headers['x-ratelimit-remaining']).toBeDefined();
  });

  it('returns 429 after exceeding the limit', async () => {
    const apiKey = 'test-key-abuse';

    // Exhaust the limit
    for (let i = 0; i < 10; i++) {
      await request(app)
        .post('/api/auth/login')
        .set('X-API-Key', apiKey)
        .send({ email: 'a@a.com', password: 'wrong' });
    }

    // This one should be blocked
    const res = await request(app)
      .post('/api/auth/login')
      .set('X-API-Key', apiKey)
      .send({ email: 'a@a.com', password: 'wrong' });

    expect(res.status).toBe(429);
    expect(res.body.error).toBe('Too Many Requests');
    expect(res.headers['retry-after']).toBeDefined();
  });

  it('resets after the window expires', async () => {
    // Use fake timers to simulate window expiry
    jest.useFakeTimers();
    const apiKey = 'test-key-reset';

    for (let i = 0; i < 10; i++) {
      await request(app)
        .post('/api/auth/login')
        .set('X-API-Key', apiKey)
        .send({});
    }

    // Advance time by 15 minutes + 1 second
    jest.advanceTimersByTime(15 * 60 * 1000 + 1000);

    const res = await request(app)
      .post('/api/auth/login')
      .set('X-API-Key', apiKey)
      .send({});

    expect(res.status).not.toBe(429);
    jest.useRealTimers();
  });
});

You can also use curl to quickly test manually:

Manual Testing with curl

# Send 15 requests in a loop and watch the headers
for i in $(seq 1 15); do
  echo "Request $i:"
  curl -s -o /dev/null -w "Status: %{http_code} | Headers: %{response_code}\n" \
    -H "X-API-Key: my-test-key" \
    http://localhost:3000/api/users
done

# Check specific headers from a single request
curl -I -H "X-API-Key: my-test-key" http://localhost:3000/api/users | grep -i "ratelimit\|retry"

Production Checklist and Conclusion

Implementing rate limiting is not a set-and-forget task. As your API evolves, so should your limits. Here's a production checklist to ensure your implementation is solid:

                 Production Rate Limiting Checklist
                Use Redis (or another shared store) in any multi-instance deployment
Set trust proxy correctly if behind a load balancer
Differentiate limits by user plan, endpoint cost, and HTTP method
Always return standard headers: X-RateLimit-* and Retry-After
Return 429 (not 403) for rate-limited requests
Log rate limit hits to detect abuse patterns and tune limits
Whitelist internal services (by IP or service token) to avoid self-throttling
Test with automated tests — including window reset behavior
Monitor and alert on 429 spike rates in your observability dashboard
Document your limits in your API reference so clients can implement back-off

            

"Rate limiting is not about punishing users — it's about guaranteeing a quality experience for everyone. A well-designed rate limiter with clear, documented limits is a feature, not a restriction."

From simple in-memory fixed windows to distributed Redis-backed sliding window counters, you now have the full toolkit to implement robust API rate limiting. Start with express-rate-limit or Laravel's built-in throttle middleware for most projects, graduate to Redis-backed stores when you scale horizontally, and layer in per-plan and per-endpoint limits as your API matures. Your infrastructure — and your users — will thank you.

API Rate Limiting Security Node.js Laravel Redis Backend

Mayur Dabhi

Full Stack Developer with 5+ years of experience building scalable web applications with Laravel, React, and Node.js.