File Descriptor Exhaustion: The Kernel Limit That Silently Drops Node.js Connections

The 3 a.m. page was blunt: “Customers reporting intermittent connection failures.” The load balancer showed all targets healthy. CPU usage on the Node.js pods was under 15%. Memory was flat. There were no application errors in the logs. Yet every few minutes a burst of requests failed with ECONNREFUSED before they ever reached our HTTP handlers.

We scaled the deployment. We restarted the pods. We blamed the cloud provider’s load balancer. Then one engineer ran lsof -p $(pgrep -f "node server.js") | wc -l on a pod and saw the number: 65,536 open file descriptors. The soft limit was 65,536. The process had hit the ceiling. Every new inbound TCP connection needed a new fd. Every new Postgres checkout needed a new fd. The kernel said no. The connections were refused at the syscall layer, before any of our code saw them.

This is file descriptor exhaustion, and it is one of the nastiest silent failures in production backend work. It does not crash your process. It does not log a stack trace. It just drops traffic. This post covers how fds are consumed, how to diagnose the leak, how to raise the limit without creating a new problem, and how to monitor fd usage so you catch it before the pager goes off.

Why file descriptors matter more than you think

In Linux, everything is a file. TCP sockets, Unix domain sockets, pipes, actual files on disk, epoll instances, and eventfd objects all consume a file descriptor. Each is just an integer index into the process’s file descriptor table. When that table is full, any syscall that needs a new fd (socket, open, accept, pipe, epoll_create) returns EMFILE (too many open files for this process) or ENFILE (too many open files in the system).

A Node.js server in a microservices architecture can open fds in surprising quantities:

Inbound client connections: One fd per active HTTP connection. With keepalive, browsers and mobile clients hold these open for seconds or minutes.
Database connection pools: A pg.Pool with a max of 100 holds 100 fds to Postgres. If you have two databases (primary and read replica), that is 200.
Redis connections: One persistent connection per ioredis instance, plus one per BullMQ queue worker, plus sentinel connections if you use Sentinel.
Outbound HTTP agents: Whether you use http.Agent, undici.Pool, or axios, keepalive connections to downstream services hold fds open.
Files and pipes: Log file streams, child process stdio, temporary file uploads, and fs.watch instances all count.
Event loop internals: epoll_wait creates an epoll fd. async_hooks and some diagnostic tools add more.

Add these up on a busy pod. If you have 1,000 concurrent inbound clients, a database pool of 200, a Redis connection count of 50, an outbound HTTP agent juggling 100 connections to three downstreams, and a handful of log streams, you are already north of 1,500 fds. That sounds modest, but the default soft limit on many Linux distributions is 1,024. Ubuntu 22.04 and Debian 12 raise it to 1,024 soft / 1,048,576 hard, but container runtimes and older base images often ship far lower. If you run inside a container with an unspecified limit, Docker historically defaulted to the host’s soft limit (often 1,024). Kubernetes inherited this behavior for years.

The result: a service that works fine in development, works fine under light integration tests, and collapses under production load when the fd count crosses a threshold that has nothing to do with your code quality.

Diagnosing fd exhaustion in real time

When the incident is happening, you have three tools that matter.

First, lsof gives you the breakdown by fd type:

lsof -p $NODE_PID | awk '{print $5}' | sort | uniq -c | sort -rn

On a Node.js server, you will see IPv4, IPv6, FIFO, REG, and unix. If IPv4 dominates, your connections (inbound, outbound, or both) are the issue. If REG dominates, you are leaking file handles on disk (log rotation without closing streams is a common cause).

Second, the /proc filesystem gives you the raw count instantly without parsing lsof output:

ls /proc/$NODE_PID/fd/ | wc -l

Third, prlimit tells you the exact limits the kernel is enforcing on that process:

prlimit --pid $NODE_PID

Look at NOFILE. If the current value is within a few hundred of the max, you have found your smoking gun.

For a programmatic check inside the process (for a health endpoint or metrics), Node.js can read its own fd count from /proc/self/fd:

import fs from 'node:fs';
import path from 'node:path';

function getOpenFdCount() {
  return fs.readdirSync('/proc/self/fd').length - 1; // subtract the dir fd itself
}

console.log(`Open fds: ${getOpenFdCount()}`);

This is safe to run every few seconds in production. It is synchronous but only reads a small directory; on Linux, the cost is negligible.

The mathematics of connection pooling

The most common self-inflicted fd spike is connection pool sprawl. Here is a formula every service should document somewhere:

total_fds_estimate = (
  max_inbound_connections +
  sum(database_pool_maxes) +
  sum(redis_connection_counts) +
  sum(outbound_agent_max_sockets_per_host * number_of_downstream_hosts) +
  baseline_files_and_pipes
) * safety_margin

For a typical Node.js API, that might look like:

max_inbound_connections:        4,000    (Node.js http server maxConnections or effective concurrency)
Postgres primary pool:            100
Postgres replica pool:            100
Redis primary:                     10
Redis BullMQ workers:              40
Outbound HTTP to 3 services:     300    (100 per host)
Log streams and misc:              20
Baseline subtotal:              4,570
Safety margin (1.5x):           6,855

Your ulimit -n should be set to at least 7,000 for this service. Prefer 16,384 or 32,768 so you have headroom for traffic spikes, memory pressure-induced connection pileups, or deployment overlap (when old and new pods briefly coexist on the same node).

The mistake many teams make is raising the database pool max to 200 “just to be safe” without realizing that fds are a finite resource shared by every subsystem. A connection pool is not a performance knob you turn up arbitrarily. It is a congestion-control parameter for the downstream. If you double your pool max, you double the fds consumed, increase memory usage, and increase the risk of Postgres max_connections exhaustion. Size pools from estimates, not hopes.

Raising the limit: sysctl, systemd, Docker, and Kubernetes

There are four layers where the fd limit is set, and you need to understand which one wins.

The shell and systemd

For a Node.js service running under systemd (most modern Linux servers), the limit is controlled by the unit file:

[Service]
Type=simple
ExecStart=/usr/bin/node /opt/app/server.js
LimitNOFILE=65536

After reloading systemd and restarting the service, verify with:

systemctl show your-service.service --property=LimitNOFILE

If you run the process directly from a shell, the shell inherits limits from the user session. You can raise them with ulimit -n 65536 before starting Node, but systemd is the durable fix.

Docker

Docker containers inherit the host’s limits by default, but older versions or custom daemon configs can override this. Always specify the limit explicitly:

docker run --ulimit nofile=65536:65536 your-image

Or in docker-compose.yml:

services:
  app:
    ulimits:
      nofile:
        soft: 65536
        hard: 65536

Kubernetes

Kubernetes did not support setting ulimits per container natively for a long time. As of recent versions, you can use securityContext in the container spec (CRI-O and containerd support this):

spec:
  containers:
    - name: api
      image: your-image
      securityContext:
        capabilities:
          drop:
            - ALL
      resources:
        limits:
          ephemeral-storage: "1Gi"

Wait, that does not set fd limits. Kubernetes delegates fd limits to the container runtime, which usually inherits from the node. The reliable way to control this in Kubernetes is to ensure your container image or runtime config sets the limit, or use an init container script that calls prlimit. The cleaner approach is to set it in the container’s entrypoint:

#!/bin/sh
ulimit -n 65536
exec node server.js

Better yet, bake it into the Dockerfile if your base image respects it:

RUN echo "nofile 65536" >> /etc/security/limits.conf

Then verify inside the running pod:

kubectl exec -it pod-name -- /bin/sh -c "ulimit -n"

If this prints 1024, your containers are still carrying the default, and every connection pool decision you make is walking on a tightrope.

Application discipline: what to change in Node.js

Raising the limit buys you breathing room. It does not fix a leak. Here are the application-level patterns that keep fd usage honest.

1. Set explicit pool max values.

Do not rely on defaults. pg’s default pool max is 10, which is conservative. undici’s default is more aggressive. Check every library:

import pg from 'pg';

const pool = new pg.Pool({
  connectionString: process.env.DATABASE_URL,
  max: 40, // sized from the formula above
  idleTimeoutMillis: 10_000,
  connectionTimeoutMillis: 5_000,
});

2. Close streams deliberately.

If you open a file for logging, ensure it is closed or rotated by a library that tracks fds. If you spawn child processes, always handle their stdio explicitly:

import { spawn } from 'node:child_process';

const child = spawn('ffmpeg', args, {
  stdio: ['ignore', 'pipe', 'pipe'],
});

child.stdout.on('end', () => child.stdout.destroy());
child.stderr.on('end', () => child.stderr.destroy());
child.on('exit', () => {
  child.stdout?.destroy();
  child.stderr?.destroy();
});

Leaving stdio streams open after a child exits will keep their fds in the parent’s fd table indefinitely.

3. Monitor your HTTP agent sockets.

If you use http.Agent or https.Agent, socket reuse is good, but socket leaks are catastrophic. Log the agent’s current socket count periodically:

import http from 'node:http';

const agent = new http.Agent({ keepAlive: true, maxSockets: 50 });

setInterval(() => {
  const sockets = Object.values(agent.sockets).flat().length;
  const freeSockets = Object.values(agent.freeSockets).flat().length;
  console.log(JSON.stringify({ event: 'agent_socket_gauge', sockets, freeSockets }));
}, 30_000);

If sockets grows monotonically while request rate is flat, you have a leak (often caused by not consuming response bodies, which prevents the socket from returning to the free pool).

4. Use an explicit server connection limit.

Node.js http.createServer accepts an optional maxConnections. If you know your architecture cannot handle more than 5,000 concurrent fes due to downstream constraints, enforce it at the server:

const server = http.createServer(app);
server.maxConnections = 5000;

This is not just about fds. It is backpressure. When the server is at capacity, new inbound connections are rejected at the kernel level (ECONNREFUSED), which is faster and cheaper than accepting them, queuing them, and timing them out in application code.

Monitoring fd usage in production

You need a metric that tracks (open fds / fd limit) and alerts when it crosses 0.7. Here is a minimal Prometheus-style exporter hook you can attach to an existing /metrics endpoint or health check:

import fs from 'node:fs';
import os from 'node:os';

function getFdMetrics() {
  const open = fs.readdirSync('/proc/self/fd').length - 1;
  const limit = os.getrlimit ? os.getrlimit().nofile?.soft : undefined;

  // Fallback for older Node versions
  let limitFallback;
  try {
    const stdout = fs.readFileSync('/proc/self/limits', 'utf8');
    const line = stdout.split('\n').find(l => l.includes('Max open files'));
    if (line) {
      limitFallback = parseInt(line.trim().split(/\s+/)[3], 10);
    }
  } catch {}

  const effectiveLimit = limit ?? limitFallback ?? 1024;
  const ratio = open / effectiveLimit;

  return {
    open_file_descriptors: open,
    file_descriptor_limit: effectiveLimit,
    fd_utilization_ratio: Number(ratio.toFixed(4)),
  };
}

Ship fd_utilization_ratio to your metrics pipeline. Alert on > 0.7. Page on > 0.85. The 30 minutes between those two thresholds are usually the difference between a planned restart and a 3 a.m. incident.

The fix checklist

Before you declare fd work done, verify:

ulimit -n inside the running container is at least 16,384 (or 4x your estimated peak fd count, whichever is larger).
Every database, cache, and outbound HTTP pool has an explicit max sized from the formula above.
File streams and child process stdio are explicitly closed or destroyed after use.
You are exporting fd_utilization_ratio as a metric with alerts at 0.7 and 0.85.
Load tests or flame deployments confirm fd count stays flat under sustained traffic.

A note from Yojji

The kind of work this post describes (tracing a silent kernel failure through lsof and /proc, sizing connection pools from first principles, and wiring metrics that catch the problem before it becomes an outage) is the unglamorous infrastructure craft that separates a service that survives real traffic from one that looks fine until it does not.

Yojji is an international custom software development company founded in 2016, with offices in Europe, the US, and the UK. Their teams specialize in the JavaScript ecosystem (React, Node.js, TypeScript), cloud platforms (AWS, Azure, GCP), and the backend operational rigor that keeps production systems honest when load increases and defaults betray you.