Durable Web Workers: From One-Off Scripts to Robust Off‑Main‑Thread Systems
A practical guide to designing Worker-based systems that don’t jank, leak, or stall. Start with a baseline, add timeouts and cancellation, coalesce requests, and use a bounded pool when parallelism actually helps.
Long tasks on the main thread cause stutters, input lag, and “Page Unresponsive” dialogs. Web Workers move work off the main thread—but naïve usage still leads to freezes (no timeouts), memory leaks (dangling listeners), and CPU thrash (too many workers). This article builds a Worker system step by step, explaining the trade‑offs as we go.
What you’ll learn
- When one worker is enough, and when to pool
- How to cancel, time out, and avoid hung requests
- How to prevent duplicate in‑flight work (coalescing)
- When to transfer large payloads vs copy
- How to size parallelism without starving the main thread
Assumptions: browser context, modern engines. Where features require cross‑origin isolation (SharedArrayBuffer), we’ll call it out and provide fallbacks.
1) Baseline: a single persistent Worker
Start simple: a long‑lived Worker that receives requests and responds with results. Avoid spinning workers up per request (startup overhead and GC churn add up).
// worker.js — minimal protocol
self.onmessage = async (e) => {
const { id, payload } = e.data;
try {
const result = await expensive(payload);
postMessage({ id, ok: true, result });
} catch (err) {
postMessage({ id, ok: false, error: String(err) });
}
};
async function expensive(payload) {
// … do real work here …
return heavyCompute(payload);
}// main.js — persistent client with a simple request/response protocol
class WorkerClient {
constructor(url) {
this.worker = new Worker(url, { type: 'module' });
this.pending = new Map();
this.seq = 0;
}
call(payload, { timeout = 10000, signal } = {}) {
const id = ++this.seq;
return new Promise((resolve, reject) => {
const timer = setTimeout(() => {
this.cancel(id);
reject(new Error('timeout'));
}, timeout);
const onMessage = (e) => {
const msg = e.data;
if (msg.id !== id) return;
cleanup();
msg.ok ? resolve(msg.result) : reject(new Error(msg.error));
};
const cleanup = () => {
clearTimeout(timer);
this.worker.removeEventListener('message', onMessage);
this.pending.delete(id);
};
this.worker.addEventListener('message', onMessage);
this.pending.set(id, { cleanup });
this.worker.postMessage({ id, payload });
if (signal) {
const onAbort = () => { this.cancel(id); reject(signal.reason || new Error('aborted')); };
signal.addEventListener('abort', onAbort, { once: true });
}
});
}
cancel(id) {
const req = this.pending.get(id);
if (req) req.cleanup();
// Inform the worker (optional; see section 2):
this.worker.postMessage({ control: 'abort', id });
}
}Why this baseline works
- Persistent worker amortizes startup cost and reduces GC churn.
- Explicit
idcorrelates requests to responses; the worker can be reused safely. - Timeouts on the main thread prevent hung UI if the worker crashes or gets stuck.
Limitations
- There’s no real cancellation yet—the worker finishes whatever it started.
- Duplicate in‑flight requests still waste CPU.
2) Real cancellation: timeouts, AbortController, and cooperative abort
Workers can’t force‑kill an operation mid‑JavaScript turn. Instead, implement cooperative abort checks inside the worker’s algorithms, and pair it with a main‑thread timeout.
Two approaches to pass an abort signal:
- Fallback (works everywhere): send a control message
{ control: 'abort', id }and make the worker check an in‑memory map. - Optimal (requires cross‑origin isolation): pass a
SharedArrayBufferflag and check withAtomicsin tight loops.
SharedArrayBuffer path
- Enable cross‑origin isolation for
SharedArrayBuffer. - Use a tiny
Int32Arrayas an abort flag.
// setup
const sab = new SharedArrayBuffer(4);
const flag = new Int32Array(sab);
// call
const ac = new AbortController();
const p = client.call({ abortFlag: flag }, { signal: ac.signal, timeout: 8000 });
// cancel
Atomics.store(flag, 0, 1);
ac.abort();Fallback control-message path
- Maintain a
Map<id, { aborted: boolean }>in the worker. - On
{ control: 'abort', id }, mark the entry aborted and have the task poll it periodically.
Design notes
- In CPU‑bound loops, check the flag every N iterations (e.g., every 1–5ms worth of work) to balance responsiveness with throughput.
- Always pair cooperative abort with a main‑thread timeout as a safety net.
3) Request deduping (coalescing)
- Hash inputs and coalesce identical concurrent calls to avoid duplicate work.
class Coalescer {
constructor() { this.inflight = new Map(); }
async get(key, fn) {
if (this.inflight.has(key)) return this.inflight.get(key);
const p = fn().finally(() => this.inflight.delete(key));
this.inflight.set(key, p);
return p;
}
}Where this helps
- Typeahead or search: repeated calls with the same query.
- Image processing with identical parameters.
Tip: Normalize keys; small differences in options should produce distinct keys.
4) Bounded worker pool: parallelize without thrash
class WorkerPool {
constructor(url, size = navigator.hardwareConcurrency || 4) {
this.size = size;
this.idle = [];
this.busy = new Set();
for (let i = 0; i < size; i++) this.idle.push(new WorkerClient(url));
this.queue = [];
}
async run(payload, opts) {
return new Promise((resolve, reject) => {
this.queue.push({ payload, opts, resolve, reject });
this.drain();
});
}
drain() {
while (this.idle.length && this.queue.length) {
const client = this.idle.pop();
const job = this.queue.shift();
this.busy.add(client);
client.call(job.payload, job.opts)
.then(job.resolve, job.reject)
.finally(() => {
this.busy.delete(client);
this.idle.push(client);
this.drain();
});
}
}
}How to size the pool
- Start with
Math.max(1, Math.min(4, navigator.hardwareConcurrency - 1)). - Measure input latency in DevTools while under load; reduce size if the main thread janks.
Transfers vs copies
- For big
ArrayBuffer/TypedArray payloads, transfer ownership to avoid copying:postMessage(value, [buffer]). - After transfer, the sender’s buffer is detached—design call sites accordingly.
5) Measuring and validating
- Use the Performance panel to watch main‑thread long tasks; aim to keep them under ~50ms.
- Add simple markers around calls to measure queue time vs execution time.
- Track memory: ensure listeners and timers are cleaned up when requests finish.
Common pitfalls and fixes
- Spawning a worker per task → Use a persistent worker or a pool.
- No timeouts → Add main‑thread timeouts on all calls.
- No cancellation → Add cooperative abort checks (control messages or SAB flags).
- Duplicate in‑flight work → Introduce request coalescing.
- Oversubscription (too many workers) → Bound concurrency and size via measurement.
- Copying huge buffers → Transfer instead of copy.
Checklist
- Persistent workers; avoid per‑task churn
- Timeouts on every request
- Cooperative cancellation in worker code
- Coalesce identical requests
- Pool with bounded concurrency when parallelism helps
- Transfer large payloads; avoid unnecessary copies
These practices turn workers from “just off‑thread code” into a predictable, maintainable subsystem that keeps your UI responsive under real‑world pressure.
- Size the pool to cores; avoid oversubscription.
- Always time out and support abort.
- Reuse workers (don’t churn).
- For huge buffers, transfer ownership instead of copying.
These patterns keep UIs responsive and workloads predictable under pressure.