Letting Users Run JavaScript On Our Backend: Anatomy of a Four-Layer Plugin Sandbox

Four-layer plugin sandbox: Static Analysis → V8 Isolate → RPC Bridge → SSRF Validator

TL;DR

Disclosure: I'm the founder of Apex Bridge Technology and the creator of BugSpotter. This piece walks through a real production sandbox we ship, including the bypasses I missed and shipped fixes for.

We let customers ship custom JavaScript plugins that run inside our backend worker process. The plugin author writes a factory({ rpcBridge, config }) that we instantiate per project, and our outbox worker calls its createFromBugReport whenever a matching bug arrives. That code is untrusted by definition.

Two distinct boundaries protect the host:

The trust boundary — Layer 0, human plugin review — decides whether a plugin gets to run at all. Approval grants capability; nothing in the technical stack downstream protects against a malicious-but-approved author. Covered in § "What I Don't Catch".
The isolation stack — four technical layers below — polices what an approved plugin can do once it executes. The rest of this article is about the isolation stack:

Static analysis — Babel AST pass, blocks known dangerous patterns (require('fs'), eval, Function, .constructor access, dotted globals like process.binding) before the code is loaded into V8.
isolated-vm sandbox — fresh V8 isolate per project: 128 MB memory cap, 15 s execution timeout, fetch / XMLHttpRequest / WebSocket / EventSource / require / process / Buffer all set to undefined on first run.
RPC bridge — the only way out of the isolate. Whitelisted methods (utils.makeApiRequest, db.bugReports.findById, storage.getPresignedUrl, …), project-scoped, with a header blocklist (Authorization, Cookie, X-Forwarded-*, all Sec-*) and credential sanitization.
SSRF on every egress — every outbound URL the plugin asks for is validated before we call fetch host-side. Blocks RFC 1918, loopback, link-local (including AWS/GCP/Azure metadata at 169.254.169.254), CGNAT, IPv6 unique-local and link-local, IPv4-mapped IPv6, and alternative IPv4 encodings (0177.0.0.1, 2130706433, 0x7f.0.0.1).

Numbers from the production code:

~2.5 k lines of security code, ~2.2 k lines of dedicated security tests (size of the surface that has to stay correct, not a quality claim).
~40 distinct SSRF cases parameterized via it.each into ~484 inputs (IP-range tables, encoding variants). 113 plugin-utils tests; 76 added in a single resource-leak fix.
The static analyzer alone went through 6 iterations and 11 distinct bypass classes — most of them documented in vm2's issue tracker and Patrik Fehrenbach / XmiliaH writeups long before I started. The iterations measure how much catching-up I had to do to the public literature, not how novel the work was.

What this doesn't catch (acknowledged, not pretended-away):

CPU within the 15 s budget — a while (true) {} ties up the worker for 15 s before timeout fires.
Side-channels from sharing a V8 process with other plugins (Spectre-class).
Egress to public hosts — that's by design (the plugin needs to call Jira / Linear / your tracker), but it means approving a custom plugin = trusting the author with bug-report data.

Everything below is the long version, with the actual bypass code samples that did and didn't get caught at each layer.

What I Almost Shipped

The first cut of the plugin executor used new Function(code) to materialise the plugin's factory. The integration tests passed. Plugins ran. Bug reports got pushed to mock Jira instances. It would have shipped — if I hadn't refactored it before the first review pass.

new Function() compiles a string in the caller's scope. The "isolation" was vibes. A plugin could have done:

// Inside the "sandboxed" plugin
return globalThis.process.env;

…and gotten the host environment. Database URL, S3 keys, the JWT signing secret, the lot.

The replacement commit message describes the fix in the tone you write at 2 AM after realising what you almost shipped:

Replace new Function() vulnerability with service wrapper pattern. Add IIFE wrapper for scope isolation and script return pattern for result extraction.

The fix wasn't a tweak. It was throwing the whole approach out and rebuilding the executor on top of isolated-vm — a real V8 sandbox with a separate heap and event loop, not a string-eval with optimistic naming. The replacement PR was substantial (a four-figure-line diff dominated by tests and security harness; LoC isn't the point, the architectural reset is) and went through three review rounds.

Two takeaways, and they sit a little uncomfortably together:

new Function(code) and vm.runInNewContext() are not sandboxes. They share the host heap; the Node docs explicitly say the vm module is not a security mechanism. Every introductory sandbox article says this.
People who know this still write it. I did. The pattern is convenient, the type errors don't surface, and "I'll harden it later" gets shipped. The honest framing isn't "if you do this you don't know what a sandbox is" — it's that the gravitational pull toward new Function(code) is strong enough that even people who've read the warnings end up there if there's no enforcing review.

The four layers below exist because that lesson cost a refactor; the rest of this article is what gets stacked on top once V8 isolation is the actual foundation.

Why I Let Users Ship JS

Customers want to push bug reports into systems we don't ship built-in plugins for: an internal Phabricator, a self-hosted Redmine, a Salesforce custom object. A generic webhook isn't enough — most real integrations need multi-step auth and per-field mapping. So I let them ship a small JavaScript factory:

export const factory = ({ rpcBridge, config }) => ({
  validateConfig:    (cfg) => Promise<{ valid, error? }>,
  testConnection:    (projectId) => Promise<boolean>,
  createFromBugReport: (bug, projectId, integrationId, meta?) => Promise<IntegrationResult>,
});

Our worker calls factory(...) once per project, then createFromBugReport(...) per matching bug. The plugin can hold an HTTP client, format ADF/markdown, run retries — but the only way it talks to the outside world is through the rpcBridge we hand it. That bridge is what the rest of this article protects.

Why not just use an existing untrusted-code runtime

There are reasonable alternatives to building this in-process — none of them slotted in for us:

vm2 — the natural answer for "JS sandbox as an npm install" until 2023. We didn't pick it because the project archived itself after a chain of escapes (CVE-2023-29017 etc.) the maintainer concluded couldn't be fixed inside the vm2 architecture without a rewrite. "Fork and patch" wasn't a serious option — the escapes were structural, not bugs. isolated-vm isn't better than a fixed vm2; it's a different architecture (separate V8 instance, not a vm.runInNewContext() wrapper) that doesn't share the broken assumption.
Cloudflare Workers / Vercel Edge / Fastly Compute — V8 isolates as a service. Excellent for the isolation primitive itself, but the plugin's job is to call our internal DB / S3 / queue to attach screenshots and replay, which means either egressing back into our VPC (defeats the boundary) or pushing all the context into the Worker invocation (defeats the latency budget).
Deno's permission model — solid for CLI tools and edge functions, but cohabiting Deno and Node in the same long-running worker process isn't a thing. We'd be running a separate Deno process per plugin, paying fork-and-IPC cost on every bug.
AWS Lambda nested / Firecracker microVMs — the gold standard for per-execution isolation. Cost and cold-start are both wrong for our pattern (one plugin call per bug report, latency-sensitive).
Figma's plugin sandbox — closest in shape to what we're doing, but their host is a browser and ours is a Node worker; the constraints don't transfer cleanly.

The in-process isolated-vm design is what fits when (a) plugin code needs to talk to your internal services, (b) per-call latency matters, and (c) you don't want a second runtime in production. If any of those assumptions don't apply to your case, the alternatives above probably beat what's described below.

Layer 1: Static Analysis

Threat addressed: make obvious malice expensive — a plugin with require('fs') or eval(...) shouldn't even reach the isolate.

Prior art note. The bypass classes in this section are not new findings. .constructor.constructor escape, alternative IPv4 encodings, and IPv4-mapped IPv6 are documented in the vm2 issue tracker, the OWASP SSRF Prevention Cheat Sheet, and CVE writeups going back to 2018–2019 (Capital One's metadata-endpoint incident is the famous one). The narrative below shows how those classes surfaced in my code, in the order code-review bots caught them — not as discovery, as catch-up to literature that was already there.

Before a single byte of plugin code touches V8, it goes through code-analyzer.ts. Two passes: a regex sweep (the obvious things — require('fs'), child_process, eval(, new Function(, process.exit, __proto__) and a Babel AST walk for the things regex can't see structurally.

The AST walk parses with @babel/parser (TypeScript plugin enabled) and:

only allows require('axios' | 'lodash' | 'date-fns' | 'crypto'); everything else, plus dynamic require(variable), is a violation;
flags any Function identifier reference (catches Function('...')(), const F = Function, and aliases through .constructor.constructor);
flags any .constructor property access in all four spellings (obj.constructor, obj?.constructor, obj['constructor'], obj[`constructor`]);
reconstructs dotted globals (process.binding, require.cache, require.main) from MemberExpression structure since the per-Identifier visitor only sees single names;
flags setTimeout('...', 0) with a string argument and any dynamic import().

How the literature applied to my code, in order

The version I shipped on day one looked solid. Then I tested it against the known-bypass corpus from the vm2 issue tracker and adjacent writeups, and three code-review bots from different vendors took several more passes at the diff. Six rounds in total before the dust settled.

The table below is the order in which those classes — all of them public before I started — surfaced in my code. It's not a list of original findings; it's a record of how much of the literature I had to actually walk through to harden a real implementation.

Iteration	What slipped through	Fix
1	bare `Function('...')()` (no `new`); alias `const F = Function; F(...)`; the literal-prototype escape `(1).constructor.constructor(...)`, `[].constructor.constructor(...)` etc.	flag any reference to the `Function` identifier; add a `MemberExpression` visitor for `.constructor` access
2	optional chaining `(1)?.constructor...`; template-literal computed (1)[`constructor`]; computed-string for dotted globals `process['binding']`	register the visitor for both `MemberExpression` and `OptionalMemberExpression`; normalise the property name across `Identifier`, `StringLiteral`, and zero-expression `TemplateLiteral`
3	TS runtime wrappers: `(Function as any)(...)`, `Function!(...)`, `<any>Function(...)`. The article version of the TS-position filter skipped them.	introduce a `TS_RUNTIME_WRAPPERS` allowlist; the type-position skip applies only to nodes not in that set
4	TS constructs with runtime initializers — `enum E { F = Function }` and `export = Function` — slip past the same filter from a different angle	extend the wrapper allowlist with `TSEnumMember` and `TSExportAssignment`
5	object-side wrappers: `(process as any).binding`, `process!.binding`, even the comma operator `(0, process).binding`	add `unwrapToIdentifier` — peels TS wrappers and `SequenceExpression` from the object side before the dotted-global lookup
6	combined `(1)?.[`constructor`]?.[`constructor`](...)` — optional chaining stacked on template-literal computed access	already covered by the combination of fixes from iterations 2 and 4; no new code, but a regression test was needed

Eleven distinct bypasses, six iterations. Each one was a way to spell the same dangerous access that an earlier fix had stopped — not new functionality, just new syntax.

What Layer 1 still doesn't catch

Pure obfuscation defeats static analysis by definition:

// String concatenation -- analyzer sees a binary expression, not 'constructor'
obj['cons' + 'tructor']

// Variable-keyed access
const k = 'constructor'; obj[k];

// eval-by-string
const e = 'eva' + 'l'; window[e]('...')

These are caught at runtime by Layer 2 (the V8 isolate strips fetch / process / require from globalThis regardless of how you spell them) and Layer 3 (the RPC bridge has its own whitelist that doesn't care what you called the method on the client side). Layer 1 is a first line of friction; it slows down attackers and catches honest mistakes. It is not a security boundary by itself.

Layer 2: V8 Isolate

Threat addressed: stop the plugin from reading host memory — env vars, secrets, other customers' data — even when Layer 1 misses an obfuscated escape.

Layer 1 is best-effort; Layer 2 is the actual isolation boundary (memory-level separation between plugin and host). Each plugin runs in its own isolated-vm Isolate — a separate V8 instance with its own heap, its own event loop, and zero shared memory with the host process. Wiring is in plugin-executor.ts.

The constants

15 seconds, 128 MB. Going over either kills the script. The 15 s number isn't accidental: it's 10 s HTTP timeout (Layer 3) + 5 s overhead, picked after I discovered the original 5 s default killed plugins mid-flight on slow tracker APIs.

A subtler trap I shipped with the original code: parsing the env-overridable timeout naively.

// What I had (silently disables the timeout if env var is malformed)
const envTimeout = process.env.PLUGIN_EXECUTION_TIMEOUT_MS
  ? parseInt(process.env.PLUGIN_EXECUTION_TIMEOUT_MS, 10)
  : 15_000;
this.defaultTimeout = options?.timeout ?? envTimeout;

If anyone deploys with PLUGIN_EXECUTION_TIMEOUT_MS=15s (the unit-suffix mistake everyone makes), parseInt('15s', 10) returns NaN. NaN is not nullish, so the ?? doesn't catch it, and script.run({ timeout: NaN }) is undefined behaviour in isolated-vm — likely disabling the wall-clock kill entirely. Half of Layer 2's enforcement gone, silently, on a typo.

The fix is two lines:

// What I shipped after the fix
const envParsed = Number(process.env.PLUGIN_EXECUTION_TIMEOUT_MS);
const envTimeout = Number.isFinite(envParsed) && envParsed > 0 ? envParsed : 15_000;
this.defaultTimeout = options?.timeout ?? envTimeout;

Number.isFinite() rejects NaN and infinities; the > 0 rejects zero and negatives. If the env var is anything other than a positive finite number, we fall back to the 15 s default. Lesson: don't trust ?? to catch parse failures, and don't trust parseInt to refuse garbage.

What the isolate strips

Before any plugin code runs, we compile and execute DISABLE_UNSAFE_APIS_SCRIPT into the new context:

const DISABLE_UNSAFE_APIS_SCRIPT = `
  globalThis.fetch          = undefined;
  globalThis.XMLHttpRequest = undefined;
  globalThis.WebSocket      = undefined;
  globalThis.EventSource    = undefined;
  globalThis.require        = undefined;
  globalThis.process        = undefined;
  globalThis.Buffer         = undefined;
`;

This is what catches the Layer 1 obfuscation bypasses. obj['cons' + 'tructor']('return process')() resolves at runtime to the real Function constructor, which we can't strip (every function value has a .constructor chain back to it) — but the function it returns runs in the same isolate where process is undefined. The escape gets you a reference; it doesn't get you anything to dereference.

The lifecycle

Two callsites need an isolate, and they treat it differently:

validateConfig / testConnection (called during the connection wizard) — temporary isolate. Create, use, dispose(). Even on exception: the cleanup is in finally.
executeFactory (the actual production path) — persistent isolate, kept alive across calls so the plugin doesn't pay startup cost per bug. Disposal is explicit, via the service proxy's dispose() method.

What the host actually exposes

Inside the isolate, the plugin's globals look like this:

console.log / console.error / console.warn — proxied through the RPC bridge so they end up in our logger, tagged [Plugin].
ERROR_CODES — frozen object of canonical error codes (so plugins can return { code: ERROR_CODES.AUTH_FAILED } consistently).
validators — names of stock validators (required, url, email, pattern, …). Calling them goes through RPC.
rpcBridge.callMethod(method, args) — the gate. Covered in the next layer.
module and exports — set up before the plugin code runs, so module.exports = { factory, metadata } works.
pluginConfig — the saved config blob, copied in via ExternalCopy.

Nothing else. Specifically: no setTimeout, no setInterval, no Promise.race against a host clock — the only thing that races is script.run({ timeout: 15_000 }).

Going from the isolate to the host

script.run(...) returns a JSON string by default; we explicitly opt into a Reference for the factory function so we can call it later. References are unidirectional handles — we can apply() them, but the plugin can't reach back into our memory. The Reference API plus ExternalCopy plus the globalThis strip is what makes the boundary real.

If you've used vm (the Node built-in) and assumed it was a sandbox: it isn't. vm.runInNewContext() shares the heap with the host and has well-known escapes. isolated-vm is the version that actually works, at the cost of native compilation (which is its own ops story — pnpm v10's build-script hardening, Node 22 V8 API requirements, and a CommonJS-via-dynamic-import quirk all came up).

Layer 3: The RPC Bridge — The Only Way Out

Threat addressed: stop the plugin from reaching another customer's data, or from spoofing host identity headers on outbound calls.

fetch is undefined in the isolate, require is undefined, process is undefined. The plugin still needs to make HTTP calls to Jira, read its assigned project's bug reports, fetch a presigned URL for a screenshot. Everything it can do, it does through one entry point — implemented in rpc-bridge.ts:

// In the plugin (inside the isolate)
const result = await rpcBridge.callMethod('utils.makeApiRequest', [{
  url: 'https://your-instance.atlassian.net/rest/api/3/issue',
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify(issuePayload),
}]);

That call serializes to JSON, crosses the isolate boundary, and lands in RpcBridge.handleCall host-side, which routes by exact name through an explicit handler registry — utils.makeApiRequest, db.bugReports.findById, db.bugReports.update, storage.getPresignedUrl, plus the three log methods. There is no handler for db.users.findAll, so the call fails before any code runs. The update validator only allows mutating the metadata field; everything else is rejected.

Project scoping

RpcBridge is constructed with a projectId baked in. Every database read filters by it: a bug whose project_id doesn't match returns null, regardless of whether the plugin guesses the UUID correctly. Cross-tenant reads stop here, because the bridge has no method that can return another project's data.

The header blocklist

utils.makeApiRequest is the most powerful method — the plugin controls the URL, method, body, and almost all headers. The exceptions are listed in BLOCKED_HEADERS: Authorization (can't pretend to be the host), Cookie and Set-Cookie (can't replay session), X-API-Key, X-Auth-Token, Proxy-Authorization, every Sec-* and every X-Forwarded-*. Case-insensitive match. The plugin can set its own Authorization to a token it brought via config; it cannot impersonate the host on a domain that already has one of our tokens.

Response limits and error sanitization

10 second host-side HTTP timeout (under the 15 s isolate timeout). Response body capped at 10 MB to prevent a hostile upstream from exhausting host memory with a multi-gigabyte stream. And exceptions thrown host-side run through sanitizeErrorMessage before they cross back into the isolate — it redacts Postgres connection strings, password=… query parameters, API tokens (sk_…, pk_…, Bearer …), file paths, private IPs, and emails. A plugin can tell whether its request worked. It cannot tell whether it failed because the database was down or which user ran the worker process — the leaky details that would help an attacker pivot get redacted before they're visible.

Layer 4: SSRF on Every Egress

Threat addressed: stop the plugin from pivoting into the cloud metadata endpoint, the internal network, or any other private address — the bridge it just requested a fetch to.

Prior art note. Everything in this section — the IPv4 private-range list, IPv6 loopback / link-local / unique-local handling, IPv4-mapped IPv6 recursion, the cloud-metadata endpoint at 169.254.169.254, the alternative IPv4 encoding bypass class (octal / hex / decimal), and even the signed-int shift gotcha — is canonical SSRF-prevention literature. The OWASP SSRF Prevention Cheat Sheet has been documenting these since the mid-2010s; Capital One's 2019 breach (metadata-endpoint exfil) and subsequent GitLab / Shopify / Atlassian incidents made the lessons painfully concrete. The work below is implementation; the curriculum was already on the shelf.

The plugin asks for utils.makeApiRequest({ url: 'http://...' }). Before the host process opens a single TCP connection, the URL goes through validateSSRFProtection in ssrf-validator.ts. It's the longest-tested function in the security folder — ~40 distinct cases parameterised via it.each over IP-range tables, 484 inputs total — because every line of it is a defence against a real bypass class with at least one public CVE writeup behind it.

SSRF validation flow: alternative encodings → URL parse → hostname → cloud metadata → IPv4 → IPv6 (mapped IPv6 recurses)

What gets blocked

IPv4 (parsed from dotted-decimal only — see "alternative encodings" below) — full list in BLOCKED_IPV4_RANGES:

127.0.0.0/8         loopback
10.0.0.0/8          RFC 1918 private
172.16.0.0/12       RFC 1918 private
192.168.0.0/16      RFC 1918 private
169.254.0.0/16      link-local -- includes 169.254.169.254 (AWS/GCP/Azure metadata!)
100.64.0.0/10       CGNAT (RFC 6598)
0.0.0.0/8           "this network"
224.0.0.0/4         multicast
240.0.0.0/4         reserved
255.255.255.255/32  broadcast

IPv6:

::1                 loopback
fc00::/7            unique-local
fe80::/10           link-local
ff00::/8            multicast
::ffff:0:0/96       IPv4-mapped -- recursively checks the embedded IPv4
fd00:ec2::254       AWS EC2 IPv6 metadata

Hostnames (exact match, case-insensitive):

localhost
ip6-localhost
ip6-loopback
localhost.localdomain
broadcasthost
metadata
metadata.google.internal
instance-data

Alternative IPv4 encodings — the easy bypass

Browsers and Node's URL parser are equally happy with all of these:

http://0177.0.0.1            # octal -- 127.0.0.1
http://0x7f.0.0.1            # hex -- 127.0.0.1
http://2130706433            # 32-bit decimal -- 127.0.0.1
http://127.1                 # short form -- 127.0.0.1
http://0177.0x0.0.1          # mixed

If you parse the URL and then check the hostname against an allowlist, you've already lost — the parser canonicalised them. The validator runs containsAlternativeIPFormat(url) before new URL(url):

// Octal: starts with 0 then more digits
if (/^0\d/.test(part)) return true;
// Hex
if (/0x[0-9a-f]/i.test(hostname)) return true;
// Pure decimal greater than 255
if (/^\d+$/.test(hostname) && parseInt(hostname, 10) > 255) return true;

Anything matching → reject before parsing. Five lines of regex implementing the alternative-encoding mitigation OWASP has recommended since the mid-2010s — and that bigger products have repeatedly shipped without (Capital One 2019 being the famous case, but the list is long).

IPv4-mapped IPv6 — the second easy bypass

http://[::ffff:127.0.0.1]/ is a perfectly legal IPv6 URL that resolves to an IPv4 socket pointing at loopback. If your IPv4 check and IPv6 check don't talk to each other, this slips between them. The validator pulls the embedded IPv4 out of two formats — hex (::ffff:7f00:0001) and dotted-decimal (::ffff:127.0.0.1) — and recurses through the IPv4 check.

The signed-int bug I shipped

The original IPv4-range comparison used straight bitwise OR:

const ipNum = (octets[0] << 24) | (octets[1] << 16) | (octets[2] << 8) | octets[3];

Looks fine — until you parse 255.255.255.255, which sets the sign bit, and JavaScript's << is a signed operator. ipNum becomes -1, the comparison flips, and any IP at or above 128.0.0.0 mis-classifies. The fix is one character per line:

const ipNum = (((octets[0] << 24) | …) >>> 0);

>>> 0 coerces to unsigned 32-bit. 255.255.255.255 becomes 4_294_967_295 instead of -1. The test that caught this had 192.168.1.1 as the input — exactly in the half of address space the bug affected.

What I Don't Catch

A sandbox post that doesn't include an honest "what I don't catch" section is selling something. Mine, as a tracked list:

#	Gap	Status	Why it's open	Mitigation
1	CPU exhaustion within 15 s budget	Accepted	An extra `fork()` per bug is meaningful at production volume	15 s wall-clock kills the script; queue absorbs the latency
2	Spectre-class side channels in shared V8	Accepted	OS-level isolation (separate Node process per plugin) costs more than the threat justifies today	Will revisit if/when we onboard customers on a higher-assurance tier
3	Egress to public hosts	By design	The plugin must reach the customer's tracker — public-host blocking would defeat the feature	Human review of plugin code before approval; customer-data trust model
4	DNS rebinding race between validate and fetch	Open — fix planned	Node's `fetch` doesn't expose a clean way to pin a pre-resolved IP without breaking TLS SNI	Custom `dispatcher` PR scoped. Exploitable when an attacker controls a DNS server and knows the request pattern; at production scale that's a meaningful surface, not a "small race." Treating as a real open item, not a theoretical one.
5	Compromised npm package in the allowlist	Tracked externally	Layer 1 allowlists `axios` / `lodash` / `date-fns` for plugins; the host also depends on `axios` directly — a supply-chain attack would compromise both the host and every plugin simultaneously	pnpm-lock + Dependabot/Renovate review; out of analyzer scope
6	`isolated-vm` is in maintenance mode	Accepted (monitored)	Maintainer states no architectural improvements are coming; new Node versions still supported. No alternative library exists today	Pinned to exact `6.0.2` (no caret); watching upstream issues; would migrate if breakage outpaces patches

Threats 1–2 and 6 are accepted risks with stated triggers for revisiting. Threat 3 is the trust boundary itself — nothing in any of the four technical layers protects you from a malicious-but-approved plugin author. Threats 4–5 are open and tracked.

Re-stating what is caught, since it's easy to lose track in a list of caveats: cross-tenant reads (Layer 3), credential theft from the host (Layer 2 + Layer 3), SSRF into cloud metadata or internal network (Layer 4), and resource exhaustion at the obvious thresholds (Layer 2 memory + timeout).

Layer 0 — human review (the layer that actually defines the trust)

The four technical layers above police what plugin code can do. None of them decide whether a given plugin gets to run at all. That decision is mine, made before any of those layers ever execute. This is Layer 0 in everything but the diagram, and it's worth being explicit about it because honest disclosure of the technical limits is meaningless if the process around them isn't described.

How it currently works (small-scale, founder-driven; this will need to change at some point):

Trigger. Plugin upload via the admin UI marks the integration as awaiting_review. It cannot be enabled by the customer until review passes — the runtime won't load awaiting_review plugins regardless of any other config.
Scope of review. I read every line. For a non-trivial plugin (multi-step auth flow, custom field mapping) that's typically 30–90 minutes of work plus another 30 for any follow-up questions to the author.
What I'm looking for. Specifically: outbound URL list (does it talk only to the customer's claimed tracker, or to other public hosts too?), the data fields it reads from bug.metadata (does it pull more than it strictly needs?), how it handles errors (does it swallow them, or surface them in a way a malicious version could exfiltrate through?), and any axios interceptor / lodash setter that touches data in transit.
What gets rejected. Plugins that hit any public host other than the declared tracker, plugins that read bug.metadata fields not used for their stated purpose, plugins that use eval-equivalent patterns even when Layer 1 would catch them at load time (signal of intent), plugins whose author can't explain what their code does line-by-line on a follow-up call.
False-positive cost. Some legitimate plugins get pushed back over things that turn out to be fine (a third-party domain that's actually the tracker's CDN, etc.). I'd rather false-positive than false-negative on this gate.
False-negative cost. A malicious-but-approved plugin gets full access to that customer's bug-report data — including PII fields the SDK already sanitised but that may still contain enough context to be useful. Once approved, the plugin can quietly POST any bug it processes to any public host. The technical layers do not stop this.

Where this falls apart at scale. Right now I'm reviewing every plugin myself. That's not a scalable process — somewhere around the 20th custom-plugin customer, this becomes a full-time job and the review quality degrades. I haven't designed the next stage yet; honest answer is that "second reviewer required for plugins touching financial / health-data fields" is the next step, and we're not there. If your custom-plugin offering becomes a real product, Layer 0 is the layer you'll spend the most time on, not Layers 1–4.

Numbers

Surface area

File	Lines	What it does
`plugin-executor.ts`	~1.0k	isolated-vm wiring, factory lifecycle, dispose
`rpc-bridge.ts`	~0.9k	Method whitelist, project scoping, header blocklist
`ssrf-validator.ts`	~0.4k	IPv4 / IPv6 / hostname / alternative-encoding checks
`code-analyzer.ts`	~0.2k	Babel AST visitors + regex sweep
Total Layer 1–4 production	~2.5k	All four layers combined

LoC isn't a quality metric — every line is more attack surface and more maintenance debt. Numbers above are what has to stay correct, not a virtue. Worth showing because "is this big or small" is a reasonable question before copying any of it; rounded to hundreds because the exact count isn't load-bearing.

Tests:

File	Cases
`ssrf-protection.test.ts`	~40 distinct cases × `it.each` over IP-range tables → 484 inputs
`rpc-bridge-security.test.ts`	~70
`plugin-utils-*.test.ts`	113
`code-analyzer.test.ts`	61

What it costs

Rough, single-run measurements on my dev machine. Not p95/p99 from production telemetry, not benchmarked against a baseline (e.g. JSON-RPC over HTTP without a sandbox), and production numbers may differ materially under contention and GC pressure. A typical plugin invocation lands in low single-digit ms hot (warm isolate, before HTTP) and ~30 ms cold (one-time per project per worker lifetime); read these as order-of-magnitude, not as performance claims. Memory: 128 MB cap per active plugin × number of customers using custom plugins. On a 1 GB-RAM API container, that's room for ~6 active plugin isolates before a separate worker tier becomes necessary — most customers use built-in plugins, so this rarely binds.

Not free, but cheap enough to not think about. The cost that actually matters is human review time on every plugin upload (an hour per non-trivial plugin — Layer 0 in everything but the diagram) and the queue-position cost of the 15-second timeout if something runs away.