Journal
Custom pixels without the SSRF footgun
Letting merchants define their own webhook destination sounds like a checkbox feature. It's actually a security project. Here's what a naive implementation lets attackers do, and the validation layer that stops it.
A merchant adds a custom destination to your app. The URL field accepts any
string. They paste http://169.254.169.254/latest/meta-data/iam/security-credentials/.
Your worker dutifully POSTs to it, gets back AWS IAM credentials, and
forwards them to a Slack webhook the same merchant configured five minutes
ago.
Game over — for you, not them.
The naive version
Most “custom webhook” features start here: a text field for the URL, a text
field for the auth header, drop both into http.Post(...). It works in the
demo. It ships Friday. It wakes someone up on Monday.
Every field a user can type into is an attack surface. A URL field is a remote-code-reach-ability field.
What an attacker can reach
Without validation, the full blast radius:
- Cloud metadata endpoints. AWS
169.254.169.254, GCPmetadata.google.internal, Azure equivalents. These return IAM credentials to anything that can make a TCP connection from inside the VM. - Internal services on the same Docker network.
http://postgres:5432,http://redis:6379, anything resolvable by compose service name. - Localhost. Your admin endpoints, debug routes, profilers, anything bound to 127.0.0.1.
- Private RFC1918 ranges.
10.0.0.0/8,172.16.0.0/12,192.168.0.0/16. Whatever’s on your private network. - IPv6 link-local and unique-local.
fe80::/10,fc00::/7. Easy to forget if your denylist only covers IPv4. - DNS rebinding. The URL resolves to a public IP on the first lookup, an internal IP on the second. Validation that only checks once gets bypassed by the actual request.
What the validation actually has to do
Layered, in order:
- Resolve the hostname before the HTTP request. Reject if any resolved IP matches a denylist — private ranges, loopback, link-local, multicast, plus the cloud metadata IPs explicitly. Check both A and AAAA records.
- Bind the HTTP client to the resolved IP, not the hostname. Dial the IP
you validated, pass the original hostname in the
Hostheader and SNI. The OS cannot re-resolve mid-request. DNS rebinding closed. - Disallow redirects, or re-validate every redirect target. A 302 back
to
169.254.169.254should not bypass step 1. - Cap response body size. A malicious endpoint can stream gigabytes just to tie up your worker and egress budget.
- Aggressive timeouts. 5s connect, 15s total is a reasonable starting point. Custom destinations aren’t latency-sensitive.
- Restrict schemes to
httpandhttps. Nofile://, nogopher://, noftp://. Most libraries default to everything.
The eventabee implementation
The SSRF guard lives once, in the connector layer, shared between the custom-pixel connector and the generic webhook connector. One source of truth, one place to audit, one place to add the next IP range we didn’t think of yet.
The field-mapper pattern matters here too. Merchants map event fields to destination fields through dropdowns, not a template DSL. There is no arbitrary template execution, no shell expansion, no “interesting” string interpolation. The Business-tier raw JSON template is sandboxed to event-derived field substitution — no code, no functions, no escapes.
Validation runs at config save and at every send. If DNS resolves differently between Tuesday and Wednesday, the send-time check catches it.
What you still can’t prevent
Merchants pointing the destination at themselves and racking up egress bandwidth on your dime. Per-merchant rate limits and a circuit breaker on consecutive 5xx are not optional. Not glamorous. Required.
If your app lets users define an outbound HTTP destination and you didn’t write SSRF validation in the first sprint, assume it’s exploitable. Then go check. We built this into eventabee before the feature shipped — not because it was hard, but because adding it after a merchant pastes the metadata URL is the wrong time to learn.