URL rewriting

URL rewriting is fundamental to how Scramjet works. Every URL that passes through the proxy must be encoded to route through the Service Worker, and every URL in proxied content must be rewritten to maintain the proxy.

URL encoding and decoding

The codec system

Scramjet uses a configurable codec to encode and decode URLs. The codec is a pair of functions defined in your configuration:

const scramjet = new ScramjetController({
  prefix: "/scramjet/",
  codec: {
    encode: (url: string) => encodeURIComponent(url),
    decode: (url: string) => decodeURIComponent(url),
  },
});

The default codec uses encodeURIComponent and decodeURIComponent, but you can use any encoding scheme (base64, custom obfuscation, etc.).

How encoding works

When you navigate to a URL through Scramjet:

frame.go("https://example.com/page?query=value#hash");

The controller’s encodeUrl() method transforms it:

Parse the URL

The URL is parsed into a URL object.

Extract and encode hash

The hash fragment is separated and encoded independently:

const encodedHash = codecEncode(url.hash.slice(1));
const realHash = encodedHash ? "#" + encodedHash : "";
url.hash = ""; // Remove from URL before encoding

Encode the URL

The main URL (without hash) is encoded:

return config.prefix + codecEncode(url.href) + realHash;

Result

https://example.com/page?query=value#hash
↓
/scramjet/https%3A%2F%2Fexample.com%2Fpage%3Fquery%3Dvalue#hash

Note that the hash is preserved separately so browser navigation works correctly.

How decoding works

When the Service Worker intercepts a request, it decodes the URL:

const url = new URL(unrewriteUrl(requestUrl));

The unrewriteUrl() function in src/shared/rewriters/url.ts reverses the process:

Check for special protocols

Handle special URL types first:

if (url.startsWith("javascript:")) return url;
if (url.startsWith("mailto:")) return url;
if (url.startsWith("about:")) return url;

Handle blob/data URLs

if (url.startsWith(prefixed + "blob:")) {
  return url.substring(prefixed.length);
}
if (url.startsWith(prefixed + "data:")) {
  return url.substring(prefixed.length);
}

Decode the main URL

const realUrl = tryCanParseURL(url);
const decodedHash = codecDecode(realUrl.hash.slice(1));
const realHash = decodedHash ? "#" + decodedHash : "";
realUrl.hash = "";

return codecDecode(realUrl.href.slice(prefixed.length) + realHash);

URL rewriting in content

Once content is fetched, all URLs within it must be rewritten to point through the proxy.

The rewriteUrl function

The rewriteUrl() function is the core rewriter, defined in src/shared/rewriters/url.ts:

export function rewriteUrl(url: string | URL, meta: URLMeta): string

It takes:

url: The URL to rewrite (absolute or relative)
meta: Context about the current page (origin, base URL, frame names)

And returns the proxied URL.

URLMeta context

The URLMeta object provides context for rewriting:

type URLMeta = {
  origin: URL;              // Real origin of the current page
  base: URL;                // Base URL for resolving relative URLs
  topFrameName?: string;    // Top Scramjet frame name
  parentFrameName?: string; // Parent frame name
};

The base URL can differ from origin if the page contains a <base> tag. This is updated dynamically during HTML rewriting.

Special URL handling

rewriteUrl() handles different URL schemes:

javascript: URLs

JavaScript URLs are rewritten by rewriting the JavaScript code:

if (url.startsWith("javascript:")) {
  return (
    "javascript:" +
    rewriteJs(url.slice("javascript:".length), "(javascript: url)", meta)
  );
}

blob: URLs

Blob URLs are prefixed with the proxy origin:

if (url.startsWith("blob:")) {
  return location.origin + config.prefix + url;
}

This routes blob fetches through the Service Worker so they can be rewritten.

data: URLs

Data URLs are also prefixed:

if (url.startsWith("data:")) {
  return location.origin + config.prefix + url;
}

mailto: and about: URLs

These are returned unchanged:

if (url.startsWith("mailto:") || url.startsWith("about:")) {
  return url;
}

HTTP(S) URLs

Regular URLs are resolved against the base, then encoded:

let base = meta.base.href;
if (base.startsWith("about:")) {
  base = unrewriteUrl(self.location.href);
}

const realUrl = tryCanParseURL(url, base);
if (!realUrl) return url; // Invalid URL, return as-is

const encodedHash = codecEncode(realUrl.hash.slice(1));
const realHash = encodedHash ? "#" + encodedHash : "";
realUrl.hash = "";

return (
  location.origin + config.prefix + codecEncode(realUrl.href) + realHash
);

Relative URL resolution

Relative URLs are resolved against meta.base:

const realUrl = tryCanParseURL(url, base);

This uses the browser’s native URL parser:

function tryCanParseURL(url: string, origin?: string | URL): URL | null {
  try {
    return new URL(url, origin);
  } catch {
    return null;
  }
}

If URL parsing fails (e.g., for invalid URLs), the original URL is returned unchanged. This prevents breaking pages with malformed URLs.

Where URLs are rewritten

Scramjet rewrites URLs in multiple places throughout the stack:

HTML rewriting

In src/shared/rewriters/html.ts, URLs in HTML attributes are rewritten:

for (const rule of htmlRules) {
  for (const attr in rule) {
    if (node.attribs[attr] !== undefined) {
      const value = node.attribs[attr];
      const v = rule.fn(value, meta, cookieStore);
      if (v === null) delete node.attribs[attr];
      else node.attribs[attr] = v;
    }
  }
}

The htmlRules array (from src/shared/htmlRules.ts) defines which attributes to rewrite:

{ 
  href: ["a", "link", "area", "base"],
  fn: (value, meta) => rewriteUrl(value, meta)
},
{
  src: ["script", "img", "iframe", "embed", "source", "track", "video", "audio"],
  fn: (value, meta) => rewriteUrl(value, meta)
},
{
  action: ["form"],
  fn: (value, meta) => rewriteUrl(value, meta)
}
// ... etc

Special HTML cases

export function rewriteSrcset(srcset: string, meta: URLMeta) {
  const sources = srcset.split(/ .*,/).map((src) => src.trim());
  const rewrittenSources = sources.map((source) => {
    const [url, ...descriptors] = source.split(/\s+/);
    const rewrittenUrl = rewriteUrl(url.trim(), meta);
    return descriptors.length > 0
      ? `${rewrittenUrl} ${descriptors.join(" ")}`
      : rewrittenUrl;
  });
  return rewrittenSources.join(", ");
}

JavaScript rewriting

JavaScript rewriting is more complex. It uses an oxc-based WASM rewriter to:

Parse the JavaScript AST
Identify API calls that accept URLs
Wrap those calls with runtime functions that rewrite URLs

For example, this code:

fetch("https://example.com/api");

Is rewritten to:

fetch($scramjet$rewrite("https://example.com/api"));

The $scramjet$rewrite function calls rewriteUrl() at runtime with the current page’s metadata.

CSS rewriting

In src/shared/rewriters/css.ts, URLs in CSS are rewritten:

export function rewriteCss(css: string, meta: URLMeta): string {
  return css.replace(/url\(["']?([^"')]+)["']?\)/gi, (match, url) => {
    const rewritten = rewriteUrl(url.trim(), meta);
    return `url("${rewritten}")`;
  });
}

This handles:

background-image: url(...)
@import url(...)
@font-face { src: url(...) }
etc.

Client-side interception

In the ScramjetClient, DOM APIs are intercepted to rewrite URLs at runtime:

client.Trap("HTMLAnchorElement.prototype.href", {
  get(ctx) {
    const href = ctx.get();
    return unrewriteUrl(href); // Show real URL
  },
  set(ctx, value) {
    const rewritten = rewriteUrl(value, client.meta);
    ctx.set(rewritten); // Set proxied URL
  },
});

This ensures that even programmatic URL manipulation is proxied.

URL preservation

Scramjet preserves original URLs in HTML using scramjet-attr-* attributes:

<!-- Before rewriting -->
<a href="https://example.com">Link</a>

<!-- After rewriting -->
<a href="/scramjet/https%3A%2F%2Fexample.com" scramjet-attr-href="https://example.com">Link</a>

This allows:

Debugging and inspection
Restoring original URLs when needed
Compatibility with scripts that read attributes directly

Hash handling

Hash fragments require special handling because:

They’re not sent to the server
They’re used for client-side routing
They need to work with browser navigation APIs

Scramjet encodes hashes separately:

https://example.com/page#section
↓
/scramjet/https%3A%2F%2Fexample.com%2Fpage#section

The hash is encoded but kept as a real hash fragment so:

window.location.hash works correctly
Hash-based routers work
The browser’s back/forward buttons work

Performance considerations

URL rewriting can be expensive for large documents. The HTML rewriter uses htmlparser2 for performance, and the JS rewriter is written in Rust/WASM.

Use the rewriterLogs flag during development to see timing information:

scramjet.modifyConfig({
  flags: { rewriterLogs: true }
});

Common pitfalls

Always use ScramjetFrame for iframes. Direct <iframe> elements won’t have proper frame tracking, breaking nested iframe URL resolution.

Be careful with custom URL encoders. They must:

Be deterministic (same input = same output)
Not produce URLs with characters that need escaping in URLs
Be reversible (encode and decode must be inverses)

Next steps

Configuration

Learn about codec configuration and other options

Service Worker

See how URLs are decoded in the Service Worker

Get Started

Core Concepts

Guides

Advanced

URL encoding and decoding

The codec system

How encoding works

How decoding works

URL rewriting in content

The rewriteUrl function

URLMeta context

Special URL handling

Relative URL resolution

Where URLs are rewritten

HTML rewriting

Special HTML cases

JavaScript rewriting

CSS rewriting

Client-side interception

URL preservation

Hash handling

Performance considerations

Common pitfalls

Next steps

Configuration

Service Worker

Get Started

Core Concepts

Guides

Advanced

Documentation Index

​URL encoding and decoding

​The codec system

​How encoding works

​How decoding works

​URL rewriting in content

​The rewriteUrl function

​URLMeta context

​Special URL handling

​Relative URL resolution

​Where URLs are rewritten

​HTML rewriting

​Special HTML cases

​JavaScript rewriting

​CSS rewriting

​Client-side interception

​URL preservation

​Hash handling

​Performance considerations

​Common pitfalls

​Next steps

Configuration

Service Worker

URL encoding and decoding

The codec system

How encoding works

How decoding works

URL rewriting in content

The rewriteUrl function

URLMeta context

Special URL handling

Relative URL resolution

Where URLs are rewritten

HTML rewriting

Special HTML cases

JavaScript rewriting

CSS rewriting

Client-side interception

URL preservation

Hash handling

Performance considerations

Common pitfalls

Next steps