URL rewriting is fundamental to how Scramjet works. Every URL that passes through the proxy must be encoded to route through the Service Worker, and every URL in proxied content must be rewritten to maintain the proxy.
URL encoding and decoding
The codec system
Scramjet uses a configurable codec to encode and decode URLs. The codec is a pair of functions defined in your configuration:
const scramjet = new ScramjetController ({
prefix: "/scramjet/" ,
codec: {
encode : ( url : string ) => encodeURIComponent ( url ),
decode : ( url : string ) => decodeURIComponent ( url ),
},
});
The default codec uses encodeURIComponent and decodeURIComponent, but you can use any encoding scheme (base64, custom obfuscation, etc.).
How encoding works
When you navigate to a URL through Scramjet:
frame . go ( "https://example.com/page?query=value#hash" );
The controller’s encodeUrl() method transforms it:
Parse the URL
The URL is parsed into a URL object.
Extract and encode hash
The hash fragment is separated and encoded independently: const encodedHash = codecEncode ( url . hash . slice ( 1 ));
const realHash = encodedHash ? "#" + encodedHash : "" ;
url . hash = "" ; // Remove from URL before encoding
Encode the URL
The main URL (without hash) is encoded: return config . prefix + codecEncode ( url . href ) + realHash ;
Result
https://example.com/page?query=value#hash
↓
/scramjet/https%3A%2F%2Fexample.com%2Fpage%3Fquery%3Dvalue#hash
Note that the hash is preserved separately so browser navigation works correctly.
How decoding works
When the Service Worker intercepts a request, it decodes the URL:
const url = new URL ( unrewriteUrl ( requestUrl ));
The unrewriteUrl() function in src/shared/rewriters/url.ts reverses the process:
Check for special protocols
Handle special URL types first: if ( url . startsWith ( "javascript:" )) return url ;
if ( url . startsWith ( "mailto:" )) return url ;
if ( url . startsWith ( "about:" )) return url ;
Handle blob/data URLs
if ( url . startsWith ( prefixed + "blob:" )) {
return url . substring ( prefixed . length );
}
if ( url . startsWith ( prefixed + "data:" )) {
return url . substring ( prefixed . length );
}
Decode the main URL
const realUrl = tryCanParseURL ( url );
const decodedHash = codecDecode ( realUrl . hash . slice ( 1 ));
const realHash = decodedHash ? "#" + decodedHash : "" ;
realUrl . hash = "" ;
return codecDecode ( realUrl . href . slice ( prefixed . length ) + realHash );
URL rewriting in content
Once content is fetched, all URLs within it must be rewritten to point through the proxy.
The rewriteUrl function
The rewriteUrl() function is the core rewriter, defined in src/shared/rewriters/url.ts:
export function rewriteUrl ( url : string | URL , meta : URLMeta ) : string
It takes:
url : The URL to rewrite (absolute or relative)
meta : Context about the current page (origin, base URL, frame names)
And returns the proxied URL.
URLMeta context
The URLMeta object provides context for rewriting:
type URLMeta = {
origin : URL ; // Real origin of the current page
base : URL ; // Base URL for resolving relative URLs
topFrameName ?: string ; // Top Scramjet frame name
parentFrameName ?: string ; // Parent frame name
};
The base URL can differ from origin if the page contains a <base> tag. This is updated dynamically during HTML rewriting.
Special URL handling
rewriteUrl() handles different URL schemes:
JavaScript URLs are rewritten by rewriting the JavaScript code: if ( url . startsWith ( "javascript:" )) {
return (
"javascript:" +
rewriteJs ( url . slice ( "javascript:" . length ), "(javascript: url)" , meta )
);
}
Blob URLs are prefixed with the proxy origin: if ( url . startsWith ( "blob:" )) {
return location . origin + config . prefix + url ;
}
This routes blob fetches through the Service Worker so they can be rewritten.
Data URLs are also prefixed: if ( url . startsWith ( "data:" )) {
return location . origin + config . prefix + url ;
}
These are returned unchanged: if ( url . startsWith ( "mailto:" ) || url . startsWith ( "about:" )) {
return url ;
}
Regular URLs are resolved against the base, then encoded: let base = meta . base . href ;
if ( base . startsWith ( "about:" )) {
base = unrewriteUrl ( self . location . href );
}
const realUrl = tryCanParseURL ( url , base );
if ( ! realUrl ) return url ; // Invalid URL, return as-is
const encodedHash = codecEncode ( realUrl . hash . slice ( 1 ));
const realHash = encodedHash ? "#" + encodedHash : "" ;
realUrl . hash = "" ;
return (
location . origin + config . prefix + codecEncode ( realUrl . href ) + realHash
);
Relative URL resolution
Relative URLs are resolved against meta.base:
const realUrl = tryCanParseURL ( url , base );
This uses the browser’s native URL parser:
function tryCanParseURL ( url : string , origin ?: string | URL ) : URL | null {
try {
return new URL ( url , origin );
} catch {
return null ;
}
}
If URL parsing fails (e.g., for invalid URLs), the original URL is returned unchanged. This prevents breaking pages with malformed URLs.
Where URLs are rewritten
Scramjet rewrites URLs in multiple places throughout the stack:
HTML rewriting
In src/shared/rewriters/html.ts, URLs in HTML attributes are rewritten:
for ( const rule of htmlRules ) {
for ( const attr in rule ) {
if ( node . attribs [ attr ] !== undefined ) {
const value = node . attribs [ attr ];
const v = rule . fn ( value , meta , cookieStore );
if ( v === null ) delete node . attribs [ attr ];
else node . attribs [ attr ] = v ;
}
}
}
The htmlRules array (from src/shared/htmlRules.ts) defines which attributes to rewrite:
{
href : [ "a" , "link" , "area" , "base" ],
fn : ( value , meta ) => rewriteUrl ( value , meta )
},
{
src : [ "script" , "img" , "iframe" , "embed" , "source" , "track" , "video" , "audio" ],
fn : ( value , meta ) => rewriteUrl ( value , meta )
},
{
action : [ "form" ],
fn : ( value , meta ) => rewriteUrl ( value , meta )
}
// ... etc
Special HTML cases
srcset attribute
Event handlers
Import maps
export function rewriteSrcset ( srcset : string , meta : URLMeta ) {
const sources = srcset . split ( / . * ,/ ). map (( src ) => src . trim ());
const rewrittenSources = sources . map (( source ) => {
const [ url , ... descriptors ] = source . split ( / \s + / );
const rewrittenUrl = rewriteUrl ( url . trim (), meta );
return descriptors . length > 0
? ` ${ rewrittenUrl } ${ descriptors . join ( " " ) } `
: rewrittenUrl ;
});
return rewrittenSources . join ( ", " );
}
JavaScript rewriting
JavaScript rewriting is more complex. It uses an oxc-based WASM rewriter to:
Parse the JavaScript AST
Identify API calls that accept URLs
Wrap those calls with runtime functions that rewrite URLs
For example, this code:
fetch ( "https://example.com/api" );
Is rewritten to:
fetch ( $scramjet$rewrite ( "https://example.com/api" ));
The $scramjet$rewrite function calls rewriteUrl() at runtime with the current page’s metadata.
CSS rewriting
In src/shared/rewriters/css.ts, URLs in CSS are rewritten:
export function rewriteCss ( css : string , meta : URLMeta ) : string {
return css . replace ( /url \( [ "' ] ? ( [ ^ "') ] + ) [ "' ] ? \) / gi , ( match , url ) => {
const rewritten = rewriteUrl ( url . trim (), meta );
return `url(" ${ rewritten } ")` ;
});
}
This handles:
background-image: url(...)
@import url(...)
@font-face { src: url(...) }
etc.
Client-side interception
In the ScramjetClient, DOM APIs are intercepted to rewrite URLs at runtime:
client . Trap ( "HTMLAnchorElement.prototype.href" , {
get ( ctx ) {
const href = ctx . get ();
return unrewriteUrl ( href ); // Show real URL
},
set ( ctx , value ) {
const rewritten = rewriteUrl ( value , client . meta );
ctx . set ( rewritten ); // Set proxied URL
},
});
This ensures that even programmatic URL manipulation is proxied.
URL preservation
Scramjet preserves original URLs in HTML using scramjet-attr-* attributes:
<!-- Before rewriting -->
< a href = "https://example.com" > Link </ a >
<!-- After rewriting -->
< a href = "/scramjet/https%3A%2F%2Fexample.com" scramjet-attr-href = "https://example.com" > Link </ a >
This allows:
Debugging and inspection
Restoring original URLs when needed
Compatibility with scripts that read attributes directly
Hash handling
Hash fragments require special handling because:
They’re not sent to the server
They’re used for client-side routing
They need to work with browser navigation APIs
Scramjet encodes hashes separately:
https://example.com/page#section
↓
/scramjet/https%3A%2F%2Fexample.com%2Fpage#section
The hash is encoded but kept as a real hash fragment so:
window.location.hash works correctly
Hash-based routers work
The browser’s back/forward buttons work
URL rewriting can be expensive for large documents. The HTML rewriter uses htmlparser2 for performance, and the JS rewriter is written in Rust/WASM.
Use the rewriterLogs flag during development to see timing information: scramjet . modifyConfig ({
flags: { rewriterLogs: true }
});
Common pitfalls
Always use ScramjetFrame for iframes. Direct <iframe> elements won’t have proper frame tracking, breaking nested iframe URL resolution.
Be careful with custom URL encoders. They must:
Be deterministic (same input = same output)
Not produce URLs with characters that need escaping in URLs
Be reversible (encode and decode must be inverses)
Next steps
Configuration Learn about codec configuration and other options
Service Worker See how URLs are decoded in the Service Worker