Web Mocks
Clone websites and host them as stable test environments for AI agents using HUD page archives.
Page Cloning
This guide demonstrates how to create and host web archives for testing AI agents with consistent, offline-first environments. By cloning websites into WACZ (Web ARChiveZip) files, you can ensure your agents always test against specific, unchanging versions of web pages.
Goal: Create reproducible web environments for testing browser-based agents without depending on live websites that might change or go offline.
Concepts Covered:
- Using ArchiveWeb.page to clone websites into WACZ files
- Hosting archives locally with the HUD page archives repository and
CustomGym
- Uploading archives to app.hud.so for immediate cloud hosting
- Creating tasks that use these stable archived environments
Prerequisites
- HUD SDK installed
- Docker installed (for local hosting option)
- ArchiveWeb.page browser extension (for cloning pages)
- API keys for HUD and your chosen agent
Part 1: Cloning the Page
Installing ArchiveWeb.page
-
Install the Browser Extension:
- Visit ArchiveWeb.page
- Install the extension for Chrome/Chromium-based browsers
- The extension icon will appear in your browser toolbar
-
Create a New Archive:
- Click the ArchiveWeb.page extension icon
- Click “Create New Collection”
- Give your collection a descriptive name (e.g., “my-test-site”)
Capturing Web Pages
-
Start Archiving:
- Click “Start” in the extension popup to begin an archiving session
- Navigate to the website you want to clone
- Interact with the site as your agent would (login, navigate through pages, fill forms)
- All pages and resources will be captured automatically
-
Best Practices for Agent Testing:
- Capture all relevant pages and states your agent will interact with
- Include error pages and edge cases
- If testing login flows, capture both logged-out and logged-in states
- For form submissions, capture the form page and success/error pages
-
Stop and Download:
- Click “Stop” in the extension when done capturing
- Click “Download” to save your collection
- Choose WACZ format (default)
- Save with a meaningful filename (e.g.,
my-test-site.wacz
)
Example: Cloning a Login Flow
Part 2: Hosting the Website
You have two options for hosting your archived website:
Option 1: Local Hosting with CustomGym
This approach uses the HUD page archives repository to host archives locally and access them via CustomGym
.
Step 1: Clone the Page Archives Repository
Step 2: Add Your Archive
-
Place your WACZ file:
-
Update
archives/archive_list.json
:Note: The
name
field must match your WACZ filename without the.wacz
extension.
Step 3: Create a CustomGym for the Archive Server
Step 4: Create Tasks Using the Archived Site
Advanced: Query Parameters
The archive viewer supports useful query parameters:
Option 2: Cloud Hosting on app.hud.so
For immediate hosting without local setup, use the HUD platform’s built-in page cloning feature.
Step 1: Access Page Clone Feature
- Go to app.hud.so
- Click “Create” in the navigation
- Select “Page Clone”
Step 2: Upload Your Archive
- Click “Upload WACZ file”
- Select your
.wacz
file created in Part 1 - Provide a name for your cloned environment
- Click “Create”
Step 3: Use the Hosted Archive
Once uploaded, you’ll receive a URL for your hosted archive (e.g., https://archives.hud.so/your-archive-id
).
Tips for Effective Page Cloning
- Capture Complete Flows: Don’t just capture individual pages - capture entire user journeys
- Include Resources: Ensure CSS, JavaScript, and images are properly captured
- Test Your Archives: Always verify your archives work correctly before using them in evaluations
- Document States: Keep notes on what states and pages are included in each archive
- Update Regularly: Re-clone sites when significant changes occur
Key Takeaways
- ArchiveWeb.page makes it easy to create WACZ archives of any website
- Local hosting with CustomGym gives you full control and fast performance
- Cloud hosting on app.hud.so provides instant deployment without infrastructure
- Page cloning ensures consistent, reproducible testing environments for AI agents
- Archived sites eliminate external dependencies and enable offline testing