Architecture

A technical deep-dive into how Lobster works.

Introduction

Most load testing tools require you to manually specify every endpoint you want to test. Tools like Apache Bench, wrk, and hey excel at hammering a single URL with concurrent requests, but they leave URL discovery entirely to you. Enterprise solutions like JMeter and k6 offer more flexibility but demand significant configuration overhead. This creates a workflow gap: developers want to validate their application’s performance under load without spending hours mapping endpoints or writing test scenarios.

Lobster bridges this gap by combining web crawling with concurrent stress testing. Point it at your application’s base URL, and it automatically discovers all linked pages, tests them under configurable load, validates performance against targets, and generates actionable reports. This article examines Lobster’s architecture, implementation decisions, and practical applications.

The Problem Space

Why Another Load Testing Tool?

The load testing landscape offers two extremes:

Simple single-URL tools (ab, wrk, hey, vegeta):

Fast and lightweight
Perfect for testing a specific endpoint
Require manual URL enumeration for multi-page applications
Provide raw metrics without validation logic

Enterprise testing platforms (JMeter, k6, Gatling):

Highly configurable with complex scenario support
Steep learning curve
Require test script development
Often overkill for quick validation during development

The gap: Developers needed a tool that could stress test an entire application automatically during development cycles, without requiring test scripts or manual endpoint mapping. This is particularly valuable for:

Pre-deployment validation: Quickly verify that recent changes haven’t introduced performance regressions
Development workflows: Test the entire application during feature development
CI/CD integration: Automated performance gates before deployment
Capacity planning: Understand current performance characteristics

Design Goals

Lobster was designed with specific constraints:

Zero configuration for basic use: lobster -url http://localhost:3000 should just work
Intelligent discovery: Automatically find all testable endpoints
Validation-focused: Go beyond raw metrics to provide pass/fail criteria
Multi-format reporting: Console output for development, JSON for CI/CD, HTML for stakeholders
Respectful behavior: Honor robots.txt, implement proper rate limiting, handle errors gracefully

Architecture Overview

Lobster follows Clean Architecture principles with a domain-driven design approach. The system is organized into distinct layers with clear separation of concerns.

System Components

graph TB
    CLI[CLI Entry Point<br/>cmd/lobster/main.go]
    Config[Configuration Layer<br/>internal/config]
    Domain[Domain Layer<br/>internal/domain]

    CLI --> Config
    Config --> Domain

    Domain --> Crawler[URL Discovery<br/>internal/crawler]
    Domain --> Tester[Load Testing Engine<br/>internal/tester]
    Domain --> Validator[Performance Validation<br/>internal/validator]
    Domain --> Reporter[Report Generation<br/>internal/reporter]
    Domain --> Robots[robots.txt Parser<br/>internal/robots]

    Tester --> GoFlow[goflow Rate Limiter<br/>Token Bucket]
    Crawler --> HTTP[net/http Client]
    Reporter --> Templates[HTML Templates]

    style CLI fill:#e1f5ff
    style Domain fill:#fff4e1
    style Crawler fill:#e8f5e9
    style Tester fill:#e8f5e9
    style Validator fill:#e8f5e9
    style Reporter fill:#e8f5e9
    style Robots fill:#e8f5e9
    style GoFlow fill:#f3e5f5

Execution Flow

The tool operates in four distinct phases:

sequenceDiagram
    participant User
    participant CLI
    participant Crawler
    participant Queue
    participant Tester
    participant Validator
    participant Reporter

    User->>CLI: lobster -url http://app
    CLI->>Crawler: Start URL Discovery

    Note over Crawler,Queue: Phase 1: URL Discovery
    Crawler->>Queue: Add base URL (depth=0)
    loop For each URL in queue
        Crawler->>Crawler: Fetch page
        Crawler->>Crawler: Extract <a href> links
        Crawler->>Crawler: Resolve relative URLs
        Crawler->>Queue: Add discovered URLs
    end

    Crawler-->>CLI: Return URL list

    Note over Tester,Queue: Phase 2: Concurrent Testing
    CLI->>Tester: Start load testing
    Tester->>Tester: Spawn worker pool

    par Worker 1
        Tester->>Queue: Get URL
        Tester->>Tester: Rate limit check
        Tester->>Tester: Send HTTP request
        Tester->>Tester: Record metrics
    and Worker 2
        Tester->>Queue: Get URL
        Tester->>Tester: Rate limit check
        Tester->>Tester: Send HTTP request
        Tester->>Tester: Record metrics
    and Worker N
        Tester->>Queue: Get URL
        Tester->>Tester: Rate limit check
        Tester->>Tester: Send HTTP request
        Tester->>Tester: Record metrics
    end

    Note over Validator: Phase 3: Validation
    Tester-->>Validator: Collected metrics
    Validator->>Validator: Calculate percentiles
    Validator->>Validator: Validate targets
    Validator-->>CLI: Validation results

    Note over Reporter: Phase 4: Reporting
    CLI->>Reporter: Generate reports
    Reporter->>Reporter: Create JSON
    Reporter->>Reporter: Create HTML
    Reporter->>Reporter: Create console output
    Reporter-->>User: Display results

Core Data Flow

flowchart LR
    BaseURL[Base URL] --> Discovery{URL Discovery}
    Discovery --> |Crawl & Parse|URLQueue[(URL Queue<br/>sync.Map)]

    URLQueue --> Distribution{URL Distribution}
    Distribution --> |Round-robin|W1[Worker 1]
    Distribution --> |Round-robin|W2[Worker 2]
    Distribution --> |Round-robin|WN[Worker N]

    W1 --> RateLimiter{Token Bucket<br/>Rate Limiter}
    W2 --> RateLimiter
    WN --> RateLimiter

    RateLimiter --> |Acquire token|HTTP[HTTP Request]
    HTTP --> Metrics[(Metrics Collection<br/>Thread-safe)]

    Metrics --> Analysis{Statistical<br/>Analysis}
    Analysis --> |Calculate|Percentiles[P95/P99<br/>Percentiles]
    Analysis --> |Aggregate|Totals[Request Counts<br/>Success Rate]
    Analysis --> |Identify|Errors[Error Analysis]

    Percentiles --> Validation{Validation}
    Totals --> Validation
    Errors --> Validation

    Validation --> Reports{Report<br/>Generation}
    Reports --> JSON[JSON Output]
    Reports --> HTML[HTML Report]
    Reports --> Console[Console Summary]

    style URLQueue fill:#e3f2fd
    style Metrics fill:#e3f2fd
    style RateLimiter fill:#fff3e0
    style Reports fill:#f3e5f5

Key Implementation Details

1. URL Discovery: The Crawler

The crawler is the first phase of execution and arguably the most critical for automation. It must handle various edge cases while maintaining performance.

Implementation (internal/crawler/crawler.go):

type Crawler struct {
    baseURL      *url.URL
    discovered   sync.Map        // Thread-safe deduplication
    queue        chan URLTask    // BFS traversal queue
    client       *http.Client
    maxDepth     int
    robotsParser *robots.RobotsData
}

func (c *Crawler) Crawl(ctx context.Context) ([]string, error) {
    // Seed with base URL at depth 0
    c.queue <- domain.URLTask{URL: c.baseURL.String(), Depth: 0}
    c.discovered.Store(c.baseURL.String(), true)

    // BFS traversal with depth limiting
    for {
        select {
        case <-ctx.Done():
            return c.collectResults(), ctx.Err()
        case task := <-c.queue:
            if task.Depth >= c.maxDepth {
                continue
            }
            c.processURL(ctx, task)
        }
    }
}

Key design decisions:

Breadth-first search: Ensures shallow pages are discovered first, allowing testing to begin on high-priority pages while deep crawling continues
Depth limiting: Prevents infinite loops in cyclic link structures and controls memory usage
Same-domain filtering: Focuses testing on the target application, avoiding external links
robots.txt compliance: Respects website preferences by default (configurable)

Link extraction uses regex for simplicity and performance:

// Pattern: href=["']([^"']+)["']
// Matches: <a href="...">, <a href='...'>
pattern := regexp.MustCompile(`href=["']([^"']+)["']`)
matches := pattern.FindAllStringSubmatch(body, -1)

for _, match := range matches {
    rawURL := match[1]

    // Filter obvious non-HTTP links
    if strings.HasPrefix(rawURL, "javascript:") ||
       strings.HasPrefix(rawURL, "mailto:") ||
       strings.HasPrefix(rawURL, "#") {
        continue
    }

    // Resolve relative URLs
    absoluteURL := c.baseURL.ResolveReference(parsedURL)

    // Same-domain check
    if absoluteURL.Host != c.baseURL.Host {
        continue
    }

    c.enqueueIfNew(absoluteURL.String(), depth+1)
}

Trade-offs:

Regex vs. HTML parser: Regex is faster for this specific use case (href extraction only) and doesn’t require parsing malformed HTML. For complex scenarios (JavaScript-rendered content, shadow DOM), an HTML parser would be necessary.
Memory vs. completeness: Storing all discovered URLs in memory limits scale but ensures complete coverage for typical applications. For very large sites (10,000+ pages), a database-backed queue would be more appropriate.

2. Concurrent Testing: The Tester

The tester implements a worker pool pattern with rate limiting to generate realistic load while respecting server capacity.

Architecture:

type Tester struct {
    config       *domain.Config
    client       *http.Client
    results      *domain.TestResults
    rateLimiter  *bucket.RateLimiter  // goflow token bucket
    urlQueue     chan string
}

func (t *Tester) Test(ctx context.Context, urls []string) error {
    // Initialize rate limiter: tokens per second with burst capacity
    burst := int(math.Ceil(t.config.Rate * 2))
    t.rateLimiter, _ = bucket.NewSafe(bucket.Limit(t.config.Rate), burst)

    // Populate URL queue for round-robin distribution
    go t.feedURLQueue(ctx, urls)

    // Spawn worker pool
    var wg sync.WaitGroup
    for i := 0; i < t.config.Concurrency; i++ {
        wg.Add(1)
        go t.worker(ctx, &wg)
    }

    wg.Wait()
    return nil
}

Worker implementation:

func (t *Tester) worker(ctx context.Context, wg *sync.WaitGroup) {
    defer wg.Done()

    for {
        select {
        case <-ctx.Done():
            return
        case url := <-t.urlQueue:
            // Acquire token from rate limiter (blocking)
            if err := t.rateLimiter.Wait(ctx); err != nil {
                return
            }

            // Execute request and collect metrics
            start := time.Now()
            resp, err := t.client.Get(url)
            duration := time.Since(start)

            t.recordMetrics(url, resp, duration, err)
        }
    }
}

Rate limiting deep dive:

Lobster uses the token bucket algorithm via the goflow library. This provides smooth rate limiting with burst capacity:

Tokens per second: Set by -rate flag (default: 5.0)
Burst capacity: 2x the rate per second
Blocking behavior: Workers wait for tokens, preventing request dropping

Example: -rate 10 allows 10 requests/second sustained, with bursts up to 20 requests if tokens have accumulated.

Why token bucket over alternatives:

vs. Fixed window: Prevents traffic spikes at window boundaries
vs. Leaky bucket: Allows controlled bursts for realistic traffic patterns
vs. No limiting: Prevents overwhelming the server and getting 429 responses

Thread-safe metrics collection:

func (t *Tester) recordMetrics(url string, resp *http.Response, duration time.Duration, err error) {
    t.resultsMutex.Lock()
    defer t.resultsMutex.Unlock()

    t.results.TotalRequests++
    t.results.ResponseTimes = append(t.results.ResponseTimes, domain.ResponseTimeEntry{
        Timestamp:    time.Now(),
        ResponseTime: duration,
        URL:          url,
    })

    if err != nil {
        t.results.FailedRequests++
        t.results.Errors = append(t.results.Errors, domain.ErrorInfo{
            Timestamp: time.Now(),
            URL:       url,
            Error:     err.Error(),
        })
        return
    }

    t.results.SuccessfulRequests++
    // Additional status code tracking...
}

3. Performance Validation

Validation transforms raw metrics into actionable pass/fail criteria. This is what distinguishes Lobster from pure benchmarking tools.

Percentile calculation:

func CalculatePercentiles(responseTimes []time.Duration) (p50, p95, p99 time.Duration) {
    if len(responseTimes) == 0 {
        return 0, 0, 0
    }

    // Sort response times
    sorted := make([]time.Duration, len(responseTimes))
    copy(sorted, responseTimes)
    sort.Slice(sorted, func(i, j int) bool {
        return sorted[i] < sorted[j]
    })

    // Calculate percentile indices
    p50Index := int(float64(len(sorted)) * 0.50)
    p95Index := int(float64(len(sorted)) * 0.95)
    p99Index := int(float64(len(sorted)) * 0.99)

    return sorted[p50Index], sorted[p95Index], sorted[p99Index]
}

Target validation logic:

func Validate(results *domain.TestResults, targets domain.PerformanceTargets) []domain.PerformanceTarget {
    _, p95, p99 := CalculatePercentiles(extractDurations(results.ResponseTimes))

    checks := []domain.PerformanceTarget{
        {
            Name:        "Requests per Second",
            Target:      fmt.Sprintf("≥ %.1f req/s", targets.RequestsPerSecond),
            Actual:      fmt.Sprintf("%.1f req/s", results.RequestsPerSecond),
            Passed:      results.RequestsPerSecond >= targets.RequestsPerSecond,
            Description: "Sustained request throughput",
        },
        {
            Name:        "95th Percentile Response Time",
            Target:      fmt.Sprintf("≤ %.0fms", targets.P95ResponseTimeMs),
            Actual:      fmt.Sprintf("%.1fms", float64(p95.Milliseconds())),
            Passed:      float64(p95.Milliseconds()) <= targets.P95ResponseTimeMs,
            Description: "95% of requests complete within target",
        },
        // Additional checks...
    }

    return checks
}

Default targets are designed for typical web applications:

func DefaultPerformanceTargets() PerformanceTargets {
    return PerformanceTargets{
        RequestsPerSecond:   100,    // Moderate throughput
        AvgResponseTimeMs:   50,     // Snappy response
        P95ResponseTimeMs:   100,    // Acceptable tail latency
        P99ResponseTimeMs:   200,    // Maximum acceptable latency
        SuccessRate:         99.0,   // High reliability
        ErrorRate:           1.0,    // Low error tolerance
    }
}

These can be customized via configuration file for specific application requirements.

4. Multi-Format Reporting

Lobster generates three report formats, each serving different use cases:

Console output for development:

=== STRESS TEST RESULTS ===
Duration: 2m0s
URLs Discovered: 42
Total Requests: 2,450
Successful Requests: 2,442
Failed Requests: 8
Average Response Time: 18.7ms
Requests/Second: 20.4
Success Rate: 99.67%

🎯 PERFORMANCE TARGET VALIDATION
============================================================
✅ PASS Requests per Second:         20.4 req/s
✅ PASS Average Response Time:        18.7ms
✅ PASS 95th Percentile Response Time: 35.2ms
✅ PASS Success Rate:                 99.67%

Overall: 4/4 targets met (100.0%)
🎉 ALL PERFORMANCE TARGETS MET!

JSON output for CI/CD integration:

{
  "duration": "2m0s",
  "urls_discovered": 42,
  "total_requests": 2450,
  "performance_validation": {
    "targets_met": 4,
    "total_targets": 4,
    "overall_status": "PRODUCTION_READY",
    "checks": [...]
  },
  "url_validations": [...],
  "errors": [...],
  "slow_requests": [...]
}

HTML report for stakeholders:

Overview dashboard with key metrics
Interactive charts (response time distribution, status codes)
Sortable table of all tested URLs
Error analysis with actionable recommendations

The HTML report uses embedded CSS and JavaScript (Chart.js via CDN) for portability—a single file contains everything needed for viewing.

Real-World Applications

Use Case 1: Pre-Deployment Validation

Scenario: A team is deploying a new feature that modifies database queries. They want to ensure no performance regressions were introduced.

Workflow:

# Start local application with production-like data
./start-local.sh

# Run Lobster with production-like targets
lobster -url http://localhost:3000 \
  -duration 5m \
  -concurrency 20 \
  -rate 10 \
  -config production-targets.json

# Check exit code
if [ $? -eq 0 ]; then
  echo "Performance validation passed"
else
  echo "Performance regression detected"
  exit 1
fi

Configuration (production-targets.json):

{
  "performance_targets": {
    "requests_per_second": 50,
    "avg_response_time_ms": 100,
    "p95_response_time_ms": 200,
    "success_rate": 99.5
  }
}

Use Case 2: CI/CD Integration

GitHub Actions workflow:

name: Performance Tests

on: [pull_request]

jobs:
  performance:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Start application
        run: |
          docker-compose up -d
          ./scripts/wait-for-app.sh

      - name: Install Lobster
        run: go install github.com/1mb-dev/lobster/cmd/lobster@latest

      - name: Run stress tests
        run: |
          lobster -url http://localhost:3000 \
            -duration 3m \
            -output results.json

      - name: Upload results
        uses: actions/upload-artifact@v4
        if: always()
        with:
          name: performance-results
          path: results.json

      - name: Comment PR
        uses: actions/github-script@v7
        if: always()
        with:
          script: |
            const fs = require('fs');
            const results = JSON.parse(fs.readFileSync('results.json'));
            const status = results.performance_validation.overall_status;
            const comment = `## Performance Test Results\n\n` +
              `Status: ${status}\n` +
              `Requests: ${results.total_requests}\n` +
              `Success Rate: ${results.success_rate}%\n` +
              `Avg Response Time: ${results.average_response_time}`;
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: comment
            });

Use Case 3: Capacity Planning

Scenario: Understanding current application limits before scaling.

Progressive load testing:

# Test at increasing concurrency levels
for concurrency in 5 10 20 50 100; do
  echo "Testing with $concurrency workers..."
  lobster -url http://staging.example.com \
    -duration 10m \
    -concurrency $concurrency \
    -rate 10 \
    -output "results-c${concurrency}.json"

  # Extract key metrics
  jq '.requests_per_second, .success_rate' "results-c${concurrency}.json"
done

Analysis: Plot throughput vs. concurrency to identify bottlenecks and optimal worker count.

Design Decisions and Trade-offs

1. Why Go?

Chosen advantages:

Excellent concurrency primitives (goroutines, channels, sync package)
Fast compilation and single-binary deployment
Strong standard library for HTTP and networking
Good performance for I/O-bound workloads

Alternatives considered:

Python: Easier HTML parsing but poorer concurrency story
Rust: Better performance but higher development complexity
Node.js: Good for async I/O but less suitable for CPU-intensive percentile calculations

2. Why Clean Architecture?

Benefits realized:

Clear separation allows independent testing of crawler, tester, validator
Easy to extend (e.g., adding GraphQL support would be a new crawler implementation)
Domain models are reusable across different interfaces (CLI today, API server tomorrow)

Costs:

More files and indirection than a monolithic approach
Possibly over-engineered for current scope (~1,000 LOC)

Verdict: Worth it for maintainability and future extensions.

3. In-Memory URL Storage

Current approach: All discovered URLs stored in sync.Map

Limitations:

Memory usage grows with URL count (~1KB per URL)
Unsuitable for very large sites (10,000+ pages)

Alternatives for future:

SQLite for large crawls
Redis for distributed crawling
Streaming approach (test as you discover)

Current reasoning: Simplicity and performance for the target use case (typical web applications with hundreds to low thousands of pages).

4. Synchronous Crawling

Current approach: Crawl phase completes before testing phase begins

Benefits:

Simpler implementation
Easier to estimate test duration
Complete URL list available for reporting

Alternative: Stream discovered URLs to testing queue

Benefits of streaming:

Faster time to first test
Lower memory footprint
Can begin testing high-priority pages immediately

Trade-off decision: Synchronous approach chosen for v1.0 due to simpler error handling and clearer user experience. Streaming mode could be added as an option in future versions.

Performance Characteristics

Benchmarks

Test environment: MacBook Pro M1, 16GB RAM, testing against local Nginx serving static HTML

Crawler performance:

100 URLs: ~2 seconds
1,000 URLs: ~20 seconds
5,000 URLs: ~95 seconds (bottleneck: network I/O)

Tester performance (after URL discovery):

5 workers, 5 req/s: Smooth, no dropped requests
50 workers, 50 req/s: Smooth, <1% CPU usage
100 workers, 100 req/s: 429 errors on test server

Memory usage:

Base: ~10MB
With 1,000 URLs: ~15MB
With 10,000 URLs: ~45MB

Optimization Opportunities

1. Crawler parallelization: Current single-threaded crawler could spawn multiple workers

// Current: Sequential
for task := range queue {
    c.processURL(task)
}

// Potential: Parallel with worker pool
for i := 0; i < crawlerWorkers; i++ {
    go func() {
        for task := range queue {
            c.processURL(task)
        }
    }()
}

2. Response body sampling: Currently reads up to 64KB for link extraction

// Potential optimization: Stream and stop at first N links
body := http.MaxBytesReader(nil, resp.Body, 64*1024)

3. HTML parsing: Regex is fast but misses edge cases. For complex pages:

import "golang.org/x/net/html"

func extractLinks(body io.Reader) []string {
    doc, _ := html.Parse(body)
    return traverseForLinks(doc)
}

When to Use Lobster

Good Fit

Development workflows: Quick validation during feature development
Small to medium web applications: Up to a few thousand pages
Pre-deployment smoke tests: Automated in CI/CD pipelines
Regression testing: Ensure performance doesn’t degrade over time
API endpoint discovery: REST APIs with hypermedia links

Not a Good Fit

JavaScript-heavy SPAs: Lobster parses static HTML; it won’t execute JavaScript
Extremely large sites: 10,000+ pages may be slow and memory-intensive
Complex authentication flows: Lobster supports bearer, basic, header, and cookie auth but not multi-step OAuth or SAML flows
Production load testing: Use dedicated solutions with distributed workers
Precise latency measurements: Not a replacement for dedicated APM tools

Conclusion

Lobster demonstrates that there’s value in the space between simple single-URL load testing tools and complex enterprise solutions. By focusing on automation (crawler-first design), validation (pass/fail criteria), and developer experience (zero-config defaults), it addresses a specific workflow gap in the testing toolkit.

The implementation choices—Clean Architecture, token bucket rate limiting, multi-format reporting—prioritize maintainability and extensibility over raw performance. This positions Lobster well for future enhancements like distributed testing and real-time result streaming.

The tool is most effective as part of a continuous integration pipeline, providing automated performance validation before deployments. While it won’t replace dedicated monitoring solutions or distributed load testing platforms, it fills a valuable niche for rapid, automated performance validation during development and in CI/CD workflows.

Project: github.com/1mb-dev/lobster Version: 2.0.0 License: MIT Author: @1mb-dev