stasis-proxy - v0.1.0
    Preparing search index...

    stasis-proxy - v0.1.0

    stasis-proxy

    The Zero-Config Local Cache for AI Engineers.
    Stop burning your budget on repeated test runs.

    TypeScript Node.js License


    Building AI apps involves running the same prompts hundreds of times:

    1. Expenses Pile Up: Every npm test run costs real money.
    2. Speed Kills Flow: Waiting 5s for GPT-4 breaks your thought process.
    3. Flaky Tests: Non-deterministic LLMs make unit tests unreliable.

    You could use a heavyweight Enterprise Gateway (Helicone, Portkey) or install Python tools (LiteLLM), but why add friction to your Node.js workflow?

    stasis-proxy is the missing json-server for AI.
    It is a local-first, zero-config HTTP proxy that caches LLM responses in a local SQLite file.

    Feature AWS Bedrock Caching Enterprise Gateways (Helicone) stasis-proxy
    Goal Lower latency for huge contexts Production observability Free & instant local development
    Cost You still pay (discounted) You pay for their service $0.00
    Setup Complex CloudFormation API Keys, Cloud Accounts npx stasis start
    Data Ephemeral (5 min TTL) Sent to 3rd party cloud 100% Local (SQLite)


    # Clone and install
    git clone <repo-url> stasis-proxy
    cd stasis-proxy
    npm install
    npm run build
    # Start the proxy (development)
    npm run dev -- start --port 4000 --upstream https://api.openai.com

    # Start the proxy (production)
    npm start -- start --port 4000 --upstream https://api.openai.com

    # Or use npx after global install
    npx stasis start --port 4000 --upstream https://api.openai.com

    Point your OpenAI/Anthropic client to the proxy:

    // OpenAI
    import OpenAI from 'openai';

    const openai = new OpenAI({
    baseURL: 'http://localhost:4000/v1', // Point to stasis-proxy
    apiKey: process.env.OPENAI_API_KEY,
    });

    // Anthropic (using OpenAI-compatible endpoint)
    const anthropic = new OpenAI({
    baseURL: 'http://localhost:4000/v1',
    apiKey: process.env.ANTHROPIC_API_KEY,
    });

    Or set environment variables:

    export OPENAI_API_BASE=http://localhost:4000/v1
    

    To use with AWS Bedrock, you need to route requests through the proxy. Since the proxy handles AWS Signature V4 verification (by using the original signature from your client for the upstream request, but a normalized "smart key" for caching), you just need to point your Bedrock client to the proxy.

    However, the AWS SDK does not support a simple baseURL override for the entire service URL easily in all versions. The most reliable way is to use a custom request handler or middleware in your application code.

    Example: Using a custom Request Handler (Node.js SDK v3)

    import { BedrockRuntimeClient, InvokeModelCommand } from "@aws-sdk/client-bedrock-runtime";
    import { NodeHttpHandler } from "@smithy/node-http-handler";

    const proxyClient = new BedrockRuntimeClient({
    region: "us-east-1",
    requestHandler: new NodeHttpHandler({
    // Point the underlying HTTP handler to the proxy
    // Note: This requires the proxy to be running on localhost:4000
    httpAgent: new http.Agent({
    host: 'localhost',
    port: 4000,
    protocol: 'http:',
    }),
    httpsAgent: new https.Agent({
    host: 'localhost',
    port: 4000,
    protocol: 'http:',
    })
    }),
    // IMPORTANT: You might need to disable SSL verification if using a local proxy without valid certs for 'bedrock.us-east-1.amazonaws.com'
    // Or simply rely on the proxy to handle the upstream connection.
    });

    // Stasis Proxy specific: The proxy listens on /model/<model-id>/invoke

    Alternatively, if you are using a library like LangChain, you can often set the endpoint_url.

    const model = new Bedrock({
    model: "anthropic.claude-v2",
    region: "us-east-1",
    endpointUrl: "http://localhost:4000/model/anthropic.claude-v2/invoke", // Full path to proxy
    });

    For stasis-proxy, ensuring your client sends standard AWS headers (Authorization: AWS4-HMAC-SHA256...) is crucial for the "Smart Auth" caching to work.


    See the examples/ directory for complete, runnable examples:

    • OpenAI Integration β€” Full setup with unit tests demonstrating cached testing
    • Development Workflow β€” Prompt engineering iteration with instant feedback
    • AI Service Pattern β€” Realistic content analyzer service with sentiment analysis, NER, and categorization
    cd examples
    npm install
    npm run test:with-proxy # See the caching magic!

    Key Results:

    Run Time Cost
    First run ~175s ~$0.02
    Cached run ~5s $0.00

    Unlike a dumb HTTP cache, stasis-proxy understands LLM semantics.

    The cache key is generated from:

    1. Normalized JSON body β€” Keys are deeply sorted, so {"a":1,"b":2} and {"b":2,"a":1} produce identical cache keys
    2. Authorization header β€” Prevents data leakage between API keys

    Control caching behavior with the X-Stasis-Mode header:

    Mode Behavior
    cache (default) Return cached response if available, otherwise fetch and cache
    fresh Force a new fetch, update the cache, ignore existing cache
    bypass Proxy directly without touching the cache at all
    • temperature: 0 β€” Responses are cached indefinitely (deterministic)
    • temperature > 0 β€” Caching still works, use fresh mode when you need new creative outputs

    For this MVP, requests with "stream": true automatically bypass the cache. Full streaming support is planned for v0.2.


    Every response includes the X-Stasis-Status header:

    Status Meaning
    HIT Response served from cache
    MISS Response fetched from upstream and cached
    BYPASS Response proxied without cache interaction (streaming, bypass mode)

    stasis start [options]

    Options:
    -p, --port <port> Port to listen on (default: 4000)
    -u, --upstream <url> Upstream API URL (required)
    -d, --db <path> SQLite database path (default: ./stasis-cache.db)
    -l, --log-level <level> Log level: fatal|error|warn|info|debug|trace (default: info)
    -h, --help Show help
    -v, --version Show version
    # OpenAI
    stasis start -p 4000 -u https://api.openai.com

    # Anthropic
    stasis start -p 4000 -u https://api.anthropic.com

    # With custom database location
    stasis start -p 4000 -u https://api.openai.com -d ~/.stasis/cache.db

    # Verbose logging
    stasis start -p 4000 -u https://api.openai.com -l debug

    curl http://localhost:4000/health
    
    {
    "status": "healthy",
    "cache": {
    "entries": 42,
    "tokensSaved": 128500,
    "dbSizeBytes": 1048576
    }
    }
    curl http://localhost:4000/stats
    
    {
    "totalEntries": 42,
    "totalTokensSaved": 128500,
    "dbSizeBytes": 1048576,
    "oldestEntry": "2024-01-15T10:30:00.000Z",
    "newestEntry": "2024-01-15T14:45:00.000Z"
    }

    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚ Your App │────▢│ stasis-proxy │────▢│ OpenAI/etc β”‚
    β”‚ │◀────│ │◀────│ β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    β”‚
    β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚ SQLite Cache β”‚
    β”‚ (stasis-cache.db)β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    • Runtime: Node.js 20+
    • Framework: Fastify (low overhead, plugin architecture)
    • Database: better-sqlite3 (single-file, zero-dependency)
    • Validation: Zod (runtime type checking)
    • Logging: Pino (high-performance, pretty-printed)
    • CLI: cac (lightweight CLI framework)

    stasis-proxy/
    β”œβ”€β”€ src/
    β”‚ β”œβ”€β”€ cli.ts # CLI entry point
    β”‚ β”œβ”€β”€ index.ts # Library exports
    β”‚ β”œβ”€β”€ types.ts # Shared types and schemas
    β”‚ β”œβ”€β”€ core/
    β”‚ β”‚ β”œβ”€β”€ server.ts # Fastify server setup
    β”‚ β”‚ β”œβ”€β”€ hasher.ts # JSON normalization & hashing
    β”‚ β”‚ └── interceptor.ts # Request interception & caching logic
    β”‚ └── store/
    β”‚ └── sqlite.ts # SQLite database wrapper
    β”œβ”€β”€ examples/ # Integration examples and tests
    β”‚ β”œβ”€β”€ src/
    β”‚ β”‚ β”œβ”€β”€ services/ # Realistic AI service patterns
    β”‚ β”‚ β”œβ”€β”€ __tests__/ # Unit tests demonstrating caching
    β”‚ β”‚ └── dev/ # Development workflow utilities
    β”‚ └── README.md # Examples documentation
    β”œβ”€β”€ package.json
    β”œβ”€β”€ tsconfig.json
    └── README.md

    # Run tests
    npm test

    # Watch mode
    npm run test:watch

    • [ ] v0.2: Streaming response caching
    • [ ] v0.3: Cache TTL and expiration policies
    • [ ] v0.4: Web UI for cache inspection
    • [ ] v0.5: Anthropic-native format support

    MIT Β© 2025 Greg King