Autonomous Code Agent

Shipyard
Agent

Plan, execute, verify, and review code changes autonomously. LangGraph-powered multi-phase pipeline with human-in-the-loop oversight.

Open Workspace →

Graph Phases

Tool Actions

AI Models

Run Modes

LangGraph Pipeline

Five-Phase Execution

Every instruction flows through a deterministic graph. Each phase has its own model, tools, and validation gates.

Plan

Reads codebase, generates step-by-step execution plan with file targets

GPT-5.4

Execute

Applies edits, creates files, runs tools against the working directory

GPT-5.4 Mini

Verify

Runs typecheck, tests, lint. Captures stdout/stderr for review

bash

Review

Evaluates diffs and verification output, decides: accept, retry, or escalate

GPT-5.4

Report

Generates human-readable summary of all changes and verification results

GPT-5.4 Mini

Capabilities

Built for Real Codebases

Not a toy. Shipyard Agent handles multi-file edits, test suites, type systems, and CI gates.

⚙

Live Edit Feed

Watch file edits stream in real-time as the agent works. Every tool call, every diff, visible in the dashboard as it happens.

✓

Plan-then-Confirm

Review the agent's plan before any code is touched. Approve, modify, or reject. Like Cursor's plan review, built into the pipeline.

★

Model Flexibility

Switch between GPT-5.4 Mini for fast edits and GPT-5.4 for deeper reasoning. Per-run model selection.

⚖

Benchmark Tracking

Capture snapshots of type safety, test health, security, build speed, and more. Compare original vs refactored codebases with radar + trend charts.

Architecture

How It Fits Together

A TypeScript runtime with Express API, WebSocket streaming, and Postgres persistence. The graph engine is LangGraph.

Dashboard UI

REST API

WebSocket

Shipyard Agent

LangGraph state machine with plan, execute, verify, review, report nodes. Repo map context, tool hooks, and persistence layer.

File Edits

Verification

Reports

TypeScript runtime

LangGraph state machine

OpenAI GPT-5

Express + WebSocket

PostgreSQL persistence

Vitest + typecheck