Rebuilding hwb2 | Learning

An AI-augmented math-tutoring product I'm actively rebuilding: a modern Next.js rewrite, then a re-architected, human-in-the-loop intake pipeline.

Role: Solo build
Year: 2026
Status: Active (in progress)
Links: Live: hwb2learning.com

Next.js
Claude API
OpenAI Agent Builder
Resend

What it is

hwb2 | Learning is one-on-one math tutoring for grades 6–9, led by a credentialed teacher. AI runs behind the scenes for intake, calibrated practice, and progress tracking. A teacher runs every session, and the student never uses AI during one. For families, that distinction is the point: the child is not talking to a chatbot, and a real teacher is the only one using AI.

The model is three steps (intake, then guided practice, then progress visibility) and the principles stay constant throughout: human-led and AI-augmented, short feedback cycles, progress that families can actually see, and a system rather than a hustle.

The problem, and the insight

One-to-one tutoring does not scale, administrative work eats into teaching time, and feedback between sessions is slow and inconsistent. The insight that shapes the product is that most of the repetitive cognitive work in tutoring happens before and after a session, not during it. So AI belongs at intake, practice generation, and progress tracking, where it buys back the teacher's time, and it stays out of the live session, where human judgment is the entire value.

The first piece I built to prove this was the intake reply pipeline: when a family submits the early-access form, they get a fast, personalized, on-brand reply, and I get a clean read on what they need.

The build: a re-architecture (v1 to v2)

v1: proof on a multi-service stack. The first working version ran on Google and OpenAI. A Google Form fed a Sheet, an Apps Script trigger called a FastAPI service on Cloud Run, that service ran an OpenAI Agent Builder workflow to classify and draft the reply, and Apps Script then sent the email and wrote status back to the Sheet. It worked end to end and proved the concept. The cost was operational weight: five moving services, service-account auth between them, and a deploy story that reached into Google Cloud every time.

The v1 classifier and three topic responders shown in the ChatGPT Agent Builder canvas. — v1's reply engine: the classifier and three topic responders, wired together in the ChatGPT Agent Builder canvas.

v2: the same product on a leaner stack. I rebuilt the live pipeline to be Vercel-native and swapped the reply engine to the Anthropic Claude API. Now a React form (Formspree) triggers a Next.js Server Action that classifies the request with Claude Haiku, drafts the reply with Claude Sonnet, and sends it through Resend. No Google Cloud, no Sheets, no Apps Script, no OpenAI: one repository, one deploy.

v1 pipeline: a Google Form submission triggers an Apps Script that calls an OpenAI Agent Builder workflow to classify, draft, and assemble a reply, then logs to a Google Sheet and sends the reply through Gmail.

v2 pipeline: a Formspree React form triggers a Next.js Server Action that classifies with Claude Haiku and drafts with Claude Sonnet, then a human reviews and approves before Resend sends the email.

Three decisions drove the rebuild.

Simpler, cheaper infrastructure. Five services collapsed into one Vercel-native path. Routing runs on a small, low-cost model and drafting on a stronger one, so each reply costs a fraction of a cent and the whole pilot runs on free tiers.

Safety by design, because the users are minors and their parents. The pipeline is human-in-the-loop and review-first. Every AI-drafted reply is sent to me to review and send by hand, with the reply address already pointed at the parent, so nothing model-written reaches a family unseen. That is a deliberate product choice for this audience, not a constraint I backed into.

A disciplined migration. The response contract, the shape every downstream step reads, stayed identical across the swap, so replacing the engine underneath did not ripple into anything else:

type Route = "standard" | "pricing" | "other" | "no_google";

interface ReplyContract {
route: Route;            // deterministic gate first, else Haiku classifies
subject: string;         // assembled in code, never by the model
body: string;            // the only field the model writes
call_to_action: string;  // assembled in code; omitted on no_google
metadata: ReplyMetadata; // raw form values: student name, grade, topic
}

The comments mark the split that kept the swap safe: the model only ever writes body, the subject and call_to_action are assembled in code, and route comes from a deterministic gate or the classifier. Because that interface held, I could change the engine that fills it from OpenAI to Claude without touching how the reply is assembled or sent. Changing a core dependency without breaking the contract is the part I am most pleased with.

How the reply works now

A deterministic gate handles the clear cases first. Everything else goes to a fast classifier that sorts the request into a small set of routes, and then a stronger model drafts the body within fixed per-route guardrails, paraphrasing what the parent actually wrote rather than echoing it. The subject line and the call-to-action are assembled in code, so the model only ever writes the body. That keeps the personalized part personalized and the fixed strings exact.

Stack

Next.js (App Router) with TypeScript and Tailwind on Vercel. The reply logic is a server-only action, so there is no public endpoint and every secret stays on the server. The Anthropic Claude API handles the classify and draft steps, and Resend sends from a verified domain. The auto-reply is fire-and-forget behind the form, so a model or mailer failure costs at most one reply, never the lead.

Status, and what is next

The review-first pilot is built and tested end to end. The next feature is one-click approve and send, where an approved draft goes out on its own. Further out, the same before-and-after pattern extends to diagnostics, mistake analysis, and progress views for families. The retired v1 pipeline is kept as a reference, the "before" half of this story.