We Built GiantAI: Lessons From Shipping an Open-Source AI Agent Platform

Build Story

13 min read

March 8, 2026

DeepCraft Engineering

We Built GiantAI: Lessons From Shipping an Open-Source AI Agent Platform

The architecture decisions, the mistakes we made, and what we'd do differently, an honest behind-the-scenes look at building the first AI Agent platform live on Roblox, Web2 and Web3, powered by DeepSeek R1 and built on Base.

In late 2023, we made a decision that would consume most of our team's bandwidth for the next ten months: we were going to build an AI agent platform unlike anything in the market, one that lived simultaneously on Roblox, across the open web, and inside Web3 ecosystems. An agent that wasn't just a chatbot, but an AI KOL, market analyst, and fully functional NPC inside decentralised games. Powered by DeepSeek R1. Built on Base and Solana.Blogging is a powerful tool for businesses looking to establish their online presence and connect with their audience. A well-maintained blog can drive traffic, improve SEO, and enhance brand awareness.

The result is GiantAI, the first AI Agent platform simultaneously live on Roblox, Web2 and Web3. Both the platform app (friendly-giant-ai-app) and the smart contracts (friendly-giant-contracts) are open source on GitHub. It works. We're proud of it. And we made every mistake you'd expect a team to make when building something genuinely new in a space evolving faster than we could read the documentation.

This is that story, told as honestly as we can tell it because the lessons are more useful to the people building AI products today than another polished case study.

What GiantAI Actually Is

GiantAI is not a chatbot. It's a multi-role AI agent platform with a scope that, when we first sketched it on a whiteboard, made at least one team member visibly uncomfortable.

The platform operates across three distinct environments simultaneously:

Roblox, as an AI NPC agent that lives inside decentralised game worlds, interacting with players, analysing in-game economies, and acting as a living character with persistent memory and personality
Web2, as an AI KOL (Key Opinion Leader) and market analyst, surfacing insights, commentary, and on-chain data to audiences across social and web platforms
Web3 / Base & Solana, deeply integrated with both Base and Solana blockchains, with open-source smart contracts and on-chain capabilities that make GiantAI a native participant across multiple decentralised ecosystems

The reasoning backbone is DeepSeek R1, chosen for its exceptional performance on analytical and reasoning tasks, its cost efficiency at scale, and its suitability for the kind of structured, multi-step thinking that an agent acting across three different environments requires.

The core ambition: an AI agent that isn't just deployed in Web3, it understands Base and Solana natively, participates in on-chain activity, and can act autonomously across both ecosystems.

That sounds compelling in a pitch deck. Building it took 10 months and a fundamental rethink of how AI agents should be architected when the environment itself is non-deterministic.

The Architecture We Started With (And Why We Threw It Away)

Our first architecture was what you'd call naive-monolith. A single backend service handled everything: model calls, tool execution, memory management, session state, and the embedding API. It worked in development. It fell apart at the first real load test.

The core problem: LLM calls, tool execution, and memory operations have completely different latency and reliability profiles. Bundling them into a single request-response cycle meant that a slow web search (which could take 8–12 seconds) blocked everything downstream — including the streaming response the user was waiting for.

GiantAI's decoupled architecture: Orchestrator coordinates LLM calls, tool execution, and memory as independent async services.

The rewrite, which we should have done from the start, separated these into independent async services. The orchestrator dispatches work and assembles results. The LLM service handles model calls with streaming. The tool runner executes web searches, API calls, and code in isolated sandboxes. The memory service manages vector storage and key-value state. Each can scale, fail, and retry independently.

The Build Timeline

Months 1–2

Architecture & Core Agent Loop

Defined the agent loop (observe → plan → act → reflect) and built the first working orchestrator. After evaluating multiple models, we selected DeepSeek R1 as the reasoning backbone — its structured reasoning output and cost profile made it the right choice for an agent operating across game environments and on-chain data. First internal demo: an agent answering questions and browsing the web inside a Roblox test environment.

Months 3–4

Tool Framework & Memory

Built the extensible tool framework — web search, code execution, API calls, file I/O. Added vector-based long-term memory using pgvector. This is where we made our first major mistake (see below).

Months 5–6

Game & Web Embedding SDK

Built the JavaScript SDK that allows GiantAI to be embedded in any web environment or game. The hardest part: context injection — giving the agent awareness of the environment it's running in without polluting the context window.

Months 7–8

The Rewrite

Threw away the monolith and rebuilt the backend as independent async microservices. Painful. Necessary. The system that emerged was 10x more reliable in production conditions.

Months 9–10

Hardening, UI, & Open Source Release

Built the management UI, wrote documentation, added rate limiting and cost controls. Open-sourced both the platform app at github.com/ispolink/friendly-giant-ai-app and the smart contracts at github.com/ispolink/friendly-giant-contracts — giving developers full transparency into the AI architecture, on-chain mechanics, and Base integration. Officially launched on Roblox, Web2, and Web3 simultaneously.

Why We Chose DeepSeek R1, Not GPT-4

This is the question we get most often at demos. The AI development world defaults to GPT-4 or Claude for agent reasoning. We went with DeepSeek R1. Here's why.

GiantAI operates in contexts that demand structured, step-by-step reasoning over complex data, on-chain analytics, game economy modelling, multi-step web research. DeepSeek R1 was built specifically for this kind of chain-of-thought reasoning, and in our internal benchmarks it consistently outperformed GPT-4 on tasks that required logical sequencing and analytical precision.

The other factor was economics. An AI agent that lives inside Roblox games and interacts with potentially thousands of players simultaneously has a very different cost profile than an enterprise chatbot handling 50 queries a day. DeepSeek R1's token costs made truly scalable deployment viable in a way that GPT-4 pricing simply wouldn't have allowed at our target scale.

Model selection is a product decision, not just an engineering one. The right model depends on your latency requirements, reasoning complexity, cost per query at scale, and the specific failure modes that matter most for your use case. We evaluated five models before committing to DeepSeek R1. Don't skip this step.

We also valued the alignment with our multi-chain positioning. DeepSeek R1's open architecture made it possible to reason transparently about on-chain data, across both Base and Solana, in ways that matter for a platform where trust and verifiability are core to the user relationship.

The Mistakes (The Useful Part)

Mistake 01

We over-engineered memory from day one. We spent 6 weeks building a sophisticated semantic memory system with automatic consolidation, importance scoring, and temporal decay — before we had a single user. When we finally shipped and watched real sessions, most interactions needed exactly zero long-term memory. Start with session-scoped context. Add persistent memory only when you have evidence that users need it.

Mistake 02

We underestimated prompt brittleness. A system prompt that worked perfectly across 200 test queries would fail in specific, hard-to-predict ways under real user input. We didn't build adversarial evaluation into our process until Month 4. By then, we'd shipped two prompt rewrites that broke things we'd already fixed. Evaluation infrastructure should be built before the system it evaluates — not after.

Mistake 03

We assumed tool execution would be the easy part. Calling a web search API and getting a result back looks trivial. Making that result reliably useful to the LLM — parsing, cleaning, chunking, scoring relevance, fitting into context — is not. We allocated two weeks to the tool framework. It took eight. Every tool is a small RAG problem in disguise.

The lesson that mattered most: AI agent development requires a fundamentally different engineering discipline than traditional software. The system's behaviour emerges from the interaction between your code, your prompts, your tools, and your data — and none of these can be tested in isolation. You need end-to-end integration tests, not just unit tests, from day one.

What We'd Do Differently

Start with the simplest viable agent loop and add complexity only when you hit a hard limit — not when you anticipate one
Build evaluation infrastructure in Week 1 — a test suite of real-world queries with expected outputs, scored automatically
Design for observable failure — every agent action should produce a structured log entry that lets you reconstruct exactly what happened and why
Decouple from day one — LLM calls, tool execution, and memory operations should never share a request lifecycle
Ship to real users in Month 2, not Month 8 — the feedback from 10 real users is worth more than 6 months of internal testing

What GiantAI Taught Us About Building AI Products

The most important thing we learned is that building an AI product requires a team that has actually shipped AI in production before. The failure modes are different. The debugging process is different. The relationship between specification and behaviour is different.

The things that matter in traditional software like clean architecture, test coverage, type safety still matter. But they're table stakes. What separates good AI products from broken ones is prompt engineering craft, evaluation discipline, and a deep understanding of how LLMs fail.

That's why DeepCraft builds AI products differently from a generic development agency. GiantAI is our proof of work and everything we learned building it informs how we approach every AI integration and AI MVP we ship for clients.

Both repositories are open source. The platform app, the full AI agent codebase, is at github.com/ispolink/friendly-giant-ai-app. The smart contracts, the Base integration and on-chain mechanics are at github.com/ispolink/friendly-giant-contracts. The live platform is at giantai.ispolink.com.

We Built GiantAI: Lessons From Shipping an Open-Source AI Agent Platform

What GiantAI Actually Is

The Architecture We Started With (And Why We Threw It Away)

The Build Timeline

Architecture & Core Agent Loop

Tool Framework & Memory

Game & Web Embedding SDK

The Rewrite

Hardening, UI, & Open Source Release

Why We Chose DeepSeek R1, Not GPT-4

The Mistakes (The Useful Part)

What We'd Do Differently

What GiantAI Taught Us About Building AI Products

Frequently asked questions

What is Giant AI?

What is an AI agent platform?

How long does it take to build an AI agent from scratch?

Can I use GiantAI in my own product?

Let's build something great.