Experience

Professional Work & Projects

Designed and built a production-grade AI orchestration framework that automates game feature development using coordinated LLM agents. The system transforms natural language feature requests into fully implemented Godot 4 game assets, GDScript files, scene hierarchies, and resources, through a structured multi-stage pipeline with validation, checkpointing, and human-in-the-loop approval gates.

What I Built
  • Multi-agent orchestration system coordinating OpenAI (GPT-4) and Anthropic (Claude) APIs, routing tasks to the optimal model based on task type, planning vs. code generation
  • Multi-stage execution pipeline with input validation, structured output schemas, checkpoint persistence, and evaluation gates between stages
  • Fault-tolerant workflow engine with automatic state persistence, enabling resume-from-checkpoint on failures without re-running completed stages
  • Domain-specific code generation targeting Godot 4's architecture, player controllers, input systems, scene composition, not generic code output
Problems Solved
  • Context limits: Decomposed complex features into smaller, focused LLM calls that fit within token windows
  • Output coherence: Structured coordination across multi-file outputs prevents drift and inconsistency
  • Validation: Built acceptance criteria checks that verify outputs meet specifications before progression
  • Cost efficiency: Checkpoint system eliminates redundant API calls on partial failures
Technical Approach

Treated AI code generation as a pipeline architecture problem, not a prompt engineering problem. Each stage has defined inputs, outputs, and validation criteria. The system maintains execution state across sessions, supports deterministic replay for debugging, and produces human-readable artifacts at every step for full transparency.

Model selection is task-aware: GPT-4 handles planning and structured reasoning; Claude handles longer code generation where context and coherence matter. The abstraction layer enables hot-swapping models per pipeline stage.

Key Accomplishments
  • Architected multi-agent LLM orchestration framework coordinating multiple AI providers across a staged execution pipeline with checkpoint recovery
  • Built automated task decomposition system transforming feature requests into domain-specific implementation plans with structured validation
  • Designed fault-tolerant pipeline infrastructure with session management and incremental checkpointing for resumable AI workflows
  • Developed end-to-end code generation producing production-ready Godot 4 assets from natural language specifications
  • Reduced iteration time by eliminating redundant LLM calls through intelligent state persistence and partial re-execution
Technologies Used
Python 3.11+ OpenAI API Anthropic API GPT-4 Claude Godot 4 GDScript LLM Orchestration Pipeline Architecture

Built and fine-tuned text-to-image AI models powering character asset generation for a blockchain-integrated mobile game. Owned the full ML pipeline, from dataset creation and LoRA fine-tuning to model evaluation and production integration, across 8+ distinct character communities.

What I Built
  • LoRA fine-tuned models for text-to-image generation with character-locked style consistency, identity retention, and prompt-controlled output
  • Dataset pipelines including image preprocessing, mask generation, bounding box normalization, class balancing, and metadata structuring
  • Model evaluation frameworks using controlled seeds, structured prompt templates, and negative prompt filtering to validate consistency and style coherence
  • Internal tooling for dataset processing, batch cropping, image masking workflows, and training reproducibility
Problems Solved
  • Style consistency: Trained models to maintain brand aesthetics across diverse character poses and scenarios
  • Identity retention: Prevented model drift and style contamination across training iterations
  • Production reliability: Validated model outputs for mobile rendering, catching artifacts before Unity integration
  • Scale: Managed parallel model variants for 8+ character communities with distinct visual identities
Technical Approach

Fine-tuned Flux-based and LoRA models using Hugging Face workflows, with hyperparameter tuning, debug batch generation, and failure mode isolation. Built evaluation protocols that tested identity accuracy, pose consistency, and style coherence across hundreds of generated samples. Integrated AI outputs into the mobile app pipeline, validating avatar rendering, wearables, and NFT-linked assets.

Key Accomplishments
  • Fine-tuned multiple LoRA text-to-image models for 8+ character communities with consistent style and identity retention
  • Built end-to-end dataset pipelines, preprocessing, masking, normalization, balancing, supporting reliable model training
  • Developed model evaluation frameworks measuring consistency, identity accuracy, and style coherence across generated outputs
  • Created internal tools for dataset processing and batch workflows, improving training reproducibility
  • Validated AI-generated assets for mobile integration, catching rendering issues before production deployment
Technologies Used
LoRA Flux Hugging Face Python PyTorch PIL TensorFlow NumPy Go DynamoDB Thirdweb BrowserStack Protobuf

I worked on fine-tuning a ChatGPT model for Tactician TM, a turn-based game engine. The goal was to make it easier for anyone to describe game rules, and then our NLP model would tidy those up into a clear, standardized format. This was my first time facing a challenge like this, let's just say it was a rough start. To make it more approachable, I focused on the game of Tic Tac Toe (TTT).

The Challenge

Initially, I was overwhelmed. Nonetheless, I embraced the challenge, realizing I had nothing to lose. My first choice was Python's NLP tool, "spaCy". My original plan was to deconstruct each input sentence word by word. However, I soon realized this approach was too time-consuming, considering the multitude of variables involved. So, I returned to the drawing board and conducted further research.

Discovery & Approach

During this phase, I discovered fine-tuning and various AI tools that facilitate this process. Notably, I found that ChatGPT offered fine-tuning capabilities. Initially, I used the "davinci-002" model for its accessibility. This model required data in a JSONL file, formatted in a prompt-completion structure.

I aimed for a 'waterfall effect' in my model, just as water in a river inevitably flows to a common destination, my model was designed to generalize any input into one of the standard TTT rules. This approach ensures consistency in interpreting diverse inputs and aligning them with established rule sets.

Iteration & Learning

Despite initial efforts, a meeting with my mentor revealed a significant oversight: my model was overfitted for TTT. Their guest advised me to use newer ChatGPT models and leverage existing ChatGPT data, reducing the need for extensive datasets. He introduced me to "few-shot learning".

This invaluable advice led me to rethink my approach. Instead of creating an extensive dataset from scratch, I explored ways to utilize existing data. Through carefully constructed sentences, I was able to get the model's training loss down from 1.81 to 0.26. Following this advice, I selected ChatGPT's "gpt-3.5-turbo-1106" model for my final fine-tuning.

Key Responsibilities
  • Fine-tuned ChatGPT models for natural language rule interpretation
  • Developed data generation pipelines for training and testing datasets
  • Reduced model training loss from 1.81 to 0.26 through few-shot learning
  • Created documentation for reproducible model training
Technologies Used
Python OpenAI API GPT-3.5 Turbo spaCy JSONL Few-Shot Learning