Cássio (Cass) Couto

Forward-Deployed Engineer · Applied AI · Enterprise AI Systems

📍 Brasília, Brazil

11+ years shipping production AI systems. 5+ years embedded with enterprise customers as a Forward-Deployed Engineer. I specialize in turning ambiguous customer problems into reliable agentic LLM systems — from discovery through compliance-aware deployment.
Cass Couto

Impact at a Glance

Results shipped across fintech, edtech, healthcare, and government

20+ Healthcare payers automated via Prior Auth pipeline Thoughtful AI
~90s Average exception resolution time Down from ~15 minutes · Canoe Intelligence
>70% ETL pipeline runtime reduction Databricks optimization · BairesDev
~80% User engagement increase ActivityLab, GenAI-powered · Voxy
~15x Search latency improvement Semantic search platform · RNP
82→98% SLA improvement IpeaData pipeline · IPEA

What I Do Best

Deep technical capability across the full AI delivery stack

🤖

Agentic LLM Systems

MCP servers, sub-agents, skill libraries, planning / execution / verification / recovery layers, tool orchestration, memory systems, reasoning loops

🚀

Forward-Deployed Engineering

Customer discovery, system boundary design, secure integration contracts, white-glove production deployment in compliance-heavy environments, stakeholder partnership

🔬

AI Evaluation & Observability

Model benchmarking (Claude, GPT, Gemini), scenario test sets, regression checks, structured telemetry, step-level execution tracing, LLM evaluation frameworks

⚙️

Data & ML Infrastructure

FastAPI, Kafka, Spark, Databricks, Airflow, vector databases, RAG pipelines, embedding systems, semantic search, ETL optimization

☁️

Cloud & DevOps

Production deployments on AWS (Lambda, ECS, SageMaker) and GCP. Infrastructure-as-code with Terraform and Docker. Observability with Datadog. Continuous delivery via GitLab CI/CD with zero-downtime release discipline.

🔁

Field-to-Product Feedback Loop

Codifying repeatable deployment playbooks, feeding customer learnings back into product and engineering, mentoring engineers, cross-functional delivery leadership

Professional Experience

11+ years building and deploying production AI systems

Thoughtful AI Forward-Deployed Engineer
Jun 2025 – Present Healthcare Automation
  • Embedded with enterprise customers to translate ambiguous business workflows into production-grade agentic LLM architectures (planning, execution, verification, and recovery layers)
  • Delivered production artifacts, including MCP servers, sub-agents, reusable agent skill libraries, and API endpoints, deployed across multiple enterprise clients
  • Partnered with customer engineering and operations teams to run discovery on the payment layer, define system boundaries, and design secure integration contracts for autonomous enterprise automation
  • Built model-selection and evaluation patterns (scenario test sets, regression checks) by benchmarking Claude Opus 4, GPT-4o, and Gemini 1.5 Pro on agent reasoning tasks
  • Implemented structured telemetry and step-level execution tracing, reducing rollback incidents and improving auditability in compliance-heavy environments
  • Codified repeatable deployment playbooks for agent workflows and fed field learnings back into product and engineering teams
Canoe Intelligence Senior Software Engineer
Jun 2024 – Oct 2025 Fintech · Alternative Investments
  • Collaborated with PM, operations, and data teams from discovery through delivery to redesign exception management as an event-driven, AI-powered pipeline using FastAPI + Kafka
  • Built an autonomous exception resolution system with an ML classification model predicting likely resolutions; agents execute known workflows when confidence thresholds are met — reduced average resolution time from ~15 minutes to ~90 seconds
  • Partnered with internal stakeholders on exception labeling, pattern analysis, and model validation to ensure AI outputs met operational requirements
  • Evaluated Claude Sonnet 4, GPT-4o, and Gemini 1.5 Flash for document understanding tasks; implemented a structured LLM evaluation framework
  • Implemented unified observability stack (Datadog + Terraform) with distributed tracing for AI workflows, enabling real-time debuggability
  • Shipped weekly zero-downtime releases through standardized CI/CD, mentored junior engineers, and used Claude Code and GitHub Copilot for test generation and code review automation
BairesDev Senior Software Engineer
Jan 2024 – May 2024 Consulting · Data Infrastructure
  • Optimized Databricks ETL pipelines for ML training data, reducing runtime by >70% and significantly lowering compute costs through strategic caching, partitioning, and transformation pruning
  • Designed custom Spark and Scikit-learn validation layers for data quality, stabilizing training data, and improving downstream model robustness and stakeholder trust
Voxy Tech Lead
Aug 2022 – Jan 2024 Edtech · Enterprise Language Learning
  • Led roadmap alignment sessions directly with enterprise customers to translate learning goals into GenAI product requirements, balancing competing priorities across engineering, product, and client stakeholders
  • Led team of 6 engineers to build ActivityLab, a GenAI-powered platform generating personalized multimodal learning activities (listening, speaking, reading, writing); increased user engagement by ~80%
  • Designed an automated reading comprehension generator with article-aware prompts producing difficulty-calibrated questions, validated by heuristics and human-in-the-loop review
  • Built an automated writing feedback system with prompt-driven revision, rubric-aligned grading, and bias mitigation checks
  • Implemented AI listening and speaking assessment integrating ASR and LLM-based scoring for near-real-time pronunciation and fluency feedback
  • Experimented with GPT-4o, Claude Sonnet 4, and early prompt engineering techniques; integrated GitHub Copilot across engineering workflows for team productivity
RNP — National Education and Research Network Senior Software Engineer
Mar 2017 – Aug 2022 National Research Infrastructure
  • Worked directly with external academic institutions and universities, visiting stakeholders in-loco, to understand research discovery workflows and translate requirements into a hybrid semantic search architecture (MariaDB + Elasticsearch)
  • Built a scientific document embedding and semantic search platform using SciBERT and multilingual embeddings; reduced average search latency by ~15x and increased user engagement by ~30%
  • Implemented hybrid ranking combining BM25, embeddings, citation metrics, recency decay, and author authority using learning-to-rank models, supporting millions of indexed articles
  • Migrated legacy ETL workflows to Apache Airflow as modular, monitored DAGs; increased pipeline efficiency by ~350%
  • Built a centralized security monitoring platform (Grafana + MongoDB) aggregating access logs and anomaly detection, preventing thousands of unauthorized access attempts monthly
Institute for Applied Economic Research (IPEA) Research Engineer
May 2015 – May 2017 Public Sector · National Data Infrastructure
  • Built a national Civil Society Organizations platform tracking 820,000+ entities with automated ETL pipelines, validation, and failure recovery
  • Improved IpeaData SLA from 82% to 98% through pipeline optimization, data normalization, and upstream validation

Signature Wins

High-impact systems shipped end-to-end

Healthcare · Thoughtful AI

Enterprise Healthcare AI Automation

Production agentic workflows deployed in compliance-heavy customer environments

Embedded as FDE to design and ship MCP servers, sub-agents, and reusable skill libraries. Established system boundaries, integration contracts, and evaluation gates for autonomous AI workflows operating on sensitive healthcare data.

MCP ServersLLM AgentsPythonFastAPITelemetry
Fintech · Canoe Intelligence

Exception Resolution at Scale

15 min → ~90 sec average resolution time via autonomous AI pipeline

Redesigned exception management from scratch as an event-driven AI pipeline. ML classifier predicts resolutions; autonomous agents execute workflows at confidence threshold, eliminating human intervention for recurring patterns.

FastAPIKafkaML ClassificationLLM AgentsDatadog
Edtech · Voxy

ActivityLab — GenAI Learning Platform

~80% user engagement increase across a B2B enterprise client base

Led 6-engineer team to build a GenAI platform generating personalized multimodal learning content (listening, speaking, reading, writing). Included ASR-based pronunciation scoring, rubric-aligned writing feedback with bias mitigation, and article-aware comprehension generators.

GPT-4oClaudeASRPrompt EngineeringPython
Research Infrastructure · RNP

Scientific Semantic Search Platform

~15x search latency reduction · Millions of indexed articles

Built a hybrid semantic search system using SciBERT and multilingual embeddings. Combined BM25, citation metrics, recency decay, and author authority with learning-to-rank models to serve millions of researchers across Brazil's academic network.

SciBERTElasticsearchLearning-to-RankAirflowPython

Education

Computer Science, University of Brasília (UnB)

Ph.D. Computer Science ↗ 2022

Multi-agent strategic simulation, autonomous decision-making, and adversarial agent behavior with Game Theory for Environmental Simulation

M.Sc. Computer Science ↗ 2014

Rational agent architectures with Belief–Desire–Intention (BDI) frameworks

B.Sc. Computer Science 2012

Multi-agent systems for environmental management simulation

🎓

Ph.D. research in multi-agent systems and autonomous decision-making — the foundational theory behind modern agentic AI systems

Tech Stack

Technologies I ship production systems with

Languages
PythonTypeScriptJavaC++
AI & LLM
LangChainMCP ServersRAGAgentsFew-shot PromptingStructured OutputsLLM Evaluation
Models
Claude Opus 4Claude Sonnet 4GPT-4oGemini 1.5 ProGemini 1.5 Flash
AI Dev Tools
Claude CodeGitHub CopilotCursor
Backend & Data
FastAPIKafkaAirflowSparkDatabricksPostgreSQLElasticsearchRedisVector DBs
Cloud & Infra
AWS LambdaAWS ECSSageMakerGCPDockerTerraformDatadogGitLab CI/CD

Beyond the Terminal

The human behind the agentic pipelines

🎮

Powered by Nintendo since 198x. (That's me in the photo — you can't really see it, but it's a Nintendo 64 t-shirt.)

🧠

Spent 4 years teaching autonomous agents to outsmart each other using Game Theory. Called it a PhD.

🌎

Born in Belem, based in Brasília, fluent in English, native in Python. C2 certified — the accent comes free of charge.

🚀

Favorite side effect of being a Forward-Deployed Engineer: shipping AI in hospitals, banks, and edtechs — sometimes in the same sprint.

Let's build something important together.

I'm looking for Forward-Deployed Engineer and Applied AI roles at ambitious AI companies. I bring production-tested agentic systems experience and a track record of delivering reliably inside complex enterprise environments.