Cássio (Cass) Couto — Forward-Deployed Engineer

Impact at a Glance

Results shipped across fintech, edtech, healthcare, and government

20+ Healthcare payers automated via Prior Auth pipeline Thoughtful AI

~90s Average exception resolution time Down from ~15 minutes · Canoe Intelligence

>70% ETL pipeline runtime reduction Databricks optimization · BairesDev

~80% User engagement increase ActivityLab, GenAI-powered · Voxy

~15x Search latency improvement Semantic search platform · RNP

82→98% SLA improvement IpeaData pipeline · IPEA

What I Do Best

Deep technical capability across the full AI delivery stack

🤖

Agentic LLM Systems

MCP servers, sub-agents, skill libraries, planning / execution / verification / recovery layers, tool orchestration, memory systems, reasoning loops

🚀

Forward-Deployed Engineering

Customer discovery, system boundary design, secure integration contracts, white-glove production deployment in compliance-heavy environments, stakeholder partnership

🔬

AI Evaluation & Observability

Model benchmarking (Claude, GPT, Gemini), scenario test sets, regression checks, structured telemetry, step-level execution tracing, LLM evaluation frameworks

⚙️

Data & ML Infrastructure

FastAPI, Kafka, Spark, Databricks, Airflow, vector databases, RAG pipelines, embedding systems, semantic search, ETL optimization

☁️

Cloud & DevOps

Production deployments on AWS (Lambda, ECS, SageMaker) and GCP. Infrastructure-as-code with Terraform and Docker. Observability with Datadog. Continuous delivery via GitLab CI/CD with zero-downtime release discipline.

🔁

Field-to-Product Feedback Loop

Codifying repeatable deployment playbooks, feeding customer learnings back into product and engineering, mentoring engineers, cross-functional delivery leadership

Professional Experience

11+ years building and deploying production AI systems

Thoughtful AI Forward-Deployed Engineer

Jun 2025 – Present Healthcare Automation ▶

Embedded with enterprise customers to translate ambiguous business workflows into production-grade agentic LLM architectures (planning, execution, verification, and recovery layers)
Delivered production artifacts, including MCP servers, sub-agents, reusable agent skill libraries, and API endpoints, deployed across multiple enterprise clients
Partnered with customer engineering and operations teams to run discovery on the payment layer, define system boundaries, and design secure integration contracts for autonomous enterprise automation
Built model-selection and evaluation patterns (scenario test sets, regression checks) by benchmarking Claude Opus, GPT, and Gemini Pro on agent reasoning tasks
Implemented structured telemetry and step-level execution tracing, reducing rollback incidents and improving auditability in compliance-heavy environments
Codified repeatable deployment playbooks for agent workflows and fed field learnings back into product and engineering teams
Orchestrated durable agentic workflows using Temporal, enabling reliable long-running automation with retry logic, fault tolerance, and full execution history

Canoe Intelligence Senior Software Engineer

Jun 2024 – Oct 2025 Fintech · Alternative Investments ▶

Collaborated with PM, operations, and data teams from discovery through delivery to redesign exception management as an event-driven, AI-powered pipeline using FastAPI + Kafka
Built an autonomous exception resolution system with an ML classification model predicting likely resolutions; agents execute known workflows when confidence thresholds are met — reduced average resolution time from ~15 minutes to ~90 seconds
Partnered with internal stakeholders on exception labeling, pattern analysis, and model validation to ensure AI outputs met operational requirements
Evaluated Claude Sonnet, GPT, and Gemini Flash for document understanding tasks; implemented a structured LLM evaluation framework
Implemented unified observability stack (Datadog + Terraform) with distributed tracing for AI workflows, enabling real-time debuggability
Shipped weekly zero-downtime releases through standardized CI/CD, mentored junior engineers, and used Claude Code and GitHub Copilot for test generation and code review automation

BairesDev Senior Software Engineer

Jan 2024 – May 2024 Consulting · Data Infrastructure ▶

Optimized Databricks ETL pipelines for ML training data, reducing runtime by >70% and significantly lowering compute costs through strategic caching, partitioning, and transformation pruning
Designed custom Spark and Scikit-learn validation layers for data quality, stabilizing training data, and improving downstream model robustness and stakeholder trust

Voxy Tech Lead

Aug 2022 – Jan 2024 Edtech · Enterprise Language Learning ▶

Led roadmap alignment sessions directly with enterprise customers to translate learning goals into GenAI product requirements, balancing competing priorities across engineering, product, and client stakeholders
Led team of 6 engineers to build ActivityLab, a GenAI-powered platform generating personalized multimodal learning activities (listening, speaking, reading, writing); increased user engagement by ~80%
Designed an automated reading comprehension generator with article-aware prompts producing difficulty-calibrated questions, validated by heuristics and human-in-the-loop review
Built an automated writing feedback system with prompt-driven revision, rubric-aligned grading, and bias mitigation checks
Implemented AI listening and speaking assessment integrating ASR and LLM-based scoring for near-real-time pronunciation and fluency feedback
Experimented with GPT, Claude Sonnet, and early prompt engineering techniques; integrated GitHub Copilot across engineering workflows for team productivity

RNP — National Education and Research Network Senior Software Engineer

Mar 2017 – Aug 2022 National Research Infrastructure ▶

Worked directly with external academic institutions and universities, visiting stakeholders in-loco, to understand research discovery workflows and translate requirements into a hybrid semantic search architecture (MariaDB + Elasticsearch)
Built a scientific document embedding and semantic search platform using SciBERT and multilingual embeddings; reduced average search latency by ~15x and increased user engagement by ~30%
Implemented hybrid ranking combining BM25, embeddings, citation metrics, recency decay, and author authority using learning-to-rank models, supporting millions of indexed articles
Migrated legacy ETL workflows to Apache Airflow as modular, monitored DAGs; increased pipeline efficiency by ~350%
Built a centralized security monitoring platform (Grafana + MongoDB) aggregating access logs and anomaly detection, preventing thousands of unauthorized access attempts monthly

Institute for Applied Economic Research (IPEA) Research Engineer

May 2015 – May 2017 Public Sector · National Data Infrastructure ▶

Built a national Civil Society Organizations platform tracking 820,000+ entities with automated ETL pipelines, validation, and failure recovery
Improved IpeaData SLA from 82% to 98% through pipeline optimization, data normalization, and upstream validation

Signature Wins

High-impact systems shipped end-to-end

Healthcare · Thoughtful AI

Enterprise Healthcare AI Automation

Production agentic workflows deployed in compliance-heavy customer environments

Embedded as FDE to design and ship MCP servers, sub-agents, and reusable skill libraries. Established system boundaries, integration contracts, and evaluation gates for autonomous AI workflows operating on sensitive healthcare data.

MCP ServersLLM AgentsPythonFastAPITelemetry

Fintech · Canoe Intelligence

Exception Resolution at Scale

15 min → ~90 sec average resolution time via autonomous AI pipeline

Redesigned exception management from scratch as an event-driven AI pipeline. ML classifier predicts resolutions; autonomous agents execute workflows at confidence threshold, eliminating human intervention for recurring patterns.

FastAPIKafkaML ClassificationLLM AgentsDatadog

Edtech · Voxy

ActivityLab — GenAI Learning Platform

~80% user engagement increase across a B2B enterprise client base

Led 6-engineer team to build a GenAI platform generating personalized multimodal learning content (listening, speaking, reading, writing). Included ASR-based pronunciation scoring, rubric-aligned writing feedback with bias mitigation, and article-aware comprehension generators.

GPTClaudeASRPrompt EngineeringPython

Research Infrastructure · RNP

Scientific Semantic Search Platform

~15x search latency reduction · Millions of indexed articles

Built a hybrid semantic search system using SciBERT and multilingual embeddings. Combined BM25, citation metrics, recency decay, and author authority with learning-to-rank models to serve millions of researchers across Brazil's academic network.

SciBERTElasticsearchLearning-to-RankAirflowPython

Open Source

Side projects — where I build to learn and write to share

Python TypeScript

GitHub ↗

Trading Crew

Deterministic-first crypto trading system with conditional AI advisory. A fully deterministic pipeline (fetch → analyze → signal → risk → execute) runs without any LLM calls. An UncertaintyScorer activates a CrewAI advisory crew only when market conditions are ambiguous — keeping token costs near zero in calm markets.

CrewAIFastAPINext.jsWebSocketsBacktestingDocker

C++

GitHub ↗

Parallelizing the Critical Path

Companion proof-of-concept for the article "Parallelizing the Critical Path: OpenMP in Latency-Sensitive Systems". Demonstrates OpenMP parallelism on a realistic trading signal pipeline across 2,000 instruments — achieving 8.35× speedup (1900ms → 227ms median) on a multi-stage rolling vol → EMA → DFT → score pipeline.

C++20OpenMPCMakePerformance

Education

Computer Science, University of Brasília (UnB)

Ph.D. Computer Science ↗ 2022

Multi-agent strategic simulation, autonomous decision-making, and adversarial agent behavior with Game Theory for Environmental Simulation

M.Sc. Computer Science ↗ 2014

Rational agent architectures with Belief–Desire–Intention (BDI) frameworks

B.Sc. Computer Science 2012

Multi-agent systems for environmental management simulation

🎓

Ph.D. research in multi-agent systems and autonomous decision-making — the foundational theory behind modern agentic AI systems

Tech Stack

Technologies I ship production systems with

Languages

PythonTypeScriptC++20Java

AI & LLM

LangChainCrewAIMCP ServersRAGAgentsFew-shot PromptingStructured OutputsLLM Evaluation

Models

Claude OpusClaude SonnetGPTGemini ProGemini Flash

AI Dev Tools

Claude CodeGitHub CopilotCursor

Backend & Data

FastAPIWebSocketsTemporalKafkaAirflowSparkDatabricksPostgreSQLSQLiteElasticsearchRedisVector DBs

Frontend

Next.jsReactNode.js

Systems & Performance

OpenMPParallel ComputingCMake

Cloud & Infra

AWS LambdaAWS ECSSageMakerGCPDockerTerraformDatadogGitLab CI/CD

Beyond the Terminal

The human behind the agentic pipelines

🎮

Powered by Nintendo since 198x. (That's me in the photo — you can't really see it, but it's a Nintendo 64 t-shirt.)

🧠

Spent 4 years teaching autonomous agents to outsmart each other using Game Theory. Called it a PhD.

🌎

Born in Belem, based in Brasília, fluent in English, native in Python. C2 certified — the accent comes free of charge.

🚀

Favorite side effect of being a Forward-Deployed Engineer: shipping AI in hospitals, banks, and edtechs — sometimes in the same sprint.

Let's build something important together.

I'm looking for Forward-Deployed Engineer and Applied AI roles at ambitious AI companies. I bring production-tested agentic systems experience and a track record of delivering reliably inside complex enterprise environments.

✉ cassio@cassiocouto.com LinkedIn GitHub Medium