Cover
Carlos Carrillo

Carlos Carrillo

(He/Him)

AI-Driven Staff Data Engineer & Architect | 20+ yrs | Snowflake · Azure · Python | Built data platforms from zero for multiple companies | Relocating to Spain Aug 2026 · EU timezone | Remote (EN/ES/DE)

Mexico → Spain (Aug 2026) · Remote (Worldwide) · EU timezone · Contact info

500+ connections

dataqbs

Open to work

Staff Data Engineer, Data Architect, Solutions Architect, Lead Data Engineer, AI Engineer, Cloud Data Consultant roles

About

Senior Data Engineer, AI builder, and Cloud Consultant with 20+ years turning complex data problems into production systems. I architect end-to-end solutions — from incremental ETL pipelines and Snowflake/Azure SQL data warehouses to RAG-powered chatbots, physics-based mining simulations, real-time dashboards (34 KPIs across 7 mine sites), and algorithmic trading engines with 50K-trial hyperparameter optimization. My core stack is deep SQL + Python, extended with AI tooling I actually build with — not just use: LLM evaluation pipelines, vector embedding search, Kalman-filter calibration, and Bellman-Ford graph algorithms. I deliver in high-volume, mission-critical environments where uptime, cost efficiency, and long-term maintainability are survival — cloud-native, operationally practical, and designed to evolve beyond prototypes. Fully remote for years with US, Europe and LATAM teams — structured delivery, documentation-driven workflows, and clear technical communication across time zones. Currently based in Mexico, relocating to Spain (August 2026). EU and Euro timezones, available for remote collaboration across Europe. Rate: $35/hr USD (€29/hr) · Negotiable for long-term engagements.

"To live in peace, free from rigid structures — building projects that flow naturally through intelligence and awareness. Technology should serve life, not the other way around."

Experience

Data Integration Lead

Hexaware Technologies Full-time

Mar 2025 — Apr 2026

Mexico · Remote

Led Snowflake → Azure SQL integration for Freeport-McMoRan mining operations. Deployed incremental sync pipelines, built regression testing CLI, optimized Snowflake views, and developed production dashboards and AI chatbots for 7 mining sites.

SnowflakeAzure SQLADX/KQLAzure FunctionsApp ServicePythonStreamlitDockerGitHub EnterpriseCopilotMERGE/UpsertCDC/DeltaCTE RefactoringETL/ELTIoT Sensor DataMining AnalyticsEntra ID/KerberosCI/CD

Senior Data Engineer

dataqbs Freelance

Jan 2011 — Present

Guadalajara, Mexico · Remote

Independent consultancy providing BI, data engineering, database solutions, and AI-assisted automation for US and LATAM clients. Built 28-tool enterprise knowledge capture platform, SSO-authenticated scrapers, config-driven diagram engine, crypto trading bots, LLM evaluation engine, and this portfolio site with RAG chatbot.

PythonSQL ServerPostgreSQLSnowflakeSSIS/SSRS/SSASTableauPower BIDataiku DSSAzure Data FactoryNode.jsPlaywrightccxtpandasAstroSvelteConfluence APIADO REST APIMicrosoft GraphMiro API

ETL Engineer

SVAM International Inc. Contract

Nov 2022 — Sep 2024

Mexico · Remote

Led migration from on-prem SQL Server and SSIS to Snowflake for student certification analytics.

SnowflakeSQL ServerSSISSalesforce APISharePoint

Certifications

Anthropic

Certificate of Completion: Introduction to Agent Skills

Anthropic

Issued by Anthropic · 2026

Credential ID: t3j2knij9735

Anthropic

Certificate of Completion: AI Fluency Framework & Foundations

Anthropic

Issued by Anthropic · 2026

Credential ID: 3fg9fxpyta6i

Featured Projects

Crypto Arbitrage Scanner

FinTech

Scans 9 exchanges (Binance, Bitget, Bybit, Coinbase, OKX, KuCoin, Kraken, Gate.io, MEXC) for price inefficiencies. Uses Bellman-Ford shortest-path algorithm and triangular arbitrage detection. Includes a Swapper module for executing trades, WebSocket L2 order-book feeds, SDK bootstrapping for native exchange integrations, and a real-time balance monitor.

  • 4,000+ LOC scanner with graph-based arbitrage detection
  • 9 exchange integrations with 4 balance provider backends
  • Live swap executor with dry-run and production modes
  • WebSocket L2 partial orderbook for Binance
  • Portfolio monitor with 1-hop bridge pricing
PythonccxtpandasWebSocketPyYAMLBinance SDKujson

OAI Code Evaluator

AI / ML

YAML-driven evaluation pipeline with rule-based scoring across Instructions, Accuracy, Optimality, Presentation, and Freshness dimensions. Supports regex/substring matching, threshold conditions, ranking normalization, rewrite post-processing, and structured audit metadata output.

  • 6-stage evaluation pipeline (adjust → rules → rank → rewrite → validate → summary)
  • Declarative YAML rules with regex, substring, and threshold conditions
  • 5-dimension scoring with configurable ideals and tolerances
  • Structured JSON/YAML audit output
PythonRichPyYAMLjsonschemaJinja2

Email Collector & Classifier

Automation

Multi-account IMAP collector supporting Gmail, Hotmail (MSAL OAuth device-flow), and Exchange. Classifies emails into Scam/Suspicious/Spam/Clean/Unknown using a weighted scoring engine with 200+ domain rules, URL-shortener detection, phone-pattern matching, and fuzzy deduplication.

  • 5-label classifier with weighted scoring and hard rules
  • 200+ domain classification rules
  • OAuth device-flow for Hotmail/Outlook
  • Fuzzy deduplication with SimHash
Pythonimap-toolsMSALlangdetectPyYAML

dataqbs.com Portfolio

AI / ML

This very website — a LinkedIn-style portfolio with RAG-powered AI chatbot, built with Astro + Svelte + Tailwind on Cloudflare Pages.

  • RAG chatbot with vector embeddings + Groq LLM streaming
  • Knowledge pipeline: markdown → 88 chunks with 768-dim embeddings
  • i18n (EN/ES/DE), dark mode, LinkedIn-style layout
  • Cloudflare Pages + Workers AI + KV storage
AstroSvelteTailwind CSSCloudflare Workers AIGroqTypeScript

Open Garage

AI / ML

Full-stack SaaS marketplace for neighborhood garage sales, deployed as a Cloudflare Pages application. Features AI-powered item descriptions using Workers AI vision models (Llama 3.2 90B), Google OAuth authentication, multi-tenant KV storage with per-garage isolation, automatic translation to ES/EN/DE, mobile camera capture with auto-compression, WhatsApp sharing integration, admin panel with audit logs, and a superadmin dashboard. 15 API endpoints, 20+ Svelte components, edge-cached image serving.

  • 15 API endpoints with 20+ Svelte components on Cloudflare Pages
  • AI vision item descriptions via Workers AI (Llama 3.2 90B Vision)
  • Multi-tenant KV storage with per-garage namespace isolation
  • Google OAuth + JWT authentication with session management
  • Auto-translation to ES/EN/DE with locale detection
  • Mobile camera capture with client-side compression and chunked upload
  • Admin panel + superadmin dashboard with audit logs and ban system
AstroSvelteTypeScriptCloudflare Workers AICloudflare KVGoogle OAuthTailwind CSSTurnstile

Multi-Reach

AI / ML

Internal SaaS tool for dataqbs social media operations. Compose content once with per-channel previews and publish to 6 Meta channels (IG Feed, IG Stories, IG Reels, FB Page, FB Stories, WhatsApp Business) through the Meta Graph API v25.0. Features Google OAuth authentication restricted to @dataqbs.com domain, AES-256-GCM encrypted token storage in Cloudflare KV, scheduled publishing with calendar view, media upload pipeline via R2, and a superadmin role system. Built as a micro-app inside the dataqbs_site Astro monolith with 9 API endpoints, 13 Svelte components, and 8 library modules.

  • 6 Meta channels: IG Feed, IG Stories, IG Reels, FB Page, FB Stories, WhatsApp Business
  • Meta Graph API v25.0 — API-native only, no browser automation
  • AES-256-GCM encrypted token storage via Web Crypto API + Cloudflare KV
  • Google OAuth restricted to @dataqbs.com with HMAC-SHA256 JWT sessions
  • Per-channel content previews with real-time character/media validation
  • Scheduled publishing with calendar view and timezone support
  • 9 API endpoints, 13 Svelte components, 8 library modules
AstroSvelteTypeScriptMeta Graph APICloudflare KVCloudflare R2Google OAuthAES-256-GCMTailwind CSS

MEMO-GRID

FinTech

Production grid trading microservice using ccxt with Binance Spot. Features Optuna hyperparameter optimization (50K trials), backtest engine with real fee modeling, attribution analysis (alpha vs beta decomposition), Monte Carlo projections, and 22 analysis tools. Includes FIFO inventory tracking, adaptive step sizing, and systemd deployment support.

  • v7.0.0: ETH/BTC Grid beats HODL BTC — ROI +47,331% vs +1,958% (2017–2026)
  • HPO with 50,000 Optuna trials (TPE sampler) for grid parameters
  • Backtest engine spanning 2017–2026 with maker fee modeling
  • Attribution analysis: alpha vs beta return decomposition
  • Production microservice (RADAR) with dust consolidation and balance monitoring
  • 33 unit tests with full coverage
PythonccxtOptunapandasNumPyPyYAMLpytest

VCA PostgreSQL Audits

Data Eng.

Full audit and schema management framework for Azure Database for PostgreSQL at VCA (Veterinary Centres of America), part of Mars Veterinary Health (MVH). Includes per-object DDL export with Nunjucks templates, automated schema discovery, LLM-friendly schema_knowledge.json generation, and 60+ ticket-based database improvements across index optimization, FK remediation, timestamp normalization, and stored procedure reviews. DA-147 (Voyager Health migration evaluation): 4,100-line Technical Design Document evaluating SQL Managed Instance vs PostgreSQL migration feasibility for the Voyager Health platform, covering Cosmos DB dependencies, 50+ microservices, SignalR, and Azure DevOps CI/CD pipelines.

  • 60+ tickets: index optimization, FK remediation, schema renames, timestamp fixes
  • DA-147 Voyager Health: 4,100-line TDD evaluating SQL MI vs PostgreSQL migration for Mars Veterinary Health
  • Templated per-object DDL exporter (Nunjucks) for CI/CD-friendly snapshots
  • Technical Design Documents for 5+ database systems
  • Automated timesheet generation with Harvest API
PostgreSQLNode.jsJavaScriptNunjucksAzure PostgreSQL

IROC Video Wall Dashboard

Data Eng.

Streamlit-based production monitoring dashboard for IROC operations across 7 Freeport-McMoRan mining sites. Features real-time metrics from Snowflake and Azure Data Explorer (ADX), 34 KPIs covering dig compliance, crusher rates, cycle times, and ROM tonnage. Includes RAG-powered AI chatbot with GitHub Copilot SDK, semantic model with 16 business outcomes per site, and auto-refresh every 60 seconds.

  • 125 KPI queries across 7 mining sites with 100% coverage verified
  • AI chatbot with RAG + GitHub Copilot SDK (zero-cost for enterprise)
  • Semantic model: 16 business outcomes × 7 sites with ADX + Snowflake queries
  • Azure Functions ETL pipeline: SQL Server → Snowflake with Docker + Kerberos auth
  • Multi-environment Azure SQL DDL extraction (DEV/TEST/PROD) with Entra ID
  • 1,369-line architecture document with 20 Mermaid diagrams
  • Docker-ready with Azure Container App deployment
PythonStreamlitSnowflakeAzure Data ExplorerKQLGitHub Copilot SDK

Ore Tracing & Stockpile Simulation

Data Eng.

End-to-end ore tracing system that simulates stockpile behavior using 3D block models and tracks mineralogy through the comminution circuit (secondary/tertiary crushers → mills → flotation). Features predictive calibration of industrial belt scales with Kalman filtering, crush-out time estimation, lag-based propagation models, and nowcast simulation for multiple mine sites. Data pipeline reads sensor data at 1-minute resolution from a cloud data warehouse, runs simulations, and writes traced mineral states back for downstream analytics.

  • Physics-based 3D stockpile simulation with block-level mass tracking
  • Mineral composition tracing through crusher → mill → flotation circuits
  • Kalman filter belt-scale correction with inertia weighting
  • Nowcast and crush-out time prediction for operational planning
  • Multi-site deployment with config-driven YAML architecture
PythonSnowflakeAzure ML PipelinesDagsterNumPySciPyPyYAMLDynaconf

Enterprise Knowledge Capture Platform

Data Eng.

Built for NewFire Global's multi-client consulting engagements. 28 Node.js tools (10,000+ LOC) that authenticate through Okta/Microsoft SSO via Playwright, then systematically capture content from Azure DevOps wikis, Confluence spaces, Teams channels/calendar, SharePoint documents, Databricks notebooks, Miro boards, and Google Drive. Includes a config-driven SVG/PNG architecture diagram renderer with swimlanes, embedded logos, and overlap validation. Features a Confluence publishing pipeline for converting captured knowledge into structured wiki pages. OncoHealth sub-client: captured 8.5M+ characters across 16 connected services (302 files, 159MB) for healthcare/oncology knowledge transfer.

  • 28 tools across 6 categories: SSO scrapers, API clients, content extractors, diagram renderers, publishers, diagnostics
  • 10,000+ LOC with modular architecture (shared/ + clients/<name>/)
  • SSO-authenticated scraping: Okta, Microsoft, Databricks, Confluence, SharePoint
  • Config-driven SVG/PNG architecture diagrams with swimlanes and auto-layout
  • Confluence publishing pipeline: Markdown → styled wiki pages
  • OncoHealth: 8.5M+ chars captured from 16 services (ADO 2.1M, Confluence 907K, Teams, Google Drive)
  • Iceberg REST Catalog investigation for Databricks data lakehouse
  • VS Code extension for M365 Calendar & Transcripts integration
Node.jsPlaywrightConfluence REST APIADO REST APIMicrosoft GraphMiro APIexceljsmammothpdf-parsejs-yamlSVG

Skills

💻 Languages

Python Advanced
SQL Expert
JavaScript / TypeScript Advanced
KQL (Kusto) Advanced
Bash Advanced
Node.js Advanced
PowerShell Intermediate

☁️ Data & Cloud

Snowflake Expert
Azure (SQL, ADF, Functions) Advanced
Azure Data Explorer (ADX) Advanced
Microsoft Fabric Intermediate
Cloudflare (Pages, Workers, AI) Intermediate

🤖 AI & ML

LLM Evaluation & Prompt Eng. Advanced
RAG (Retrieval-Augmented Gen.) Advanced
Snowflake Cortex AI Advanced
Vector Embeddings & Search Advanced
GitHub Copilot SDK Advanced
Optuna (HPO) Advanced
Fine-Tuning (PEFT / LoRA) Intermediate

📦 Libraries & Frameworks

pandas / NumPy Expert
ccxt (crypto exchanges) Expert
Streamlit Advanced
Playwright Advanced
Astro / Svelte Intermediate
Nunjucks / Jinja2 Advanced
Rich / rapidfuzz Advanced

🔧 DevOps & Tools

GitHub Actions CI/CD Advanced
Poetry / pip Expert
ruff / pre-commit / pytest Advanced
Docker Intermediate
QEMU / KVM Intermediate
Linux (Pop!_OS) Advanced

🗄️ Databases

SQL Server Expert
Snowflake Expert
PostgreSQL Expert
Azure SQL / Azure PostgreSQL Advanced
SQLite Advanced

Contact