AI & RAG ENGINEERING

LLM features that survive real users.

Production AI · not demos

I build production AI on top of real products: retrieval-augmented generation (RAG) over your own documents, autonomous and multi-agent systems, and LLM features wired end to end into web and mobile apps. I work across the OpenAI, Anthropic and Gemini APIs with the Vercel AI SDK and LangChain — and I treat reliability, grounding and evaluation as part of the build, not an afterthought.

01§

RAG systems

Chatbots and search that answer from your data: document ingestion, chunking, embeddings, vector search (pgvector, Pinecone) and grounded retrieval with citations.

02§

AI agents

Autonomous and multi-agent systems that take actions, run on a schedule, and integrate with your tools — like the dual-agent pipeline behind TechBlog AI Agent.

03§

LLM integration

Generation, summarization, classification and chat added to an existing app, end to end — data and retrieval, API layer, and the web or mobile UI.

04§

Reliability

Structured outputs, schema validation, guardrails, evaluation sets and human-in-the-loop where it matters — so the feature is trustworthy in front of real users.

SHIPPED

AI I’ve put in production.

Real systems · real users

Clona

2025

B2B platform for AI conversational agents with a vector-search knowledge base and 3D avatars. Implements retrieval-augmented generation (RAG) over each client’s documents, with multichannel delivery across chat, voice, WhatsApp and embeddable web widgets.

Retrieval-Augmented GenerationVector searchConversational AI agentsLangChainOpenAI

TechBlog AI Agent

2026

Autonomous dual-agent system that discovers AI/tech news from 20+ RSS feeds, rewrites it in Spanish, and publishes automatically every three hours. Uses PostgreSQL-backed deduplication and scheduled execution.

Multi-agent systemAutonomous agentsVercel AI SDKOpenRouterContent automation

Living Motions

2026

Cross-platform Enneagram personality app with AI-generated reports and personalized coaching, integrating an LLM to turn assessment data into tailored narrative output.

LLM integrationReport generationAnthropic

Credit Helper

2026

SaaS credit-analysis platform with an AI chatbot that drafts FCRA dispute letters and explains credit concepts, generating structured, document-ready output.

AI chatbotStructured generationOpenAI

ArcaVida

2026

Health and wellness platform with an AI chatbot and a natural-remedies knowledge library, answering user questions grounded in a curated content base.

AI chatbotKnowledge retrievalOpenAI
FAQ

Questions people actually ask.

Hiring, AI, MVPs & how I work
01

Can you build a RAG chatbot over my company’s documents?

Yes. Ramón builds retrieval-augmented generation (RAG) systems: ingesting and chunking your documents, generating embeddings, storing them in a vector database (such as pgvector or Pinecone), and retrieving the right context at query time so the model answers from your data instead of guessing. He shipped exactly this in Clona, a B2B platform whose conversational agents answer from a vector-search knowledge base across chat, voice and WhatsApp.

02

Which LLM providers and AI tools do you work with?

Ramón works with the OpenAI, Anthropic, and Gemini APIs, and routes across models with OpenRouter. On the application side he uses the Vercel AI SDK and LangChain for orchestration, plus vector stores and embeddings for retrieval. He picks the model and tooling per use case — cost, latency, and quality — rather than defaulting to one provider.

03

Can you build autonomous AI agents or multi-agent systems?

Yes. Ramón has built autonomous and multi-agent systems in production. TechBlog AI Agent is a dual-agent pipeline that discovers news from 20+ RSS feeds, rewrites it, and publishes automatically every few hours, with PostgreSQL-backed deduplication and scheduled execution — agents doing real work on a schedule, not a demo.

04

How do you keep AI features reliable and avoid hallucinations in production?

The core technique is grounding: RAG so answers come from real sources, structured outputs and schema validation so responses are machine-checkable, and guardrails plus fallbacks for when the model is uncertain. Where it matters, he adds evaluation sets to measure quality across changes and keeps a human in the loop for high-stakes actions. The goal is an AI feature you can trust in front of real users, not just a working prompt.

05

How much does it cost to add an AI feature to an existing product?

It depends on scope, but a focused AI feature — say a RAG chatbot or a generation flow on top of an existing app — often ships in around 2 to 5 weeks. Pricing is quoted per project once the scope is clear rather than as a fixed rate, so the first step is a short call to define the use case, the data involved, and how reliability will be measured.

06

Can you integrate AI into an existing web or mobile app?

Yes — most AI work Ramón does sits on top of an existing product rather than starting from scratch. Because he works full-stack across React, Next.js, React Native and the backend, he can wire an LLM feature end to end: data and retrieval, the API layer, and the web or mobile UI, without coordinating separate contractors.

CONSULTING
13+ years · Web, Mobile, AI · ES / EN

When you need judgment,
not just code.

More than a decade shipping product across web, mobile and AI left me something more valuable than a stack: judgment. If your team is stuck on a technical decision, evaluating a stack, or wants a second opinion before sinking months into a direction — let's talk.

WhatsAppLinkedInUpwork