Recruiter Walkthrough
A guided tour by Jonatas Silva — the problem, the risks, the architecture, and how I'd execute in the first 90 days.

Jonatas Silva · Tech Lead Engineer · Full Stack · Agentic AI & Multi-tenant SaaS
Lexora AI Marketing OS — a production-shaped prototype I designed and built for law-firm AI marketing.
7+ years building production systems · 4+ years leading engineering teams · specialized in tenant isolation, async AI pipelines, and billing at scale.
I treat AI calls as engineered systems, not fetch requests: prompt chaining, fallback logic, output validation, per-client token cost tracking, and observability. This prototype demonstrates those patterns in a compliance-sensitive, multi-tenant context — everything is mock-driven, but the entities, flows and architecture mirror systems I ship in production.
Professional profile
Tech Lead Engineer & Full Stack Developer specialized in multi-tenant SaaS with strict tenant isolation, agentic AI pipelines, async job architectures, and billing systems — owning products end to end from architecture to production.
Multi-tenant SaaS
Strict tenant isolation — scoped queries, middleware resolvers, indexed tenantId, RLS-style policies shipped in production.
Agentic AI systems
Prompt chaining, fallback logic, output validation, and orchestration across Claude, OpenAI, Gemini and Perplexity.
Async job architecture
BullMQ + Redis workers, retries, idempotency, dead-letter handling, and alerting for AI/content pipelines.
Billing & observability
Stripe checkout, webhooks, tier gating, per-client token cost tracking, audit trails, and integration health monitoring.

Jonatas Silva
Tech Lead Engineer
Contact
- jonatasfelipe68@hotmail.com
- +55 14 99116-4027
- Areiópolis/SP, Brazil · Remote
- linkedin.com/in/jonatas-felipe-silva
Languages
Relevant experience
Background that maps directly to this prototype.
Tech Lead Engineer & Full Stack Developer
2018 — PresentBackend ownership and architecture of multi-tenant SaaS products with AI pipelines, async job systems, and billing — leading delivery from data model to production.
- Multi-tenant data models with strict isolation (tenantId scoping + middleware resolver)
- BullMQ + Redis async pipelines with retries, idempotent jobs, and failure handling
- Agentic AI: OpenAI, Gemini, prompt chaining, fallback, structured output validation
- Subscription billing: checkout, tier gating, webhooks, active/overdue/blocked state machines
- Per-tenant audit trails, RBAC, rate limiting, and production observability
- CI/CD on AWS — Docker, PM2, EC2/S3, zero-downtime releases, Prisma migrations
Tech Lead & Systems Engineer
2022 — 2025Transmaion Transportes
Technical leadership of the engineering team — architecture, mentoring, and production reliability of internal platforms.
- Led multi-developer team: code reviews, onboarding, architectural decisions
- Real-time operational systems with async jobs, observability, and retry/alerting
- Third-party API integrations with per-source health monitoring
Key projects → Lexora screens
Production systems I've shipped that informed this prototype.
UPVEND
Multi-tenant SaaSBullMQAIERP/commerce SaaS — per-subdomain tenant isolation, BullMQ workers, plan checkout + webhooks, AI assistant via secured endpoints.
See related screenCaixaly
Financial SaaSOpenAIBillingMulti-tenant financial platform with master admin panel, OpenAI assistant with heuristic fallback, plans/tier model — live in production.
See related screenTouchFind
Granular RBACAuditTier gatingIndustrial SaaS — shared-schema isolation, resource:action permissions, full audit trail, subscription state machine enforced via middleware.
See related screenSales Launch
Agentic AIPrompt chainingBullMQAI sales-training platform — OpenAI + Gemini, dynamic scenarios, streamed responses, provider fallback, output evaluation rubrics.
See related screenCore stack
Core
AI & observability
Platform & DevOps
The walkthrough
Ten talking points, each linked to the live screen that proves it.
1 · The product problem
Law firms publish marketing content that must be jurisdiction-compliant, fast and cheap. Today that runs through fragmented N8N flows + an Express/ECS service: hard to version, weak retries, no per-tenant cost visibility, and real legal risk if a bad claim ships. Lexora consolidates this into one observable platform — the same class of problems I've solved across UPVEND, Caixaly, and TouchFind.
See the dashboard2 · Key technical risks
Non-deterministic AI output (hallucinations, malformed JSON, refusals), runaway token cost, cross-tenant data leakage, provider rate limits, and compliance violations that can't be 'rolled back' once published. Each risk has an explicit mitigation in the system — patterns I've applied in Sales Launch and production AI assistants.
Risk telemetry3 · Proposed architecture
A Next.js App Router monolith (UI + API + Server Actions) fronting Supabase Postgres with RLS, Redis-backed BullMQ workers for async pipelines, AI observability tracing, Stripe for event-driven billing, and adapters for Claude/Gemini/Perplexity/DALL-E, CourtListener and WordPress.
Architecture map4 · Replacing N8N with native workers
Every workflow becomes a typed BullMQ job contract: code-reviewed, versioned, retried with backoff, and pushed to a DLQ on exhaustion. Jobs carry tenant_id and emit tokens/cost/trace data — eliminating the visual-flow black box I've replaced in multiple production systems.
Workers & DLQ5 · Multi-tenant isolation
Isolation lives in Postgres. FORCE ROW LEVEL SECURITY binds every row to auth.jwt() ->> 'tenant_id'. A forgotten WHERE clause can't leak data; cross-tenant attempts return 0 rows and log a critical audit event — the same discipline I enforce with tenantId scoping + middleware resolvers.
RLS isolation6 · Per-tenant AI cost tracking
Each LLM call produces a trace with input/output tokens and cost, attributed to the tenant via the metering queue. Budget guardrails warn at 85% and hard-stop at 100%, reconciled with Stripe usage records — modeled after Caixaly and UPVEND billing flows.
Billing & usage7 · Failures & retries
Layered resilience: retry with exponential backoff + jitter → fallback to a secondary provider → fall back to cached research and flag for human review. Exhausted jobs land in the DLQ with full payload and stacktrace for replay.
Retry in action8 · AI quality & compliance monitoring
Beyond logs: structured traces capture evaluation, compliance and hallucination scores per generation. Non-deterministic test suites gate publishing; jurisdiction rule packs block prohibited claims before they ship.
Compliance layer9 · Migrating without breaking prod
Strangler-fig: run BullMQ alongside N8N, migrate one pipeline at a time behind feature flags, shadow-run for parity, then cut over. RLS and observability land before legacy is removed, so we always have a rollback path.
Migration plan10 · First 30/60/90 days
Stabilize and instrument first, migrate the riskiest pipelines next, then harden security, billing and compliance — finishing by decommissioning N8N and consolidating ECS into the monolith.
See the plan30 / 60 / 90 Day Plan
How I'd de-risk and deliver in the first quarter.
First 30 days
Phase 1- 1Map current pipelines & failure modes
- 2Identify critical failures and bottlenecks
- 3Stand up operational metrics & dashboards
- 4Define typed job contracts
- 5Build the base RLS layer
First 60 days
Phase 2- 1Migrate highest-risk pipelines to BullMQ
- 2Implement DLQ + retry strategy
- 3Integrate AI observability tracing
- 4Implement billing metering
First 90 days
Phase 3- 1Remove critical N8N dependencies
- 2Compliance hardening per jurisdiction
- 3Per-tenant monitoring & alerting
- 4Executive dashboards
- 5Regression testing for AI outputs
Thanks for reviewing.
This prototype was designed and built by Jonatas Silva. The UI exists to make the architecture legible — I'm happy to walk through any pipeline, the retry/DLQ model, the RLS design, or the AI observability schema in depth.