Tolao
Dashboard Lobsters Architecture
Platform Costs
~$160 /mo burn rate
ItemAmountDate / FreqNotes
California LLC Filing $75 Mar 26, 2026 Filed with CA Secretary of State
Anthropic API Credits (Claude tokens) $200 Mar 2026 2x $100 top-ups — powers Lobster voice, chat, reports, Claude Code
Spent to date $275
Claude Max $100/mo Monthly claude.ai + Claude Code usage (flat rate)
Anthropic API Credits pay-as-you-go Monthly Separate from Claude Max — Lobster voice, chat, reports endpoints
Apify Starter Plan $29/mo Monthly CL scraping credits + platform access
ivanvs CL Scraper Actor $25/mo Monthly Rented actor for Craigslist vehicle scraping
Twilio Phone Number ~$1/mo Monthly +15106163508 — voice + SMS
Railway Hosting ~$5/mo Monthly Usage-based — always-on server + auto-deploy
Monthly subtotal ~$160/mo
CA LLC Annual Franchise Tax $800/yr Due 2027 California minimum franchise tax
EIN (IRS) $0 Pending Free — apply at irs.gov after LLC approved
Twilio Standard Brand $4/mo After LLC Required for A2P 10DLC SMS compliance
Apify Scrape Credits ~$15/mo Estimated 2x daily CL runs across 5 regions
Projected after LLC ~$179/mo + $800/yr franchise tax
1
Lead Sources
Where vehicle listings enter the system
OpenClaw Watcher
Retiring
src/watcher.js + src/worker.js
  • What it does: Connects to Chrome via CDP (127.0.0.1:9222), navigates to app.vettx.com/fresh, scans card tiles for vehicle data
  • Extraction: extractTitle() parses year/make/model from tile text, extractPrice() / extractMileage() / extractVin() pull structured fields
  • Claim flow: Finds thumbs-up button on card → clicks → handles confirm dialog → detects channel (FB/CL) → opens listing → captures photos
  • Sets on lead: vehicle.title, vehicle.vin, vehicle.mileage, vehicle.photos[], seller.askingPrice, _listingUrl, _channelAction
  • Dedup: claimedLeads Set tracks fingerprints (price|mileage), VINs, and IDs across restarts
  • One-and-done: Stops after first successful claim per run (state.done = true)
  • Status: Built and functional but retiring — 248 VETTX leads already ingested, moving to Apify FB scraper
  • Run: npm startnode src/run.jsstartWorker()
Apify FB Marketplace
Not Wired
src/connectors/facebookMarketplace.js + apify.js + fb-actor-input.json
  • Actor config: fb-actor-input.json — queries: "cars for sale", "car for sale private", radius: 150mi from SF Bay Area, $1k–$30k, max 200 items, category: vehicles
  • Normalizer: normalizeFBLead(item, datasetId) in facebookMarketplace.js
  • Output shape: id (fb-timestamp-rand), source: "apify_fb", channel: "facebook", vehicle.{title, year, make, model, mileage, vin, photos[]}, seller.{name, askingPrice, contactInfo.phone}
  • Photos: Comes with item.images[] already populated from Apify — no lobster needed
  • VIN extraction: Regex scans title + description for 17-char VIN
  • Phone extraction: Regex pulls US phone from description, normalizes to E.164
  • Webhook: POST /api/inbound/facebook exists in server.js (line 1472) but actor is not scheduled yet
  • To wire: Schedule Apify actor run → configure webhook to hit /api/inbound/facebook → leads flow automatically
Apify CL Scraper
Quota Exceeded
src/connectors/apify.js + apify-actor-input.json
  • Normalizer: normalizeApifyLead(item, source, datasetId)
  • Output shape: id (apify-timestamp-rand), source: "apify", channel: "craigslist", full vehicle + seller schema
  • Photo handling: Converts _50x50c.jpg_1200x900.jpg, deduplicates by image ID hash, strips remaining thumbnails
  • Dealer detection: isDealerPost() checks description for 3+ dealer signals (dealer, auto sales, financing available, etc.)
  • Webhook: POST /api/inbound/listing (line 1374)
  • Run actor: POST /api/admin/run-scraper triggers Apify actor via API
  • Status: Monthly quota exceeded — CL scraping paused until billing resets or plan upgraded
2
Ingest Layer
How leads enter the database
POST /api/leads + Webhooks
Live
src/dashboard/server.js + src/services/leadStore.js
  • Ingest routes: POST /api/inbound/listing (CL/Apify), POST /api/inbound/facebook (FB/Apify), saveLead() from worker
  • Auth: Webhooks are public (Apify calls them); direct POST /api/leads requires Bearer token
  • Validation: Normalizer outputs checked for required title, missing fields flagged in _needsReview / _missingFromScraper[]
  • Dedup: findLeadByListingUrl(url) prevents duplicate ingest of same listing
  • Thumbnail strip: All CL photos filtered on ingest: !url.includes("50x50") && !url.includes("_50x") && !url.includes("50c.jpg")
  • Ingest order: _ingestOrder: Date.now() for reliable sort (replaces claimedAt ordering)
Lead Schema
Canonical
src/services/leadStore.js — makeLeadRecord()
  • id — unique string (vettx-*, apify-*, fb-*)
  • status — pending_evaluation → evaluating → captured_valuation → offer_approved / rejected / stale / archived
  • source — "vettx" | "apify" | "apify_fb"
  • channel — "craigslist" | "facebook"
  • vehicle.{title, vin, year, make, model, mileage, condition, photos[], primaryPhoto, reconNotes}
  • seller.{name, askingPrice, contactInfo.phone, motivation}
  • _listingUrl — original listing URL (NOT lead.url)
  • _expired — true if listing gone (404/410)
  • _vAutoAppraisal — captured vAuto data (NEVER delete leads with this)
  • _vAutoSubmitted — true if sent to vAuto
  • conversation.{messages[], summary, keyFacts, lastMessageAt, messageCount}
  • evaluation.{vAutoResult, suggestedOffer, maxOffer, approved, approvedAt}
3
Photo Pipeline
Two paths: Lobster (scrape) vs Apify (pre-loaded)
Lobster Path (Scrape)
Live
scripts/photo-lobster.js
  • 1. Fetch: fetchLeads() — GET /api/leads, filter !vehicle.photos.length && _listingUrl && !_expired, sort FB first (URLs expire), then newest
  • 2. Pre-flight: leadNeedsPhotos(leadId) — re-checks live state before opening browser to avoid races
  • 3. Browser: Playwright chromium.launchPersistentContext() with copied Chrome cookies (avoids SingletonLock). Temp profile in /tmp/photo-lobster-*
  • 4a. CL scraper: scrapeCraigslist(page) — grabs images.craigslist.org URLs, upgrades via clFullSize() to _1200x900.jpg
  •   CL URL guard: Only adds URLs matching /\d+_\d+\.jpg/ or containing _1200x900 — rejects icon sprites that share the CL domain
  • 4b. FB scraper: scrapeFacebook(page) — collects fbcdn.net/scontent imgs, dedupes by fbBaseKey(), keeps largest area, upgrades to _o.jpg
  • 4c. Generic: scrapeGeneric(page) — og:image first, then all large <img> tags
  • 5. Filter: looksLikeCarPhoto(url, w, h) — rejects: JUNK_PATTERNS (thumbnails, icons, logos, trackers, SVGs, tiny images), vertical sprites (h/w > 3), min 200x150, square icons
  • 6. Intercept: attachImageInterceptor(page) — captures response bytes in-flight for fbcdn.net and craigslist.org images > 5KB
  • 7. Dimension check: Before Cloudinary upload, image-size validates buffer: rejects h/w > 2 (portrait sprites), < 200x150 (too small), unreadable buffers
  • 8. Upload: Buffer → Cloudinary upload_stream → tolao/leads/{leadId}/photo-{i}.jpg → permanent res.cloudinary.com URLs
  • 9. Patch: patchLead() via API — sets vehicle.photos[] and vehicle.primaryPhoto
  • 10. Safety fence: safePatch() only allows vehicle.photos, vehicle.primaryPhoto, _expired — throws on anything else
  • Expired detection: HTTP 404/410, or page text: "posting has been deleted/expired", "listing is no longer available", "content isn't available"
  • Run: npm run photo-lobster — processes 20 leads per batch, 3–5s polite delay between each
Apify Path (Pre-loaded)
Ready
src/connectors/apify.js + facebookMarketplace.js
  • CL path: normalizeApifyLead() reads item.pics || item.images || item.photos || item.imageUrls
  • CL transforms: _50x50c.jpg → _1200x900.jpg, dedup by image ID hash, strip remaining thumbnails
  • FB path: normalizeFBLead() reads item.images || item.photos || [item.image], slices to 20
  • Result: vehicle.photos[] populated on ingest — no lobster run needed
  • Trade-off: Apify FB photos are CDN URLs that may expire; lobster captures bytes permanently to Cloudinary
  • Hybrid plan: Ingest with Apify photos for immediate display, lobster backfills to Cloudinary later for permanence
4
Database
MongoDB Atlas + local fallback
MongoDB Atlas
Live
src/services/db.js + leadStore.js
  • Cluster: cluster0.kemvuwx.mongodb.net
  • Database: tolao
  • Collection: leads — 248 docs (all source: vettx, pre-archive)
  • User: tolaoaii_db_user (password in .keys)
  • IP restriction: Atlas whitelists Railway IPs only — local scripts always fail with "bad auth"
  • Workaround: Always run DB operations through Railway API, never direct from local
  • ORM: Mongoose 9.x with strict: false schema (LeadPersona, Lead)
  • Fallback: data/leads.json used when DB not connected (fileReadStore())
Key Queries
Updated
src/services/leadStore.js
  • getAllLeads(): Lead.find({ status: { $ne: "archived" } }) — filters archived leads from dashboard (updated this session)
  • getPendingLeads(): Lead.find({ status: "pending_evaluation" })
  • findLeadByListingUrl(url): Dedup check on ingest
  • saveLead(record): Upsert by id
  • updateLead(id, patch): Partial update via $set
  • deleteLead(id): Hard delete (never for leads with _vAutoAppraisal)
Backfill Scripts
Used
scripts/
  • clear-bad-photos.mjs: Connects to Atlas, finds leads where vehicle.photos[0] matches cloudinary.*photo-0, sets vehicle.photos: [] and primaryPhoto: null
  • fix-photo-thumbnails.mjs: Strips 50x50 thumbnails from existing leads
  • fix-duplicate-photos.mjs: Removes duplicate hero photos across leads
  • expire-dead-leads.mjs: Marks leads with dead listing URLs as _expired: true
  • Note: All scripts must run through Railway API due to Atlas IP restriction
5
Railway Server
Express API + dashboard + webhooks
Server Config
Live
src/dashboard/server.js
  • Framework: Express 5.x (ESM)
  • Auth: Bearer token (never expires), stored in .keys as RAILWAY_TOKEN
  • Cloudinary: Cloud ddcmjicij, falls back to local public/photos/
  • Prod URL: https://dashboard.tolao.co
  • Deploy: Auto-deploy on git push origin main (~60s)
  • WebSocket: /api/voice/stream for Twilio ConversationRelay
Pages
Live
src/dashboard/*.html
  • GET /dashboard — Lead cards, filters, detail panel, lightbox
  • GET /lobsters — Lobster Command Center (automation tasks)
  • GET /roadmap — This page (platform architecture)
  • GET /login — Login form
  • GET / — Redirects to /dashboard
Photo Endpoints
Live
server.js:1797-1865
  • POST /api/photos/download — Downloads CDN photos to local disk
  • POST /api/photos/upload — Multer → Cloudinary upload_stream
  • GET /api/proxy-image — Proxies external image URLs
  • GET /photos/* — Static serve from public/photos/
MethodPathAuthPurpose
GET/api/leadsBearerAll leads (filters archived)
GET/api/leads/:idBearerSingle lead detail
PATCH/api/leads/:idBearerUpdate lead fields
DELETE/api/leads/:idBearerHard delete lead
POST/api/inbound/listingPublicApify CL webhook intake
POST/api/inbound/facebookPublicApify FB webhook intake
POST/api/inbound/smsPublicTwilio inbound SMS
POST/api/inbound/voicePublicTwilio voice → ConversationRelay
POST/api/inbound/emailPublicSendGrid inbound parse
POST/api/leads/:id/read-plateBearerPlateToVIN plate reader
POST/api/leads/:id/detect-dealerBearerAI dealer detection
POST/api/leads/:id/approveBearerManager approval → offer_approved
POST/api/leads/:id/capture-contactBearerSave seller contact info
POST/api/vauto/appraise/:idBearerTrigger Playwright vAuto automation
POST/api/vauto/appraisalBridgeBookmarklet captures vAuto values
GET/api/appraisal-queueBearerLeads pending vAuto appraisal
POST/api/admin/run-scraperBearerTrigger Apify actor run
POST/api/admin/backfill-platesBearerBatch PlateToVIN on all leads
POST/api/admin/purge-staleBearerArchive stale/expired leads
GET/api/system/statsPublicSystem health + lead counts
GET/api/roadmap/tasksPublicRoadmap task list
POST/api/outreach/:id/draftBearerAI-generate outreach message
POST/api/outreach/:id/approveBearerSend approved outreach
6
Worker Pipeline
Assess → Communicate → Close
Pipeline
Built
src/services/pipeline.js
  • Flow: processClaim(record, page) — photos → readiness check → appraisal → opening message
  • Triggered by: Worker after successful claim from VETTX
  • Status: Built, tested, not running (waiting for FB lead flow)
Outreach
Partial
src/services/outreachService.js + messagingAI.js
  • SMS: Twilio from +15106163508SMS_DRY_RUN=true until approval
  • Email: SendGrid configured (free tier), not yet wired to outreach
  • AI drafts: Claude generates personalized messages via POST /api/outreach/:id/draft
  • Approval flow: Draft → manager review → POST /api/outreach/:id/approve → send
Services
Available
src/services/
  • PlateToVIN: ~$75 credit, $0.05/lookup — reads license plates from photos
  • VinAudit: Demo key (VA_DEMO_KEY) — misses some VINs, needs paid upgrade
  • Anthropic: Active — listing analysis, dealer detection, message drafts
  • vAuto: Playwright automation via local Chrome CDP on port 9222
7
Next Steps
In priority order
1
Archive 248 VETTX leads — clear the dashboard
Set status: "archived" on all source:vettx leads so the dashboard starts clean for FB leads. getAllLeads() already filters archived.
curl -s "https://dashboard.tolao.co/api/leads" \ -H "Authorization: Bearer TOKEN" | \ node --input-type=module -e " import fs from 'fs'; const data = JSON.parse(fs.readFileSync('/dev/stdin','utf8')); const leads = Array.isArray(data) ? data : (data.leads || []); const vettx = leads.filter(l => l.source === 'vettx'); console.log('Archiving', vettx.length, 'VETTX leads...'); let ok = 0; for (const l of vettx) { const id = l.id || l._id; const r = await fetch('https://dashboard.tolao.co/api/leads/' + id, { method: 'PATCH', headers: { 'Authorization': 'Bearer TOKEN', 'Content-Type': 'application/json' }, body: JSON.stringify({ status: 'archived' }) }); if (r.ok) ok++; } console.log('Done:', ok, 'archived'); "
2
Wire Apify FB scraper
Schedule the curious_coder/facebook-marketplace actor on Apify with fb-actor-input.json config. Set webhook URL to POST /api/inbound/facebook. Leads flow automatically with photos pre-loaded.
3
Run photo-lobster on remaining ~149 leads
Interceptor is now safe (image-size dimension check, CL URL guard, aspect ratio filter). Run npm run photo-lobster in batches of 20 until all leads have permanent Cloudinary photos.
cd ~/Desktop/vettx-bot && node scripts/photo-lobster.js
4
Enable worker pipeline
Once FB leads are flowing with photos: start src/worker.js to run assess → communicate → close pipeline. Requires Chrome with --remote-debugging-port=9222 for vAuto automation.
cd ~/Desktop/vettx-bot && npm start