An automated real estate acquisition platform we built and operated for a client — processed 2 million+ properties end-to-end, from MLS listing ingestion through cross-vendor schema unification, address & entity normalization, dedup across re-list/withdraw/relist cycles, comp-based ARV, repair estimation, margin analysis, and automatic offer generation with contracts attached. Built on a 2TB+ PostgreSQL store with sub-second queries, and a custom PyTorch neural network for cross-LLM scoring normalization. The client's platform is no longer running live; the engineering — schema unification across 7+ MLS providers, MLS→contract automation, address/entity linking, and the cross-LLM scoring NN — is the basis for the real-estate-data work we do today.
Automating entire flow: MLS listing → signed contract with zero human touch
End-to-end pipeline: MLS → LLM → Comps → ARV → Repairs → Margins → Contracts → Email
Managing 2TB of calculated field data with sub-second query performance
PostgreSQL optimization: partitioning, composite indexes, query optimization for 2TB+
Incompatible scoring between LLMs (GPT-4 vs Claude) AND across different prompts
Novel PyTorch neural network mapping LLM outputs across models AND prompts (<100ms inference)
Handling 7+ different MLS systems with wildly inconsistent data formats
Unified MLS provider abstraction layer with fallback sweep mechanisms