Our Approach to Building 'Home Intelligence'
Basically building the worlds best model / system for helping homeowners, buyers, sellers and agents make decisions on home conditions. At the end, the test will be simple, compare our results.
This intelligence is currently being built for two distinct products with proven market demand.
Home intelligence agents. Domalytx provides AI-powered consultation for residential property decisions. The system serves home buyers evaluating purchases, sellers preparing listings, real estate agents advising clients, and homeowners managing their properties. Each of these users faces foundation and structural questions that historically required hiring an engineer or trusting unqualified opinions. We’ve proven people will pay for expert-level guidance: 830 customers currently pay $200 per consultation for access to this intelligence. The fine-tuned model replicates the expertise of our licensed engineers across all agentic flows, making that expertise available at scale without requiring a human engineer on every interaction.
AI-first inspection software. The second application is inspection report writing software for the home inspection industry. This runs on an iPhone, recognizes images of foundation conditions, and generates professional write-ups incorporating location context, client information, property history, and all relevant background data. The system pulls from the same fine-tuned intelligence to produce report language that meets professional engineering standards. The home inspection industry lacks software that combines computer vision, contextual awareness, and expert-level report generation. This becomes the world’s best report writing software for the category, and the market for inspection software across the industry is substantial.
Both products depend on the same underlying capability: AI that reasons about residential foundation and structural conditions the way an experienced Professional Engineer does. That capability doesn’t exist in general-purpose models. We have to build it.
The Training Data Problem
Large language models learn from internet-scale text. The available corpus for residential foundation inspection is dominated by contractor marketing content biased toward selling repairs, DIY forum discussions with unverified information and variable competence, home inspector training materials written for certification rather than engineering rigor, and SEO content optimized for traffic rather than accuracy.
What’s missing from the training data is substantial: peer-reviewed residential foundation engineering literature (academia focuses on commercial and infrastructure projects), regional geotechnical knowledge that exists in practice but was never published, proprietary inspection data from engineering firms, and the judgment patterns developed by licensed Professional Engineers across thousands of inspections.
The models learned what was available. The authoritative information for this domain was never on the public internet. This isn’t a limitation of the models themselves. It’s a limitation of what existed in their training corpus.
Model Instability as Business Risk
Foundation model providers regularly modify their models without notice. GPT-4 degraded noticeably within six months of launch. Users documented tasks that worked perfectly beginning to fail. OpenAI initially denied changes, then acknowledged them. The same pattern has repeated with Claude, Gemini, and other frontier models.
For a consumer chatbot, inconsistency is an annoyance. For a business producing structural engineering assessments, it creates concrete operational problems. Prompts that generated correctly formatted reports suddenly produce different output. Terminology shifts without warning. Quality checks that passed in January fail in March with no changes on our end.
AI labs optimize for their priorities: cost reduction, new capabilities, safety guardrails, competitive positioning. Maintaining exact behavior for a specific vertical application is not among those priorities, and there’s no reason to expect it ever will be. Building on third-party foundations means inheriting their decisions. We need control over ours.
Domain Characteristics Favor Stability Over Innovation
Foundation inspection knowledge has specific properties that distinguish it from domains where constant model improvement provides value. The physics of soil mechanics and structural behavior doesn’t change. Failure modes including settlement, heave, lateral movement, and bearing capacity issues are well understood. Building codes evolve slowly and predictably. Expert reasoning patterns have been refined over decades of practice.
We don’t need a model that can write poetry, generate images, or discuss philosophy. We need consistent application of established engineering principles to specific conditions, producing reports that meet professional standards. Model “improvements” introduce variability without corresponding benefit for our use case. A model that does our specific task correctly every time is more valuable than one that does a thousand tasks with unpredictable quality.
Edge Deployment Requirement
Field inspections often occur in areas with limited connectivity. The inspection software must function on-site where internet service is unreliable or unavailable. The system must run on edge devices, essentially on a mobile phone, without dependence on cloud infrastructure. This requires a smaller, purpose-built model rather than API calls to frontier models.
Proprietary Data Assets
The constraint on fine-tuning has shifted. Two years ago, it required dedicated ML teams and significant compute resources. Now the barrier is having the right data. We have it.
Remote consultation transcripts number approximately 500 sessions where an expert engineer worked directly with clients. Each follows a consistent pattern: client presents their problem, expert reviews relevant data sources, diagnostic process and reasoning is explained, recommendations are provided with supporting logic. These transcripts capture the full reasoning chain from problem statement to recommendation.
Foundation inspection reports from BEAR Engineering total 7,000 documents representing ground truth for conditions in the Bay Area. These PDF reports can be converted to labeled datasets with structured condition data. The associated images provide training material for computer vision applications in crack identification, settlement patterns, and condition assessment.
On-site engineer discussions are recordings of real-time diagnostic processes including observations, questions, and explanations that don’t appear in formal reports. These capture the investigative process as it happens, not just the conclusions.
Internal Slack channel discussions spanning years of engineering team conversations show how experts work through edge cases, debate interpretations, and arrive at conclusions. This collaborative diagnostic reasoning reveals decision-making processes that formal reports don’t surface.
Client email correspondence contains technical explanations, follow-up questions, and clarifications that expand on inspection findings. This material captures how experts communicate complex issues to non-experts and address the specific concerns homeowners and buyers raise.
Proprietary geospatial data aggregated through Domalytx includes foundation cracking patterns by area, settlement data, groundwater conditions, and other location-specific risk factors. This enables predictions based on property location, not just visible conditions.
Age-based condition patterns provide systematic data on conditions by construction era. What is common for a 1920s foundation differs substantially from a 1980s slab. This institutional knowledge is now systematized and available for training.
What Fine-Tuning Actually Does
There’s confusion about what fine-tuning accomplishes that’s worth clarifying. Fine-tuning does not teach a model new facts. Retrieval systems handle factual knowledge. Fine-tuning teaches behavior: how to structure outputs, which terminology to use, what reasoning patterns to apply, what format professional reports should follow.
This distinction matters because it’s exactly what we need. We’re not trying to give the model new information about soil mechanics. We’re trying to make it produce outputs that match the standards of a licensed engineering firm. That means correct terminology used precisely (”differential settlement” rather than “uneven settling”), proper severity classifications based on observed conditions, standardized report sections that meet professional requirements, and reasoning patterns that reflect engineering judgment rather than contractor sales framing.
Research shows fine-tuning improves format consistency by over 90% with just a few hundred examples. We have thousands.
System Architecture
The production system will combine multiple approaches, each handling what it does best.
Fine-tuning handles report style and format, terminology consistency, severity classification patterns, and the voice of professional engineering assessment. These are behavioral patterns that should remain stable once established.
Retrieval-augmented generation handles current building codes, jurisdiction-specific requirements, project-specific documents, and anything that changes or requires source citation. RAG grounds responses in verified documentation rather than relying solely on base model knowledge.
Human review handles final quality assurance, edge cases, and professional liability. This isn’t about replacing engineers. It’s about scaling the expertise that already exists within the firm so that every report reflects fifteen years of accumulated knowledge, regardless of which engineer reviews it.
Economics
The economics of fine-tuning have changed dramatically. OpenAI now offers fine-tuning at $3 per million training tokens. A complete proof-of-concept costs under $3,000. Open-source approaches using techniques like LoRA and QLoRA have reduced costs further. FinGPT demonstrated full financial model fine-tuning for under $300.
The barrier is no longer budget or technical complexity. It’s having the right data and the domain expertise to validate outputs. Both of those we have.
Development Plan
The approach is structured and testable. First phase: convert several hundred of the best reports into training data, fine-tune on OpenAI’s platform, and measure whether outputs match quality standards more consistently than base models. The consultation transcripts provide natural test cases since they contain both the inputs (client problems, available data) and the outputs (expert recommendations with reasoning).
If results validate the approach, build the full production pipeline: hybrid architecture combining fine-tuned generation with retrieval-augmented verification, systematic evaluation frameworks to measure accuracy against expert recommendations, and infrastructure for continuous improvement as additional expert examples are added to the training corpus.
The base models provided a starting point. Our data and expertise will get us the rest of the way.


Couldn't agree more. The intentionality behind leveraging fine-tuned models for such critical, domain-specific expertise like structural engineering is truly insightful, demonstrating AI's immense practical scalability. Looking ahead, how do you see the roadmap for these 'home intelligence agents' to not just consult, but actively interface with predictive maintenance or even construction planning systems, and what a fantastic job making this level of expertise so widely accesible.