Skip to main content

Plan a real-estate AI video tour

Paste a Zillow, Redfin, or Realtor.com listing URL. We'll fetch the photos and listing copy, then plan a 10-clip cinematic tour you can review before any video gets generated.

How it works

  1. 1

    Scrape the listing~10s

    We pull the address, beds/baths/sqft, price, agent contact, marketing remarks, and the full photo set (often 30–50 images) from your listing URL.

  2. 2

    Plan all 4 design philosophies~30s

    Our AI director proposes 4 different cinematic plans — Modern Drone, Luxury Cinema, Editorial Calm, Energetic Listing — so you can pick the one that fits the property.

  3. 3

    Re-stage + color-grade photos~2 min

    Our image-staging AI re-shoots each photo per your selected color grade (none / pop / luxe) and staging style (organic, modern farmhouse, minimalist, etc) in parallel.

  4. 4

    Generate 10 cinematic clips~3 min

    Our video model produces 10 transitions between curated first-frame / last-frame photo pairs. We submit them in parallel and silence the audio at the source.

  5. 5

    Compose a property-specific score~30s

    Our music AI writes a no-vocals instrumental track tuned to your property (address, beds/baths, staging style, motion vibe) so each tour gets a unique soundtrack.

  6. 6

    Splice + bake text overlays~1 min

    We splice the 10 clips into a single master, bake the address and agent card into the opener and closer frames (9:16-safe), and render 16:9, 1:1, and 9:16 outputs.

What you get

  • 16:9 web master

    1920×1080 mp4 ready for your website, YouTube, Facebook, and LinkedIn. Includes the spliced clips, the property-specific score, and baked address / agent text overlays.

  • 1:1 Instagram square

    1080×1080 derivation of the same master. Same cuts, same score, recropped from the center so the hero subject stays in frame on Instagram's square feed.

  • 9:16 Reels / TikTok vertical

    608×1080 vertical master with safe-area-aware text. Plug straight into Reels, TikTok, or YouTube Shorts without re-cutting.

  • Per-platform share pack

    Pre-written captions for Web, YouTube, Facebook, LinkedIn, Instagram, Reels, and TikTok — copy-pasteable straight from the published tour page.

Frequently asked questions

How long does a tour take to generate?

Most tours finish in 5–10 minutes. The slowest stage is video clip generation (~3 minutes for 10 parallel clips); photo preprocessing (~2 minutes for 50 photos) is the second-largest. You will receive an email when the tour is published.

What URLs work?

Zillow, Redfin, and Realtor.com listing URLs work out of the box. We extract photos, marketing remarks, and listing facts; if the page has fewer than 10 photos we fall back to a deeper agentic scrape that loads the lazy gallery.

How much does a tour cost?

Costs are dominated by video generation. A 60-second 10-clip tour at 1080p costs roughly $2–$4 in inference; the music score is free during preview. We display the estimated cost on the plan review page before you confirm.

What aspect ratios are exported?

Every tour ships as a 16:9 web/YouTube/Facebook/LinkedIn master, a 1:1 Instagram square master, and a 9:16 Reels/TikTok vertical master. All three are derived from the same composed master so the cuts and the score line up frame-for-frame.

Can I customize the music?

The score is generated automatically by our music AI, parameterized by your property (address, room counts, staging style) and your selected design philosophy (Modern Drone produces a punchier track, Luxury Cinema a slower restrained one, etc). Manual music swap is on the roadmap.

Will the AI invent rooms or features?

The pipeline pins each clip to two specific source photos as first / last frame and prompts the video model to interpolate motion only. Re-staging removes furniture and re-decorates within the existing architecture; we instruct the model not to invent fixtures or change geometry.

What text overlays appear on the video?

You choose: the property address (baked into the opening shot) and the listing agent contact card (baked into the closing shot) are both optional. Text is constrained to the central 9:16 safe-area column so it never clips on a vertical render.