What This Article Includes

In this article, I’ll share everything you need to recreate the Vision Pro–style 3D interactive website I built.

You’ll get:

  • The exact prompt used to generate the Vision Pro–style 3D interactive web application

  • The .gltf 3D product model with animation used in the demo

  • The full source code of the website, including the 3D scene, hand-tracking logic, and UI components

You’ll be able to study, modify, and reuse each part to showcase any product in a high-fidelity, interactive 3D experience.

The purpose of this app is straightforward:

To showcase any product in a high-fidelity 3D view and allow users to interact with it naturally.

Users can rotate, zoom, and explore a product using hand gestures via their webcam, creating a Vision Pro–style interaction that runs directly in the browser.

This tutorial includes:

  • The core prompt that generates ~70% of the application

  • A step-by-step workflow for completing the remaining 30%

  • Notes on adding features, fixing bugs, and improving UI

  • A live preview and test 3D model

Live App Preview

Important Note: The “70% Ready” Reality

The prompt below typically generates an application that is approximately 70% complete.

The remaining work is done by:

  • Prompting the AI to fix bugs

  • Adding additional product features

  • Refining UI and performance

This is expected and normal.

The workflow is iterative, not one-shot.

Step 1: Generate the Core Application (Primary Prompt)

This prompt establishes the full architecture: frontend framework, 3D engine, hand tracking, UI, and interaction logic.

Primary Prompt

Role: Expert Creative Technologist and Frontend Developer.

Task: Create a single-page immersive web application that features a high-fidelity 3D model viewer controlled by hand gestures via the webcam.

Design Aesthetic:

Vibe: Similar to igloo.inc or Apple's product pages, minimalist, premium, smooth motion, and highly responsive.

Background: Deep dark grey/black flexible gradient or blurred ambient lights to make the 3D model pop.

Typography: Clean sans-serif fonts (Inter or SF Pro).

Core Tech Stack:

Framework: React (Next.js App Router preferred).

3D Engine: React Three Fiber (R3F) + Drei.

Styling: Tailwind CSS.

Computer Vision: Google MediaPipe Hands (specifically @mediapipe/tasks-vision) or react-webcam with a hand tracking model.

Functional Requirements:

3D Scene:

Initialize a canvas with a realistic environment map (lighting).

Load a placeholder 3D model (a simple geometry for now, but configured to accept a .glb or .gltf file of an Apple Vision Pro later).

The model should float in the center with a gentle idle animation (sine wave hovering).

Webcam & Hand Tracking:

Ask for camera permissions immediately on load with a sleek UI overlay.

Display a small, stylized video feed in the corner (circular mask) so the user can see their hand.

Detect hand landmarks in real-time.

This prompt produces a strong, structured baseline.

Step 2: Prepare the 3D Product Model

The application currently works best with .gltf or .glb files.

You can convert models using tools such as:

  • Blender

  • Cinema 4D

  • Maya

Ensure:

  • Textures are correctly applied

  • Materials are embedded or referenced properly

  • Scale and orientation are correct

Test 3D Model (Vision Pro)

Step 3: Upload the 3D Model

I used Gemini 3 Pro on emergent.sh, which allows direct upload of the .gltf file into the project.

This avoids external storage links and simplifies iteration.

Step 4: Extend Functionality Through Prompts

Once the core viewer works, additional features can be added conversationally.

Examples:

  • Sliders to control X / Y / Z position

  • UI panels for product parameters

  • Additional camera behaviors

  • Animation controls

These features can usually be added in seconds with clear prompts.

Step 5: Build a Dedicated Product Environment

Beyond a simple viewer, the AI can generate:

  • Floor planes

  • Soft shadows and contact shadows

  • HDR environment lighting

  • Ambient backgrounds

This transforms the app into a true product showcase environment.

Step 6: Debug and Refine

When issues occur:

  • Paste the error output

  • Or provide a screenshot of the problem

In most cases, the AI can diagnose and fix issues automatically while preserving structure and performance.

Final Notes

This approach shifts development from writing code to designing interactions and experiences.

You define:

  • How the product should look

  • How users should interact with it

  • How it should feel

The AI handles implementation, iteration, and refinement.