Capture Caddy

When AI Meets the Rhythm of Human Creativity

A 2-week exploratory proof of concept testing whether voice-guided AI coaching could help part-time photographers rebuild posing confidence between shoots—and discovering that preparation, not performance, is where AI creates value.

Overview

Capture Caddy is a Custom GPT prototype designed to help part-time photographers regain posing confidence after weeks or months between shoots. When that much time passes, muscle memory fades—and searching Pinterest boards or reference notes mid-session disrupts presence, connection, and momentum.

The prototype explored whether tailored posing ideas, composition guidance, and camera settings—delivered conversationally and hands-free—could support confidence without interrupting the creative rhythm of a shoot.

The Stakes

This was my first applied prompt-engineering project using Custom GPTs. The goal wasn't to ship a product—it was to understand how AI fits into creative work, when humans are cognitively available to learn, and where technical solutions collide with emotional and social dynamics.

The project answered a critical product question: When does AI enhance creative work vs. interrupt it?

QUICK FACTS

Role: Product Manager & AI Practitioner (Solo)

Duration: 2 weeks (Exploratory POC)

Platform: OpenAI Custom GPT

Validation Method: Real-world shoot testing (desk simulation, live outdoor session, pre-shoot rehearsal)

Key Finding: Timing is a product constraint, not a technical one

Pivot: Real-time coaching → Pre-shoot preparation tool

Status: Proof of concept validated core insight about AI timing in creative workflows

My Role

Product Manager & AI Practitioner | Solo Project
Designed, built, tested, and iterated on a 2-week exploratory proof of concept.



What I Did

Strategy & Prototyping

Defined riskiest assumption: voice-guided AI can fit into live creative sessions without disrupting flow. Designed Custom GPT behavior for posing guidance, settings recommendations, and tone calibration—prioritizing trust and reducing cognitive overwhelm over technical novelty.

Real-World Testing & Pivot

Conducted live shoot trials in authentic creative context. Observed latency-driven rapport breaks and attention switching costs when subjects needed to focus on AI instead of photographer. Reframed product from live coaching to pre-shoot confidence builder (pose packs, reminders, setup walkthroughs) based on testing evidence.

Human-Centered Design

Prioritized hands-free constraints, subject comfort, and psychological safety. Discovered that preparation support amplifies creative flow while live intervention interrupts it—proving where AI adds value vs. creates friction.

Why This Project Mattered to Me

I wanted to test whether AI could fit into live creative workflows or if timing and context fundamentally shape usefulness. This was my chance to practice prompt engineering in real human-centered scenarios, develop evaluation methodology for AI interruption vs. amplification, and learn when preparation beats real-time assistance—even when real-time seems more innovative.

The Problem

This wasn't an abstract design challenge—it was something I experienced firsthand as a part-time photographer.

The Confidence Gap

As a part-time photographer, weeks or months can pass between shoots, and each new session feels like rebuilding muscle memory from scratch.

That gap sets off a predictable chain reaction:

  1. Long time between shoots → Loss of posing fluency

  2. Loss of fluency → Hesitation and awkward pauses during session

  3. Hesitation → Subjects feel unsure or disconnected

  4. Disconnection → Lower-quality photos and fewer usable images

It wasn't a lack of technical skill—it was timing, confidence, and momentum.

Why Existing Solutions Weren't Enough

I already kept posing boards, Pinterest screenshots, and reference notes—but none of them worked in the moment.

Stopping to reference them broke:

  • Eye contact with subjects

  • Emotional presence and connection

  • Natural rhythm between photographer and subject

Portrait photography depends on that connection. Breaking it to look at references undermines the very thing that makes portraits work.

What I Actually Needed

Not just more posing ideas—but a way to quickly get back up to speed, stay confident, and stay connected throughout the session.

Product Hypothesis

Riskiest Assumption

Voice-guided, hands-free posing support can increase photographer confidence during live sessions without disrupting presence or connection.

If True, Then:

  • It would reduce the likelihood of my mind going blank mid-shoot (coaching fallback available)

  • I could stay fully present and emotionally connected with subjects (hands-free = no visual distraction)

  • Sessions would feel smoother, with fewer pauses or momentum breaks

  • Subjects would remain relaxed, engaged, and comfortable

  • Final photos would include more diverse poses and authentic expressions

  • I'd have more strong images to deliver—giving clients better options

  • Ramp-up time between sporadic shoots would shrink, easing pre-shoot anxiety

This confidence → connection → better photos hypothesis became the foundation for building Capture Caddy.

Product Strategy: Build to Learn

Because this was my first time building a GPT assistant, I chose to build and test directly rather than validate through user research upfront.

Why This Approach

  • First time building with Custom GPTs → Hands-on learning was fastest path to understanding

  • Single user (me) → Could test immediately in authentic context without recruitment overhead

  • Exploratory goal → Iteration speed mattered more than validation rigor

  • Domain expertise → 10 years photography experience meant I could evaluate quality without external validation

Tradeoff: Risk of solving my problem vs. a widespread one—but speed-to-insight was priority for a learning project.

How I Tested

Rather than simulated scenarios, I evaluated through real-world use:

Test 1: Desk Simulation

  • Confirmed pose quality, clarity, and voice responsiveness

  • Validated technical viability in controlled environment

Test 2: Live Outdoor Shoot

  • Observed latency impact on rapport and connection

  • Measured attention switching costs in real creative context

  • Identified where AI interrupted vs. supported flow

Test 3: Pre-Shoot Rehearsal

  • Tested whether practice (not performance) was the right intervention point

  • Evaluated shared rhythm with subject in low-stakes environment

What I Evaluated

Rather than metrics, I focused on experiential quality:

  • Did it increase posing confidence? (Could I direct more fluidly?)

  • Did it preserve presence and connection? (Did subjects stay engaged?)

  • Did it support creative flow—or interrupt it? (Was momentum maintained?)

These lived tests surfaced the insight that ultimately reshaped the product direction.

Initial Design

Capture Caddy was intentionally simple—a Custom GPT designed to offer posing support without pulling attention away from the subject.

Core Features

1. Personalized Pose Suggestions

The assistant generates poses tailored to real people, environments, lighting, and mood—not generic templates.

Each pose includes:

  • Visual reference or detailed description

  • Step-by-step positioning guidance

  • Expression and interaction prompts

  • Suggested camera settings (lens, aperture, shutter speed)

Design Philosophy: Context-aware suggestions that feel relevant, not algorithmic.

2. Reverse-Engineer Mode

Upload a reference image and Capture Caddy will:

  • Break down composition, framing, and body angles

  • Recommend lens and exposure choices

  • Guide toward recreating the feeling of the image with your own subjects

Use Case: When you see a pose you love but don't know how to direct it.

3. Voice-Guided Interaction

Using ChatGPT's voice mode, photographers can request ideas hands-free:

Commands:

  • "Give me three poses."

  • "Next."

  • "Repeat."

  • "Slow down."

Intent: Keep hands on the camera and eyes on the subject—not on a phone screen.

4. Prompt Architecture

Built through iterative prompt design—no fine-tuning or external data.

Structured for:

  • Clarity (simple, actionable direction)

  • Emotional sensitivity (reading subject comfort)

  • Low cognitive load (minimal decision-making required)

This was a UX-driven exploration, not a technical showcase.

Testing in the Real World

Test 1 — The Desk "Vibe Check"

Before heading into the field, I tested the prototype indoors at my desk.

Results:

✅ Pose suggestions were accurate and creative

✅ Lens recommendations made sense for described scenarios

✅ Voice commands responded smoothly

On paper, everything worked.

But confidence in a controlled environment ≠ connection in a real one.

I needed to test where it actually mattered: during a live shoot.

Test 2 — Live Outdoor Shoot (The Failure Point)

I tested Capture Caddy during a portrait session with my daughter.

What Happened:

Latency that felt acceptable indoors became painfully slow outdoors.

Me: "Hi Capture Caddy."
[pause]
[nothing happens]
[I wait, focused on the AI]
Meanwhile: My daughter waits—unsure, disengaged, no longer in the moment.

The Moment of Truth:

While I focused on the AI, she waited. I had broken the unspoken rule of portrait photography:

Never disconnect from your subject.

The tool meant to support confidence was now creating:

  • Hesitation (technical failure undermining trust)

  • Awkwardness (unclear why we were pausing)

  • Emotional distance (broken connection with subject)

The Critical Observation

I wasn't solving the wrong problem—I was solving the right problem at the wrong time.

This insight reshaped the entire product direction.

Understanding the Real Constraint

The problem wasn't technical—it was cognitive and social.

During a portrait session, attention is fully occupied:

  • Reading body language and emotional state

  • Building and maintaining trust

  • Adjusting composition and light in real-time

  • Keeping subjects comfortable and confident

Trying to learn new poses while managing all that is cognitively unrealistic—and emotionally costly.

Real-time AI competes with presence. Learning can't happen during performance.

Test 3 — Practice, Not Performance (The Breakthrough)

To test the reframed idea, I treated Capture Caddy as a pre-shoot rehearsal tool instead of an on-set coach.

What Changed:

  • Set the assistant to speaker mode (no earbuds, no isolation)

  • Practiced poses with my daughter in our living room

  • Let her hear the prompts too—direction felt shared, not mediated

  • Moved slowly, without time pressure or performance stakes

  • Treated it as rehearsal, so perfection didn't matter

The shift was immediate.

What This Revealed

What mattered most wasn't just the shared rhythm—it was the chance to practice in a low-stress environment.

When I entered the real shoot after practicing:

  • I already knew the pose flow

  • Confidence was rebuilt before performance pressure

  • My daughter knew what to expect

  • We were already connected

Core Insight: AI isn't most valuable during creative performance—it's most valuable before it.

Key Insights

This project wasn't about choosing between creative support and human connection—it was about understanding when learning can actually happen.

1. Latency Exposed the Social Fragility

Even brief delays disrupted flow and rapport.
Each pause created uncertainty—for me and for my subject.
What looked like a tech limitation revealed a human one: Creative work depends on unbroken connection, and even small interruptions compound into emotional distance.

2. Learning Can’t Happen During Shoots

During a portrait session, cognitive capacity is maxed out managing:

  • Technical execution (camera settings, composition, light)
  • Emotional presence (reading subjects, building trust)
  • Creative direction (posing, expression, interaction)

There's no bandwidth left for learning new information.
Trying to absorb new poses during a shoot is like trying to learn piano chords during a concert performance.

3. Timing Is a Product Constraint, Not a Technical One

The constraint wasn't AI capability—it was when humans can actually absorb information.
Real-time AI isn't wrong—it's just the wrong moment.
The right moment is preparation, not performance.

There's no bandwidth left for learning new information.
Trying to absorb new poses during a shoot is like trying to learn piano chords during a concert performance.

4. Core Product Insight

AI should prepare people for flow, not interrupt it.
Confidence begins before the shoot:
Skill refresh → Reduced anxiety → Smoother direction → Better images
Once Capture Caddy shifted into a pre-shoot practice tool, the experience aligned with how photographers—and humans—actually learn.

There's no bandwidth left for learning new information.
Trying to absorb new poses during a shoot is like trying to learn piano chords during a concert performance.

The Product Decision

The early prototype proved something unexpected: real-time AI wasn't the wrong solution—it was being used at the wrong moment.

Evidence from Live Testing

  • ✅ Latency broke rapport even when prompts were relevant

  • ✅ Divided attention created uncertainty for both photographer and subject

  • ✅ Learning couldn't happen during performance—cognitive load was already maxed

Decision: Pivot Product Direction

From: Real-time on-set coaching

To: Pre-shoot preparation tool

Rationale: Put AI where learning can actually be absorbed, not where it competes with presence.

New Value Proposition

Build confidence before the shoot, not during it.

Redesigned Product Vision

Before the Shoot (New Intended Use)

  • Explore posing ideas without time pressure

  • Practice directing transitions and expressions

  • Build shared rhythm with subjects ahead of time

  • Reduce ramp-up anxiety after long breaks between sessions

  • Create mental muscle memory that activates during the real shoot

During the Shoot

  • No earbuds

  • No voice commands

  • No divided attention

  • Full presence with the subject

After the Shoot

  • Reflect on what worked

  • Refine approach for next session

  • Continue building confidence through iteration

Instead of competing with the moment, Capture Caddy now prepares photographers to enter it confidently.

Key Outcomes

  • Built and tested my first Custom GPT assistant end-to-end (hands-on AI prototyping experience)

  • Identified that timing, not functionality, was the core constraint (product insight applicable beyond photography)

  • Pivoted from live AI coaching → pre-shoot preparation tool (evidence-based product decision)

  • Developed repeatable approach for evaluating AI in human-centered workflows (methodology for future AI projects)

  • Strengthened posing fluency and reduced pre-shoot anxiety through rehearsal (validated that preparation approach works)

  • Clarified foundational principle: AI should prepare people for flow, not interrupt it (design philosophy for AI products)

What This Taught Me About Building AI Products

1. Context Matters More Than Capability

AI performance in isolation ≠ AI usefulness in real workflows.

  • Learning: The best AI doesn't just work technically—it fits where humans can actually use it. Evaluate in authentic contexts, not controlled demos.

  • Application: Test AI products in messy real-world conditions early. Desk tests lie.

2. Timing Is a Product Constraint

The right solution at the wrong moment is the wrong solution.

  • Learning: Understand the user's cognitive and emotional state at the intervention point. Are they receptive to learning, or fully occupied with performance?

  • Application: Map user workflows to identify when they're cognitively available vs. maxed out. Design AI interventions for available moments.

3. Build-to-Learn Accelerates Understanding

Prototyping revealed insights research alone wouldn't have surfaced.

  • Learning: For exploratory AI projects, building and testing in real contexts generates faster, deeper insights than hypothetical user interviews.

  • Application: When learning goals outweigh shipping goals, bias toward rapid prototyping with real-world testing over extensive upfront validation.

4. Honest Iteration Creates Better Products

The "failure" wasn't a dead end—it was a redirect toward the real solution.

  • Learning: Being willing to pivot based on evidence (rather than defending initial assumptions) unlocks breakthrough insights.

  • Application: Design decision gates into testing. Define what evidence would make you pivot vs. double down. Let data rewrite assumptions.

5. Measure Against Human Constraints, Not Technical Ones

Success isn't "does AI work?"—it's "does AI fit where humans can actually use it?"

  • Learning: Evaluate AI products against human cognitive capacity, emotional state, and social context—not just technical performance.

  • Application: Frame success metrics around human experience (confidence, flow, connection) not just AI performance (latency, accuracy).

What I'd Do Differently

If I were to build Capture Caddy as a real product:

1. Validate Problem Scope with 10-15 Photographers First

Why: I solved my problem, but is this widespread?

What I'd Ask:

  • How often do you shoot (frequency matters for confidence gaps)?

  • What happens when weeks/months pass between shoots?

  • How do you currently prep for shoots?

  • Where does confidence break down during sessions?

Impact: Understand if this is "Gloria's problem" or "part-time photographer's problem" before building.

2. Test Multiple Intervention Points Early

Why: I assumed live coaching was the right moment without testing alternatives.

What I'd Test:

  • Pre-shoot preparation (what I eventually discovered)

  • During shoot (what I initially built)

  • Post-shoot review (reflection and improvement)

How: Run parallel tests of all three timing approaches in week 1.

Impact: Discover optimal intervention point faster, avoid wasted development on wrong timing.

3. Build Decision Gates into Testing

Why: I iterated organically without clear pivot criteria.

What I'd Define Upfront:

  • What results would make me double down on live coaching?

  • What findings would trigger exploration of alternative timings?

  • What evidence would suggest stopping entirely?

Example Gates:

  • If latency < 2 seconds AND subjects stay engaged → keep live coaching

  • If subjects disengage during waits → test pre-shoot preparation

  • If neither timing works → stop and reconsider whether AI is right approach

Impact: Faster, more confident decisions with less attachment to initial assumptions.

4. Define Clearer Success Metrics Upfront

Why: I evaluated experientially, which was appropriate for exploration—but lacked quantifiable targets.

What I'd Measure:

Primary Metrics:

  • Photographer confidence (1-10 scale, before/after)

  • Posing fluency (# poses directed smoothly per 30 min)

  • Subject comfort (measured via post-shoot survey)

Secondary Metrics:

  • Ramp-up time (minutes until first strong shot)

  • Image quality (# strong images delivered per session)

  • Usable variety (# different pose types per session)

Impact: Clearer evidence for product decisions; easier to communicate value to stakeholders.

5. Consider Business Model Implications

Why: Built as learning project without considering monetization or distribution.

What I'd Explore:

Pricing Models:

  • Subscription ($10-20/month for part-time photographers)

  • Freemium (basic poses free, advanced features paid)

  • Integration (partner with photography education platforms)

Distribution:

  • Standalone GPT (current approach)

  • Mobile app (better for on-location use)

  • Integration with existing photography tools (Adobe, Lightroom)

Market Sizing:

  • How many part-time photographers exist?

  • What's their willingness to pay for confidence tools?

  • What alternatives do they currently use (and at what cost)?

Impact: Understand if there's viable business opportunity, or if this should remain a free community tool.

If I Were to Productionize This

Phase 1: Validate Problem Broadly (Month 1)

Goal: Confirm this problem exists beyond me

Activities:

  • Interview 15-20 part-time photographers

  • Survey broader photography communities (Reddit, forums)

  • Identify common confidence gaps and current solutions

  • Test willingness to pay

Success Criteria: 60%+ of part-time photographers report confidence gaps after breaks; no existing solution addresses it well

Phase 2: Build Pre-Shoot Preparation MVP (Month 2-3)

Goal: Validate that preparation approach solves the confidence problem

Features:

  • Pose library by scenario (family, couples, portraits, etc.)

  • Practice mode with voice guidance

  • Camera settings recommendations

  • Simple session prep checklist

What's NOT Included:

  • Live coaching (already know it doesn't work)

  • Advanced editing features

  • Social sharing

  • Complex customization

Success Metrics:

  • 70%+ report increased confidence after using prep tool

  • 50%+ use it before majority of shoots (stickiness)

  • 4+ rating on usefulness (1-5 scale)

Phase 3: Iterate Based on Usage (Month 4-6)

Goal: Refine based on real usage patterns

Instrumentation:

  • Which pose categories used most?

  • How long do practice sessions last?

  • What features are ignored?

  • Where do users drop off?

Potential Additions (data-driven):

  • Pre-built pose sequences for specific scenarios

  • Integration with calendar (reminder to practice before scheduled shoot)

  • Subject-side experience (practice together before real shoot)

Success Metrics

Primary:

  • Photographer confidence increase (before/after scale)

  • Adoption rate among part-time photographers (% who use regularly)

  • Session quality improvement (more strong images delivered)

Secondary:

  • Time to first strong shot (ramp-up speed)

  • Pose variety per session (creative range)

  • Subject satisfaction (comfort and enjoyment)

Long-term:

  • Retention (do photographers keep using it?)

  • Word-of-mouth growth (do they recommend it?)

  • Business viability (if monetized, does revenue sustain development?)

Final Reflection

What This Project Taught Me

This POC wasn't about shipping a product—it was about understanding where AI belongs in creative work.

The insight that emerged—AI should prepare people for flow, not interrupt it—applies far beyond photography:

  • Coaching and mentorship: Prep materials before sessions, not during

  • Presentations: Practice support before stage, not real-time prompts

  • Creative work: Reference and inspiration before execution, not mid-flow

  • Complex tasks: Learning and skill-building separate from performance

The Bigger Principle

Technology shouldn't replace human connection—it should protect and prepare it.

The best AI products don't make experts unnecessary—they help people become more expert themselves.

What I Learned About
My Own Product Approach

This project reinforced beliefs that now guide how I build AI products:

1. Start with Real Problems, Not Technology

I had AI capability (Custom GPT) and looked for problems it could solve. Better: start with user problems and evaluate if AI is the right solution.

2. Test in Authentic Context Early

Desk tests validated nothing meaningful. Real-world testing in 20 minutes revealed what hours of simulation couldn't.

3. Be Willing to Kill Your Darlings

The voice-guided live coaching idea was clever—but wrong. Pivoting required letting go of what I thought was cool in favor of what actually worked.

4. Cognitive and Social Constraints Matter More Than Technical Ones

The limitation wasn't AI capability—it was human capacity. Great products respect human constraints, not just technical possibilities.

5. Learning Projects Have Different Success Criteria

This wasn't about user adoption or revenue—it was about understanding AI's role in creative workflows. That clarity is the foundation for better future product decisions.

Previous
Previous

Anne-bot