Search Ai
Sole UX/UI designer for DIRECTV's conversational AI search experience. Targeting 85%+ task success and content access in under 30 seconds.
"Nobody says 'I'd like to watch a prestige drama with ensemble casting and themes of moral ambiguity.' They say 'something like The Bear but less stressful.' That's a real search query. The old system couldn't touch it."
The Problem
Traditional DIRECTV search was an exact-match system living in a fuzzy-intent world. It required users to know precisely what they wanted and spell it correctly — which is rarely how people decide what to watch.
People search by mood, by association, by half-remembered recommendations. Every time the system failed to meet them there, it was a small door closing. Multiply that across millions of sessions and you see it not as a UX inconvenience but as a strategic problem. We set out to design a conversational AI experience that could close that gap without introducing new ones.
My Role
I was the sole UX/UI designer on the project, leading the full conversational experience: flow design, response structure, interaction model, error states, and guardrail definition. I partnered closely with an IX designer throughout, and was supported by a UX lead, UI lead, lead content designer, lead researcher, and our UX directors. We've completed two rounds of usability testing ahead of the 2027 launch. Both rounds changed something meaningful about how the system behaves.
The Strengths Worth Naming
The system is built on a rich foundation of personalization. DIRECTV's app already knows a lot about you — what you watch, where you left off, which external apps you've connected, your favorite teams and channels. The AI draws on all of that context to make recommendations feel relevant rather than random.
The real tradeoff is speed versus accuracy. Existing search generates results instantaneously as you type or speak. The AI assistant may be marginally slower — but it's doing more: interpreting intent, weighing context, ranking results by relevance. The bet is that slightly slower and meaningfully more accurate beats instant and often wrong. Test results at 6.6/7 for search relevance suggest that bet is paying off.
Hallucination risk, while a concern in AI broadly, is significantly mitigated. TiVo search has ingested DIRECTV's full library plus partnered SVOD services, which means the system only ever surfaces content that actually exists and is available.
Test 1 — November 2025
Five tasks across ten participants. Three tasks reached 100% completion. One reached 90%. One reached 80%. Task success was strong — but the details told us exactly where to focus.
Voice was dominant by a wide margin. Seven out of ten explicitly preferred speaking; six out of ten chose the mic even when typing was equally available. The on-screen keyboard worked but felt slow — five out of ten disliked it, none of which was surprising in a lean-back environment.
Search relevance scored 6.8/7. But participants wanted more context around results, especially for sports — dates, opponents, channels, spoiler controls. Relevant wasn't quite enough. They wanted results that felt intelligent about what they were actually trying to do.
Personalization landed well when it felt earned. Eight out of ten valued the recommendations carousel when it felt tailored. Three out of ten said they'd rather it not exist if recommendations felt generic. A feature that doesn't feel personal is worse than no feature at all.
Eighty percent said they'd be open to replacing current search entirely.
The Autofill Problem — and How We Fixed It
The biggest friction point was autofill. Every participant who used it experienced the same issue: selecting a suggestion immediately submitted the search when they expected it to keep building their query. They were mid-thought. The system finished the sentence for them and moved on.
This revealed a fundamental mismatch. Users wanted a collaborator, not a shortcut.
For Test 2, I redesigned autofill as a prompt constructor. Selecting a suggestion now appends it to the input field rather than submitting — giving users the ability to keep refining, keep thinking. That single change, born from watching people feel interrupted by a product trying to help them, is one of the clearest examples I have of why testing isn't just validation. It's discovery.
A Design Disagreement Worth Naming
Partial query display — the behavior of shortening long voice queries on screen after submission — generated internal debate. Most participants were comfortable with it. A minority wanted full visibility to verify or edit their original question.
The lead UX designer felt this was significant enough to address directly. My position was different: what matters most is whether the results reflect what the user intended. If the AI understood the request correctly, the outcome is the verification. If it didn't, the user course-corrects through the next query. We designed accordingly — prioritizing result clarity over query transparency — and Test 2 supported that direction.
Test 2 — February 2026
Clear improvement across the board. Three tasks hit 100% completion. One reached 90%. The scrolling task came in at 60%, consistent with Test 1. Search relevance held at 6.6/7, with 60% of participants emphasizing top result prioritization for live and time-sensitive content.
Voice gains continued: eight out of ten preferred voice, up from seven. The prompt constructor fix resolved autofill misfires entirely.
Eighty percent said they would prefer or be comfortable replacing current search with this assistant — holding steady from Test 1. That number isn't a task success metric. It's a signal that people want to live with this thing.
Three Findings, Three Outcomes
Not every research finding becomes a design change. Part of this work is knowing when to iterate, when to hold, and when to trust that the real environment will behave differently than the test environment.
Scroll discoverability was flagged across both rounds — four out of ten participants in Test 2 struggled to find chat history. After review, our UX and IX directors made the call to ship as designed. Participants tested on a computer browser; on a TV screen, spatial transitions and the peek affordance of previous results are more pronounced. The real environment is expected to resolve what the test environment surfaced.
The recommendations carousel discomfort — raised by three out of ten — was delivered as designed. Carousels exist throughout DIRECTV already. What made this one feel different wasn't the behavior. It was the AI context. At a moment in culture when AI is everywhere and not everyone finds that comforting, we noted it, watched it, and made a considered decision to hold rather than react to a sentiment that may normalize as the product matures.
On naming: Search Ai emerged as the clear winner. It won on transparency — users understood what it was and what it did, which in an AI product is exactly the trust signal worth leading with.
Impact
The system targets a task success rate above 85%, a Customer Effort Score above 6, and time-to-content under 30 seconds. The legacy experience averaged 12 minutes to find and begin playing content. We're designing for 24 times faster — not a feature improvement, but a different relationship with the product entirely.
The most interesting design problems in AI aren't about capability. They're about restraint. Knowing when not to talk, when not to guess, when to wait for the user to finish their thought. The goal was never a chatbot that felt impressive. It was a quiet, accurate thing that got out of the way and let people watch television.
Previous