Evaluating a Generative AI Assistant: Usability, Trust, and Adoption
Company: DIRECTV
Role: Senior UX Researcher (Project Lead)
Timeline: 2025-2026
Team: Product, Design
Method: Two-phase iterative moderated evaluation study
Participants: 20 DIRECTV Stream users total (10 per phase)
Platform: PC prototype of the DIRECTV Stream Gen AI Assistant
1. Problem & Context
DIRECTV was developing a Generative AI Assistant to replace its traditional search experience on the DIRECTV Stream platform. The assistant was designed to support voice and text-based search, deliver AI-generated content recommendations, and respond to natural language queries including subscription help.
The team needed rigorous user feedback at two points in the design cycle to de-risk the experience before continued development. I led a two-phase iterative moderated evaluation to assess user perceptions, task performance, and trust across the assistant's search, input, and response experience — and to identify design opportunities to improve clarity, usability, and adoption readiness.

2. Research Objectives & Key Questions
-
How easily can users search and navigate results using keyboard, voice, and prompt-based inputs?
-
Do recommendations, search history, and results organization meet user expectations and support efficient decision-making?
-
How do users perceive the assistant's tone, copy, and communication style, including branding and naming?
-
Are users willing to replace traditional search with the Gen AI Assistant, and what factors influence trust across varying levels of AI familiarity?
Phase 2 carried an additional objective: to evaluate whether design updates from Phase 1 had meaningfully improved usability, clarity, and trust.
3. Method & Rationale
We conducted two rounds of moderated prototype evaluation studies, each with 10 DIRECTV Stream customers, using a PC-based prototype. Moderated sessions enabled direct observation of user behavior and nuanced exploration of trust and adoption attitudes — critical for an AI-powered product where user mental models vary significantly. The iterative design allowed Phase 1 findings to directly inform design refinements tested in Phase 2, compressing the feedback loop within an active development window.
Participants across both phases represented a mix of device types, age groups, search frequencies, and AI familiarity levels.
4. Execution & Logistics
-
Phase 1 (December 2025)
-
10 participants completed five tasks covering content search, voice input, subscription help, search history navigation, and new search initiation
-
Collected current usage patterns, task performance data, ease of use and relevance ratings, and user preferences
-
Phase 2 (February 2026)
-
Fresh sample of 10 participants completed the same five tasks to enable direct cross-phase comparison
-
Evaluated updated keyboard, voice input, autosuggest, recommendations carousel, scroll-up history, and overall UI changes
5. Synthesis Process
I synthesized moderated session observations, post-task scale ratings, and open-ended feedback across both phases into themes organized around input usability, content and UI expectations, communication style, and trust and adoption. For Phase 2, I conducted a direct cross-phase comparison to identify which design changes had improved the experience, which issues persisted, and where new opportunities had emerged.
Prototype Comparison — Phase 1 vs. Phase 2 AI Assistant Landing Screen
Phase 1:

Phase 2:

6. Key Findings
Phase 1 — Strong foundations, with meaningful friction points
-
Voice was dominant: 70% preferred speaking over typing, and 60% chose the mic when input was optional. The on-screen keyboard created friction for half of participants, and autofill caused unintended search submissions.
-
Task success was high (80–100%) but scroll-up history was not discoverable: 80% missed it initially, yielding the lowest ease score across tasks (4.9/7).
-
Search relevance was consistently strong (6.7–6.9/7), but participants wanted richer sports context. The recommendations carousel was valued only when perceived as personalized.
-
Trust was sufficient for TV search, but 50% reacted negatively to the "AI Assistant" label, preferring simpler or DIRECTV-branded naming.
Phase 2 — Improved adoption readiness, persistent navigation challenge
-
Adoption sentiment strengthened: 80% said they would prefer or be comfortable replacing the current search, citing voice and "how-to" capabilities as key differentiators.
-
Voice preference increased to 80%, with Tasks 1–3 achieving 100% completion and strong ease scores (6.6–6.7/7).
-
Scroll-up history remained the single highest-priority friction point: Task 4 success dropped to 60% and ease fell to 4.0/7, as the interaction model remained non-obvious despite design updates.
-
Branding resistance softened: naming preferences distributed across SearchAI (50%), Search+ (30%), and Search (20%), indicating reduced friction with AI-adjacent branding.
-
A sharper expectation emerged: 50% requested clickable deep links within instructional responses rather than manually navigating multi-step paths.
Phase 1 vs. Phase 2 Task Success Comparison
Phase 1:

Phase 2:

7. Recommendations
-
Prioritize scroll-up history discoverability through visible affordances, first-time-use education, or a dedicated navigation control.
-
Add direct, clickable deep links within instructional AI responses to close the gap between information and action.
-
Implement transparent, user-controlled personalization settings to address tracking concerns while preserving recommendation value.
-
Prioritize live and time-sensitive content at the top of search results to align with user intent and increase perceived AI intelligence.
-
Adopt a functional, DIRECTV-branded naming convention to reduce AI skepticism and reinforce utility.
8. Impact
-
Delivered two rounds of iterative research that directly informed design refinements between phases and compressed the feedback loop within an active development window.
-
Documented a meaningful increase in adoption sentiment across phases, with 80% of Phase 2 participants actively preferring or comfortable replacing the current search experience.
-
Identified scroll-up history discoverability as the highest-priority friction point requiring resolution before launch readiness.
-
Translated findings into prioritized How Might We opportunities that guided continued product development and de-risked the experience ahead of release.
9. Confidentiality Note
All visuals and product references in this case study have been anonymized and redacted to respect company confidentiality.