Evaluating a Generative AI Assistant: Usability, Trust, and Adoption

Company: DIRECTV
Role: Senior UX Researcher (Project Lead)
Timeline: 2025-2026
Team: Product, Design
Method: Two-phase iterative moderated evaluation study
Participants: 20 DIRECTV Stream users total (10 per phase)

Platform: PC prototype of the DIRECTV Stream Gen AI Assistant

1. Problem & Context

DIRECTV was developing a Generative AI Assistant to replace its traditional search experience on the DIRECTV Stream platform. The assistant was designed to support voice and text-based search, deliver AI-generated content recommendations, and respond to natural language queries including subscription help.

The team needed rigorous user feedback at two points in the design cycle to de-risk the experience before continued development. I led a two-phase iterative moderated evaluation to assess user perceptions, task performance, and trust across the assistant's search, input, and response experience — and to identify design opportunities to improve clarity, usability, and adoption readiness.

Screenshot 2026-04-14 at 4.38_edited.jpg

2. Research Objectives & Key Questions

How easily can users search and navigate results using keyboard, voice, and prompt-based inputs?
Do recommendations, search history, and results organization meet user expectations and support efficient decision-making?
How do users perceive the assistant's tone, copy, and communication style, including branding and naming?
Are users willing to replace traditional search with the Gen AI Assistant, and what factors influence trust across varying levels of AI familiarity?

Phase 2 carried an additional objective: to evaluate whether design updates from Phase 1 had meaningfully improved usability, clarity, and trust.

3. Method & Rationale

We conducted two rounds of moderated prototype evaluation studies, each with 10 DIRECTV Stream customers, using a PC-based prototype. Moderated sessions enabled direct observation of user behavior and nuanced exploration of trust and adoption attitudes — critical for an AI-powered product where user mental models vary significantly. The iterative design allowed Phase 1 findings to directly inform design refinements tested in Phase 2, compressing the feedback loop within an active development window.

Participants across both phases represented a mix of device types, age groups, search frequencies, and AI familiarity levels.

4. Execution & Logistics

Phase 1 (December 2025)
10 participants completed five tasks covering content search, voice input, subscription help, search history navigation, and new search initiation
Collected current usage patterns, task performance data, ease of use and relevance ratings, and user preferences
Phase 2 (February 2026)
Fresh sample of 10 participants completed the same five tasks to enable direct cross-phase comparison
Evaluated updated keyboard, voice input, autosuggest, recommendations carousel, scroll-up history, and overall UI changes

5. Synthesis Process

I synthesized moderated session observations, post-task scale ratings, and open-ended feedback across both phases into themes organized around input usability, content and UI expectations, communication style, and trust and adoption. For Phase 2, I conducted a direct cross-phase comparison to identify which design changes had improved the experience, which issues persisted, and where new opportunities had emerged.

Prototype Comparison — Phase 1 vs. Phase 2 AI Assistant Landing Screen

Phase 1:

Screenshot 2026-04-14 at 4.54_edited.jpg

Phase 2:

Screenshot 2026-04-14 at 4.55_edited.jpg

6. Key Findings

Phase 1 — Strong foundations, with meaningful friction points

Voice was dominant: 70% preferred speaking over typing, and 60% chose the mic when input was optional. The on-screen keyboard created friction for half of participants, and autofill caused unintended search submissions.
Task success was high (80–100%) but scroll-up history was not discoverable: 80% missed it initially, yielding the lowest ease score across tasks (4.9/7).
Search relevance was consistently strong (6.7–6.9/7), but participants wanted richer sports context. The recommendations carousel was valued only when perceived as personalized.
Trust was sufficient for TV search, but 50% reacted negatively to the "AI Assistant" label, preferring simpler or DIRECTV-branded naming.

Phase 2 — Improved adoption readiness, persistent navigation challenge

Adoption sentiment strengthened: 80% said they would prefer or be comfortable replacing the current search, citing voice and "how-to" capabilities as key differentiators.
Voice preference increased to 80%, with Tasks 1–3 achieving 100% completion and strong ease scores (6.6–6.7/7).
Scroll-up history remained the single highest-priority friction point: Task 4 success dropped to 60% and ease fell to 4.0/7, as the interaction model remained non-obvious despite design updates.
Branding resistance softened: naming preferences distributed across SearchAI (50%), Search+ (30%), and Search (20%), indicating reduced friction with AI-adjacent branding.
A sharper expectation emerged: 50% requested clickable deep links within instructional responses rather than manually navigating multi-step paths.

Phase 1 vs. Phase 2 Task Success Comparison

Phase 1:

Screenshot 2026-04-14 at 4.58_edited.jpg

Phase 2:

Screenshot 2026-04-14 at 5.03_edited.jpg

7. Recommendations

Prioritize scroll-up history discoverability through visible affordances, first-time-use education, or a dedicated navigation control.
Add direct, clickable deep links within instructional AI responses to close the gap between information and action.
Implement transparent, user-controlled personalization settings to address tracking concerns while preserving recommendation value.
Prioritize live and time-sensitive content at the top of search results to align with user intent and increase perceived AI intelligence.
Adopt a functional, DIRECTV-branded naming convention to reduce AI skepticism and reinforce utility.

8. Impact

Delivered two rounds of iterative research that directly informed design refinements between phases and compressed the feedback loop within an active development window.
Documented a meaningful increase in adoption sentiment across phases, with 80% of Phase 2 participants actively preferring or comfortable replacing the current search experience.
Identified scroll-up history discoverability as the highest-priority friction point requiring resolution before launch readiness.
Translated findings into prioritized How Might We opportunities that guided continued product development and de-risked the experience ahead of release.

9. Confidentiality Note

All visuals and product references in this case study have been anonymized and redacted to respect company confidentiality.