Skip to main content

Risk Register -- AI Support Copilot

Engagement: AI Support Copilot Pilot Owner: Shivani (PM), with risk owners per item Version: 1.0 Date: 2026-05-01 Framework ref: Doc 03, Section 6

This is a living document. Updated weekly by the PM, reviewed in every sprint demo, and visible to the client. Any POD member can raise a new risk at any time -- no prior approval needed.


Scoring Guide

LevelLikelihoodImpact
High> 60% chance of occurringBlocks milestone or delivery; requires scope change
Medium30-60% chanceDelays timeline or degrades quality; recoverable with effort
Low< 30% chanceMinor inconvenience; handled within normal sprint work

Risk Score = Likelihood x Impact. High/High = Critical. Any Critical risk requires a mitigation plan with specific actions and dates.


Active Risks

Technical Risks

IDRiskLIScoreEarly SignalMitigationOwnerStatus
R-01Gemini accuracy does not reach 85% on all metricsMHHighEval results plateau below target after 2 prompt iterationsLLM Gateway enables zero-code swap to GPT-4o or Claude; rapid prompt iteration in Sprint 1; per-step model selection (different model per pipeline step if needed)Atharva + AmitOpen
R-02End-to-end latency exceeds 15s hard limitMMMediumWalking skeleton p95 > 10sParallelize classify + retrieve steps; implement streaming for draft step; consider smaller model for classification; add response caching for repeated ticketsAtharvaOpen
R-03Elasticsearch hybrid search (vector + BM25) underperforms on small KBMMMediumRetrieval accuracy < 75% on golden set despite correct KB articles existingTune RRF parameters; adjust BM25 boost weights; increase embedding chunk overlap; consider adding keyword-based fallbackAtharva + NancyOpen
R-04LangChain.js introduces abstraction overhead or breaking changesLMMediumUnexpected behavior in structured output parsing or provider switchingPin exact versions in package.json; write integration tests for each pipeline step; have fallback to direct Vertex AI SDK callsAmitOpen

Data Risks

IDRiskLIScoreEarly SignalMitigationOwnerStatus
R-05Low dataset diversity (11 unique scenarios from 36 tickets) limits model generalizationHHCriticalClassification accuracy varies wildly across categories; overfits to seen patternsGenerate 30-40 diverse golden set cases by May 5; generate 1,000 synthetic eval set by May 14; flag limitation to client explicitlyNishka + AtharvaOpen
R-06Reporting category has zero training data but is in eval setMMMediumReporting tickets consistently misclassifiedNancy adds 2-3 synthetic Reporting tickets to dataset; accept cold-start performance and document as known limitationNancyOpen
R-07KB articles lack sufficient depth for grounded responsesMHHighResponse faithfulness < 80%; draft responses are generic despite correct retrievalReview KB content with Prasanna; identify gaps; supplement with additional KB content or adjust expectations for thin categoriesAmit + NancyOpen
R-08Excel-to-production data shape mismatchLMMediumFields in Excel don't map cleanly to Freshdesk API schemaDesign data ingestion with abstraction layer; document mapping assumptions; validate schema with Prasanna during Sprint 1NancyOpen

Timeline & Organisational Risks

IDRiskLIScoreEarly SignalMitigationOwnerStatus
R-09Tight timeline (16 days) with hard deadlines leaves no bufferHHCriticalAny task taking longer than estimated by Day 3Ruthless prioritization: cut polish, not core capabilities; daily standup catches blockers within 24 hours; pre-define scope cuts if needed (see Contingency section below)Shivani + AmitOpen
R-10Team members pulled to competing prioritiesMHHighTeam member misses standup or deliverable dateConfirm dedicated allocation with all team members at sprint kickoff; escalate to Gyde leadership immediately if competing work appearsShivaniOpen
R-11Client decision turnaround exceeds 1 business dayLMMediumPrasanna unavailable for > 24 hours during a decision-dependent taskBatch decisions into weekly call agenda; identify backup decision-maker; keep parallel work available when blockedShivaniOpen
R-12Knowledge transfer package is deprioritized under time pressureMMMediumDocumentation tasks consistently deferred to "later"Documentation is a sprint deliverable, not a post-sprint activity; assign specific doc tasks per sprint; Amit owns architecture doc, Shivani owns engagement docsAmit + ShivaniOpen

Security & Governance Risks

IDRiskLIScoreEarly SignalMitigationOwnerStatus
R-13On-prem handover requirements surface late constraintsLMMediumClient's infra team flags incompatible components post-pilotDocument all dependencies and infrastructure requirements early; use only self-hostable components (no managed-only services); share infra requirements doc with Prasanna by Sprint 1 demoAmitOpen
R-14GCP Service Account permissions insufficient or misconfiguredLHMediumVertex AI API calls fail during initial setupDocument exact IAM roles needed; test Service Account permissions before Sprint 1 code starts; have fallback API key auth for dev environmentAmitOpen
R-15Guardrails fail to catch adversarial inputsMMMediumAdversarial eval set reveals bypasses in profanity/PII/injection checksBuild adversarial test cases early (by May 10); layer multiple guardrail checks; human review remains mandatory for all responsesShubham + NishkaOpen

Commercial Risks

IDRiskLIScoreEarly SignalMitigationOwnerStatus
R-16Cost per ticket exceeds client expectationsMMMediumToken usage from M1 estimates projects > $0.50/ticket at production volumeModel tiering (smaller model for classification, larger for drafting); prompt compression; response caching for recurring ticket patternsAtharva + AmitOpen
R-17Pilot success creates unrealistic production expectationsLMMediumClient assumes pilot accuracy will hold at 10x volume with real dataSet expectations in every demo: pilot uses curated data; production accuracy depends on KB quality and data diversity; document gaps explicitly in handover packageShivani + AmitOpen

Risk Contingency Plan

If R-09 (timeline) materializes, the following scope cuts are pre-approved at POD level (no client approval needed for these specific cuts):

PriorityWhat Gets CutImpactWhen to Cut
1 (first cut)UI polish and visual refinementsFunctional but less polished interfaceIf behind by Day 5
2Synthetic 1,000-question eval (reduce to 500)Slightly less eval coverage; still statistically meaningfulIf behind by Day 8
3Feedback loop (save corrections)Agents can still use copilot; corrections not stored for learningIf behind by Day 10
4Confidence scoring granularity (binary instead of percentage)Less nuanced confidence display; still shows high/lowIf behind by Day 10

Scope cuts that require client approval (per POD Charter, Section 5):

  • Reducing eval thresholds below minimum
  • Dropping guardrails
  • Removing any pipeline step (classify, retrieve, reason, draft)
  • Changing delivery dates

Assumptions Register

Each assumption is a belief the plan rests on. If an assumption proves wrong, it becomes a risk.

IDAssumptionImpact if WrongValidation MethodValidated?
A-01Excel dataset is representative of production ticket patternsPilot accuracy won't predict production accuracyCompare Excel categories against Freshdesk category distribution (ask Prasanna)No
A-02Prasanna is available for decisions within 1 business daySprint velocity drops; scope may slipConfirmed in discovery call; monitor weeklyYes
A-03GCP Vertex AI APIs are stable and available throughout the pilotNeed to switch LLM provider mid-sprintTest API availability during setup; LLM Gateway provides fallbackNo
A-0485% accuracy is achievable with the provided KB contentMay need to renegotiate thresholds or expand KBBaseline metrics from M2 eval harness will validateNo
A-05Team members are dedicated to this engagement (no competing priorities)Deliverables at risk; may miss hard deadlinesConfirm with each team member at sprint kickoffNo
A-0612 KB articles cover the primary resolution paths for common ticket typesRetrieval step will have coverage gapsMap KB articles to ticket categories; identify unmapped categoriesNo
A-07Gemini via Vertex AI produces reliable structured JSON outputPipeline steps fail to parse LLM outputValidate during walking skeleton (M1); LangChain.js structured output parsingNo
A-08MongoDB and Elasticsearch can run on existing GCP VM without performance issuesNeed separate infrastructure or managed servicesLoad test during Sprint 1 with full pipelineNo

Dependencies Register

IDDependencyProviderNeeded ByStatus
D-01GCP Service Account with Vertex AI permissionsAmit (setup)May 2 (Sprint 1 Day 2)Pending
D-02GCP VM access for MongoDB + ElasticsearchAmit (existing infra)May 2Pending
D-03Dataset (Excel file with tickets, KB, escalation rules, eval cases)Prasanna (provided)Available nowDone
D-04Client review of expanded golden dataset (30-40 cases)Prasanna or support team leadMay 5Pending
D-05Client review of 1,000 synthetic eval resultsPrasanna's support team leadMay 15Pending
D-06Decision on target cost-per-ticket for productionPrasannaNext weekly callPending

Review Cadence

ActivityFrequencyParticipants
PM reviews and updates registerWeekly (minimum)Shivani
Top-5 risks reviewed in POD standupWeeklyFull POD
Risk summary in client status emailWeeklyPrasanna, Shivani
Risk discussion in sprint demoPer sprintPrasanna, full POD
New risk raised by any team memberContinuousAnyone

Change Log

DateChangeBy
2026-05-01Initial risk register created with 17 risks, 8 assumptions, 6 dependenciesShivani + Amit

Risk register is a living document. "Risks tracked but never mitigated" is an anti-pattern (Doc 03, Section 9). Every risk with a mitigation plan must have specific actions, not just words.