AI Features

Part of the Anaya Care product wiki. See 00-overview.md.

Purpose

Anaya Care embeds LLM-backed generation across the care workflow: agentic background generation of care plans, care-provider tasks, and meals; one-shot generation of engagement activities, assessments, summaries, and quotes; a general-purpose AI chat assistant; rich-text editor AI (Plate); audio transcription; AI image generation; and a tenant-scoped knowledge base with vector (RAG) search. The backend talks directly to OpenAI, Anthropic, and Google Gemini; the web app additionally has its own editor-AI routes that call the Vercel AI Gateway.

This page documents the AI plumbing and each feature's trigger → model → context → output path. Wound photo analysis is covered in 08-health-monitoring.md, the care readiness quiz in 02-assessments.md, and daily motivations in 10-daily-living.md — each is cross-linked briefly below.

Entities & Data Model

`ai_settings` — per-business AI defaults

apps/backend/src/ai-settings/entities/ai-settings.entity.ts

Field	Type	Notes
`business`	ObjectId → Business	required, unique — one settings doc per tenant
`defaultProvider`	string	default `'anthropic'`
`defaultMode`	string	default `'standard'`
`defaultAdditionalPrompt`	string	default `''`

Created lazily via upsert on first read (ai-settings.service.ts:19-30). Read/written through GET|PATCH /ai-settings, both guarded by BusinessPermission.AI_FEATURES (ai-settings.controller.ts:24,33).

`ai_conversations` / `ai_messages` — AI chat sessions

apps/backend/src/ai-chat/entities/ai-conversation.entity.ts, ai-message.entity.ts

Collection	Key fields
`ai_conversations`	`business` (ObjectId, optional — null for independent care providers), `userId` (required), `title?` (auto-generated after first exchange), `lastMessageAt`; compound index `{business, userId, lastMessageAt}`
`ai_messages`	`conversationId`, `business?`, `role` (`user`/`assistant`), `content`, `model?`, `usage?` (`inputTokens`/`outputTokens`/`totalTokens`); index `{conversationId, createdAt}`

`knowledge_base_documents` + Qdrant `knowledge_base_chunks`

apps/backend/src/knowledge-base/entities/knowledge-base-document.entity.ts

Field	Type	Notes
`business`	ObjectId → Business	required, indexed — tenant owner taken from the authenticated user, never the body (`knowledge-base.service.ts:47-66`)
`title`, `description?`, `tags[]`	strings	tags indexed
`file`	embedded	S3 `key`, `url`, `type`, `name`, `size`, optional PDF `thumbnailKey/Url`
`processingStatus`	enum `DocumentProcessingStatus`	`pending → processing → completed/failed`
`uploadedBy`	ObjectId → User
`isDeleted`	boolean	soft delete

Vector store: Qdrant collection knowledge_base_chunks, 1536-dim cosine, payload-indexed on businessId and documentId (knowledge-base/services/qdrant.service.ts:6,63-77). Each point payload carries content, businessId, documentId, chunkIndex (vector-search.service.ts:53-62).

AI output cached on other entities

Client.aiSummary + Client.aiSummaryUpdatedAt — cached ~3000-char plain-text client profile summary (clients/services/client-ai-summary.service.ts:74-77).
Generated care plans, tasks, meals, engagement activities, trainings etc. are stored in their own module collections (see sibling pages); AI metadata such as iteration counts/confidence is returned in job results, not persisted as a first-class audit collection.

Shared config types (`@anaya/shared`)

packages/shared/src/care-plan/care-plan-generation.ts — AIProvider ('openai' | 'anthropic'), GenerationMode ('fast' | 'standard' | 'thorough'), PROVIDER_MODEL_MAP, GENERATION_MODE_PRESETS, care-plan + task focus-area option lists.
packages/shared/src/meal-generation/meal-generation.ts, packages/shared/src/engagement-activity-generation/engagement-activity-generation.ts — focus areas and mode UI copy (estimated times, "Recommended" badge).
packages/shared/src/ai-chat/ai-chat.ts — chat entity DTOs, AI_CHAT_MODELS whitelist, DEFAULT_AI_CHAT_MODEL = 'claude-sonnet-4-6', starter prompts.
packages/shared/src/ai/ai-settings.ts — AISettingsResponse DTO.

Workflows & State Machines

Shared provider/agent infrastructure

LLMProviderFactory (apps/backend/src/ai/providers/llm-provider.factory.ts) resolves AIProvider + GenerationMode → model via PROVIDER_MODEL_MAP. As of today every mode maps to the same model per provider: OpenAI → gpt-5.5, Anthropic → claude-sonnet-4-6 (packages/shared/src/care-plan/care-plan-generation.ts:22-32). API keys come from OPENAI_API_KEY / ANTHROPIC_API_KEY env.
OpenAIProvider (ai/providers/openai.provider.ts) — openai.responses.create() with strict JSON-schema output, reasoning.effort, 5-min default timeout via AbortController, and one truncation retry doubling max_output_tokens up to 32k. Logs estimated cost ($5/$30/$30 per 1M in/out/reasoning, lines 155-161).
AnthropicProvider (ai/providers/anthropic.provider.ts) — Vercel AI SDK generateText with a forced tool call whose input schema is the desired JSON schema (toolChoice, lines 78-86); one transient-error retry and one truncation retry; reasoning tokens always reported as 0 (line 139); cost $3/$15 per 1M (lines 154-159).
BaseAgent (ai/agents/base/base.agent.ts) — retry wrapper for all agents: up to 2 retries with exponential backoff (1s base), retryability classified by error-message keywords (timeout/rate limit/5xx retryable; invalid/not found/unauthorized/cancelled not), token-usage and timing metrics.
CancellationRegistry (ai/agents/base/cancellation-registry.ts) — Redis-backed (CACHE_MANAGER) key job:cancelled:<jobId> with 2h TTL, bridging the cancel API endpoint and the running BullMQ worker (job.data updates don't propagate to a running processor). Orchestrators call syncCancellation() at every stage/iteration and abort in-flight LLM calls via AbortController (care-plan-orchestrator.service.ts:852-862).
Prompt files — markdown files under apps/backend/src/ai/prompts/*.prompts.md parsed by prompt-loader.ts using  section markers and {{variable}} substitution, cached in-process.
AIModule is @Global() and provides a raw OpenAI client via DI (ai/ai.module.ts:62-67).

Generation mode presets (packages/shared/src/care-plan/care-plan-generation.ts:51-82):

Mode	maxIterations	confidence	coverage	reasoningEffort	skipQualityEvaluation
fast	1	0.7	0.75	low	true
standard	1	0.8	0.85	medium	true
thorough	3	0.85	0.9	high	false

Note: fast and standard differ only in thresholds/reasoning effort — both skip the quality-evaluator loop.

Care plan generation (agentic, BullMQ)

Trigger: admin/care-manager dialog on web; backend dedupes — an active non-cancelled job for the same client returns the existing jobId (client-care-plans.service.ts:455-488). Job options: attempts: 3, exponential backoff 5s (:523-535).
Provider/model: job.data.provider || 'anthropic' through LLMProviderFactory (ai/agents/care-plan/care-plan-orchestrator.service.ts:137-142).
Context: ValidationAgent loads client, medications, schedules, representative profiles and requires client.timezone (:230-237); ContextBuilderAgent builds the formatted narrative from ClientNarrativeService (ai/agents/care-plan/agents/context-builder.agent.ts:117) plus schedule context and the user's additionalPrompt.
Generation: ContentGeneratorAgent calls provider.generateStructuredOutput per care-plan section with prompts from ai/prompts/care-plan.prompts.md (content-generator.agent.ts:750-753) and per-section token caps. The loop's stop conditions are confidence ≥ threshold, coverage ≥ threshold with no critical issues, evaluator says no refinement, or max iterations (care-plan-orchestrator.service.ts:722-782).
Fallback: if generation fails on the first iteration with no draft, a near-empty "emergency draft" with confidence 0.35 is saved so the job completes (:973-1005).
Output: saved via ClientCarePlansService.createCarePlan(..., isCompleted: true); completion emits CARE_PLAN_GENERATION_COMPLETED to the requester's socket room and a push notification (:366-404).

Care provider tasks generation (agentic, BullMQ)

Trigger: ClientCareProviderTasksService enqueues generate-care-provider-tasks on queue care-provider-tasks-generation with target schedule IDs, existing-task summaries (for dedup), mode presets, optional provider/focus areas/additional prompt (clients/services/client-care-provider-tasks.service.ts:1548-1586). Same duplicate-job guard + Redis-cancellation check pattern.
Pipeline (ai/agents/care-provider-tasks/care-provider-tasks-orchestrator.service.ts:63-72): Validation → context building → ObjectivesAgent → StructureAgent → TaskGeneratorAgent (parallel per-structure with a semaphore) → TasksQualityEvaluatorAgent → refinement loop → DeduplicationAgent → save. LLM calls go through the same LLMProviderFactory/generateStructuredOutput path (agents/task-generator.agent.ts:338).
Deterministic helpers: MealTaskMapperService and VitalSignsTaskMapperService map existing meal plans / vital-sign requirements into tasks without LLM calls; ContextPrioritizationService, ClinicalGuidelinesService, CognitiveAdaptationService, FewShotExamplesService, SemanticSimilarityService shape prompts and dedup (ai/agents/care-provider-tasks/services/).
Progress: CARE_PROVIDER_TASKS_GENERATION_PROGRESS/COMPLETED/FAILED socket events (packages/shared/src/enums/socket-events.ts:70-72).

Meal generation (agentic, BullMQ)

Trigger: ClientMealsService enqueues generate-meals on queue client-meals (client-meals/services/client-meals.service.ts:278); processor (concurrency 3, 15-min lock) delegates to MealOrchestratorService (client-meals/processors/client-meals.processor.ts:38-49).
Pipeline (ai/agents/client-meals/client-meals-orchestrator.service.ts): Validation → MealContextBuilder → MealPlannerAgent → MealGeneratorAgent (structured output via the selected provider, agents/meal-generator.agent.ts:186) → MealQualityEvaluatorAgent (thorough only). Job data carries meal counts per type, preferred/avoid ingredients, restrictions, mode/provider/focus areas (:81-97).
Images: when generateImages is set, the orchestrator generates a photo per meal (:313-323) through AIClientMealsService, which uses Google Gemini image generation (ai/services/ai-client-meals.service.ts:1024).
Variations: a separate generate-variations job re-uses a base meal and calls AIClientMealsService directly (OpenAI Responses API, gpt-5.5, :229,328,399).
Progress: CLIENT_MEALS_GENERATION_* and MEAL_VARIATION_GENERATION_* socket events (socket-events.ts:85-93).

Engagement activity generation (one-shot loop, BullMQ — not agentic)

Trigger: ClientEngagementActivitiesService enqueues generate-engagement-activity on queue client-engagement-activities (clients/services/client-engagement-activities.service.ts:299).
Flow (clients/processors/client-engagement-activities.processor.ts:85-249): build the client's formatted narrative (clientsService.getFormattedNarrative), then loop numberOfActivities times calling AIClientEngagementActivityService.generateActivity — a fixed OpenAI gpt-5.5 Responses-API call (ai/services/ai-client-engagement-activity.service.ts:25-29,139-140) with Montessori theme/type/difficulty/duration and optional mode/focus areas/additional prompt. No provider selection — the shared LLMProviderFactory is not used here, unlike care plans/tasks/meals.
Images: optional, generated per activity via Gemini (1:1, 1K) and uploaded to public S3 (ai-client-engagement-activity.service.ts:51-84).
Progress: CLIENT_ENGAGEMENT_ACTIVITY_GENERATION_* events (socket-events.ts:96-98); cancellation via CancellationRegistry (client-engagement-activities.service.ts:346).

AI chat (conversational assistant)

Who: any authenticated user; the controller is @SkipBusinessCheck()/@SkipBusinessScope() so independent care providers without a business can use it (ai-chat/ai-chat.controller.ts:33-35). Conversations are owned by userId and scoped to the user's businessId ?? null.
Safety gate: before any LLM call, a regex pre-filter classifies the message as CRISIS / EMERGENCY / SAFE; crisis/emergency messages get canned 988/911 responses and never reach the model (ai-chat-safety.service.ts:7-32, ai-chat.controller.ts:104-129).
Model: user-selectable from the AI_CHAT_MODELS whitelist (GPT-4o/4o-mini/4.1/4.1-mini/4.1-nano/o4-mini, Claude Sonnet 4.6 / Haiku 4.5 / Opus 4.6), default claude-sonnet-4-6 (packages/shared/src/ai-chat/ai-chat.ts:43-102); DTO validates against the whitelist. Titles are auto-generated on the first exchange with gpt-4.1-nano (ai-chat.service.ts:24,229-243).
Context: system prompt composed from generic anaya/base + anaya/chat prompt sections (scope of practice, glossary, safety escalation, identity, platform knowledge, behavioral rules — ai-chat-prompt.service.ts:14-30) plus the last 50 messages of the conversation (ai-chat.service.ts:211-227). The chat has no access to client records, the knowledge base, or any tenant data — it is a pure conversational assistant.
Output: streamed via AI SDK streamText().pipeUIMessageStreamToResponse(res); assistant message + token usage persisted in onFinish (ai-chat.controller.ts:152-192).

Plate AI (rich-text editor AI — backend module)

plate-ai is the AI service for the Plate.js rich-text editor (generate / edit / comment / copilot-complete / transform commands) — it is not meal-photo analysis.

Endpoints POST /plate-ai/command|complete|transform (JWT-guarded) stream SSE chunks (plate-ai/plate-ai.controller.ts).
Model: gpt-4o-mini by default (command accepts a model in the DTO) via OpenAI chat completions (plate-ai/plate-ai.service.ts:18,33,66,124); system prompts = anaya/context (full platform context) + tool-specific sections of ai/prompts/plate-ai.prompts.md (:9,92-109).
Web reality check: the web app's Plate editor (apps/web/components/editor/plugins/ai-kit.tsx, copilot-kit.tsx) calls the web's own Next.js routes /api/ai/command and /api/ai/copilot, which use the Vercel AI Gateway (createGateway + AI_GATEWAY_API_KEY, apps/web/app/api/ai/command/route.ts, copilot/route.ts) — not the backend module. The web proxy routes to the backend (/api/ai/plate-command, /api/ai/plate-transform) forward no Authorization header while the backend controller requires JWT, and no web component references them. See Open Questions.

Knowledge base (RAG)

Ingest: create document → status pending → async indexDocument(): S3 download → text extraction (text-extraction.service.ts) → chunking (~3200 chars with 1600-char overlap, ≈800/400 tokens — chunking.service.ts:14-17) → OpenAI text-embedding-3-small embeddings (1536-dim, embedding.service.ts:17,47) → Qdrant upsert with rollback on failure (vector-search.service.ts:27-78). Status moves to completed/failed; PDFs get an async thumbnail; POST :id/retry-indexing re-runs failed docs (knowledge-base.service.ts:442-455).
Search: POST /knowledge-base/search embeds the query and runs Qdrant cosine search filtered by businessId, deduplicating to the best chunk per document (knowledge-base.service.ts:347-398).
Permissions: KNOWLEDGE_BASE_MANAGE for create/update/delete/retry, KNOWLEDGE_BASE_VIEW for read/search (knowledge-base.controller.ts).
RAG consumer: AI training generation retrieves KB chunks via vectorSearchService.search(...) scoped to selected document IDs (trainings/services/training-ai-generation.service.ts:113,229) and generates training content with gpt-5.5 (:140); document ownership is asserted before a job can reference KB docs (trainings.service.ts:2684, knowledge-base.service.ts:466-484).

Client AI summary & narrative services

ClientNarrativeService (clients/services/client-narrative.service.ts) is not an LLM service — it deterministically formats client demographics, medical records, medications, schedules, and assessments (RTI, simplified RTI, Montessori, Allen cognitive level, meal assessment) into a long markdown narrative. This narrative is the canonical context input for care-plan context building, engagement activities, meals, and the AI summary below.
ClientAiSummaryService (clients/services/client-ai-summary.service.ts) condenses that narrative into a ~3000-char plain-text caregiver-facing profile via claude-sonnet-4-6 (:29), cached on Client.aiSummary; getSummary() auto-generates on first read (:82-104). Consumed as document context by guided generation (ai/services/ai-guided-generate.service.ts:71-73).

Shift summary generation + content filter

AI narrative: ShiftSummaryService.generateAiSummary produces an audience-tailored structured summary (headline + sections with paragraphs/bullets/severity flags) using AI SDK generateObject with claude-sonnet-4-6 and a Zod schema whose section-id enum is restricted to the audience's allowed sections — the model cannot emit out-of-audience sections, and a second filter re-checks after decode (clients/services/shift-summary.service.ts:1222-1329).
Content filter: ShiftSummaryContentFilterService is a deterministic, non-LLM visibility filter that maps the shared CONTENT_MATRIX (ContentItem × ExportAudience → include/consider/exclude) onto the shift-summary data sections — e.g. family audiences get alarming caregiver notes stripped and only critical threshold warnings (clients/services/shift-summary-content-filter.service.ts).

One-shot AI services inventory (`apps/backend/src/ai/services/`)

Service	Model / provider	Purpose & output
`ai-guided-generate.service.ts`	`claude-sonnet-4-6` + AI SDK tools (`ask_questions`, `generate_content`), max 10 questions	Interactive Q&A → field content for client / care-proposal forms; pulls the doc's AI summary as context (`:71-73,176,190-215`)
`ai-guided-narrative.service.ts`	`claude-haiku-4-5` (conversation, summary, extraction), `claude-sonnet-4-6` (synthesis) (`:541,648,788,846,890`)	Guided narrative flow for initial assessments (`POST /ai/guided-narrative/*`)
`ai-text-improve.service.ts`	`gpt-4o-mini` (`:35`)	Improve/rewrite text fields (web "AI improve" popover)
`ai-transcription.service.ts`	`whisper-1` (`:30`)	Voice-to-text; 25 MB upload cap, throttled 20 req/min (`ai-transcription.controller.ts:26-30`)
`ai-client-care-readiness-assessment.service.ts`	`gpt-5.5` Responses API + `care-readiness` prompts (`:57`)	Care readiness quiz questions — see 02-assessments.md; driven by the `care-readiness-assessment` queue in `shifts/`
`ai-client-care-proposal.service.ts`	`claude-sonnet-4-6` (`:79`)	Care proposal content — see 03-care-proposals.md
`ai-client-narrative-report.service.ts`	`gpt-5.5` (`:9`)	Narrative report generation
`ai-home-safety-recommendation.service.ts`	`gpt-4o` (`:26`)	Home-safety recommendations
`ai-client-medication.service.ts`	`gpt-4-turbo` (`:57,114`)	Medication-related generation — see 07-medications.md
`ai-job-posting.service.ts`	`gpt-4-turbo` (`:94,202`)	Job posting copy — see 06-job-postings.md
`ai-client-care-plan.service.ts`	`gpt-5.5` (`:515`)	Legacy single-shot care plan generation — registered in `ai.module.ts` but no callers found outside the module (superseded by the agentic orchestrator)
`ai-client.service.ts` (`AIClientService`)	Hardcoded OpenAI Assistants API IDs (5 `asst_*` constants, `:11-16`), `dall-e-3`, `gpt-4-turbo`	Mostly vestigial; only `generateTitle` is called today (doctor-appointment titles, `clients/services/client-doctors-appointments.service.ts:171`)

Adjacent AI features (documented elsewhere, brief)

Wound photo analysis — BullMQ queue wound-analysis (wound-care/wound-care.module.ts:27), analysis via claude-sonnet-4-6 generateText with a forced tool (wound-care/services/*:263-264). Full coverage: 08-health-monitoring.md.
Care readiness quiz — care-readiness-assessment queue + processor in shifts/ using the gpt-5.5 service above. Full coverage: 02-assessments.md.
Daily motivations — AIDailyMotivationService generates caregiver quote batches via LLMProviderFactory('anthropic', 'fast') structured output, with theme input sanitized against prompt injection (daily-motivations/services/ai-daily-motivation.service.ts:50-80). Full coverage: 10-daily-living.md.
AI image facade — AIImageService (common/services/ai-image/ai-image.service.ts) wraps OpenAI image and Gemini (gemini-3-pro-image-preview) behind one API; default provider from env, per-call override; used by training content generation (trainings/services/training-ai-content.service.ts). GeminiService (common/services/gemini/gemini.service.ts) adds API-key rotation, circuit breaker, and quota handling (default model gemini-2.0-flash, :67-68).
Training generation (training-generation queue) and blog generation (blog-generation queue) are additional LLM consumers in trainings/ and blog/.

Business Rules & Constraints

Provider selection is per-request, not enforced per-business. Generation job data carries provider from the web dialog; orchestrators default to 'anthropic' when absent (care-plan-orchestrator.service.ts:138, client-meals-orchestrator.service.ts:103). The stored ai_settings.defaultProvider/defaultMode/defaultAdditionalPrompt are applied client-side only, as web form defaults via useAISettingsDefaults (apps/web/hooks/use-ai-settings.ts:58-90); the backend never reads AISettingsService during generation.
Model is independent of mode — PROVIDER_MODEL_MAP maps every mode to the same model per provider; mode only changes loop config and reasoningEffort (packages/shared/src/care-plan/care-plan-generation.ts:22-32,51-82).
Mode presets: fast and standard both run a single iteration with quality evaluation skipped; only thorough runs the evaluate/refine loop (up to 3 iterations) (care-plan-generation.ts:61-81, applied at care-plan-orchestrator.service.ts:591-599).
Cancellation: DELETE cancel endpoints (clients.controller.ts:1246, client-care-provider-tasks.controller.ts:109, client-engagement-activities.controller.ts:68, client-meals.controller.ts:95) write to the Redis CancellationRegistry; orchestrators poll it at each stage/iteration and abort in-flight LLM calls via AbortController (care-plan-orchestrator.service.ts:852-862). Keys expire after 2h; the registry entry is cleared in finally (:475).
Duplicate-job guard: one active generation per client per feature — an existing non-cancelled queued/active job short-circuits with the existing jobId (client-care-plans.service.ts:455-488; same pattern for tasks client-care-provider-tasks.service.ts:1530-1546).
BullMQ resilience: jobs attempts: 3 with 5s exponential backoff (client-care-plans.service.ts:526-531); workers run concurrency: 3, lockDuration: 900_000 (15 min), stalledInterval: 30s, maxStalledCount: 1, with best-effort extendLock before each long stage (clients/processors/care-plan.processor.ts:35-40, care-plan-orchestrator.service.ts:868-878).
Cost controls: per-call cost estimates are computed and logged only (openai.provider.ts:144-146, anthropic.provider.ts:143-145); token usage is accumulated into job results (care-plan-orchestrator.service.ts:409-418). No budget caps, per-tenant quotas, or billing aggregation exist in code.
Rate limits: only transcription is throttled (@Throttle 20/min, ai-transcription.controller.ts:26). AI chat, plate-ai, and generation endpoints have no AI-specific throttles (global throttler config aside).
Truncation handling: both providers retry once with doubled output budget (cap 32k) and hard-fail rather than return truncated JSON (openai.provider.ts:114-134, anthropic.provider.ts:107-133).
Permissions: BusinessPermission.AI_FEATURES gates AI settings (ai-settings.controller.ts:24,33) and is described as "Access AI generation tools" (packages/shared/src/constants/permission-groups.ts:534); KB has dedicated KNOWLEDGE_BASE_VIEW/MANAGE permissions.
Chat safety: keyword-regex crisis/emergency interception happens server-side before any model call (ai-chat-safety.service.ts); chat model IDs are whitelist-validated.
Tenant isolation in AI paths: KB writes take business from the JWT, search filters by businessId in Qdrant, and cross-tenant KB references in training jobs are rejected (knowledge-base.service.ts:47-66,347-358,466-484). But see Open Questions #8 for findOne/update/remove.
Prompt-injection hygiene exists only piecemeal: daily-motivation theme sanitization (ai-daily-motivation.service.ts:61-63); user additionalPrompt strings flow into generation prompts unsanitized elsewhere.

Surfaces (Web & Mobile)

Web (`apps/web`)

Generation dialogs (admin/care-manager, per client): generate-care-plan-dialog.tsx, generate-meals-dialog.tsx, generate-engagement-activities-dialog.tsx, generate-ai-tasks-dialog.tsx under app/(app)/(admin)/dashboard/clients/[id]/edit/... — each offers mode (fast/standard/thorough), provider (where supported), focus areas, and additional prompt, pre-filled from business AI settings via useAISettingsDefaults (hooks/use-ai-settings.ts). Provider icons in components/icons/ai-provider-icons.
AI settings: components/ai-settings-dialog.tsx (edit; requires AI_FEATURES) and components/ai-settings-readonly-banner.tsx (non-admin read-only view).
AI chat: app/(app)/(admin)/dashboard/ai-chat/page.tsx.
Plate editor AI: components/editor/plugins/ai-kit.tsx and copilot-kit.tsx + components/ui/ai-menu.tsx → web routes app/api/ai/command/route.ts and app/api/ai/copilot/route.ts (Vercel AI Gateway). Backend-proxy routes app/api/ai/plate-command/route.ts, plate-transform/route.ts exist but appear unused.
Guided narrative / guided generate: dashboard/initial-assessments/[id]/guided-narrative/page.tsx, _components/guided-narrative-flow.tsx, hooks/use-guided-generate.ts, use-guided-narrative.ts.
Text utilities: components/ui/ai-improve-popover.tsx / ai-ask-popover.tsx (hooks/use-text-improve.ts → /ai/text-improve), hooks/use-voice-to-text.ts (→ /ai/transcribe).
Generation progress: dashboards subscribe to the notification WebSocket; orchestrators emit *_GENERATION_PROGRESS/COMPLETED/FAILED events to the user-<userId> room (care-plan-orchestrator.service.ts:817-823; event names in packages/shared/src/enums/socket-events.ts:70-102), plus push notifications on completion/failure.

Mobile (`apps/mobile`)

AI chat: app/(ai-chat)/index.tsx (conversation list) and app/(ai-chat)/[id].tsx (thread); REST client lib/ai-chat-api.ts; streaming via lib/ai-chat-stream.ts, which uses expo/fetch to read the backend's AI SDK UI-message SSE stream (axios/RN fetch can't stream) with the JWT from SecureStore.
No mobile surfaces were found for triggering care-plan/meal/engagement generation — those are web-only.

Cross-Module Dependencies

Module	Relationship
`clients/`	Enqueues care-plan, tasks, engagement, appointment-history jobs (`clients.module.ts:244-248`); owns processors for care-plan/tasks/engagement; supplies `ClientNarrativeService` narrative consumed by nearly every generation context; stores `aiSummary` on Client
`client-meals/`	Owns `client-meals` queue + processor; saves generated meals; meal images via Gemini
`notification/`	`NotificationGateway` (WebSocket progress events) and `NotificationService` (push on complete/fail) injected into every orchestrator/processor
`trainings/`	RAG consumer of `knowledge-base` vector search; `training-generation` + `slides-video-merge` queues; uses `AIImageService`
`shifts/`	`care-readiness-assessment` queue → `AIClientCareReadinessAssessmentService` (02-assessments.md)
`wound-care/`	`wound-analysis` queue, Claude vision-style analysis (08-health-monitoring.md)
`daily-motivations/`	Uses `LLMProviderFactory` directly (10-daily-living.md)
`care-proposals/`	AI proposal content + `CareProposalAiSummaryService` consumed by guided generate (03-care-proposals.md)
`initial-assessments/`	Guided narrative flow (web) backed by `AIGuidedNarrativeService` (02-assessments.md)
`common/`	`GeminiService` (image gen, key rotation/circuit breaker) and `AIImageService` facade
`@anaya/shared`	Single source for providers/modes/presets/focus areas, chat models, socket event names

Open Questions & Gaps

Business AI defaults are advisory only. ai_settings is read exclusively by the web to pre-fill dialogs (apps/web/hooks/use-ai-settings.ts:77-87); backend orchestrators hard-default to 'anthropic'/'standard' when job data omits provider/mode (care-plan-orchestrator.service.ts:134,138). API callers bypassing the web UI never see the business defaults. Intent (server-side enforcement vs. UI convenience) cannot be determined from code.
Two parallel Plate-editor AI stacks. The backend plate-ai module (OpenAI direct, JWT-guarded) coexists with the web's own /api/ai/command + /api/ai/copilot routes (Vercel AI Gateway, AI_GATEWAY_API_KEY). The web's proxy routes to the backend (/api/ai/plate-command, plate-transform) forward no auth header, so they would 401 against JwtAuthGuard — and no web component references them. Which stack is live (and whether the backend module is dead code) cannot be determined from code.
Apparent dead code: AIClientCarePlanService (one-shot gpt-5.5 care plan, only referenced from ai.module.ts) and most of AIClientService (5 hardcoded OpenAI Assistant IDs in source, ai-client.service.ts:11-16; only generateTitle has a caller). Hardcoded assistant IDs are also environment-coupling risks (same IDs across staging/prod).
Engagement activities ignore the provider abstraction — fixed OpenAI gpt-5.5 while sibling features (care plan, tasks, meals) offer OpenAI/Anthropic choice; engagement job data has no provider field (clients/processors/client-engagement-activities.processor.ts:88-100). Inconsistent by design or by lag — cannot determine from code.
"Standard" mode skips quality evaluation (GENERATION_MODE_PRESETS.standard.skipQualityEvaluation = true), so fast vs. standard differ only in thresholds (which then only matter in the skipped evaluator) and reasoning effort. The thresholds in fast/standard presets are effectively unused.
Model naming drift: shared option labels say "Claude Sonnet 4.6 / GPT-5.5" but other services pin a zoo of models (gpt-4-turbo, gpt-4o, gpt-4o-mini, gpt-4.1-nano, whisper-1, dall-e-3, Gemini variants). No central model registry; upgrades require touching each service.
PHI flows to three external AI vendors with no redaction layer (visible in code): full client narratives (demographics, diagnoses, medications, assessments) to OpenAI and Anthropic for care plans/tasks/meals/engagement/summaries; meal/activity image prompts to Google Gemini; caregiver audio to OpenAI Whisper; knowledge-base document text to OpenAI embeddings and to the external Qdrant instance (QDRANT_URL/QDRANT_API_KEY). Whether BAAs/zero-retention agreements cover these calls is outside the code.
Knowledge-base single-document endpoints are not tenant-scoped: findOne, update, and remove query by _id only (knowledge-base.service.ts:256-341), and the controller checks permission but not business ownership — a user with KNOWLEDGE_BASE_VIEW/MANAGE in business A who learns a business-B document ID can read/update/delete it. findAll, semanticSearch, and assertDocumentsBelongToBusiness are scoped, so this looks like an oversight rather than a design choice.
Anthropic token accounting under-reports: reasoning tokens are hardcoded to 0 (anthropic.provider.ts:139), and cost estimates are log-only everywhere — no persisted per-tenant usage/cost records despite usage being saved per AI chat message.
Chat safety filter is English-only keyword regex (ai-chat-safety.service.ts:7-20) — paraphrased or non-English crisis messages pass straight to the model; mitigation then depends entirely on prompt instructions.
updateConversationTitle only sets a title when none exists (ai-chat.service.ts:141-149) — there is no user-facing rename; whether that's intended cannot be determined from code.
Emergency fallback care plan (confidence 0.35, mostly empty sections) is saved as a completed care plan when first-iteration generation fails (care-plan-orchestrator.service.ts:566-577,973-1005); the only signals are openQuestions/missingTopics text inside the plan. Reviewers may not notice the plan is a fallback.

AI Features

On this page