AI Features
Part of the Anaya Care product wiki. See 00-overview.md.
Purpose
Anaya Care embeds LLM-backed generation across the care workflow: agentic background generation of care plans, care-provider tasks, and meals; one-shot generation of engagement activities, assessments, summaries, and quotes; a general-purpose AI chat assistant; rich-text editor AI (Plate); audio transcription; AI image generation; and a tenant-scoped knowledge base with vector (RAG) search. The backend talks directly to OpenAI, Anthropic, and Google Gemini; the web app additionally has its own editor-AI routes that call the Vercel AI Gateway.
This page documents the AI plumbing and each feature's trigger → model → context → output path. Wound photo analysis is covered in 08-health-monitoring.md, the care readiness quiz in 02-assessments.md, and daily motivations in 10-daily-living.md — each is cross-linked briefly below.
Entities & Data Model
ai_settings — per-business AI defaults
apps/backend/src/ai-settings/entities/ai-settings.entity.ts
| Field | Type | Notes |
|---|---|---|
business | ObjectId → Business | required, unique — one settings doc per tenant |
defaultProvider | string | default 'anthropic' |
defaultMode | string | default 'standard' |
defaultAdditionalPrompt | string | default '' |
Created lazily via upsert on first read (ai-settings.service.ts:19-30). Read/written through GET|PATCH /ai-settings, both guarded by BusinessPermission.AI_FEATURES (ai-settings.controller.ts:24,33).
ai_conversations / ai_messages — AI chat sessions
apps/backend/src/ai-chat/entities/ai-conversation.entity.ts, ai-message.entity.ts
| Collection | Key fields |
|---|---|
ai_conversations | business (ObjectId, optional — null for independent care providers), userId (required), title? (auto-generated after first exchange), lastMessageAt; compound index {business, userId, lastMessageAt} |
ai_messages | conversationId, business?, role (user/assistant), content, model?, usage? (inputTokens/outputTokens/totalTokens); index {conversationId, createdAt} |
knowledge_base_documents + Qdrant knowledge_base_chunks
apps/backend/src/knowledge-base/entities/knowledge-base-document.entity.ts
| Field | Type | Notes |
|---|---|---|
business | ObjectId → Business | required, indexed — tenant owner taken from the authenticated user, never the body (knowledge-base.service.ts:47-66) |
title, description?, tags[] | strings | tags indexed |
file | embedded | S3 key, url, type, name, size, optional PDF thumbnailKey/Url |
processingStatus | enum DocumentProcessingStatus | pending → processing → completed/failed |
uploadedBy | ObjectId → User | |
isDeleted | boolean | soft delete |
Vector store: Qdrant collection knowledge_base_chunks, 1536-dim cosine, payload-indexed on businessId and documentId (knowledge-base/services/qdrant.service.ts:6,63-77). Each point payload carries content, businessId, documentId, chunkIndex (vector-search.service.ts:53-62).
AI output cached on other entities
Client.aiSummary+Client.aiSummaryUpdatedAt— cached ~3000-char plain-text client profile summary (clients/services/client-ai-summary.service.ts:74-77).- Generated care plans, tasks, meals, engagement activities, trainings etc. are stored in their own module collections (see sibling pages); AI metadata such as iteration counts/confidence is returned in job results, not persisted as a first-class audit collection.
Shared config types (@anaya/shared)
packages/shared/src/care-plan/care-plan-generation.ts—AIProvider('openai' | 'anthropic'),GenerationMode('fast' | 'standard' | 'thorough'),PROVIDER_MODEL_MAP,GENERATION_MODE_PRESETS, care-plan + task focus-area option lists.packages/shared/src/meal-generation/meal-generation.ts,packages/shared/src/engagement-activity-generation/engagement-activity-generation.ts— focus areas and mode UI copy (estimated times, "Recommended" badge).packages/shared/src/ai-chat/ai-chat.ts— chat entity DTOs,AI_CHAT_MODELSwhitelist,DEFAULT_AI_CHAT_MODEL = 'claude-sonnet-4-6', starter prompts.packages/shared/src/ai/ai-settings.ts—AISettingsResponseDTO.
Workflows & State Machines
Shared provider/agent infrastructure
LLMProviderFactory(apps/backend/src/ai/providers/llm-provider.factory.ts) resolvesAIProvider + GenerationMode → modelviaPROVIDER_MODEL_MAP. As of today every mode maps to the same model per provider: OpenAI →gpt-5.5, Anthropic →claude-sonnet-4-6(packages/shared/src/care-plan/care-plan-generation.ts:22-32). API keys come fromOPENAI_API_KEY/ANTHROPIC_API_KEYenv.OpenAIProvider(ai/providers/openai.provider.ts) —openai.responses.create()with strict JSON-schema output,reasoning.effort, 5-min default timeout via AbortController, and one truncation retry doublingmax_output_tokensup to 32k. Logs estimated cost ($5/$30/$30 per 1M in/out/reasoning, lines 155-161).AnthropicProvider(ai/providers/anthropic.provider.ts) — Vercel AI SDKgenerateTextwith a forced tool call whose input schema is the desired JSON schema (toolChoice, lines 78-86); one transient-error retry and one truncation retry; reasoning tokens always reported as 0 (line 139); cost $3/$15 per 1M (lines 154-159).BaseAgent(ai/agents/base/base.agent.ts) — retry wrapper for all agents: up to 2 retries with exponential backoff (1s base), retryability classified by error-message keywords (timeout/rate limit/5xx retryable; invalid/not found/unauthorized/cancelled not), token-usage and timing metrics.CancellationRegistry(ai/agents/base/cancellation-registry.ts) — Redis-backed (CACHE_MANAGER) keyjob:cancelled:<jobId>with 2h TTL, bridging the cancel API endpoint and the running BullMQ worker (job.data updates don't propagate to a running processor). Orchestrators callsyncCancellation()at every stage/iteration and abort in-flight LLM calls viaAbortController(care-plan-orchestrator.service.ts:852-862).- Prompt files — markdown files under
apps/backend/src/ai/prompts/*.prompts.mdparsed byprompt-loader.tsusing<!-- @prompt KEY -->section markers and{{variable}}substitution, cached in-process. AIModuleis@Global()and provides a rawOpenAIclient via DI (ai/ai.module.ts:62-67).
Generation mode presets (packages/shared/src/care-plan/care-plan-generation.ts:51-82):
| Mode | maxIterations | confidence | coverage | reasoningEffort | skipQualityEvaluation |
|---|---|---|---|---|---|
| fast | 1 | 0.7 | 0.75 | low | true |
| standard | 1 | 0.8 | 0.85 | medium | true |
| thorough | 3 | 0.85 | 0.9 | high | false |
Note: fast and standard differ only in thresholds/reasoning effort — both skip the quality-evaluator loop.
Care plan generation (agentic, BullMQ)
- Trigger: admin/care-manager dialog on web; backend dedupes — an active non-cancelled job for the same client returns the existing jobId (
client-care-plans.service.ts:455-488). Job options:attempts: 3, exponential backoff 5s (:523-535). - Provider/model:
job.data.provider || 'anthropic'throughLLMProviderFactory(ai/agents/care-plan/care-plan-orchestrator.service.ts:137-142). - Context: ValidationAgent loads client, medications, schedules, representative profiles and requires
client.timezone(:230-237); ContextBuilderAgent builds the formatted narrative fromClientNarrativeService(ai/agents/care-plan/agents/context-builder.agent.ts:117) plus schedule context and the user'sadditionalPrompt. - Generation:
ContentGeneratorAgentcallsprovider.generateStructuredOutputper care-plan section with prompts fromai/prompts/care-plan.prompts.md(content-generator.agent.ts:750-753) and per-section token caps. The loop's stop conditions are confidence ≥ threshold, coverage ≥ threshold with no critical issues, evaluator says no refinement, or max iterations (care-plan-orchestrator.service.ts:722-782). - Fallback: if generation fails on the first iteration with no draft, a near-empty "emergency draft" with confidence 0.35 is saved so the job completes (
:973-1005). - Output: saved via
ClientCarePlansService.createCarePlan(..., isCompleted: true); completion emitsCARE_PLAN_GENERATION_COMPLETEDto the requester's socket room and a push notification (:366-404).
Care provider tasks generation (agentic, BullMQ)
- Trigger:
ClientCareProviderTasksServiceenqueuesgenerate-care-provider-taskson queuecare-provider-tasks-generationwith target schedule IDs, existing-task summaries (for dedup), mode presets, optional provider/focus areas/additional prompt (clients/services/client-care-provider-tasks.service.ts:1548-1586). Same duplicate-job guard + Redis-cancellation check pattern. - Pipeline (
ai/agents/care-provider-tasks/care-provider-tasks-orchestrator.service.ts:63-72): Validation → context building → ObjectivesAgent → StructureAgent → TaskGeneratorAgent (parallel per-structure with a semaphore) → TasksQualityEvaluatorAgent → refinement loop → DeduplicationAgent → save. LLM calls go through the sameLLMProviderFactory/generateStructuredOutputpath (agents/task-generator.agent.ts:338). - Deterministic helpers:
MealTaskMapperServiceandVitalSignsTaskMapperServicemap existing meal plans / vital-sign requirements into tasks without LLM calls;ContextPrioritizationService,ClinicalGuidelinesService,CognitiveAdaptationService,FewShotExamplesService,SemanticSimilarityServiceshape prompts and dedup (ai/agents/care-provider-tasks/services/). - Progress:
CARE_PROVIDER_TASKS_GENERATION_PROGRESS/COMPLETED/FAILEDsocket events (packages/shared/src/enums/socket-events.ts:70-72).
Meal generation (agentic, BullMQ)
- Trigger:
ClientMealsServiceenqueuesgenerate-mealson queueclient-meals(client-meals/services/client-meals.service.ts:278); processor (concurrency 3, 15-min lock) delegates toMealOrchestratorService(client-meals/processors/client-meals.processor.ts:38-49). - Pipeline (
ai/agents/client-meals/client-meals-orchestrator.service.ts): Validation → MealContextBuilder → MealPlannerAgent → MealGeneratorAgent (structured output via the selected provider,agents/meal-generator.agent.ts:186) → MealQualityEvaluatorAgent (thorough only). Job data carries meal counts per type, preferred/avoid ingredients, restrictions, mode/provider/focus areas (:81-97). - Images: when
generateImagesis set, the orchestrator generates a photo per meal (:313-323) throughAIClientMealsService, which uses Google Gemini image generation (ai/services/ai-client-meals.service.ts:1024). - Variations: a separate
generate-variationsjob re-uses a base meal and callsAIClientMealsServicedirectly (OpenAI Responses API,gpt-5.5,:229,328,399). - Progress:
CLIENT_MEALS_GENERATION_*andMEAL_VARIATION_GENERATION_*socket events (socket-events.ts:85-93).
Engagement activity generation (one-shot loop, BullMQ — not agentic)
- Trigger:
ClientEngagementActivitiesServiceenqueuesgenerate-engagement-activityon queueclient-engagement-activities(clients/services/client-engagement-activities.service.ts:299). - Flow (
clients/processors/client-engagement-activities.processor.ts:85-249): build the client's formatted narrative (clientsService.getFormattedNarrative), then loopnumberOfActivitiestimes callingAIClientEngagementActivityService.generateActivity— a fixed OpenAIgpt-5.5Responses-API call (ai/services/ai-client-engagement-activity.service.ts:25-29,139-140) with Montessori theme/type/difficulty/duration and optional mode/focus areas/additional prompt. No provider selection — the sharedLLMProviderFactoryis not used here, unlike care plans/tasks/meals. - Images: optional, generated per activity via Gemini (1:1, 1K) and uploaded to public S3 (
ai-client-engagement-activity.service.ts:51-84). - Progress:
CLIENT_ENGAGEMENT_ACTIVITY_GENERATION_*events (socket-events.ts:96-98); cancellation viaCancellationRegistry(client-engagement-activities.service.ts:346).
AI chat (conversational assistant)
- Who: any authenticated user; the controller is
@SkipBusinessCheck()/@SkipBusinessScope()so independent care providers without a business can use it (ai-chat/ai-chat.controller.ts:33-35). Conversations are owned byuserIdand scoped to the user'sbusinessId ?? null. - Safety gate: before any LLM call, a regex pre-filter classifies the message as
CRISIS/EMERGENCY/SAFE; crisis/emergency messages get canned 988/911 responses and never reach the model (ai-chat-safety.service.ts:7-32,ai-chat.controller.ts:104-129). - Model: user-selectable from the
AI_CHAT_MODELSwhitelist (GPT-4o/4o-mini/4.1/4.1-mini/4.1-nano/o4-mini, Claude Sonnet 4.6 / Haiku 4.5 / Opus 4.6), defaultclaude-sonnet-4-6(packages/shared/src/ai-chat/ai-chat.ts:43-102); DTO validates against the whitelist. Titles are auto-generated on the first exchange withgpt-4.1-nano(ai-chat.service.ts:24,229-243). - Context: system prompt composed from generic
anaya/base+anaya/chatprompt sections (scope of practice, glossary, safety escalation, identity, platform knowledge, behavioral rules —ai-chat-prompt.service.ts:14-30) plus the last 50 messages of the conversation (ai-chat.service.ts:211-227). The chat has no access to client records, the knowledge base, or any tenant data — it is a pure conversational assistant. - Output: streamed via AI SDK
streamText().pipeUIMessageStreamToResponse(res); assistant message + token usage persisted inonFinish(ai-chat.controller.ts:152-192).
Plate AI (rich-text editor AI — backend module)
plate-ai is the AI service for the Plate.js rich-text editor (generate / edit / comment / copilot-complete / transform commands) — it is not meal-photo analysis.
- Endpoints
POST /plate-ai/command|complete|transform(JWT-guarded) stream SSE chunks (plate-ai/plate-ai.controller.ts). - Model:
gpt-4o-miniby default (commandaccepts a model in the DTO) via OpenAI chat completions (plate-ai/plate-ai.service.ts:18,33,66,124); system prompts =anaya/context(full platform context) + tool-specific sections ofai/prompts/plate-ai.prompts.md(:9,92-109). - Web reality check: the web app's Plate editor (
apps/web/components/editor/plugins/ai-kit.tsx,copilot-kit.tsx) calls the web's own Next.js routes/api/ai/commandand/api/ai/copilot, which use the Vercel AI Gateway (createGateway+AI_GATEWAY_API_KEY,apps/web/app/api/ai/command/route.ts,copilot/route.ts) — not the backend module. The web proxy routes to the backend (/api/ai/plate-command,/api/ai/plate-transform) forward noAuthorizationheader while the backend controller requires JWT, and no web component references them. See Open Questions.
Knowledge base (RAG)
- Ingest: create document → status
pending→ asyncindexDocument(): S3 download → text extraction (text-extraction.service.ts) → chunking (~3200 chars with 1600-char overlap, ≈800/400 tokens —chunking.service.ts:14-17) → OpenAItext-embedding-3-smallembeddings (1536-dim,embedding.service.ts:17,47) → Qdrant upsert with rollback on failure (vector-search.service.ts:27-78). Status moves tocompleted/failed; PDFs get an async thumbnail;POST :id/retry-indexingre-runs failed docs (knowledge-base.service.ts:442-455). - Search:
POST /knowledge-base/searchembeds the query and runs Qdrant cosine search filtered bybusinessId, deduplicating to the best chunk per document (knowledge-base.service.ts:347-398). - Permissions:
KNOWLEDGE_BASE_MANAGEfor create/update/delete/retry,KNOWLEDGE_BASE_VIEWfor read/search (knowledge-base.controller.ts). - RAG consumer: AI training generation retrieves KB chunks via
vectorSearchService.search(...)scoped to selected document IDs (trainings/services/training-ai-generation.service.ts:113,229) and generates training content withgpt-5.5(:140); document ownership is asserted before a job can reference KB docs (trainings.service.ts:2684,knowledge-base.service.ts:466-484).
Client AI summary & narrative services
ClientNarrativeService(clients/services/client-narrative.service.ts) is not an LLM service — it deterministically formats client demographics, medical records, medications, schedules, and assessments (RTI, simplified RTI, Montessori, Allen cognitive level, meal assessment) into a long markdown narrative. This narrative is the canonical context input for care-plan context building, engagement activities, meals, and the AI summary below.ClientAiSummaryService(clients/services/client-ai-summary.service.ts) condenses that narrative into a ~3000-char plain-text caregiver-facing profile viaclaude-sonnet-4-6(:29), cached onClient.aiSummary;getSummary()auto-generates on first read (:82-104). Consumed as document context by guided generation (ai/services/ai-guided-generate.service.ts:71-73).
Shift summary generation + content filter
- AI narrative:
ShiftSummaryService.generateAiSummaryproduces an audience-tailored structured summary (headline + sections with paragraphs/bullets/severity flags) using AI SDKgenerateObjectwithclaude-sonnet-4-6and a Zod schema whose section-id enum is restricted to the audience's allowed sections — the model cannot emit out-of-audience sections, and a second filter re-checks after decode (clients/services/shift-summary.service.ts:1222-1329). - Content filter:
ShiftSummaryContentFilterServiceis a deterministic, non-LLM visibility filter that maps the sharedCONTENT_MATRIX(ContentItem×ExportAudience→ include/consider/exclude) onto the shift-summary data sections — e.g. family audiences get alarming caregiver notes stripped and only critical threshold warnings (clients/services/shift-summary-content-filter.service.ts).
One-shot AI services inventory (apps/backend/src/ai/services/)
| Service | Model / provider | Purpose & output |
|---|---|---|
ai-guided-generate.service.ts | claude-sonnet-4-6 + AI SDK tools (ask_questions, generate_content), max 10 questions | Interactive Q&A → field content for client / care-proposal forms; pulls the doc's AI summary as context (:71-73,176,190-215) |
ai-guided-narrative.service.ts | claude-haiku-4-5 (conversation, summary, extraction), claude-sonnet-4-6 (synthesis) (:541,648,788,846,890) | Guided narrative flow for initial assessments (POST /ai/guided-narrative/*) |
ai-text-improve.service.ts | gpt-4o-mini (:35) | Improve/rewrite text fields (web "AI improve" popover) |
ai-transcription.service.ts | whisper-1 (:30) | Voice-to-text; 25 MB upload cap, throttled 20 req/min (ai-transcription.controller.ts:26-30) |
ai-client-care-readiness-assessment.service.ts | gpt-5.5 Responses API + care-readiness prompts (:57) | Care readiness quiz questions — see 02-assessments.md; driven by the care-readiness-assessment queue in shifts/ |
ai-client-care-proposal.service.ts | claude-sonnet-4-6 (:79) | Care proposal content — see 03-care-proposals.md |
ai-client-narrative-report.service.ts | gpt-5.5 (:9) | Narrative report generation |
ai-home-safety-recommendation.service.ts | gpt-4o (:26) | Home-safety recommendations |
ai-client-medication.service.ts | gpt-4-turbo (:57,114) | Medication-related generation — see 07-medications.md |
ai-job-posting.service.ts | gpt-4-turbo (:94,202) | Job posting copy — see 06-job-postings.md |
ai-client-care-plan.service.ts | gpt-5.5 (:515) | Legacy single-shot care plan generation — registered in ai.module.ts but no callers found outside the module (superseded by the agentic orchestrator) |
ai-client.service.ts (AIClientService) | Hardcoded OpenAI Assistants API IDs (5 asst_* constants, :11-16), dall-e-3, gpt-4-turbo | Mostly vestigial; only generateTitle is called today (doctor-appointment titles, clients/services/client-doctors-appointments.service.ts:171) |
Adjacent AI features (documented elsewhere, brief)
- Wound photo analysis — BullMQ queue
wound-analysis(wound-care/wound-care.module.ts:27), analysis viaclaude-sonnet-4-6generateTextwith a forced tool (wound-care/services/*:263-264). Full coverage: 08-health-monitoring.md. - Care readiness quiz —
care-readiness-assessmentqueue + processor inshifts/using the gpt-5.5 service above. Full coverage: 02-assessments.md. - Daily motivations —
AIDailyMotivationServicegenerates caregiver quote batches viaLLMProviderFactory('anthropic', 'fast')structured output, with theme input sanitized against prompt injection (daily-motivations/services/ai-daily-motivation.service.ts:50-80). Full coverage: 10-daily-living.md. - AI image facade —
AIImageService(common/services/ai-image/ai-image.service.ts) wraps OpenAI image and Gemini (gemini-3-pro-image-preview) behind one API; default provider from env, per-call override; used by training content generation (trainings/services/training-ai-content.service.ts).GeminiService(common/services/gemini/gemini.service.ts) adds API-key rotation, circuit breaker, and quota handling (default modelgemini-2.0-flash,:67-68). - Training generation (
training-generationqueue) and blog generation (blog-generationqueue) are additional LLM consumers intrainings/andblog/.
Business Rules & Constraints
- Provider selection is per-request, not enforced per-business. Generation job data carries
providerfrom the web dialog; orchestrators default to'anthropic'when absent (care-plan-orchestrator.service.ts:138,client-meals-orchestrator.service.ts:103). The storedai_settings.defaultProvider/defaultMode/defaultAdditionalPromptare applied client-side only, as web form defaults viauseAISettingsDefaults(apps/web/hooks/use-ai-settings.ts:58-90); the backend never readsAISettingsServiceduring generation. - Model is independent of mode —
PROVIDER_MODEL_MAPmaps every mode to the same model per provider; mode only changes loop config andreasoningEffort(packages/shared/src/care-plan/care-plan-generation.ts:22-32,51-82). - Mode presets: fast and standard both run a single iteration with quality evaluation skipped; only thorough runs the evaluate/refine loop (up to 3 iterations) (
care-plan-generation.ts:61-81, applied atcare-plan-orchestrator.service.ts:591-599). - Cancellation:
DELETEcancel endpoints (clients.controller.ts:1246,client-care-provider-tasks.controller.ts:109,client-engagement-activities.controller.ts:68,client-meals.controller.ts:95) write to the RedisCancellationRegistry; orchestrators poll it at each stage/iteration and abort in-flight LLM calls viaAbortController(care-plan-orchestrator.service.ts:852-862). Keys expire after 2h; the registry entry is cleared infinally(:475). - Duplicate-job guard: one active generation per client per feature — an existing non-cancelled queued/active job short-circuits with the existing jobId (
client-care-plans.service.ts:455-488; same pattern for tasksclient-care-provider-tasks.service.ts:1530-1546). - BullMQ resilience: jobs
attempts: 3with 5s exponential backoff (client-care-plans.service.ts:526-531); workers runconcurrency: 3,lockDuration: 900_000(15 min),stalledInterval: 30s,maxStalledCount: 1, with best-effortextendLockbefore each long stage (clients/processors/care-plan.processor.ts:35-40,care-plan-orchestrator.service.ts:868-878). - Cost controls: per-call cost estimates are computed and logged only (
openai.provider.ts:144-146,anthropic.provider.ts:143-145); token usage is accumulated into job results (care-plan-orchestrator.service.ts:409-418). No budget caps, per-tenant quotas, or billing aggregation exist in code. - Rate limits: only transcription is throttled (
@Throttle 20/min,ai-transcription.controller.ts:26). AI chat, plate-ai, and generation endpoints have no AI-specific throttles (global throttler config aside). - Truncation handling: both providers retry once with doubled output budget (cap 32k) and hard-fail rather than return truncated JSON (
openai.provider.ts:114-134,anthropic.provider.ts:107-133). - Permissions:
BusinessPermission.AI_FEATURESgates AI settings (ai-settings.controller.ts:24,33) and is described as "Access AI generation tools" (packages/shared/src/constants/permission-groups.ts:534); KB has dedicatedKNOWLEDGE_BASE_VIEW/MANAGEpermissions. - Chat safety: keyword-regex crisis/emergency interception happens server-side before any model call (
ai-chat-safety.service.ts); chat model IDs are whitelist-validated. - Tenant isolation in AI paths: KB writes take
businessfrom the JWT, search filters bybusinessIdin Qdrant, and cross-tenant KB references in training jobs are rejected (knowledge-base.service.ts:47-66,347-358,466-484). But see Open Questions #8 forfindOne/update/remove. - Prompt-injection hygiene exists only piecemeal: daily-motivation theme sanitization (
ai-daily-motivation.service.ts:61-63); useradditionalPromptstrings flow into generation prompts unsanitized elsewhere.
Surfaces (Web & Mobile)
Web (apps/web)
- Generation dialogs (admin/care-manager, per client):
generate-care-plan-dialog.tsx,generate-meals-dialog.tsx,generate-engagement-activities-dialog.tsx,generate-ai-tasks-dialog.tsxunderapp/(app)/(admin)/dashboard/clients/[id]/edit/...— each offers mode (fast/standard/thorough), provider (where supported), focus areas, and additional prompt, pre-filled from business AI settings viauseAISettingsDefaults(hooks/use-ai-settings.ts). Provider icons incomponents/icons/ai-provider-icons. - AI settings:
components/ai-settings-dialog.tsx(edit; requiresAI_FEATURES) andcomponents/ai-settings-readonly-banner.tsx(non-admin read-only view). - AI chat:
app/(app)/(admin)/dashboard/ai-chat/page.tsx. - Plate editor AI:
components/editor/plugins/ai-kit.tsxandcopilot-kit.tsx+components/ui/ai-menu.tsx→ web routesapp/api/ai/command/route.tsandapp/api/ai/copilot/route.ts(Vercel AI Gateway). Backend-proxy routesapp/api/ai/plate-command/route.ts,plate-transform/route.tsexist but appear unused. - Guided narrative / guided generate:
dashboard/initial-assessments/[id]/guided-narrative/page.tsx,_components/guided-narrative-flow.tsx,hooks/use-guided-generate.ts,use-guided-narrative.ts. - Text utilities:
components/ui/ai-improve-popover.tsx/ai-ask-popover.tsx(hooks/use-text-improve.ts→/ai/text-improve),hooks/use-voice-to-text.ts(→/ai/transcribe). - Generation progress: dashboards subscribe to the notification WebSocket; orchestrators emit
*_GENERATION_PROGRESS/COMPLETED/FAILEDevents to theuser-<userId>room (care-plan-orchestrator.service.ts:817-823; event names inpackages/shared/src/enums/socket-events.ts:70-102), plus push notifications on completion/failure.
Mobile (apps/mobile)
- AI chat:
app/(ai-chat)/index.tsx(conversation list) andapp/(ai-chat)/[id].tsx(thread); REST clientlib/ai-chat-api.ts; streaming vialib/ai-chat-stream.ts, which usesexpo/fetchto read the backend's AI SDK UI-message SSE stream (axios/RN fetch can't stream) with the JWT from SecureStore. - No mobile surfaces were found for triggering care-plan/meal/engagement generation — those are web-only.
Cross-Module Dependencies
| Module | Relationship |
|---|---|
clients/ | Enqueues care-plan, tasks, engagement, appointment-history jobs (clients.module.ts:244-248); owns processors for care-plan/tasks/engagement; supplies ClientNarrativeService narrative consumed by nearly every generation context; stores aiSummary on Client |
client-meals/ | Owns client-meals queue + processor; saves generated meals; meal images via Gemini |
notification/ | NotificationGateway (WebSocket progress events) and NotificationService (push on complete/fail) injected into every orchestrator/processor |
trainings/ | RAG consumer of knowledge-base vector search; training-generation + slides-video-merge queues; uses AIImageService |
shifts/ | care-readiness-assessment queue → AIClientCareReadinessAssessmentService (02-assessments.md) |
wound-care/ | wound-analysis queue, Claude vision-style analysis (08-health-monitoring.md) |
daily-motivations/ | Uses LLMProviderFactory directly (10-daily-living.md) |
care-proposals/ | AI proposal content + CareProposalAiSummaryService consumed by guided generate (03-care-proposals.md) |
initial-assessments/ | Guided narrative flow (web) backed by AIGuidedNarrativeService (02-assessments.md) |
common/ | GeminiService (image gen, key rotation/circuit breaker) and AIImageService facade |
@anaya/shared | Single source for providers/modes/presets/focus areas, chat models, socket event names |
Open Questions & Gaps
- Business AI defaults are advisory only.
ai_settingsis read exclusively by the web to pre-fill dialogs (apps/web/hooks/use-ai-settings.ts:77-87); backend orchestrators hard-default to'anthropic'/'standard'when job data omits provider/mode (care-plan-orchestrator.service.ts:134,138). API callers bypassing the web UI never see the business defaults. Intent (server-side enforcement vs. UI convenience) cannot be determined from code. - Two parallel Plate-editor AI stacks. The backend
plate-aimodule (OpenAI direct, JWT-guarded) coexists with the web's own/api/ai/command+/api/ai/copilotroutes (Vercel AI Gateway,AI_GATEWAY_API_KEY). The web's proxy routes to the backend (/api/ai/plate-command,plate-transform) forward no auth header, so they would 401 againstJwtAuthGuard— and no web component references them. Which stack is live (and whether the backend module is dead code) cannot be determined from code. - Apparent dead code:
AIClientCarePlanService(one-shot gpt-5.5 care plan, only referenced fromai.module.ts) and most ofAIClientService(5 hardcoded OpenAI Assistant IDs in source,ai-client.service.ts:11-16; onlygenerateTitlehas a caller). Hardcoded assistant IDs are also environment-coupling risks (same IDs across staging/prod). - Engagement activities ignore the provider abstraction — fixed OpenAI
gpt-5.5while sibling features (care plan, tasks, meals) offer OpenAI/Anthropic choice; engagement job data has noproviderfield (clients/processors/client-engagement-activities.processor.ts:88-100). Inconsistent by design or by lag — cannot determine from code. - "Standard" mode skips quality evaluation (
GENERATION_MODE_PRESETS.standard.skipQualityEvaluation = true), so fast vs. standard differ only in thresholds (which then only matter in the skipped evaluator) and reasoning effort. The thresholds in fast/standard presets are effectively unused. - Model naming drift: shared option labels say "Claude Sonnet 4.6 / GPT-5.5" but other services pin a zoo of models (
gpt-4-turbo,gpt-4o,gpt-4o-mini,gpt-4.1-nano,whisper-1,dall-e-3, Gemini variants). No central model registry; upgrades require touching each service. - PHI flows to three external AI vendors with no redaction layer (visible in code): full client narratives (demographics, diagnoses, medications, assessments) to OpenAI and Anthropic for care plans/tasks/meals/engagement/summaries; meal/activity image prompts to Google Gemini; caregiver audio to OpenAI Whisper; knowledge-base document text to OpenAI embeddings and to the external Qdrant instance (
QDRANT_URL/QDRANT_API_KEY). Whether BAAs/zero-retention agreements cover these calls is outside the code. - Knowledge-base single-document endpoints are not tenant-scoped:
findOne,update, andremovequery by_idonly (knowledge-base.service.ts:256-341), and the controller checks permission but not business ownership — a user withKNOWLEDGE_BASE_VIEW/MANAGEin business A who learns a business-B document ID can read/update/delete it.findAll,semanticSearch, andassertDocumentsBelongToBusinessare scoped, so this looks like an oversight rather than a design choice. - Anthropic token accounting under-reports: reasoning tokens are hardcoded to 0 (
anthropic.provider.ts:139), and cost estimates are log-only everywhere — no persisted per-tenant usage/cost records despiteusagebeing saved per AI chat message. - Chat safety filter is English-only keyword regex (
ai-chat-safety.service.ts:7-20) — paraphrased or non-English crisis messages pass straight to the model; mitigation then depends entirely on prompt instructions. updateConversationTitleonly sets a title when none exists (ai-chat.service.ts:141-149) — there is no user-facing rename; whether that's intended cannot be determined from code.- Emergency fallback care plan (confidence 0.35, mostly empty sections) is saved as a completed care plan when first-iteration generation fails (
care-plan-orchestrator.service.ts:566-577,973-1005); the only signals areopenQuestions/missingTopicstext inside the plan. Reviewers may not notice the plan is a fallback.