🎙️ Echo - Technical Architecture

1. New Entry Creation Flow (Voice Recording & Text Input)

Overview: User records voice OR types text → Entry saved with auto-draft functionality → Background processing begins
graph TD Input{Input Method} Input -->|Voice| StartVoice([User Clicks Record]) Input -->|Text| StartText([User Types Text]) StartVoice --> WS[WebSocket Connection] WS --> Record[Recording Audio
State: RECORDING] Record --> Stop[User Stops Recording] Stop --> Process[State: PROCESSING] Process --> Whisper[Whisper Service
Speech-to-Text] Whisper --> Trans[State: TRANSCRIBING] Trans --> RawText[Raw Text Generated] StartText --> DraftSave[Auto-Draft Save
POST /api/drafts/save] DraftSave --> TextReady[Text Ready] TextReady --> RawText RawText --> API[POST /api/entries/create-and-process] API --> SmartTag[Smart Tagging Service
Auto-detect patterns: question/idea/todo/decision/etc] SmartTag --> DB[(SQLite Database
Save Raw Entry + Smart Tags)] DB --> BG[Background Tasks Queue] BG --> T1[Task 1: Generate Embedding] BG --> T2[Task 2: Memory Extraction] BG --> Q1[Queue: Enhanced Processing] BG --> Q2[Queue: Structured Processing] T1 --> BGE[BGE-small Model
384 dimensions] BGE --> SaveEmb[(Save Embedding JSON)] T2 --> MemWait[Wait for Enhanced Text
or use Raw Text as fallback] MemWait --> MemExtract[Memory Service
LLM Extraction] MemExtract --> MemLLM[Extract Memories with LLM
Facts, Preferences, Habits] MemLLM --> MemStore[Store Memories] MemStore --> MemFallback{LLM Extracted Any?} MemFallback -->|No| RuleBased[Rule-based Fallback] MemFallback -->|Yes| MemComplete[Memory Extraction Complete] RuleBased --> MemComplete Q1 --> Worker1[Processing Worker] Worker1 --> Ollama1[Ollama LLM
Enhanced Prompt] Ollama1 --> Enhanced[Enhanced Text] Enhanced --> SaveEnh[(Update Entry
enhanced_text)] SaveEnh --> MoodTrigger[Trigger Mood Analysis] MoodTrigger --> MoodLLM[Ollama Mood Detection] MoodLLM --> SaveMood[(Save Mood Tags)] Q2 --> Worker2[Processing Worker] Worker2 --> Ollama2[Ollama LLM
Structured Prompt] Ollama2 --> Structured[Structured Summary] Structured --> SaveStr[(Update Entry
structured_summary)] SaveEmb --> Complete[Entry Complete] MemComplete --> Complete SaveMood --> Complete SaveStr --> Complete Complete --> WSNotify[WebSocket Notify
Processing Complete] WSNotify --> UI[UI Updates] style StartVoice fill:#f9f,stroke:#333,stroke-width:4px style StartText fill:#f9f,stroke:#333,stroke-width:4px style DraftSave fill:#87ceeb,stroke:#333,stroke-width:2px style Complete fill:#9f9,stroke:#333,stroke-width:4px style Ollama1 fill:#ffd700,stroke:#333,stroke-width:2px style Ollama2 fill:#ffd700,stroke:#333,stroke-width:2px style MoodLLM fill:#ffd700,stroke:#333,stroke-width:2px style BGE fill:#87ceeb,stroke:#333,stroke-width:2px style DB fill:#dda0dd,stroke:#333,stroke-width:2px style SaveEmb fill:#dda0dd,stroke:#333,stroke-width:2px style SaveEnh fill:#dda0dd,stroke:#333,stroke-width:2px style SaveStr fill:#dda0dd,stroke:#333,stroke-width:2px style SmartTag fill:#ffd700,stroke:#333,stroke-width:2px style MemExtract fill:#ffd700,stroke:#333,stroke-width:2px style SaveMood fill:#dda0dd,stroke:#333,stroke-width:2px
LLM Processing (Ollama)
Embedding Model (BGE)
Database Operations

2. Voice File Upload Flow

Overview: User uploads audio file → Whisper transcribes → Same processing pipeline as voice recording
graph TD Upload([User Uploads File]) --> Validate[Validate File
Formats: .wav/.mp3/.m4a/.aac/.ogg/.flac/.webm/.opus
Max Size: 100MB] Validate -->|Invalid| Error[Show Error
Unsupported format or too large] Validate -->|Valid| API[POST /api/audio/transcribe] API --> TempDir[Create Temporary Directory] TempDir --> SaveFile[Save Uploaded File] SaveFile --> SizeCheck[Check Actual File Size] SizeCheck -->|Too Large| Error SizeCheck -->|OK| Convert[Convert Audio to WAV
Using Librosa] Convert --> AudioProcess[Audio Processing:
• Load with original sample rate
• Resample to 16kHz if needed
• Convert to mono float32
• Save as WAV] AudioProcess --> Metadata[Extract Metadata:
Duration, Sample Rate, Format] Metadata --> Whisper[Whisper Service
Transcribe WAV File] Whisper --> TransResult[Transcription Result
+ Duration + Confidence + Language] TransResult --> Return[Return JSON Response
with transcription & metadata] Return --> Display[Display in UI
Allow User Editing] Display --> UserEdit[User Edits Text
Optional] UserEdit --> Save[User Saves Entry] Save --> Create[POST /api/entries/create-and-process] Create --> SmartTagging[Smart Tagging Service] SmartTagging --> Same[Same Pipeline as
New Entry Creation] style Upload fill:#f9f,stroke:#333,stroke-width:4px style Convert fill:#87ceeb,stroke:#333,stroke-width:2px style Whisper fill:#87ceeb,stroke:#333,stroke-width:2px style SmartTagging fill:#ffd700,stroke:#333,stroke-width:2px style Same fill:#90ee90,stroke:#333,stroke-width:2px

3. Talk to Echo (Chat Agent) Flow

Overview: Two-phase process: 1) Tool-calling for search, 2) Response generation with context
graph TD Start([User Message]) --> API[POST /api/diary/chat] API --> Service[DiaryChatService] Service --> Phase1[Phase 1: Tool Selection] Phase1 --> Ollama1[ChatOllama with Tools
Model: qwen3:8b] Ollama1 --> Tools{Which Tool?} Tools -->|Content Search| T1[search_diary_entries] T1 --> Embed[Generate Query Embedding
BGE-small] Embed --> Sim[Cosine Similarity Search] Sim --> Entries[(Find Similar Entries)] Tools -->|Date Search| T2[get_entries_by_date] T2 --> DateQ[Date Range Query] DateQ --> DateEntries[(Find Date Entries)] Tools -->|Ideas| T3[extract_ideas_and_concepts] T3 --> IdeaQ[Search for Ideas/Concepts] IdeaQ --> IdeaEntries[(Find Idea Entries)] Tools -->|Actions| T4[extract_action_items] T4 --> TodoQ[Search for TODOs] TodoQ --> TodoEntries[(Find Action Items)] Tools -->|Time Summary| T5[summarize_time_period] T5 --> TimeQ[Time Period Query] TimeQ --> TimeEntries[(Find Period Entries)] Tools -->|Context| T6[get_context_before_after] T6 --> ContextQ[Context Window Query] ContextQ --> ContextEntries[(Find Context)] Tools -->|Add Entry| T7[add_entry_to_diary] T7 --> AddEntry[(Create New Entry)] AddEntry --> Pipeline[Full Processing Pipeline] Tools -->|Conversations| T8[search_conversations] T8 --> ConvQ[Search Past Chats] ConvQ --> ConvEntries[(Find Conversations)] Entries --> Results[Tool Results] DateEntries --> Results IdeaEntries --> Results TodoEntries --> Results TimeEntries --> Results ContextEntries --> Results Pipeline --> Results ConvEntries --> Results Results --> Phase2[Phase 2: Response Generation] Phase2 --> MemoryRetrieval[Retrieve Relevant Memories
Based on User Query] MemoryRetrieval --> UserInfo[Get User Information
Name, Display Settings] UserInfo --> Context[Build Context:
- Tool Results
- Conversation History
- System Date
- Relevant Memories
- User Name] Context --> Ollama2[ChatOllama Generate
Model: qwen3:8b
With Memory & User Context] Ollama2 --> Response[AI Response] Response --> Clean[Strip Thinking Blocks
Clean Formatting] Clean --> Final[Final Response] Final --> Check{Voice Enabled?} Check -->|Yes| TTS[TTS Service
Piper Engine] TTS --> Audio[Audio Stream] Audio --> Client[Return to Client] Check -->|No| Client style Start fill:#f9f,stroke:#333,stroke-width:4px style Ollama1 fill:#ffd700,stroke:#333,stroke-width:2px style Ollama2 fill:#ffd700,stroke:#333,stroke-width:2px style Embed fill:#87ceeb,stroke:#333,stroke-width:2px style TTS fill:#90ee90,stroke:#333,stroke-width:2px
Available Tools:
  • search_diary_entries: Semantic search using embeddings
  • get_entries_by_date: Date-specific queries
  • extract_ideas_and_concepts: Find ideas and insights
  • extract_action_items: Find TODOs and tasks
  • summarize_time_period: Summarize a time range
  • get_context_before_after: Get surrounding entries
  • add_entry_to_diary: Create new entry from chat
  • search_conversations: Search past conversations

4. Memory Extraction & Management System

Overview: Memories extracted from entries/conversations → Categorized → User can rate accuracy
graph TD Source{Source Type} -->|Entry| Entry[Enhanced Text Available] Source -->|Conversation| Conv[Conversation Saved] Entry --> Extract1[Memory Service] Conv --> Extract2[Memory Service] Extract1 --> LLM[Ollama LLM
Memory Extraction Prompt] Extract2 --> LLM LLM --> Parse[Parse LLM Response
JSON Format] Parse --> Validate[Validate & Deduplicate
Check Recent 50 Memories] Validate --> Cat{Categorize} Cat -->|Type 1| Fact[Personal Facts
Name, Job, Location] Cat -->|Type 2| Pref[Preferences
Likes, Dislikes, Style] Cat -->|Type 3| Habit[Habits/Patterns
Routines, Behaviors] Fact --> ConfidenceScore[LLM Assigns Confidence
0.0 - 1.0 based on clarity] Pref --> ConfidenceScore Habit --> ConfidenceScore ConfidenceScore --> ImportanceCalc[Calculate Importance Score
Confidence × 10 = 1-10 Scale] ImportanceCalc --> OllamaScore[Optional: Ollama LLM
Refine Importance Score 1-10] OllamaScore --> Store[(Store in DB
agent_memories table
importance_score 1-10)] Store --> Display[Display to User
Memory Review UI] Display --> Rate{User Rating} Rate -->|-3 to -1| Irrelevant[Subtract 3 to 1 from importance] Rate -->|0| Neutral[No adjustment] Rate -->|+1 to +3| Important[Add 1 to 3 to importance] Irrelevant --> Update[(Update Memory)] Neutral --> Update Important --> Update Update --> Future[Available for
Future Chats] Future --> Retrieve[Retrieved During Chat
Based on Relevance] Retrieve --> Context[Added to Chat Context
Top 10 Relevant] style LLM fill:#ffd700,stroke:#333,stroke-width:2px style OllamaScore fill:#ffd700,stroke:#333,stroke-width:2px style Store fill:#dda0dd,stroke:#333,stroke-width:2px style Update fill:#dda0dd,stroke:#333,stroke-width:2px
Memory Lifecycle:
  1. Extraction triggered after enhanced text is ready
  2. LLM identifies facts, preferences, and habits
  3. Deduplication against recent memories
  4. Importance scoring (1-10 scale from confidence × 10)
  5. User can adjust importance (-3 to +3 points)
  6. User-adjusted memories have different decay rates
  7. Access count tracked for relevance

5. Background Processing Queue System

Overview: Async job queue with retry logic and status notifications
graph TD Job[New Processing Job] --> Queue[Job Queue
deque structure] Queue --> Worker[Worker Loop
Async Processing] Worker --> Check{Job Status} Check -->|Pending| Process[Process Job] Check -->|Processing| Skip[Skip] Check -->|Complete| Remove[Remove from Queue] Process --> GetEntry[Get Entry from Database] GetEntry --> Mode{Processing Mode} Mode -->|Enhanced| EnhServ[EntryProcessingService
Enhanced Mode] Mode -->|Structured| StrServ[EntryProcessingService
Structured Mode] EnhServ --> EnhOllama[Ollama Generate
Enhanced Prompt] StrServ --> StrOllama[Ollama Generate
Structured Prompt] EnhOllama --> EnhResult[Enhanced Text Result] StrOllama --> StrResult[Structured Result] EnhResult --> UpdateEnh[Update Entry.enhanced_text
+ processing_metadata] StrResult --> UpdateStr[Update Entry.structured_summary
+ processing_metadata] UpdateEnh --> MemoryCheck{Enhanced Mode?} UpdateStr --> SaveSuccess[Save to Database] MemoryCheck -->|Yes| ExtractMem[Extract Memories
from Enhanced Text] MemoryCheck -->|No| SaveSuccess ExtractMem --> SaveSuccess SaveSuccess --> Success{Success?} Success -->|Yes| Complete[Mark Job Complete
Set completed_at] Success -->|No| Retry{Retry Count ≤ 3?} Retry -->|Yes| Backoff[Exponential Backoff
2^retry_count seconds] Backoff --> Requeue[Requeue Job
Status: PENDING] Retry -->|No| Failed[Mark Job Failed
Max Retries Reached] Complete --> NotifySuccess[WebSocket Notification
Job Complete] Failed --> NotifyFail[WebSocket Notification
Job Failed] NotifySuccess --> UI[UI Updates] NotifyFail --> UI Requeue --> Queue style Queue fill:#87ceeb,stroke:#333,stroke-width:2px style EnhOllama fill:#ffd700,stroke:#333,stroke-width:2px style StrOllama fill:#ffd700,stroke:#333,stroke-width:2px style UpdateEnh fill:#dda0dd,stroke:#333,stroke-width:2px style UpdateStr fill:#dda0dd,stroke:#333,stroke-width:2px style ExtractMem fill:#90ee90,stroke:#333,stroke-width:2px

6. Pattern Insights & Mood Analysis System

Overview: Advanced analysis available for all users - pattern detection across time periods and mood analysis for emotional insights
graph TD Trigger{Analysis Trigger} Trigger -->|User Requests Analysis| PatternAnalyze[POST /api/patterns/analyze
Trigger Pattern Detection] Trigger -->|Enhanced Text Available| MoodAnalysis[Mood Analysis Trigger
POST /api/entries/analyze-mood] PatternAnalyze --> PatternService[Pattern Detector Service] PatternService --> PatternLLM[Analyze Entry Corpus
Detect patterns across time] PatternLLM --> PatternTypes{Pattern Categories} PatternTypes -->|Mood| MoodPatterns[Mood Patterns
Emotional cycles and triggers] PatternTypes -->|Topic| TopicPatterns[Topic Patterns
Recurring themes and interests] PatternTypes -->|Behavior| BehaviorPatterns[Behavior Patterns
Habits and routine changes] PatternTypes -->|Temporal| TemporalPatterns[Temporal Patterns
Time-based correlations] MoodPatterns --> PatternStore[(Store in patterns table
with confidence and frequency)] TopicPatterns --> PatternStore BehaviorPatterns --> PatternStore TemporalPatterns --> PatternStore PatternStore --> PatternUI[Pattern Insights UI
Diamond icon in sidebar] MoodAnalysis --> MoodService[Mood Analysis Service] MoodService --> MoodLLM[Ollama LLM
Mood Detection Prompt] MoodLLM --> MoodParse[Parse JSON Response
Extract 1-5 mood tags] MoodParse --> MoodValidate[Validate Mood Vocabulary
happy/stressed/excited/etc] MoodValidate --> MoodSave[(Update Entry
mood_tags column)] MoodSave --> MoodTrends[Mood Trend Analysis
Available in UI] PatternUI --> Insights[User Views Insights
Patterns and Correlations] MoodTrends --> Insights style PatternLLM fill:#ffd700,stroke:#333,stroke-width:2px style MoodLLM fill:#ffd700,stroke:#333,stroke-width:2px style PatternStore fill:#dda0dd,stroke:#333,stroke-width:2px style MoodSave fill:#dda0dd,stroke:#333,stroke-width:2px style PatternUI fill:#90ee90,stroke:#333,stroke-width:2px style Insights fill:#90ee90,stroke:#333,stroke-width:2px
Pattern Detection Features:
  • Four Pattern Types: Mood, Topic, Behavior, and Temporal patterns
  • Confidence Scoring: Statistical confidence for detected patterns
  • Entry Correlation: Links patterns to specific journal entries
  • Mood Analysis: Real-time emotional tagging with 1-5 mood tags per entry
  • Trend Visualization: Mood and pattern trends over time

7. Embedding Generation & Hybrid Search System

Overview: BGE-small model for embeddings → Hybrid search combining semantic similarity with keyword matching power-ups
graph TD Text[Input Text] --> Type{Text Type} Type -->|Document| DocFormat[No special prefix for documents] Type -->|Query| QueryFormat[Add 'Represent this sentence for searching relevant passages:' prefix] DocFormat --> BGE[BGE-small-en-v1.5
SentenceTransformer] QueryFormat --> BGE BGE --> Vector[384-dimension vector] Vector --> Norm[L2 Normalization] Norm --> Store[(Store as JSON
in embeddings column)] Search[Search Query] --> QueryEmb[Generate Query Embedding] QueryEmb --> Load[Load All Entry Embeddings] Load --> Cosine[Cosine Similarity
numpy.dot product] Cosine --> Sort[Sort by Score] Sort --> Filter[Apply Threshold
Default: 0.3] Filter --> Candidates[Get 2x Candidates
for Hybrid Reranking] Candidates --> HybridSearch[Hybrid Search Service] HybridSearch --> ExactMatch{Exact Match Found?} HybridSearch --> PartialMatch{Partial Words Match?} ExactMatch -->|Yes| ExactBoost[+0.2 Boost
Whole query in text] ExactMatch -->|No| CheckPartial[Check Word Overlap] CheckPartial --> PartialMatch PartialMatch -->|Yes| PartialBoost[+0.1 × Match Ratio
Based on word overlap] PartialMatch -->|No| NoBoost[No Text Boost] ExactBoost --> Combine[Combine Scores
Semantic + Text Boosts] PartialBoost --> Combine NoBoost --> Combine Combine --> Cap[Cap Final Score at 1.0] Cap --> Rerank[Rerank by Hybrid Score] Rerank --> Context[Extract Search Context
Around Keyword Matches] Context --> FinalResults[Top K Results
with Enhanced Scoring] FinalResults --> Return[Return to User] style BGE fill:#87ceeb,stroke:#333,stroke-width:2px style Store fill:#dda0dd,stroke:#333,stroke-width:2px style HybridSearch fill:#ffd700,stroke:#333,stroke-width:2px style ExactBoost fill:#90ee90,stroke:#333,stroke-width:2px style PartialBoost fill:#90ee90,stroke:#333,stroke-width:2px
Hybrid Search Features:
  • Semantic Similarity: Base BGE embedding cosine similarity
  • Exact Match Boost: +20% for entries containing the full query
  • Partial Match Boost: +10% × (matched words / total query words)
  • Context Extraction: Smart snippets around keyword matches
  • Score Capping: Final scores capped at 100% (1.0) for consistency