🎙️ Echo - Technical Architecture
1. New Entry Creation Flow (Voice Recording & Text Input)
Overview: User records voice OR types text → Entry saved with auto-draft functionality → Background processing begins
graph TD
Input{Input Method}
Input -->|Voice| StartVoice([User Clicks Record])
Input -->|Text| StartText([User Types Text])
StartVoice --> WS[WebSocket Connection]
WS --> Record[Recording Audio
State: RECORDING]
Record --> Stop[User Stops Recording]
Stop --> Process[State: PROCESSING]
Process --> Whisper[Whisper Service
Speech-to-Text]
Whisper --> Trans[State: TRANSCRIBING]
Trans --> RawText[Raw Text Generated]
StartText --> DraftSave[Auto-Draft Save
POST /api/drafts/save]
DraftSave --> TextReady[Text Ready]
TextReady --> RawText
RawText --> API[POST /api/entries/create-and-process]
API --> SmartTag[Smart Tagging Service
Auto-detect patterns: question/idea/todo/decision/etc]
SmartTag --> DB[(SQLite Database
Save Raw Entry + Smart Tags)]
DB --> BG[Background Tasks Queue]
BG --> T1[Task 1: Generate Embedding]
BG --> T2[Task 2: Memory Extraction]
BG --> Q1[Queue: Enhanced Processing]
BG --> Q2[Queue: Structured Processing]
T1 --> BGE[BGE-small Model
384 dimensions]
BGE --> SaveEmb[(Save Embedding JSON)]
T2 --> MemWait[Wait for Enhanced Text
or use Raw Text as fallback]
MemWait --> MemExtract[Memory Service
LLM Extraction]
MemExtract --> MemLLM[Extract Memories with LLM
Facts, Preferences, Habits]
MemLLM --> MemStore[Store Memories]
MemStore --> MemFallback{LLM Extracted Any?}
MemFallback -->|No| RuleBased[Rule-based Fallback]
MemFallback -->|Yes| MemComplete[Memory Extraction Complete]
RuleBased --> MemComplete
Q1 --> Worker1[Processing Worker]
Worker1 --> Ollama1[Ollama LLM
Enhanced Prompt]
Ollama1 --> Enhanced[Enhanced Text]
Enhanced --> SaveEnh[(Update Entry
enhanced_text)]
SaveEnh --> MoodTrigger[Trigger Mood Analysis]
MoodTrigger --> MoodLLM[Ollama Mood Detection]
MoodLLM --> SaveMood[(Save Mood Tags)]
Q2 --> Worker2[Processing Worker]
Worker2 --> Ollama2[Ollama LLM
Structured Prompt]
Ollama2 --> Structured[Structured Summary]
Structured --> SaveStr[(Update Entry
structured_summary)]
SaveEmb --> Complete[Entry Complete]
MemComplete --> Complete
SaveMood --> Complete
SaveStr --> Complete
Complete --> WSNotify[WebSocket Notify
Processing Complete]
WSNotify --> UI[UI Updates]
style StartVoice fill:#f9f,stroke:#333,stroke-width:4px
style StartText fill:#f9f,stroke:#333,stroke-width:4px
style DraftSave fill:#87ceeb,stroke:#333,stroke-width:2px
style Complete fill:#9f9,stroke:#333,stroke-width:4px
style Ollama1 fill:#ffd700,stroke:#333,stroke-width:2px
style Ollama2 fill:#ffd700,stroke:#333,stroke-width:2px
style MoodLLM fill:#ffd700,stroke:#333,stroke-width:2px
style BGE fill:#87ceeb,stroke:#333,stroke-width:2px
style DB fill:#dda0dd,stroke:#333,stroke-width:2px
style SaveEmb fill:#dda0dd,stroke:#333,stroke-width:2px
style SaveEnh fill:#dda0dd,stroke:#333,stroke-width:2px
style SaveStr fill:#dda0dd,stroke:#333,stroke-width:2px
style SmartTag fill:#ffd700,stroke:#333,stroke-width:2px
style MemExtract fill:#ffd700,stroke:#333,stroke-width:2px
style SaveMood fill:#dda0dd,stroke:#333,stroke-width:2px
2. Voice File Upload Flow
Overview: User uploads audio file → Whisper transcribes → Same processing pipeline as voice recording
graph TD
Upload([User Uploads File]) --> Validate[Validate File
Formats: .wav/.mp3/.m4a/.aac/.ogg/.flac/.webm/.opus
Max Size: 100MB]
Validate -->|Invalid| Error[Show Error
Unsupported format or too large]
Validate -->|Valid| API[POST /api/audio/transcribe]
API --> TempDir[Create Temporary Directory]
TempDir --> SaveFile[Save Uploaded File]
SaveFile --> SizeCheck[Check Actual File Size]
SizeCheck -->|Too Large| Error
SizeCheck -->|OK| Convert[Convert Audio to WAV
Using Librosa]
Convert --> AudioProcess[Audio Processing:
• Load with original sample rate
• Resample to 16kHz if needed
• Convert to mono float32
• Save as WAV]
AudioProcess --> Metadata[Extract Metadata:
Duration, Sample Rate, Format]
Metadata --> Whisper[Whisper Service
Transcribe WAV File]
Whisper --> TransResult[Transcription Result
+ Duration + Confidence + Language]
TransResult --> Return[Return JSON Response
with transcription & metadata]
Return --> Display[Display in UI
Allow User Editing]
Display --> UserEdit[User Edits Text
Optional]
UserEdit --> Save[User Saves Entry]
Save --> Create[POST /api/entries/create-and-process]
Create --> SmartTagging[Smart Tagging Service]
SmartTagging --> Same[Same Pipeline as
New Entry Creation]
style Upload fill:#f9f,stroke:#333,stroke-width:4px
style Convert fill:#87ceeb,stroke:#333,stroke-width:2px
style Whisper fill:#87ceeb,stroke:#333,stroke-width:2px
style SmartTagging fill:#ffd700,stroke:#333,stroke-width:2px
style Same fill:#90ee90,stroke:#333,stroke-width:2px
3. Talk to Echo (Chat Agent) Flow
Overview: Two-phase process: 1) Tool-calling for search, 2) Response generation with context
graph TD
Start([User Message]) --> API[POST /api/diary/chat]
API --> Service[DiaryChatService]
Service --> Phase1[Phase 1: Tool Selection]
Phase1 --> Ollama1[ChatOllama with Tools
Model: qwen3:8b]
Ollama1 --> Tools{Which Tool?}
Tools -->|Content Search| T1[search_diary_entries]
T1 --> Embed[Generate Query Embedding
BGE-small]
Embed --> Sim[Cosine Similarity Search]
Sim --> Entries[(Find Similar Entries)]
Tools -->|Date Search| T2[get_entries_by_date]
T2 --> DateQ[Date Range Query]
DateQ --> DateEntries[(Find Date Entries)]
Tools -->|Ideas| T3[extract_ideas_and_concepts]
T3 --> IdeaQ[Search for Ideas/Concepts]
IdeaQ --> IdeaEntries[(Find Idea Entries)]
Tools -->|Actions| T4[extract_action_items]
T4 --> TodoQ[Search for TODOs]
TodoQ --> TodoEntries[(Find Action Items)]
Tools -->|Time Summary| T5[summarize_time_period]
T5 --> TimeQ[Time Period Query]
TimeQ --> TimeEntries[(Find Period Entries)]
Tools -->|Context| T6[get_context_before_after]
T6 --> ContextQ[Context Window Query]
ContextQ --> ContextEntries[(Find Context)]
Tools -->|Add Entry| T7[add_entry_to_diary]
T7 --> AddEntry[(Create New Entry)]
AddEntry --> Pipeline[Full Processing Pipeline]
Tools -->|Conversations| T8[search_conversations]
T8 --> ConvQ[Search Past Chats]
ConvQ --> ConvEntries[(Find Conversations)]
Entries --> Results[Tool Results]
DateEntries --> Results
IdeaEntries --> Results
TodoEntries --> Results
TimeEntries --> Results
ContextEntries --> Results
Pipeline --> Results
ConvEntries --> Results
Results --> Phase2[Phase 2: Response Generation]
Phase2 --> MemoryRetrieval[Retrieve Relevant Memories
Based on User Query]
MemoryRetrieval --> UserInfo[Get User Information
Name, Display Settings]
UserInfo --> Context[Build Context:
- Tool Results
- Conversation History
- System Date
- Relevant Memories
- User Name]
Context --> Ollama2[ChatOllama Generate
Model: qwen3:8b
With Memory & User Context]
Ollama2 --> Response[AI Response]
Response --> Clean[Strip Thinking Blocks
Clean Formatting]
Clean --> Final[Final Response]
Final --> Check{Voice Enabled?}
Check -->|Yes| TTS[TTS Service
Piper Engine]
TTS --> Audio[Audio Stream]
Audio --> Client[Return to Client]
Check -->|No| Client
style Start fill:#f9f,stroke:#333,stroke-width:4px
style Ollama1 fill:#ffd700,stroke:#333,stroke-width:2px
style Ollama2 fill:#ffd700,stroke:#333,stroke-width:2px
style Embed fill:#87ceeb,stroke:#333,stroke-width:2px
style TTS fill:#90ee90,stroke:#333,stroke-width:2px
Available Tools:
- search_diary_entries: Semantic search using embeddings
- get_entries_by_date: Date-specific queries
- extract_ideas_and_concepts: Find ideas and insights
- extract_action_items: Find TODOs and tasks
- summarize_time_period: Summarize a time range
- get_context_before_after: Get surrounding entries
- add_entry_to_diary: Create new entry from chat
- search_conversations: Search past conversations
4. Memory Extraction & Management System
Overview: Memories extracted from entries/conversations → Categorized → User can rate accuracy
graph TD
Source{Source Type} -->|Entry| Entry[Enhanced Text Available]
Source -->|Conversation| Conv[Conversation Saved]
Entry --> Extract1[Memory Service]
Conv --> Extract2[Memory Service]
Extract1 --> LLM[Ollama LLM
Memory Extraction Prompt]
Extract2 --> LLM
LLM --> Parse[Parse LLM Response
JSON Format]
Parse --> Validate[Validate & Deduplicate
Check Recent 50 Memories]
Validate --> Cat{Categorize}
Cat -->|Type 1| Fact[Personal Facts
Name, Job, Location]
Cat -->|Type 2| Pref[Preferences
Likes, Dislikes, Style]
Cat -->|Type 3| Habit[Habits/Patterns
Routines, Behaviors]
Fact --> ConfidenceScore[LLM Assigns Confidence
0.0 - 1.0 based on clarity]
Pref --> ConfidenceScore
Habit --> ConfidenceScore
ConfidenceScore --> ImportanceCalc[Calculate Importance Score
Confidence × 10 = 1-10 Scale]
ImportanceCalc --> OllamaScore[Optional: Ollama LLM
Refine Importance Score 1-10]
OllamaScore --> Store[(Store in DB
agent_memories table
importance_score 1-10)]
Store --> Display[Display to User
Memory Review UI]
Display --> Rate{User Rating}
Rate -->|-3 to -1| Irrelevant[Subtract 3 to 1 from importance]
Rate -->|0| Neutral[No adjustment]
Rate -->|+1 to +3| Important[Add 1 to 3 to importance]
Irrelevant --> Update[(Update Memory)]
Neutral --> Update
Important --> Update
Update --> Future[Available for
Future Chats]
Future --> Retrieve[Retrieved During Chat
Based on Relevance]
Retrieve --> Context[Added to Chat Context
Top 10 Relevant]
style LLM fill:#ffd700,stroke:#333,stroke-width:2px
style OllamaScore fill:#ffd700,stroke:#333,stroke-width:2px
style Store fill:#dda0dd,stroke:#333,stroke-width:2px
style Update fill:#dda0dd,stroke:#333,stroke-width:2px
Memory Lifecycle:
- Extraction triggered after enhanced text is ready
- LLM identifies facts, preferences, and habits
- Deduplication against recent memories
- Importance scoring (1-10 scale from confidence × 10)
- User can adjust importance (-3 to +3 points)
- User-adjusted memories have different decay rates
- Access count tracked for relevance
5. Background Processing Queue System
Overview: Async job queue with retry logic and status notifications
graph TD
Job[New Processing Job] --> Queue[Job Queue
deque structure]
Queue --> Worker[Worker Loop
Async Processing]
Worker --> Check{Job Status}
Check -->|Pending| Process[Process Job]
Check -->|Processing| Skip[Skip]
Check -->|Complete| Remove[Remove from Queue]
Process --> GetEntry[Get Entry from Database]
GetEntry --> Mode{Processing Mode}
Mode -->|Enhanced| EnhServ[EntryProcessingService
Enhanced Mode]
Mode -->|Structured| StrServ[EntryProcessingService
Structured Mode]
EnhServ --> EnhOllama[Ollama Generate
Enhanced Prompt]
StrServ --> StrOllama[Ollama Generate
Structured Prompt]
EnhOllama --> EnhResult[Enhanced Text Result]
StrOllama --> StrResult[Structured Result]
EnhResult --> UpdateEnh[Update Entry.enhanced_text
+ processing_metadata]
StrResult --> UpdateStr[Update Entry.structured_summary
+ processing_metadata]
UpdateEnh --> MemoryCheck{Enhanced Mode?}
UpdateStr --> SaveSuccess[Save to Database]
MemoryCheck -->|Yes| ExtractMem[Extract Memories
from Enhanced Text]
MemoryCheck -->|No| SaveSuccess
ExtractMem --> SaveSuccess
SaveSuccess --> Success{Success?}
Success -->|Yes| Complete[Mark Job Complete
Set completed_at]
Success -->|No| Retry{Retry Count ≤ 3?}
Retry -->|Yes| Backoff[Exponential Backoff
2^retry_count seconds]
Backoff --> Requeue[Requeue Job
Status: PENDING]
Retry -->|No| Failed[Mark Job Failed
Max Retries Reached]
Complete --> NotifySuccess[WebSocket Notification
Job Complete]
Failed --> NotifyFail[WebSocket Notification
Job Failed]
NotifySuccess --> UI[UI Updates]
NotifyFail --> UI
Requeue --> Queue
style Queue fill:#87ceeb,stroke:#333,stroke-width:2px
style EnhOllama fill:#ffd700,stroke:#333,stroke-width:2px
style StrOllama fill:#ffd700,stroke:#333,stroke-width:2px
style UpdateEnh fill:#dda0dd,stroke:#333,stroke-width:2px
style UpdateStr fill:#dda0dd,stroke:#333,stroke-width:2px
style ExtractMem fill:#90ee90,stroke:#333,stroke-width:2px
6. Pattern Insights & Mood Analysis System
Overview: Advanced analysis available for all users - pattern detection across time periods and mood analysis for emotional insights
graph TD
Trigger{Analysis Trigger}
Trigger -->|User Requests Analysis| PatternAnalyze[POST /api/patterns/analyze
Trigger Pattern Detection]
Trigger -->|Enhanced Text Available| MoodAnalysis[Mood Analysis Trigger
POST /api/entries/analyze-mood]
PatternAnalyze --> PatternService[Pattern Detector Service]
PatternService --> PatternLLM[Analyze Entry Corpus
Detect patterns across time]
PatternLLM --> PatternTypes{Pattern Categories}
PatternTypes -->|Mood| MoodPatterns[Mood Patterns
Emotional cycles and triggers]
PatternTypes -->|Topic| TopicPatterns[Topic Patterns
Recurring themes and interests]
PatternTypes -->|Behavior| BehaviorPatterns[Behavior Patterns
Habits and routine changes]
PatternTypes -->|Temporal| TemporalPatterns[Temporal Patterns
Time-based correlations]
MoodPatterns --> PatternStore[(Store in patterns table
with confidence and frequency)]
TopicPatterns --> PatternStore
BehaviorPatterns --> PatternStore
TemporalPatterns --> PatternStore
PatternStore --> PatternUI[Pattern Insights UI
Diamond icon in sidebar]
MoodAnalysis --> MoodService[Mood Analysis Service]
MoodService --> MoodLLM[Ollama LLM
Mood Detection Prompt]
MoodLLM --> MoodParse[Parse JSON Response
Extract 1-5 mood tags]
MoodParse --> MoodValidate[Validate Mood Vocabulary
happy/stressed/excited/etc]
MoodValidate --> MoodSave[(Update Entry
mood_tags column)]
MoodSave --> MoodTrends[Mood Trend Analysis
Available in UI]
PatternUI --> Insights[User Views Insights
Patterns and Correlations]
MoodTrends --> Insights
style PatternLLM fill:#ffd700,stroke:#333,stroke-width:2px
style MoodLLM fill:#ffd700,stroke:#333,stroke-width:2px
style PatternStore fill:#dda0dd,stroke:#333,stroke-width:2px
style MoodSave fill:#dda0dd,stroke:#333,stroke-width:2px
style PatternUI fill:#90ee90,stroke:#333,stroke-width:2px
style Insights fill:#90ee90,stroke:#333,stroke-width:2px
Pattern Detection Features:
- Four Pattern Types: Mood, Topic, Behavior, and Temporal patterns
- Confidence Scoring: Statistical confidence for detected patterns
- Entry Correlation: Links patterns to specific journal entries
- Mood Analysis: Real-time emotional tagging with 1-5 mood tags per entry
- Trend Visualization: Mood and pattern trends over time
7. Embedding Generation & Hybrid Search System
Overview: BGE-small model for embeddings → Hybrid search combining semantic similarity with keyword matching power-ups
graph TD
Text[Input Text] --> Type{Text Type}
Type -->|Document| DocFormat[No special prefix for documents]
Type -->|Query| QueryFormat[Add 'Represent this sentence for searching relevant passages:' prefix]
DocFormat --> BGE[BGE-small-en-v1.5
SentenceTransformer]
QueryFormat --> BGE
BGE --> Vector[384-dimension vector]
Vector --> Norm[L2 Normalization]
Norm --> Store[(Store as JSON
in embeddings column)]
Search[Search Query] --> QueryEmb[Generate Query Embedding]
QueryEmb --> Load[Load All Entry Embeddings]
Load --> Cosine[Cosine Similarity
numpy.dot product]
Cosine --> Sort[Sort by Score]
Sort --> Filter[Apply Threshold
Default: 0.3]
Filter --> Candidates[Get 2x Candidates
for Hybrid Reranking]
Candidates --> HybridSearch[Hybrid Search Service]
HybridSearch --> ExactMatch{Exact Match Found?}
HybridSearch --> PartialMatch{Partial Words Match?}
ExactMatch -->|Yes| ExactBoost[+0.2 Boost
Whole query in text]
ExactMatch -->|No| CheckPartial[Check Word Overlap]
CheckPartial --> PartialMatch
PartialMatch -->|Yes| PartialBoost[+0.1 × Match Ratio
Based on word overlap]
PartialMatch -->|No| NoBoost[No Text Boost]
ExactBoost --> Combine[Combine Scores
Semantic + Text Boosts]
PartialBoost --> Combine
NoBoost --> Combine
Combine --> Cap[Cap Final Score at 1.0]
Cap --> Rerank[Rerank by Hybrid Score]
Rerank --> Context[Extract Search Context
Around Keyword Matches]
Context --> FinalResults[Top K Results
with Enhanced Scoring]
FinalResults --> Return[Return to User]
style BGE fill:#87ceeb,stroke:#333,stroke-width:2px
style Store fill:#dda0dd,stroke:#333,stroke-width:2px
style HybridSearch fill:#ffd700,stroke:#333,stroke-width:2px
style ExactBoost fill:#90ee90,stroke:#333,stroke-width:2px
style PartialBoost fill:#90ee90,stroke:#333,stroke-width:2px
Hybrid Search Features:
- Semantic Similarity: Base BGE embedding cosine similarity
- Exact Match Boost: +20% for entries containing the full query
- Partial Match Boost: +10% × (matched words / total query words)
- Context Extraction: Smart snippets around keyword matches
- Score Capping: Final scores capped at 100% (1.0) for consistency