787k+
unified_dividends Rows
🟢
Grok / X API — Pay-Per-Use Active — Migrated to X API 2025 pay-per-use billing. Monthly credit exhaustion no longer possible. Ensemble weights restored. grok_x_dividend_service.py upgraded to v2.0: search_ticker_signals() for any ticker · XMCP Server hook ready · New MCP tool get_x_dividend_signals added (tool #19). Model: grok-3-fast
🟢
ML API Healthy — heydividend-unified-ml v4.0.0 running on port 9001 with 7 models loaded. HeyDividend AI backend on port 8001 active. Claude Intelligence Layer v1.2.0 — 8 services registered.
🟢
Accuracy & Performance Upgrades Active — Accuracy log batch INSERT (1 background thread vs 24 sequential), DividendScreener stale-connection retry (pool recycle on OperationalError), Two-layer LLM Response Cache deployed.
🟢
DB Integrity Guard — 3-Layer Protection Active — CHK_unified_dividends_amount_sanity SQL CHECK constraint deployed (amount > 0 AND amount ≤ $500). Code-level batch-uniformity guard + moat cleanup endpoint. 787,787 clean rows · 0 over $500 · 0 zeroes.
🟢
VM Cron Health Monitoring Active — vm_training_cron.sh permissions fixed. training_health_check.py deployed on LLM VM — runs every 6h, validates all training tables + unified_dividends integrity. GitHub Actions ML training pipeline fixed: 6-model pipeline runs daily 2 AM UTC; models 1, 5 & 6 train on synthetic data when DB unreachable.
🟢
Cross-API Verification Layer Active — multi_source_verifier.py fires 5 parallel API calls (FMP · Finnhub · EODHD · Alpha Vantage · HeyDividend AI DB) on every ticker preflight. Median consensus for price, yield, annual dividend, payout ratio & last dividend is injected into the LLM system prompt before any response is generated. Confidence ratings: HIGH✅ ≥3 sources agree · MEDIUM⚡ 2 sources · LOW⚠️ single source. 5-minute TTL cache per ticker.
🟢
Second Brain Deployed — Knowledge Graph + Weekly Review — Entity knowledge graph (heydividend_memory_edges) adds backlinks between memories; retrieve_connected() expands 1 hop along edges to surface linked prior thinking, not just frequency hits. Weekly Claude Sonnet 4 knowledge review, graph-driven Map of Content, and explicit memory tiering (fleeting → literature → permanent). New endpoints GET /api/v1/memory/graph, /review/latest, POST /review/run live on the LLM VM.
🖥️System Vitals
HostAzure VM — 20.81.210.213
OSUbuntu (miniconda3 llm env)
Uptime~24h (refreshed via deploy)
Load avg0.75 / 0.79 / 0.81
RAM2.8 GB used / 27 GB total
Disk148 GB / 248 GB (60%)
⚡Active Processes
| Process | Port | CPU | MEM | Status |
| HeyDividend AI Main API | 8001 | 7.3% | 5.3% | LIVE |
| HeyDividend AI ML API | 9001 | 0.9% | 1.2% | LIVE |
| Internal API (9000) | 9000 | 0.0% | 1.2% | IDLE |
| Gemini Scheduler | — | 0.0% | 0.0% | RUNNING |
| Service Watchdog | — | 0.0% | 0.0% | ACTIVE |
| Port 8000 (shadow) | 8000 | 189% | 0.6% | RESTARTING |
📊Training Database — Row Counts
↻ Refresh
⚙ Init Tables
📦Disk Breakdown
🧬Dividend Neural Engine — 7 Modules (Boris Banushev Framework)
🌊 Fourier Denoiser
FFT-based denoising of dividend history. Detects special dividends as 2σ outliers. Isolates true trend vs noise.
active 215 lines
📊 ARIMA Feature
Auto-selects best (p,d,q) via AIC. Forecasts next dividend as ML input feature. Handles short series gracefully.
active 212 lines
🔲 Stacked Autoencoder
PyTorch 3-layer encoder/decoder. Compresses 12 dividend features → 16-dim latent vector for LSTM input.
PyTorch 473 lines
🔮 LSTM Predictor
2-layer LSTM + attention. MC Dropout (50 passes) for confidence intervals. Predicts next dividend amount.
PyTorch 445 lines
🎲 GAN Engine
LSTM Generator + CNN Discriminator. Generates 100 synthetic dividend futures. cut_probability → 8% ensemble vote.
ensemble voter 432 lines
🗺️ SOM Anomaly Detector
10×10 Self-Organized Map. Detects unusual payout/FCF trajectories. Returns similar tickers by BMU proximity.
minisom 487 lines
🔗 Eigen / Contagion
PCA on dividend growth matrix. Dividend contagion risk: if sector peer cuts, who follows? 24h TTL cache.
sklearn PCA 256 lines
🏭HeyDividend AI Unified ML API — Port 9001
Status✓ healthy
Serviceheydividend-unified-ml
Version4.0.0
Models loaded7
Host0.0.0.0:9001 (public) + nginx proxy
Endpoints/score/symbol · /predict/yield · /predict/cut-risk · /predict/payout-rating · /health
CPU usage0.9%
Memory1.2% of 27 GB ≈ 330 MB
⚖️Ensemble Weight Distribution
Claude Sonnet 4 (Verification)26%
HeyDividend AI GPT-5 (Azure OAI)13%
Market Signals (CDS + IV)3%
🛡️ML-Powered Moat Features
FeatureExtractionService
Extracts 12 proprietary dividend features from DB + market data for ML input pipeline.
active
DividendQualityScorer
ML scoring of dividend quality across 6 dimensions. sklearn-based (offline on LLM VM).
sklearn missing (dev)
CutRiskAnalyzer
XGBoost dividend cut risk model. Rule-based fallback active when model unavailable.
xgboost missing (dev)
HeyDividend AI Score Service
Pre-computed scores for top 500 tickers daily at 2 AM. Composite dividend health score.
active
NAV Avoidance ML Screen
6 warning signals, erosion trend, distribution trap detection across 50+ securities.
active
Claim Verification Gate
Mandatory fact-check before AI response delivery. Cross-references ground truth tables. Upgraded with apply_multi_source_consensus() — cross-checks LLM output against median consensus, upgrades to VERIFIED within tolerance, marks CONFLICT on deviation.
active
🔬Cross-API Verification Layer — v1.0
multi_source_verifier.py
active
804 lines
5-min TTL cache
Fires 5 parallel API calls on every ticker preflight. Computes median consensus across all responding sources. Injects a VerificationResult.context_block() into the LLM system prompt before any response is generated — ensuring HeyDividend AI always reasons from ground-truth consensus rather than stale or hallucinated figures.
Data Sources
FMP (Financial Modeling Prep)
Finnhub
EODHD
Alpha Vantage
HeyDividend AI DB (unified_dividends)
Verified Metrics
📈 Price
💰 Dividend Yield
📅 Annual Dividend
⚖️ Payout Ratio
🕐 Last Dividend
Discrepancy Thresholds
Price: 2%
Dividend: 5%
Yield: 15%
Payout ratio: 20%
Confidence Levels
HIGH ✅ ≥3 sources agree
MEDIUM ⚡ 2 sources agree
LOW ⚠️ single source
Conflicts flagged & surfaced to LLM
Pipeline flow: ticker preflight → parallel fetch (asyncio.gather) → median consensus → context_block() → LLM system prompt injection → response generation → apply_multi_source_consensus() post-check → VERIFIED / CONFLICT label appended
📐Quantium Library Stack — 20+ Libraries (Dividend Intelligence Pipeline)
edgartools
SEC EDGAR access — 8-K dividend declarations, 10-K XBRL cash flows, Item 5 policy text extraction
Stage 1
exchange-calendars
NYSE/NASDAQ/LSE trading day validation for ex-date confirmation and growth streak counting
Stage 2
pandas-market-calendars
Market-aware date arithmetic — fiscal year boundary detection, pay-gap widening (liquidity stress signal)
Stage 2
arch (GARCH)
GARCH(1,1) volatility model on dividend yield series. Conditional variance = cut risk. Persistence α+β.
Stage 3
pmdarima
Auto-ARIMA with AIC/BIC order search. Dividend payment forecasting with confidence intervals.
Stage 4
prophet
Facebook Prophet with quarterly seasonality decomposition. 2nd member of 3-model ensemble for 4-quarter forecast.
Stage 4
scikit-learn Ridge
Ridge regression (L2, α=1.0) with lag-4 windowed features and StandardScaler. 3rd ensemble member. Dynamic inverse-MAE weighting. CI: ±1.28×val_MAE (80%).
Stage 4inverse-MAE weights
tsfresh
EfficientFCParameters — 800+ statistical features from payment time series. 20-feature curated subset.
Stage 5
empyrical-reloaded
Income Sharpe, income Sortino, max income drawdown. Treats dividends as the "returns" series.
Stage 6
ffn
Financial functions — yield-on-cost CAGR, income consistency score, drawdown analytics.
Stage 6
PyPortfolioOpt
Max-yield portfolio optimization where yield replaces expected returns in the efficient frontier.
Stage 7
Riskfolio-Lib
HRP (Hierarchical Risk Parity) across dividend growth correlations for robust position sizing.
Stage 7
skfolio
Portfolio optimization toolkit — additional constraint handling for income-focused allocation problems.
Stage 7
alphalens-reloaded
Factor IC / ICIR back-testing of HeyDividend AI Safety Score, yield screen, DGR screen as alpha signals.
Stage 8
ta · finta
Technical analysis indicators applied to dividend yield time series for regime context signals.
supporting
financedatabase
Sector/industry classification database — sector peer grouping for contagion and factor analysis.
supporting
pypme · rateslib
Public Market Equivalent benchmarking and rates/bond analytics for yield spread context.
supporting
🔧Core Intelligence Services
Intelligent Query Service
19 intent types · multi-model fallback chain · learning loop
Hallucination Prevention
Ticker Preflight Gate · Schema fingerprint · Claim verification
Dividend Safety Ensemble
7-voter weighted system · VERY_SAFE → CRITICAL scale
NAV Erosion Service
50 securities · 6 warning signals · distribution trap detection
Chart Generator Service
7 chart types · matplotlib · server-side PNG generation · includes Forecast Fan (bull/base/bear · FMP consensus + ±1.65σ fallback)
PDF Research Report Gen
4 report types · ReportLab · institutional-grade output
SEC Filing RAG Service
EDGAR retrieval · vector search · 10-K/Q synthesis
Stock Card Service
Data-driven per-ticker summaries · ML integration · 30+ stopwords
🌐Real-Time Data Services
FMP Integration (80+ endpoints)
Financial Modeling Prep · market data · fundamentals · estimates
Finnhub Service
Real-time quotes · news · earnings · analyst ratings
X.com Monitoring
Grok agent tools · circuit breaker active · X API 2025 pay-per-use
FRED Economic Data
Federal Reserve economic indicators · interest rate feeds
Unified Dividends Pipeline
Multi-source dividend data · FMP + Finnhub + EDGAR → Azure SQL
Multi-Source Backfill
Auto data discovery · gap filling · Backfilled_Dividends/Prices tables
🚀v4.2 Feature Layer
Semantic Vector Memory
Persistent user conversation memory with semantic retrieval
ElevenLabs TTS
Text-to-speech audio generation for research summaries
Agentic Tool-Calling Loop
Multi-step tool orchestration with self-correction
Code Interpreter (Sandboxed)
Safe Python execution for user-defined financial calculations
SSE Token Streaming
Real-time token-by-token response streaming via Server-Sent Events
WebSocket Alert Push
Real-time dividend alert delivery via persistent WebSocket
User Profile Injection
Personalized context from persistent user profiles into LLM prompts
Auto Fine-Tuning Pipeline
Fully automated: Sunday 1 AM UTC · 3 sources (50/35/15%) · min 200 pairs · gpt-4o-mini-2024-07-18 · 6h polls · zero manual steps
HeyDividend AI v4.0 Training Agent
29 seed pairs across 6 categories · 15 LLM variations per run (every 2h) · ~180 pairs/day into InvestmentResearcherTraining (35% fine-tune weight)
🧠Second Brain — Knowledge Graph & Weekly Review NEW
Entity Knowledge Graph
Backlinks layer over block memory: typed edges between entities that co-occur in a stored response — in_sector, has_metric, compared_with, risk_flag. retrieve_connected() expands 1 hop ordered by weight × confidence; stale edges decay by confidence and are pruned during consolidation.
activeheydividend_memory_edgesblock_memory_service.py
Weekly Claude Knowledge Review
Per-user Claude Sonnet 4 digest: Themes · Risk watch · Gaps. Reads the user's recent block-memory entities and relationship edges from the last 7 days (including compared_with and risk_flag edges) and writes a short structured review. Persisted to heydividend_knowledge_reviews for retrieval/audit.
Claude Sonnet 4knowledge_review_service.pyweekly job
Map of Content (MOC)
Living "what I've researched" map driven by the graph — new topic:"moc" on the mind-map service: hubs ranked by occurrence, children via graph neighbors labeled by relation. Reuses the existing node schema (frontend unchanged) and fixes the prior bug that left the "Recent Focus (Memory)" pillar empty.
activemind_map_service.pyPOST /api/v1/reports/mind-map
Memory Tiering
Explicit lifecycle on each memory: fleeting (seen once) → literature (corroborated, embedded, ≥1 edge) → permanent (meets the fine-tune threshold). Prompt injection prioritizes permanent → literature → fleeting; fine-tune export filters on permanent. Promotion happens inside consolidate().
activetier columnconsolidate()
Memory API Surface
New additive endpoints: GET /api/v1/memory/graph (raw nodes + edges), GET /api/v1/memory/review/latest, POST /api/v1/memory/review/run (manual trigger). Registered before the /{user_id} catch-all so they aren't shadowed. Verified live on the LLM VM.
deployedmemory_routes.py3 endpoints
🆕Recent Platform Upgrades
Chat History API
Full conversation persistence to Azure SQL. 4 endpoints: GET /api/v1/chat/conversations (list recent), GET /conversations/{id}/messages (restore thread), DELETE /conversations/{id}, POST /conversations/new. Streams persist via _history_save_wrapper. User/conversation IDs via x-user-id/x-conversation-id headers.
activechat_history_routes.py4 endpoints
Analysis Depth Overhaul v4.1
Every single-ticker dividend card now calls Claude Sonnet for 5 mandatory analysis sections: Dividend Sustainability (VERY SAFE/SAFE/WATCH/AT RISK/DANGER), Growth Trajectory, Yield in Context (vs sector/S&P/T-bill/peers), Business Quality, and HeyDividend AI's Verdict (BUY/HOLD/WATCH/AVOID). max_tokens raised to 2800. Banned-phrase enforcement. prefer_groq=False.
Claude Sonnet 4ai_sdk_routes.py5 sections2800 tokens
Two-Layer LLM Response Cache
Semantic response cache with in-memory dictionary (L1) + Azure SQL persistent table (L2). Reduces repeated-query token cost by 30–50%. Semantic similarity matching prevents stale cache misses. LRU eviction on L1 overflow.
activeL1 in-memoryL2 Azure SQL
99.999% Accuracy Enforcement (v1.0)
Multi-layer accuracy framework: middleware interception, claim verification gate, stale data refresh trigger, multi-source cross-validation. Batch INSERT logging (1 background thread replaces 24 sequential DB calls — ~4.2s → 200ms). HeyDividend AI accuracy log table auto-ensured once per process.
activeclaim_verification_gate.pybatch INSERT
Broker Connect via Plaid (REST + MCP)
5 REST endpoints at /api/plaid/* plus 5 MCP tools. 3-step flow: create link token → connect account → sync portfolio. CUSIP (100%), ISIN (95%), ticker (70%) institutional-grade security matching. Dividend-only filtering, portfolio upsert, ownership-verified disconnect.
active/api/plaid/*PLAID_CLIENT_ID + PLAID_SECRET
DividendScreener Stale-Connection Retry
2-attempt retry loop in request_handler.py. On OperationalError at attempt 0 calls engine.dispose() to recycle the pymssql connection pool, then retries. Graceful fallback to LLM narrative if second attempt also fails. Eliminates "connection closed" errors after DB idle.
activerequest_handler.py:637pool recycle
Fix-It Agent v2.1
Hourly scan of conversation logs for failed responses. 19 fix strategies (7 new in v2.1: TAX_EFFICIENCY, ETF_ANALYSIS, DRIP_COMPOUNDING, RETIREMENT_INCOME, COMPARISON_ANALYSIS, POSITION_SIZING, DGI_STRATEGY). 45 bad-response detection patterns (12 new: knowledge-cutoff deflections, clarification-fishing, tool-error leakage, hedging, truncated/disclaimer-only responses, AI identity deflections, info-begging, accuracy disclaimers, hallucination confidence hedges, stall phrases).
active19 strategies45 patternsv2.1
DIVIDEND_SCREENER Training Data v2
86x failure-pattern fix for SCREENER_LIST intent. dividend_intelligence_training.py: 12→20 scenarios, 8 new query pools (PRICE_CONSTRAINED, FCF_COVERAGE, TICKER_SEEDED, RECOVERY_DIVIDEND, NEGATIVE_SCREEN, INCOME_TARGET_REVERSE, COMPARATIVE_BETTER, VAGUE_INTENT). comprehensive_investment_trainer.py: +36 templates across 6 gap-pattern categories. LLM override prompt for screener intents — requires concrete ticker list with yield, safety rating, and rationale.
active20 scenariosSCREENER_LIST86x fix
🧠HeyDividend AI v4.0 Standalone Capabilities
General Knowledge Handler
Detects educational/definitional queries ("what is inflation", "explain DCF") via prefix matching + capability triggers. Routes before ticker extraction — HeyDividend AI answers as CIO-level financial educator.
activeheydividend_persona.py
Capability Registry
34 structured capabilities across 8 categories: Dividend Analysis, Portfolio Strategy, ETF Intelligence, Equity Research, Market/Macro, Screeners, Education, Advanced Tools. Exposed via GET /api/v1/capabilities.
activeheydividend_capabilities.py
Agentic Reflect Loop
Fires after CHAT PATH LLM stream completes — checks if critical content is missing and appends a supplement. Only activates for non-low-latency queries with parsed tickers. Capped at 3s via SIGALRM.
active_reflect_and_supplement()
Graceful Degradation
50 pre-built Q&A answers for common finance topics. Activates on full provider failure. Health exposed via GET /api/v1/health — 6 provider statuses (Azure OAI, Claude, Gemini, OpenAI, Groq, graceful_degradation).
active50 answersgraceful_degradation.py
Unified Persona Contract
Single source of truth for HeyDividend AI's CIO/Chief Strategist identity. get_system_prompt(mode="full"|"compact"|"fast") assembles right persona for each call site. Imported by llm_providers.py and intelligent_query_service.py.
activeheydividend_persona.py
📊StructuredKPIService — Perplexity Finance-Style Breakouts
Auto-appended to every single-ticker CHAT PATH response. Auto-detects security type via symbol sets + FMP profile. Fetches data in parallel via ThreadPoolExecutor (5s hard cap). Renders clean markdown panels.
BLOCK TYPE — stock
Business segments, valuation ratios, P&L summary for general equities.
BLOCK TYPE — dividend_stock
Yield, FCF payout, ML cut risk, 7-model safety score, growth streak years.
BLOCK TYPE — reit
FFO proxy, AFFO payout ratio, debt/equity, ML cut risk score.
BLOCK TYPE — etf
Top 10 holdings, sector allocation, expense ratio, AUM.
BLOCK TYPE — dividend_etf
Weighted yield, NAV erosion flag, distribution schedule.
BLOCK TYPE — bank
Net Interest Margin, ROE, ROA, efficiency ratio, P/Book.
BLOCK TYPE — mlp
DCF coverage ratio, distribution yield, debt/EBITDA, commodity exposure.
🏗️HarvestEngine Platform — 8 Modules
Backtesting Engine
6 strategies · DRIP, Growth, High-Yield, Capture, Aristocrat, Covered Call
Portfolio Optimizer
Dividend income optimization · Markowitz-inspired allocation
Dividend Risk Analyzer
NAV erosion, cut risk, FCF coverage, sector contagion
Income Impact Simulator
Dividend income impact projection with DRIP compounding
Dividend Calendar
Ex-date tracking · declaration date patterns · 25 securities
NAV Avoidance Screener
50 securities · 20 profiles · 6 warning signals
Multi-Model Safety Ensemble
Orchestration layer · 7 voters · weight redistribution on failure
HarvestEngine Continuous Training
Expert Q&A pair generation · 13,120 rows in DB · ThreadPoolExecutor
🧪Dividend Intelligence Pipeline — 8 Services (Quantium Library Stack)
Full ML pipeline orchestrated by dividend_intelligence_pipeline.py — all 8 stages confirmed available: true. Returns unified DividendIntelligenceReport with HeyDividend AI composite score & plain-language verdict.
STAGE 1 — EDGAR Dividend Service
edgar_dividend_service.py
edgartools pulls 8-K declared dividends + 10-K XBRL cash flow + Item 5 policy text. Policy type classifier (consistent / growth / variable / suspended).
edgartoolsSEC EDGARXBRL
STAGE 2 — Trading Calendar Service
trading_calendar_service.py
exchange_calendars validates ex-dates, counts consecutive growth years respecting fiscal years, detects pay-gap widening (liquidity stress), estimates next ex-date.
exchange-calendarspandas-market-calendars
STAGE 3 — Yield Volatility Service
yield_volatility_service.py
ARCH/GARCH(1,1) fit on dividend yield time series → conditional variance = cut risk signal. 0–100 score, regime (stable / elevated / crisis), persistence alpha+beta.
arch/GARCHcut risk 0–100
STAGE 4 — Dividend Forecast Service
dividend_forecast_service.py
Three-model ensemble: Auto-ARIMA (pmdarima) + Facebook Prophet + Ridge regression (L2, α=1.0, lag-4 windowed features, StandardScaler). Dynamic inverse-MAE weighting — weight ∝ 1/val_MAE, equal-weight fallback if <2 valid. Next 4 payment forecasts with 80% CI. Predictability score 0–100.
pmdarimaprophetRidge4-quarter CIinverse-MAE weights
STAGE 5 — Dividend Feature Extractor
dividend_feature_extractor.py
tsfresh EfficientFCParameters extracts 800+ features from payment time series. Curated 20-feature subset + stability score 0–100. Powers the tsfresh Cut-Risk Classifier: GradientBoostingClassifier (n_estimators=200, max_depth=4, lr=0.05, subsample=0.8) — upgraded from RandomForest. Time-ordered train/test split with 10% embargo gap prevents look-ahead bias. Returns probability + top-5 feature drivers with direction signals. Monthly auto-retraining.
tsfresh800+ featuresstability 0–100GradientBoostingmonthly retrain
STAGE 6 — Income Analytics Service
income_analytics_service.py
empyrical-reloaded + ffn compute income Sharpe, income Sortino, yield-on-cost CAGR, max income drawdown, income consistency score 0–100.
empyricalffnYOC CAGR
STAGE 7 — Portfolio Income Optimizer
portfolio_income_optimizer.py
PyPortfolioOpt max-yield optimization (yield replaces expected returns), HRP across dividend growth correlations, Kelly position sizing (win_prob = 1 − cut_prob).
PyPortfolioOptRiskfolio-LibHRP · Kelly
STAGE 8 — Dividend Factor Analyzer
dividend_factor_analyzer.py
alphalens-reloaded back-tests HeyDividend AI Safety Score, yield screen, DGR screen as alpha factors. Returns IC / ICIR metrics proving or refuting each screen's predictive power.
alphalensIC · ICIRalpha factor
🔗MCP Server v3.1 — 23 Financial Intelligence Tools · 5 Capability Groups
OAuth 2.1 one-click connection · stdio (Claude Desktop) + HTTP/SSE at /api/mcp/sse · MCPGuard prompt injection + rate limiting + SHA-256 audit log
📡Group 1 — Data (7 tools)
get_dividend_history
Full dividend payment history for any ticker from Azure SQL
data60/min
get_stock_price
Real-time price + yield from FMP integration
data60/min
get_dividend_calendar
Upcoming ex-dates, pay dates, declaration dates
data60/min
get_company_fundamentals
FCF, payout ratio, debt-to-equity, sector from DB views
data60/min
get_sec_filings
SEC EDGAR 10-K/10-Q/8-K structured extraction with 7-day cache
data30/min
get_earnings_transcript
FMP earnings call transcripts — dividend guidance + guidance language extraction
data30/min
get_market_data
Macro indicators, sector ETF flows, FRED interest rate data
data60/min
🔍Group 2 — Screening (3 tools)
screen_dividends
Filter securities by yield, DGR, payout ratio, streak, sector
screening60/min
screen_dividend_aristocrats
Filter by consecutive growth years — Aristocrats (25yr+), Kings (50yr+), Champions
screening60/min
screen_nav_safe_etfs
NAV-erosion-free ETF screening — 6 warning signals, 50 securities tracked
screening30/min
📊Group 3 — Analytics (4 tools)
analyze_dividend_safety
Full 7-voter ensemble safety score: VERY SAFE / SAFE / WATCH / AT RISK / DANGER
analytics20/min
compare_dividends
Side-by-side multi-ticker dividend comparison matrix with safety ratings
analytics60/min
forecast_dividend
Three-model ensemble forecast (ARIMA + Prophet + Ridge) — next 4 quarters with 80% CI
analytics20/min
optimize_portfolio
HarvestEngine max-yield portfolio optimization with HRP position sizing
analytics10/min
🧠Group 4 — Intelligence (4 tools)
ask_heydividend
Natural language query to HeyDividend AI's full AI pipeline. Prompt injection & manipulation detection via MCPGuard.
intelligence10/min · 50/hrguarded
generate_research_report
Claude Opus 4.5 deep research — IoC, earnings, sector, dividend sustainability, risk
intelligence5/min
get_heydividend_capabilities
List HeyDividend AI's 34 capabilities across 8 categories for agent discovery
intelligenceno limit
get_durability_graph
Composite durability score with 6 explainable sub-scores, stress scenarios, historical trends
intelligence20/min
🏦Group 5 — Brokerage / Plaid Connect (5 tools)
create_brokerage_link_token
Step 1: Generate Plaid Link token for OAuth brokerage connection flow
brokerage30/min
connect_brokerage_account
Step 2: Exchange public token → access token; persist to DB. CUSIP/ISIN/ticker matching.
brokerage30/min
sync_brokerage_portfolio
Step 3: Pull holdings → enrich with dividend data → upsert portfolio. Dividend-only filter.
brokerage10/min
get_brokerage_portfolio
Retrieve synced portfolio positions with HeyDividend AI safety scores and income projections
brokerage60/min
disconnect_brokerage
Revoke Plaid access token + purge portfolio data. Ownership-verified delete.
brokerage10/min
VersionMCP Server v3.1 · MCP SDK v1.26.0
Transportsstdio (Claude Desktop) + HTTP/SSE at /api/mcp/sse
AuthOAuth 2.1 one-click connection · API key for REST consumers
SecurityMCPGuard — prompt injection (16 override + 7 manipulation patterns) + sliding-window rate limiter + SHA-256 audit log → dbo.mcp_audit_log
Brokerage matchingCUSIP 100% · ISIN 95% · Ticker 70% — institutional-grade security resolution
Health/api/mcp/health · /api/mcp/tools
🤖Claude Intelligence Layer v1.2.0 — 8 Services
ClaudeClient (Async HTTP Core)
httpx async wrapper for Anthropic API — sonnet/opus/haiku. Lazy import, fails gracefully if ANTHROPIC_API_KEY absent. No SDK dependency. Shared across all 7 other Claude services.
Sonnet 4 / Opus 4.5httpxno SDK
Premium Query Router
39-signal frozenset routes complex queries to Claude Sonnet 4 with 6,000-token budget. Triggers: initiation of coverage, passive income plan, retirement portfolio, deep dive, investment thesis. OAI fallback on failure.
Sonnet 439 signalsrequest_handler.py
Deep Research Agent
Opus 4.5 for institutional-grade equity research. 5 report types: initiation-of-coverage, earnings deep dive, sector comparative, dividend sustainability, risk assessment. Multi-section structured output.
Opus 4.5POST /api/v1/claude/research/report
Training Quality Reviewer
Scores Q&A training pairs across 5 rubric dimensions. Returns verdict (pass/review/fail), dimension scores, improvement notes. Batch up to 50 concurrent. Raises overall fine-tune dataset quality.
Sonnet 4POST /api/v1/claude/training/reviewbatch 50
Self-Improvement Engine
Queries heydividend_query_log + heydividend_feedback DB tables. Outputs SelfImprovementReport: persona health score 0–100, knowledge gaps, prompt refinements, training recommendations. Scheduled analysis.
Sonnet 4POST /api/v1/claude/improve/analyze
Safety Ensemble Vote
Claude Sonnet 4 as the primary weighted voter (26%) in the 8-model dividend safety ensemble. Logical consistency check + hallucination detection + quality anchoring. Dynamic weight redistribution on model failure.
Sonnet 426% weightPOST /api/v1/claude/ensemble/safety-score
Financial Formula Engine
43+ deterministic formulas across 8 categories with Claude explanations and Excel syntax. Prevents hallucination on quantitative calculations. Gordon Growth Model, Yield on Cost, AFFO Payout, DCF and more.
Sonnet 443 formulasGET /api/excel/formulas/list
ClaudeDeepResearch (Direct Methods)
Three callable research primitives: generate_initiation_report() — full IoC with valuation + dividend thesis; generate_dividend_deep_dive() — payout safety + growth trajectory + BUY/HOLD/WATCH/AVOID verdict; generate_sector_comparison() — peer ranking with dividend yield matrix.
Opus 4.53 research primitivesPOST /api/v1/claude/research/report
Status endpointGET /api/v1/claude/status
Clienthttpx async (no Anthropic SDK) — lazy import, fails gracefully if ANTHROPIC_API_KEY absent
Modelsclaude-sonnet-4-20250514 (default) + claude-opus-4-20250514 (deep research only)
𝕏X Real-Time Signal Service — v2.0 (X API 2025)
grok_x_dividend_service.py
pay-per-use
XMCP-ready
grok-3-fast
MCP tool #19
X is the most real-time data platform on earth. HeyDividend AI now queries it via Grok x_search for any ticker — not just monitored ETF accounts. Pay-per-use billing (X API 2025) means no monthly credit exhaustion. XMCP Server hook will upgrade to native X MCP context when configured.
Signal Types
📣 Dividend announcements
⚠️ Cut / suspension warnings
📊 Earnings reactions
🏦 Analyst calls
👤 Insider activity
🔴 Breaking news
Coverage
Any public ticker
10 monitored ETF accounts
All public X posts
Image understanding
1–30 day lookback
Sentiment classification
New Endpoints
GET /api/x/signals/{ticker}
GET /api/x/signals/{ticker}/dividend
GET /api/x/status
GET /api/x/xmcp/status
MCP Tool #19
get_x_dividend_signals
Args: ticker, days_back
Returns: signals[], sentiment_summary, top_signal
Claude/Cursor can now ask:
"What is X saying about $T?"
Grok Responses API /v1/responses
x_search tool
no monthly limits
XMCP fallback active
env: X_XMCP_SERVER_URL
🛡️Accuracy & Trust Layer
Inline Source Citations (Task #82)
CitationRegistry accumulates per-metric citations during every request. Unicode superscripts (¹²³…) are injected onto metric values in both the resolved-profile context block and cross-API verified metrics. System prompt instructs the LLM to preserve them verbatim and label model-estimated values [est.]. displaySourceAttribution tool call is enriched with per-provider sources, overall confidence, and a full citations array for downstream audit.
audit-gradeapp/services/citation_registry.py
Claim Verification Gate
Extracts financial claims from LLM output and verifies them against the cross-API consensus before the answer is returned. Hard gate at the end of the Phase-5 ensemble assembly. Drops or flags any value that disagrees with the median consensus.
hard gateclaim_verification_gate.py
SHAP Explainability Layer
Audit-ready XAI for dividend safety verdicts. Domain-linear Shapley decomposition for LLM voters + shap.TreeExplainer for sklearn ML models. Returns per-feature contribution that adds up to the final safety score.
XAIshap.TreeExplainer
Query Distillation Layer
Zero-latency rule-based classifier intercepts high-frequency simple queries (single-ticker price, last dividend, basic yield) and routes them to a cost-effective LLM before the full Phase-5 pipeline runs. Cuts cost on the long tail of trivial queries.
zero-latencycost optimizer
Skill Loader Service
Injects up to two relevant financial methodology blocks into the system prompt based on the user's query. 68 financial skills across 11 categories live under skills/. House style (frontmatter + H2 sections) documented in skills/README.md. New skills authored via vendored Agent Skill Open Standard kit at tools/agent-skill-creator/.
68 skills · 11 categoriesTRIGGER_MAP
Memory & Session Layer + Fix-It Agent
Conversation memory, block-memory entities, user profiles — injected into the system prompt and persisted. Fix-It Agent scans conversation logs for failed responses and regenerates answers offline.
block memoryoffline regen
Multi-Source Verifier (5 APIs)
multi_source_verifier.py fires 5 parallel API calls (FMP · Finnhub · EODHD · Alpha Vantage · HeyDividend AI DB) on every ticker preflight. Median consensus for price, yield, annual dividend, payout ratio, and last dividend is injected into the LLM system prompt before any response is generated. 5-minute TTL cache per ticker.
5 sources · median consensusHIGH/MED/LOW confidence
Finance Benchmark Harness
Runs FinanceBench and FinQA across all ensemble voters and derives router weights from measured accuracy + latency. Weights are not hand-tuned — they are earned.
FinanceBench + FinQAdata-driven weights
📊Reports, Charts & Token Accounting
Slide Deck Generator
6-slide institutional PDF reports via ReportLab: holdings overview, dividend metrics, 3-year income projection, risk heatmap, Claude AI recommendation. Served from /static/reports/{user_id}/ with 24h auto-cleanup.
POST /api/v1/reports/slide-deckReportLab
Mind Map Generator
Nested JSON tree for D3.js / React Flow rendering. Topics: strategy (income/growth/safety/international pillars), sectors (GICS grouping), portfolio (per-ticker metric breakdown). Enriched with block-memory entities.
POST /api/v1/reports/mind-mapD3 / React Flow
Forecast Fan Chart — 4-View
Three-scenario (bull / base / bear) charts for price, dividend, yield, and NAV rendered with matplotlib. One endpoint, four views per symbol.
GET /api/v1/charts/forecast-fan/{symbol}matplotlib
HarvestEngine Platform
Institutional-grade dividend strategy backtesting and portfolio optimization — Aristocrat rotation, max-yield with HRP sizing, sector rebalance, drawdown control.
/api/harvest/*HRP · backtest
Token Usage & Cost Tracking
Real-time per-provider LLM cost tracking with real token capture, user identification, and spending alerts. Daily aggregates persisted to dbo.llm_token_usage. REST: GET /api/v1/tokens/today · /summary · /historical · /users · /alerts · /alert-status · /pricing + POST /api/v1/tokens/flush.
FinOpsdbo.llm_token_usage
International Dividend Intelligence
Analysis across 9 international markets with withholding-tax adjustment and FX conversion (Frankfurter / ExchangeRate-API). After-tax yield, treaty rate awareness, base-currency normalization.
9 marketswithholding · FX
ChatGPT GPT Actions Integration
ChatGPT-ready OpenAPI 3.1 spec with 9 Actions covering price, dividend safety, dividend history, fundamentals, screener, neural prediction, scenarios, FMP history, chat.
9 ActionsOpenAPI 3.1
Notebook Dashboard Spec
Comprehensive frontend integration spec at docs/NOTEBOOK_DASHBOARD_FRONTEND_SPEC.md — covers Tasks #58–#61 endpoints, block-chain schema, SmartSheet data contract, tool-call UI component contracts, tier gating, WebSocket protocol.
frontend contracttasks #58–#61
🗂️API Routes (112 files)
advanced_analyticsadvisoragent_toolai_sdkalert_pushclaude_intelligencecode_interpretercomprehensive_trainingcurated_listdashboarddatabase_mldeep_researchdeepseek_trainingdividend_aristocratsdividend_intelligencedividend_listsdividend_neuraldividend_pipelinedocument_learningeducation_trainingexternal_ml_apifile_processingfinetuningfingptfinrobotfmpfmp_traininggeneral_investmenthallucination_preventionharvestharvest_traininghashtaginvestment_strategymarket_intelligencemcpml_predictionmoatmulti_source_trainingnotebookperplexityperplexity_trainingquantlibs_trainingrecommendationrecommendation_trainingresearcher_agentrlmroundtablesecurity_comparisonsemanticsentimentsocial_mediastrategicstreaming_chattrading_intelligencetrading_strategies_trainingtraining_managementttsultimate_packunified_data_lakeuser_profilevideovideo_trainingx_dividendx_dividend_training
🌐Network & Ports
8001HeyDividend AI Main API (public)
9001ML API (public)
9000Internal API (localhost)
8000Shadow / restarting
443Nginx → llm.heydividend.ai (llm.theheydividend.ai legacy alias)
🗄️Database
EngineAzure SQL Server
Serverhey-dividend-sql-server.database.windows.net
DatabaseHeyDividend-Main-DB
Driverpymssql (native FreeTDS)
ViewsvSecurities, vDividendsEnhanced, vSchedules, vSignals, vPredictions
unified_dividends787,787 rows · canonical primary source
Data guardCHK_unified_dividends_amount_sanity · amount > 0 AND ≤ $500 · 0 bad rows
Guard layers① code batch-uniformity check ② SQL CHECK constraint ③ moat cleanup endpoint
⚙️Architecture Enhancement Layer
Model Telemetry System
active
Cost-Aware Router (4-tier)
active
ML-Based Intent Classifier
active
Shared Cache (LRU in-mem)
active
Async DB Pool (ThreadPoolExecutor)
active
Parallel Source Fan-Out
active
RAG Retrieval Reranking
active
Circuit Breaker (ML API)
active
ML Health Monitor (30s interval)
active
🔑Active External Integrations
Azure OpenAI
Endpointhtmltojson-parser-openai-a1a8.openai.azure.com
DeploymentHarveyGPT-5
Status✓ active
xAI Grok (X API 2025)
PurposeX.com monitoring, real-time social sentiment, ensemble voter
Modelgrok-3-fast · /v1/responses · x_search tool
Billing✓ pay-per-use — no monthly credit exhaustion
Status✓ active · MCP tool #19 get_x_dividend_signals
Google Gemini 2.5 Pro
PurposeEnsemble (31%) + market intel
Status✓ active
Anthropic Claude
ModelsSonnet 4 + Opus 4.5
PurposeEnsemble + deep research
Status✓ active (httpx, no SDK)
Perplexity Sonar
PurposeEnsemble (21%) + research training
Status✓ active
ElevenLabs TTS
PurposeAudio generation for research
KeyELEVENLABS_API_KEY ✓
Status✓ configured
FMP (Financial Modeling Prep)
Endpoints80+ API endpoints
Status✓ active
Helicone LLM Observability
PurposeToken tracking, latency, cost
Status✓ active (proxy layer)
🔁GitHub Actions — ML Training Pipeline
Workflowtrain-ml-models.yml · daily 2 AM UTC + manual dispatch
Models6 total — Dividend Growth, Cut Predictor, Anomaly Detection, ESG Scorer, Payout Rating, Portfolio Optimization
No-DB models (always train)① Growth Forecaster ⑤ Payout Rating ⑥ Portfolio Optimizer — use synthetic data when DB unreachable
DB-dependent models② Cut Predictor ③ Anomaly Detection ④ ESG Scorer — train when Azure SQL reachable
UploadAzure Blob Storage → ml-models container · versioned archive per run
Status✓ Fixed — exit code 2 (pip cache / MSSQL install) + TRAINED=0 (wrong args) resolved
🩺VM Cron Health — LLM VM (20.81.210.213)
vm_training_cron.sh✓ execute permission fixed (was chmod -x, silent 30-min fail)
training_health_check.py✓ deployed — runs every 6h via cron, validates all training tables + unified_dividends integrity
Health checksRow counts per training table · unified_dividends 787k+ · amount sanity · zero / over-$500 scan
📋Log Health (Current Session)
Dominant error sourceGrok 429 storm resolved — migrated to X API 2025 pay-per-use; ensemble weights restored
Residual errorsDB timeouts (transient), yfinance fallback, config misses
Log file size~205 MB (accumulated; rotate monthly)
Live countsSee Live Logs tab → ERROR filter for real-time totals
Accuracy logBatch INSERT active — 1 background thread per verify_all_claims() call (~200ms vs ~4.2s sequential)