HeyDividend AI — Engine Insights

Ensemble Voters

Training Agents

283

Service Modules

112

API Route Files

787k+

unified_dividends Rows

112

Scheduled Jobs

19.1k

Log Events (24h)

ML Models Loaded

—

Response Accuracy

—

Queries (24h)

—

Fix-It Pending

—

Auto-Fix Rate

—

Avg Confidence

Verification Sources

🟢

Grok / X API — Pay-Per-Use Active — Migrated to X API 2025 pay-per-use billing. Monthly credit exhaustion no longer possible. Ensemble weights restored. grok_x_dividend_service.py upgraded to v2.0: search_ticker_signals() for any ticker · XMCP Server hook ready · New MCP tool get_x_dividend_signals added (tool #19). Model: grok-3-fast

🟢

ML API Healthy — heydividend-unified-ml v4.0.0 running on port 9001 with 7 models loaded. HeyDividend AI backend on port 8001 active. Claude Intelligence Layer v1.2.0 — 8 services registered.

🟢

Accuracy & Performance Upgrades Active — Accuracy log batch INSERT (1 background thread vs 24 sequential), DividendScreener stale-connection retry (pool recycle on OperationalError), Two-layer LLM Response Cache deployed.

🟢

DB Integrity Guard — 3-Layer Protection Active — CHK_unified_dividends_amount_sanity SQL CHECK constraint deployed (amount > 0 AND amount ≤ $500). Code-level batch-uniformity guard + moat cleanup endpoint. 787,787 clean rows · 0 over $500 · 0 zeroes.

🟢

VM Cron Health Monitoring Active — vm_training_cron.sh permissions fixed. training_health_check.py deployed on LLM VM — runs every 6h, validates all training tables + unified_dividends integrity. GitHub Actions ML training pipeline fixed: 6-model pipeline runs daily 2 AM UTC; models 1, 5 & 6 train on synthetic data when DB unreachable.

🟢

Cross-API Verification Layer Active — multi_source_verifier.py fires 5 parallel API calls (FMP · Finnhub · EODHD · Alpha Vantage · HeyDividend AI DB) on every ticker preflight. Median consensus for price, yield, annual dividend, payout ratio & last dividend is injected into the LLM system prompt before any response is generated. Confidence ratings: HIGH✅ ≥3 sources agree · MEDIUM⚡ 2 sources · LOW⚠️ single source. 5-minute TTL cache per ticker.

🟢

Second Brain Deployed — Knowledge Graph + Weekly Review — Entity knowledge graph (heydividend_memory_edges) adds backlinks between memories; retrieve_connected() expands 1 hop along edges to surface linked prior thinking, not just frequency hits. Weekly Claude Sonnet 4 knowledge review, graph-driven Map of Content, and explicit memory tiering (fleeting → literature → permanent). New endpoints GET /api/v1/memory/graph, /review/latest, POST /review/run live on the LLM VM.

🖥️System Vitals

HostAzure VM — 20.81.210.213

OSUbuntu (miniconda3 llm env)

Uptime~24h (refreshed via deploy)

Load avg0.75 / 0.79 / 0.81

RAM2.8 GB used / 27 GB total

Disk148 GB / 248 GB (60%)

RAM usage10.4%

Disk usage60%

⚡Active Processes

Process	Port	CPU	MEM	Status
HeyDividend AI Main API	8001	7.3%	5.3%	LIVE
HeyDividend AI ML API	9001	0.9%	1.2%	LIVE
Internal API (9000)	9000	0.0%	1.2%	IDLE
Gemini Scheduler	—	0.0%	0.0%	RUNNING
Service Watchdog	—	0.0%	0.0%	ACTIVE
Port 8000 (shadow)	8000	189%	0.6%	RESTARTING

📊Training Database — Row Counts ↻ Refresh ⚙ Init Tables

Loading live DB counts…

📦Disk Breakdown

ml_training/966 MB

logs/205 MB

app/24 MB

⚖️Dividend Safety Ensemble — 7 Weighted Voters

🟣 Google Gemini 2.5 Pro

Sustainability + fundamentals analyst

31%

active primary voter REST API

Evaluates FCF coverage, payout sustainability, dividend growth trajectory

🔵 Perplexity Sonar

Real-time data + news disclosures

21%

active sonar-pro web-grounded

Grounds analysis in recent earnings, news, SEC filings, and analyst reports

🟢 HeyDividend AI GPT-5

Azure OpenAI — complex reasoning

17%

active HarveyGPT-5 Azure hosted

Payout coverage analysis, DCF reasoning, complex multi-step financial logic

🔶 DeepSeek-R1

Quantitative — DCF & FCF math

13%

active quantitative chain-of-thought

Precise DCF, FCF payout calculations and quantitative stress-testing

🌐 GAN Scenarios

Dividend Neural Engine — stress testing

active CNN Discriminator synthetic futures

Generates 100 synthetic dividend sequences, cut_probability drives this vote

📰 BERT News Sentiment

FinBERT — positive news signal

active FinBERT sentiment layer

Scores news sentiment; positive coverage boosts safety score

🎭 Claude Sonnet 4

Anthropic — verification gate

active Opus 4.5 for research hallucination check

Logical consistency check, hallucination detection, claim verification

🔌Other Active Models (Outside Ensemble)

⚡ Grok (grok-3-fast)

xAI — real-time X.com monitoring via X API 2025

PAY-PER-USE

Migrated to X API 2025 pay-per-use billing — no monthly credit exhaustion. Ensemble weights restored. Powers x_search on /v1/responses, MCP tool #19 get_x_dividend_signals, and the X Dividend Training Agent.

⚡ Groq (Llama 3.3 70B)

Ultra-fast inference (~300ms)

active

Used for rapid lightweight queries. Async streaming enabled.

🏦 FinGPT

Finance-specialized open-source LLM

specialized

Domain-specific fine-tuned on financial corpus. Used for technical analysis queries.

📊 FinRobot

Multi-agent financial reasoning

specialized

Institutional-grade multi-agent financial analysis framework.

🔍 Claude Deep Research

Opus 4.5 — institutional reports

premium

5 report types: IoC, earnings, sector, dividend sustainability, risk. Opus 4.5 only.

📐 Financial Formula Engine

43 deterministic formulas + Claude explanations

active

Deterministic math layer prevents AI hallucination on quantitative calculations.

🔄Core Training Agents — Continuous

Deep Research Training Agent

Every 1 hour 500 examples/hr running

Investment Researcher Agent

Every 30 min 100 questions running

Perplexity Research Training

Every 1 hour 30 Q&A pairs running

HarvestEngine Training Agent

Every 2 hours 160 pairs/batch running

Investor Roundtable Training

Every 2 hours ~9,600 pairs running

FMP Comprehensive Training

Every 2 hours 60 pairs (720/day) running

Market Intelligence Training

Every 2 hours 30 pairs running

DeepSeek Quantitative Training

Every 2 hours 20 pairs running

HeyDividend AI Advisor Platform Training

Every 3 hours 120 pairs/batch running

NAV Avoidance Training Agent

Every 2 hours 15 pairs (180/day) running

X Dividend Training Agent

Every 2 hours 30 questions running (pay-per-use)

Video Training Agent (@heydividedtv)

Every 6 hours sync YouTube running

Data Enrichment Service

Every 6 hours AFFO, REIT, streaks running

HeyDividend AI Score Pre-computation

Daily @ 2 AM Top 500 tickers scheduled

Dividend Frequency Fallback Sync

Every 1 hour DB sync running

HeyDividend AI v4.0 Standalone Training Agent

Every 2 hours 15 variations/run · ~180/day running

Auto Fine-Tuning Submission (Scheduler)

Sunday 1 AM UTC · poll 6h gpt-4o-mini-2024-07-18 scheduled

DIVIDEND_SCREENER Training v2 (86x fix)

Every 2 hours 20 scenarios · 36 new templates v2 — SCREENER_LIST fix

Chat History Conversation Training

On conversation end DB-persisted turns running

📈Trading Intelligence — 13 Specialist Agents (Every 2 Hours)

Investment Thesis Generator

20 pairs/batch • 240/day

active

Technical Analysis Agent

15 pairs/batch • 180/day

active

Sentiment Aggregator

15 pairs/batch • 180/day

active

Crypto Market Agent

15 pairs/batch • 180/day

active

ETF Composition Analyzer

15 pairs/batch • 180/day

active

Volatility Risk Scorer

15 pairs/batch • 180/day

active

Macro Regime Detector

15 pairs/batch • 180/day

active

Earnings Surprise Predictor

15 pairs/batch • 180/day

active

Rotation Strategy Agent

15 pairs/batch • 180/day

active

Multi-Agent Trading Desk

15 pairs/batch • 180/day

active

Options Strategy Analyzer

15 pairs/batch • 180/day

active

Fund Flow Analyzer

15 pairs/batch • 180/day

active

Crypto Correlation Monitor

15 pairs/batch • 180/day

active

⏱️Cron Schedule (VM)

Schedule	Job	Log
/30 * * *	vm_training_cron.sh — comprehensive training batch	/var/log/heydividend/comprehensive_training.log
15 6,18 * * *	Dividend Intelligence 8-model training	dividend_intelligence_cron.log
0 4,16 * * *	Investor Profile Trainer	investor_profile_training.log
0 5,17 * * *	Portfolio Blueprint Trainer (15 questions)	portfolio_blueprint_training.log
0 1 * * 0	Auto Fine-Tuning Submission (Sunday 1 AM UTC) + 6h status polls	scheduler_service.py (in-process)
0 2 * * *	HeyDividend AI Sanity Check v3 (email alert)	sanity_cron.log
0 * * * *	PII Monitor — production log scan	pii_monitor.log

📉Projected Training Volume

Agent	Rate	Per Day	Per Week
Deep Research Training	500/hr	12,000	84,000
Investment Researcher	100/30min	4,800	33,600
Investor Roundtable	~9,600/2hr	4,800	33,600
FMP Comprehensive	60/2hr	720	5,040
Investment Strategy	15/2hr	180	1,260
13 Trading Agents	15/2hr each	2,340	16,380
DI 8-Model (daily incr.)	20/model×8	160	960
HeyDividend AI v4.0 Training Agent	15/2hr	180	1,260
Total		~25,180	~176,100

🧬Dividend Neural Engine — 7 Modules (Boris Banushev Framework)

🌊 Fourier Denoiser

FFT-based denoising of dividend history. Detects special dividends as 2σ outliers. Isolates true trend vs noise.

active 215 lines

📊 ARIMA Feature

Auto-selects best (p,d,q) via AIC. Forecasts next dividend as ML input feature. Handles short series gracefully.

active 212 lines

🔲 Stacked Autoencoder

PyTorch 3-layer encoder/decoder. Compresses 12 dividend features → 16-dim latent vector for LSTM input.

PyTorch 473 lines

🔮 LSTM Predictor

2-layer LSTM + attention. MC Dropout (50 passes) for confidence intervals. Predicts next dividend amount.

PyTorch 445 lines

🎲 GAN Engine

LSTM Generator + CNN Discriminator. Generates 100 synthetic dividend futures. cut_probability → 8% ensemble vote.

ensemble voter 432 lines

🗺️ SOM Anomaly Detector

10×10 Self-Organized Map. Detects unusual payout/FCF trajectories. Returns similar tickers by BMU proximity.

minisom 487 lines

🔗 Eigen / Contagion

PCA on dividend growth matrix. Dividend contagion risk: if sector peer cuts, who follows? 24h TTL cache.

sklearn PCA 256 lines

🏭HeyDividend AI Unified ML API — Port 9001

Status✓ healthy

Serviceheydividend-unified-ml

Version4.0.0

Models loaded7

Host0.0.0.0:9001 (public) + nginx proxy

Endpoints/score/symbol · /predict/yield · /predict/cut-risk · /predict/payout-rating · /health

CPU usage0.9%

Memory1.2% of 27 GB ≈ 330 MB

⚖️Ensemble Weight Distribution

Claude Sonnet 4 (Verification)26%

Gemini 2.5 Pro22%

Perplexity Sonar18%

HeyDividend AI GPT-5 (Azure OAI)13%

DeepSeek-R110%

GAN Scenarios (Neural)6%

Market Signals (CDS + IV)3%

BERT News Sentiment2%

🛡️ML-Powered Moat Features

FeatureExtractionService

Extracts 12 proprietary dividend features from DB + market data for ML input pipeline.

active

DividendQualityScorer

ML scoring of dividend quality across 6 dimensions. sklearn-based (offline on LLM VM).

sklearn missing (dev)

CutRiskAnalyzer

XGBoost dividend cut risk model. Rule-based fallback active when model unavailable.

xgboost missing (dev)

HeyDividend AI Score Service

Pre-computed scores for top 500 tickers daily at 2 AM. Composite dividend health score.

active

NAV Avoidance ML Screen

6 warning signals, erosion trend, distribution trap detection across 50+ securities.

active

Claim Verification Gate

Mandatory fact-check before AI response delivery. Cross-references ground truth tables. Upgraded with apply_multi_source_consensus() — cross-checks LLM output against median consensus, upgrades to VERIFIED within tolerance, marks CONFLICT on deviation.

active

🔬Cross-API Verification Layer — v1.0

multi_source_verifier.py active 804 lines 5-min TTL cache

Fires 5 parallel API calls on every ticker preflight. Computes median consensus across all responding sources. Injects a VerificationResult.context_block() into the LLM system prompt before any response is generated — ensuring HeyDividend AI always reasons from ground-truth consensus rather than stale or hallucinated figures.

Data Sources

FMP (Financial Modeling Prep)
Finnhub
EODHD
Alpha Vantage
HeyDividend AI DB (unified_dividends)

Verified Metrics

📈 Price
💰 Dividend Yield
📅 Annual Dividend
⚖️ Payout Ratio
🕐 Last Dividend

Discrepancy Thresholds

Price:         2%
Dividend:     5%
Yield:        15%
Payout ratio: 20%

Confidence Levels

HIGH ✅ ≥3 sources agree
MEDIUM ⚡ 2 sources agree
LOW ⚠️ single source

Conflicts flagged & surfaced to LLM

Pipeline flow: ticker preflight → parallel fetch (asyncio.gather) → median consensus → context_block() → LLM system prompt injection → response generation → apply_multi_source_consensus() post-check → VERIFIED / CONFLICT label appended

📐Quantium Library Stack — 20+ Libraries (Dividend Intelligence Pipeline)

edgartools

SEC EDGAR access — 8-K dividend declarations, 10-K XBRL cash flows, Item 5 policy text extraction

Stage 1

exchange-calendars

NYSE/NASDAQ/LSE trading day validation for ex-date confirmation and growth streak counting

Stage 2

pandas-market-calendars

Market-aware date arithmetic — fiscal year boundary detection, pay-gap widening (liquidity stress signal)

Stage 2

arch (GARCH)

GARCH(1,1) volatility model on dividend yield series. Conditional variance = cut risk. Persistence α+β.

Stage 3

pmdarima

Auto-ARIMA with AIC/BIC order search. Dividend payment forecasting with confidence intervals.

Stage 4

prophet

Facebook Prophet with quarterly seasonality decomposition. 2nd member of 3-model ensemble for 4-quarter forecast.

Stage 4

scikit-learn Ridge

Ridge regression (L2, α=1.0) with lag-4 windowed features and StandardScaler. 3rd ensemble member. Dynamic inverse-MAE weighting. CI: ±1.28×val_MAE (80%).

Stage 4inverse-MAE weights

tsfresh

EfficientFCParameters — 800+ statistical features from payment time series. 20-feature curated subset.

Stage 5

empyrical-reloaded

Income Sharpe, income Sortino, max income drawdown. Treats dividends as the "returns" series.

Stage 6

ffn

Financial functions — yield-on-cost CAGR, income consistency score, drawdown analytics.

Stage 6

PyPortfolioOpt

Max-yield portfolio optimization where yield replaces expected returns in the efficient frontier.

Stage 7

Riskfolio-Lib

HRP (Hierarchical Risk Parity) across dividend growth correlations for robust position sizing.

Stage 7

skfolio

Portfolio optimization toolkit — additional constraint handling for income-focused allocation problems.

Stage 7

alphalens-reloaded

Factor IC / ICIR back-testing of HeyDividend AI Safety Score, yield screen, DGR screen as alpha signals.

Stage 8

ta · finta

Technical analysis indicators applied to dividend yield time series for regime context signals.

supporting

financedatabase

Sector/industry classification database — sector peer grouping for contagion and factor analysis.

supporting

pypme · rateslib

Public Market Equivalent benchmarking and rates/bond analytics for yield spread context.

supporting

🔧Core Intelligence Services

Intelligent Query Service

19 intent types · multi-model fallback chain · learning loop

Hallucination Prevention

Ticker Preflight Gate · Schema fingerprint · Claim verification

Dividend Safety Ensemble

7-voter weighted system · VERY_SAFE → CRITICAL scale

NAV Erosion Service

50 securities · 6 warning signals · distribution trap detection

Chart Generator Service

7 chart types · matplotlib · server-side PNG generation · includes Forecast Fan (bull/base/bear · FMP consensus + ±1.65σ fallback)

PDF Research Report Gen

4 report types · ReportLab · institutional-grade output

SEC Filing RAG Service

EDGAR retrieval · vector search · 10-K/Q synthesis

Stock Card Service

Data-driven per-ticker summaries · ML integration · 30+ stopwords

🌐Real-Time Data Services

FMP Integration (80+ endpoints)

Financial Modeling Prep · market data · fundamentals · estimates

Finnhub Service

Real-time quotes · news · earnings · analyst ratings

X.com Monitoring

Grok agent tools · circuit breaker active · X API 2025 pay-per-use

FRED Economic Data

Federal Reserve economic indicators · interest rate feeds

Unified Dividends Pipeline

Multi-source dividend data · FMP + Finnhub + EDGAR → Azure SQL

Multi-Source Backfill

Auto data discovery · gap filling · Backfilled_Dividends/Prices tables

🚀v4.2 Feature Layer

Semantic Vector Memory

Persistent user conversation memory with semantic retrieval

ElevenLabs TTS

Text-to-speech audio generation for research summaries

Agentic Tool-Calling Loop

Multi-step tool orchestration with self-correction

Code Interpreter (Sandboxed)

Safe Python execution for user-defined financial calculations

SSE Token Streaming

Real-time token-by-token response streaming via Server-Sent Events

WebSocket Alert Push

Real-time dividend alert delivery via persistent WebSocket

User Profile Injection

Personalized context from persistent user profiles into LLM prompts

Auto Fine-Tuning Pipeline

Fully automated: Sunday 1 AM UTC · 3 sources (50/35/15%) · min 200 pairs · gpt-4o-mini-2024-07-18 · 6h polls · zero manual steps

HeyDividend AI v4.0 Training Agent

29 seed pairs across 6 categories · 15 LLM variations per run (every 2h) · ~180 pairs/day into InvestmentResearcherTraining (35% fine-tune weight)

🧠Second Brain — Knowledge Graph & Weekly Review NEW

Entity Knowledge Graph

Backlinks layer over block memory: typed edges between entities that co-occur in a stored response — in_sector, has_metric, compared_with, risk_flag. retrieve_connected() expands 1 hop ordered by weight × confidence; stale edges decay by confidence and are pruned during consolidation.

activeheydividend_memory_edgesblock_memory_service.py

Weekly Claude Knowledge Review

Per-user Claude Sonnet 4 digest: Themes · Risk watch · Gaps. Reads the user's recent block-memory entities and relationship edges from the last 7 days (including compared_with and risk_flag edges) and writes a short structured review. Persisted to heydividend_knowledge_reviews for retrieval/audit.

Claude Sonnet 4knowledge_review_service.pyweekly job

Map of Content (MOC)

Living "what I've researched" map driven by the graph — new topic:"moc" on the mind-map service: hubs ranked by occurrence, children via graph neighbors labeled by relation. Reuses the existing node schema (frontend unchanged) and fixes the prior bug that left the "Recent Focus (Memory)" pillar empty.

activemind_map_service.pyPOST /api/v1/reports/mind-map

Memory Tiering

Explicit lifecycle on each memory: fleeting (seen once) → literature (corroborated, embedded, ≥1 edge) → permanent (meets the fine-tune threshold). Prompt injection prioritizes permanent → literature → fleeting; fine-tune export filters on permanent. Promotion happens inside consolidate().

activetier columnconsolidate()

Memory API Surface

New additive endpoints: GET /api/v1/memory/graph (raw nodes + edges), GET /api/v1/memory/review/latest, POST /api/v1/memory/review/run (manual trigger). Registered before the /{user_id} catch-all so they aren't shadowed. Verified live on the LLM VM.

deployedmemory_routes.py3 endpoints

🆕Recent Platform Upgrades

Chat History API

Full conversation persistence to Azure SQL. 4 endpoints: GET /api/v1/chat/conversations (list recent), GET /conversations/{id}/messages (restore thread), DELETE /conversations/{id}, POST /conversations/new. Streams persist via _history_save_wrapper. User/conversation IDs via x-user-id/x-conversation-id headers.

activechat_history_routes.py4 endpoints

Analysis Depth Overhaul v4.1

Every single-ticker dividend card now calls Claude Sonnet for 5 mandatory analysis sections: Dividend Sustainability (VERY SAFE/SAFE/WATCH/AT RISK/DANGER), Growth Trajectory, Yield in Context (vs sector/S&P/T-bill/peers), Business Quality, and HeyDividend AI's Verdict (BUY/HOLD/WATCH/AVOID). max_tokens raised to 2800. Banned-phrase enforcement. prefer_groq=False.

Claude Sonnet 4ai_sdk_routes.py5 sections2800 tokens

Two-Layer LLM Response Cache

Semantic response cache with in-memory dictionary (L1) + Azure SQL persistent table (L2). Reduces repeated-query token cost by 30–50%. Semantic similarity matching prevents stale cache misses. LRU eviction on L1 overflow.

activeL1 in-memoryL2 Azure SQL

99.999% Accuracy Enforcement (v1.0)

Multi-layer accuracy framework: middleware interception, claim verification gate, stale data refresh trigger, multi-source cross-validation. Batch INSERT logging (1 background thread replaces 24 sequential DB calls — ~4.2s → 200ms). HeyDividend AI accuracy log table auto-ensured once per process.

activeclaim_verification_gate.pybatch INSERT

Broker Connect via Plaid (REST + MCP)

5 REST endpoints at /api/plaid/* plus 5 MCP tools. 3-step flow: create link token → connect account → sync portfolio. CUSIP (100%), ISIN (95%), ticker (70%) institutional-grade security matching. Dividend-only filtering, portfolio upsert, ownership-verified disconnect.

active/api/plaid/*PLAID_CLIENT_ID + PLAID_SECRET

DividendScreener Stale-Connection Retry

2-attempt retry loop in request_handler.py. On OperationalError at attempt 0 calls engine.dispose() to recycle the pymssql connection pool, then retries. Graceful fallback to LLM narrative if second attempt also fails. Eliminates "connection closed" errors after DB idle.

activerequest_handler.py:637pool recycle

Fix-It Agent v2.1

Hourly scan of conversation logs for failed responses. 19 fix strategies (7 new in v2.1: TAX_EFFICIENCY, ETF_ANALYSIS, DRIP_COMPOUNDING, RETIREMENT_INCOME, COMPARISON_ANALYSIS, POSITION_SIZING, DGI_STRATEGY). 45 bad-response detection patterns (12 new: knowledge-cutoff deflections, clarification-fishing, tool-error leakage, hedging, truncated/disclaimer-only responses, AI identity deflections, info-begging, accuracy disclaimers, hallucination confidence hedges, stall phrases).

active19 strategies45 patternsv2.1

DIVIDEND_SCREENER Training Data v2

86x failure-pattern fix for SCREENER_LIST intent. dividend_intelligence_training.py: 12→20 scenarios, 8 new query pools (PRICE_CONSTRAINED, FCF_COVERAGE, TICKER_SEEDED, RECOVERY_DIVIDEND, NEGATIVE_SCREEN, INCOME_TARGET_REVERSE, COMPARATIVE_BETTER, VAGUE_INTENT). comprehensive_investment_trainer.py: +36 templates across 6 gap-pattern categories. LLM override prompt for screener intents — requires concrete ticker list with yield, safety rating, and rationale.

active20 scenariosSCREENER_LIST86x fix

🧠HeyDividend AI v4.0 Standalone Capabilities

General Knowledge Handler

Detects educational/definitional queries ("what is inflation", "explain DCF") via prefix matching + capability triggers. Routes before ticker extraction — HeyDividend AI answers as CIO-level financial educator.

activeheydividend_persona.py

Capability Registry

34 structured capabilities across 8 categories: Dividend Analysis, Portfolio Strategy, ETF Intelligence, Equity Research, Market/Macro, Screeners, Education, Advanced Tools. Exposed via GET /api/v1/capabilities.

activeheydividend_capabilities.py

Agentic Reflect Loop

Fires after CHAT PATH LLM stream completes — checks if critical content is missing and appends a supplement. Only activates for non-low-latency queries with parsed tickers. Capped at 3s via SIGALRM.

active_reflect_and_supplement()

Graceful Degradation

50 pre-built Q&A answers for common finance topics. Activates on full provider failure. Health exposed via GET /api/v1/health — 6 provider statuses (Azure OAI, Claude, Gemini, OpenAI, Groq, graceful_degradation).

active50 answersgraceful_degradation.py

Unified Persona Contract

Single source of truth for HeyDividend AI's CIO/Chief Strategist identity. get_system_prompt(mode="full"|"compact"|"fast") assembles right persona for each call site. Imported by llm_providers.py and intelligent_query_service.py.

activeheydividend_persona.py

📊StructuredKPIService — Perplexity Finance-Style Breakouts

Auto-appended to every single-ticker CHAT PATH response. Auto-detects security type via symbol sets + FMP profile. Fetches data in parallel via ThreadPoolExecutor (5s hard cap). Renders clean markdown panels.

BLOCK TYPE — stock

Business segments, valuation ratios, P&L summary for general equities.

BLOCK TYPE — dividend_stock

Yield, FCF payout, ML cut risk, 7-model safety score, growth streak years.

BLOCK TYPE — reit

FFO proxy, AFFO payout ratio, debt/equity, ML cut risk score.

BLOCK TYPE — etf

Top 10 holdings, sector allocation, expense ratio, AUM.

BLOCK TYPE — dividend_etf

Weighted yield, NAV erosion flag, distribution schedule.

BLOCK TYPE — bank

Net Interest Margin, ROE, ROA, efficiency ratio, P/Book.

BLOCK TYPE — mlp

DCF coverage ratio, distribution yield, debt/EBITDA, commodity exposure.

🏗️HarvestEngine Platform — 8 Modules

Backtesting Engine

6 strategies · DRIP, Growth, High-Yield, Capture, Aristocrat, Covered Call

Portfolio Optimizer

Dividend income optimization · Markowitz-inspired allocation

Dividend Risk Analyzer

NAV erosion, cut risk, FCF coverage, sector contagion

Income Impact Simulator

Dividend income impact projection with DRIP compounding

Dividend Calendar

Ex-date tracking · declaration date patterns · 25 securities

NAV Avoidance Screener

50 securities · 20 profiles · 6 warning signals

Multi-Model Safety Ensemble

Orchestration layer · 7 voters · weight redistribution on failure

HarvestEngine Continuous Training

Expert Q&A pair generation · 13,120 rows in DB · ThreadPoolExecutor

🧪Dividend Intelligence Pipeline — 8 Services (Quantium Library Stack)

Full ML pipeline orchestrated by dividend_intelligence_pipeline.py — all 8 stages confirmed available: true. Returns unified DividendIntelligenceReport with HeyDividend AI composite score & plain-language verdict.

STAGE 1 — EDGAR Dividend Service

edgar_dividend_service.py

edgartools pulls 8-K declared dividends + 10-K XBRL cash flow + Item 5 policy text. Policy type classifier (consistent / growth / variable / suspended).

edgartoolsSEC EDGARXBRL

STAGE 2 — Trading Calendar Service

trading_calendar_service.py

exchange_calendars validates ex-dates, counts consecutive growth years respecting fiscal years, detects pay-gap widening (liquidity stress), estimates next ex-date.

exchange-calendarspandas-market-calendars

STAGE 3 — Yield Volatility Service

yield_volatility_service.py

ARCH/GARCH(1,1) fit on dividend yield time series → conditional variance = cut risk signal. 0–100 score, regime (stable / elevated / crisis), persistence alpha+beta.

arch/GARCHcut risk 0–100

STAGE 4 — Dividend Forecast Service

dividend_forecast_service.py

Three-model ensemble: Auto-ARIMA (pmdarima) + Facebook Prophet + Ridge regression (L2, α=1.0, lag-4 windowed features, StandardScaler). Dynamic inverse-MAE weighting — weight ∝ 1/val_MAE, equal-weight fallback if <2 valid. Next 4 payment forecasts with 80% CI. Predictability score 0–100.

pmdarimaprophetRidge4-quarter CIinverse-MAE weights

STAGE 5 — Dividend Feature Extractor

dividend_feature_extractor.py

tsfresh EfficientFCParameters extracts 800+ features from payment time series. Curated 20-feature subset + stability score 0–100. Powers the tsfresh Cut-Risk Classifier: GradientBoostingClassifier (n_estimators=200, max_depth=4, lr=0.05, subsample=0.8) — upgraded from RandomForest. Time-ordered train/test split with 10% embargo gap prevents look-ahead bias. Returns probability + top-5 feature drivers with direction signals. Monthly auto-retraining.

tsfresh800+ featuresstability 0–100GradientBoostingmonthly retrain

STAGE 6 — Income Analytics Service

income_analytics_service.py

empyrical-reloaded + ffn compute income Sharpe, income Sortino, yield-on-cost CAGR, max income drawdown, income consistency score 0–100.

empyricalffnYOC CAGR

STAGE 7 — Portfolio Income Optimizer

portfolio_income_optimizer.py

PyPortfolioOpt max-yield optimization (yield replaces expected returns), HRP across dividend growth correlations, Kelly position sizing (win_prob = 1 − cut_prob).

PyPortfolioOptRiskfolio-LibHRP · Kelly

STAGE 8 — Dividend Factor Analyzer

dividend_factor_analyzer.py

alphalens-reloaded back-tests HeyDividend AI Safety Score, yield screen, DGR screen as alpha factors. Returns IC / ICIR metrics proving or refuting each screen's predictive power.

alphalensIC · ICIRalpha factor

🔗MCP Server v3.1 — 23 Financial Intelligence Tools · 5 Capability Groups

OAuth 2.1 one-click connection · stdio (Claude Desktop) + HTTP/SSE at /api/mcp/sse · MCPGuard prompt injection + rate limiting + SHA-256 audit log

📡Group 1 — Data (7 tools)

get_dividend_history

Full dividend payment history for any ticker from Azure SQL

data60/min

get_stock_price

Real-time price + yield from FMP integration

data60/min

get_dividend_calendar

Upcoming ex-dates, pay dates, declaration dates

data60/min

get_company_fundamentals

FCF, payout ratio, debt-to-equity, sector from DB views

data60/min

get_sec_filings

SEC EDGAR 10-K/10-Q/8-K structured extraction with 7-day cache

data30/min

get_earnings_transcript

FMP earnings call transcripts — dividend guidance + guidance language extraction

data30/min

get_market_data

Macro indicators, sector ETF flows, FRED interest rate data

data60/min

🔍Group 2 — Screening (3 tools)

screen_dividends

Filter securities by yield, DGR, payout ratio, streak, sector

screening60/min

screen_dividend_aristocrats

Filter by consecutive growth years — Aristocrats (25yr+), Kings (50yr+), Champions

screening60/min

screen_nav_safe_etfs

NAV-erosion-free ETF screening — 6 warning signals, 50 securities tracked

screening30/min

📊Group 3 — Analytics (4 tools)

analyze_dividend_safety

Full 7-voter ensemble safety score: VERY SAFE / SAFE / WATCH / AT RISK / DANGER

analytics20/min

compare_dividends

Side-by-side multi-ticker dividend comparison matrix with safety ratings

analytics60/min

forecast_dividend

Three-model ensemble forecast (ARIMA + Prophet + Ridge) — next 4 quarters with 80% CI

analytics20/min

optimize_portfolio

HarvestEngine max-yield portfolio optimization with HRP position sizing

analytics10/min

🧠Group 4 — Intelligence (4 tools)

ask_heydividend

Natural language query to HeyDividend AI's full AI pipeline. Prompt injection & manipulation detection via MCPGuard.

intelligence10/min · 50/hrguarded

generate_research_report

Claude Opus 4.5 deep research — IoC, earnings, sector, dividend sustainability, risk

intelligence5/min

get_heydividend_capabilities

List HeyDividend AI's 34 capabilities across 8 categories for agent discovery

intelligenceno limit

get_durability_graph

Composite durability score with 6 explainable sub-scores, stress scenarios, historical trends

intelligence20/min

🏦Group 5 — Brokerage / Plaid Connect (5 tools)

create_brokerage_link_token

Step 1: Generate Plaid Link token for OAuth brokerage connection flow

brokerage30/min

connect_brokerage_account

Step 2: Exchange public token → access token; persist to DB. CUSIP/ISIN/ticker matching.

brokerage30/min

sync_brokerage_portfolio

Step 3: Pull holdings → enrich with dividend data → upsert portfolio. Dividend-only filter.

brokerage10/min

get_brokerage_portfolio

Retrieve synced portfolio positions with HeyDividend AI safety scores and income projections

brokerage60/min

disconnect_brokerage

Revoke Plaid access token + purge portfolio data. Ownership-verified delete.

brokerage10/min

VersionMCP Server v3.1 · MCP SDK v1.26.0

Transportsstdio (Claude Desktop) + HTTP/SSE at /api/mcp/sse

AuthOAuth 2.1 one-click connection · API key for REST consumers

SecurityMCPGuard — prompt injection (16 override + 7 manipulation patterns) + sliding-window rate limiter + SHA-256 audit log → dbo.mcp_audit_log

Brokerage matchingCUSIP 100% · ISIN 95% · Ticker 70% — institutional-grade security resolution

Health/api/mcp/health · /api/mcp/tools

🤖Claude Intelligence Layer v1.2.0 — 8 Services

ClaudeClient (Async HTTP Core)

httpx async wrapper for Anthropic API — sonnet/opus/haiku. Lazy import, fails gracefully if ANTHROPIC_API_KEY absent. No SDK dependency. Shared across all 7 other Claude services.

Sonnet 4 / Opus 4.5httpxno SDK

Premium Query Router

39-signal frozenset routes complex queries to Claude Sonnet 4 with 6,000-token budget. Triggers: initiation of coverage, passive income plan, retirement portfolio, deep dive, investment thesis. OAI fallback on failure.

Sonnet 439 signalsrequest_handler.py

Deep Research Agent

Opus 4.5 for institutional-grade equity research. 5 report types: initiation-of-coverage, earnings deep dive, sector comparative, dividend sustainability, risk assessment. Multi-section structured output.

Opus 4.5POST /api/v1/claude/research/report

Training Quality Reviewer

Scores Q&A training pairs across 5 rubric dimensions. Returns verdict (pass/review/fail), dimension scores, improvement notes. Batch up to 50 concurrent. Raises overall fine-tune dataset quality.

Sonnet 4POST /api/v1/claude/training/reviewbatch 50

Self-Improvement Engine

Queries heydividend_query_log + heydividend_feedback DB tables. Outputs SelfImprovementReport: persona health score 0–100, knowledge gaps, prompt refinements, training recommendations. Scheduled analysis.

Sonnet 4POST /api/v1/claude/improve/analyze

Safety Ensemble Vote

Claude Sonnet 4 as the primary weighted voter (26%) in the 8-model dividend safety ensemble. Logical consistency check + hallucination detection + quality anchoring. Dynamic weight redistribution on model failure.

Sonnet 426% weightPOST /api/v1/claude/ensemble/safety-score

Financial Formula Engine

43+ deterministic formulas across 8 categories with Claude explanations and Excel syntax. Prevents hallucination on quantitative calculations. Gordon Growth Model, Yield on Cost, AFFO Payout, DCF and more.

Sonnet 443 formulasGET /api/excel/formulas/list

ClaudeDeepResearch (Direct Methods)

Three callable research primitives: generate_initiation_report() — full IoC with valuation + dividend thesis; generate_dividend_deep_dive() — payout safety + growth trajectory + BUY/HOLD/WATCH/AVOID verdict; generate_sector_comparison() — peer ranking with dividend yield matrix.

Opus 4.53 research primitivesPOST /api/v1/claude/research/report

Status endpointGET /api/v1/claude/status

Clienthttpx async (no Anthropic SDK) — lazy import, fails gracefully if ANTHROPIC_API_KEY absent

Modelsclaude-sonnet-4-20250514 (default) + claude-opus-4-20250514 (deep research only)

𝕏X Real-Time Signal Service — v2.0 (X API 2025)

grok_x_dividend_service.py pay-per-use XMCP-ready grok-3-fast MCP tool #19

X is the most real-time data platform on earth. HeyDividend AI now queries it via Grok x_search for any ticker — not just monitored ETF accounts. Pay-per-use billing (X API 2025) means no monthly credit exhaustion. XMCP Server hook will upgrade to native X MCP context when configured.

Signal Types

📣 Dividend announcements
⚠️ Cut / suspension warnings
📊 Earnings reactions
🏦 Analyst calls
👤 Insider activity
🔴 Breaking news

Coverage

Any public ticker
10 monitored ETF accounts
All public X posts
Image understanding
1–30 day lookback
Sentiment classification

New Endpoints

GET /api/x/signals/{ticker}
GET /api/x/signals/{ticker}/dividend
GET /api/x/status
GET /api/x/xmcp/status

MCP Tool #19

get_x_dividend_signals
Args: ticker, days_back
Returns: signals[], sentiment_summary, top_signal

Claude/Cursor can now ask:
"What is X saying about $T?"

Grok Responses API /v1/responses x_search tool no monthly limits XMCP fallback active env: X_XMCP_SERVER_URL

🛡️Accuracy & Trust Layer

Inline Source Citations (Task #82)

CitationRegistry accumulates per-metric citations during every request. Unicode superscripts (¹²³…) are injected onto metric values in both the resolved-profile context block and cross-API verified metrics. System prompt instructs the LLM to preserve them verbatim and label model-estimated values [est.]. displaySourceAttribution tool call is enriched with per-provider sources, overall confidence, and a full citations array for downstream audit.

audit-gradeapp/services/citation_registry.py

Claim Verification Gate

Extracts financial claims from LLM output and verifies them against the cross-API consensus before the answer is returned. Hard gate at the end of the Phase-5 ensemble assembly. Drops or flags any value that disagrees with the median consensus.

hard gateclaim_verification_gate.py

SHAP Explainability Layer

Audit-ready XAI for dividend safety verdicts. Domain-linear Shapley decomposition for LLM voters + shap.TreeExplainer for sklearn ML models. Returns per-feature contribution that adds up to the final safety score.

XAIshap.TreeExplainer

Query Distillation Layer

Zero-latency rule-based classifier intercepts high-frequency simple queries (single-ticker price, last dividend, basic yield) and routes them to a cost-effective LLM before the full Phase-5 pipeline runs. Cuts cost on the long tail of trivial queries.

zero-latencycost optimizer

Skill Loader Service

Injects up to two relevant financial methodology blocks into the system prompt based on the user's query. 68 financial skills across 11 categories live under skills/. House style (frontmatter + H2 sections) documented in skills/README.md. New skills authored via vendored Agent Skill Open Standard kit at tools/agent-skill-creator/.

68 skills · 11 categoriesTRIGGER_MAP

Memory & Session Layer + Fix-It Agent

Conversation memory, block-memory entities, user profiles — injected into the system prompt and persisted. Fix-It Agent scans conversation logs for failed responses and regenerates answers offline.

block memoryoffline regen

Multi-Source Verifier (5 APIs)

multi_source_verifier.py fires 5 parallel API calls (FMP · Finnhub · EODHD · Alpha Vantage · HeyDividend AI DB) on every ticker preflight. Median consensus for price, yield, annual dividend, payout ratio, and last dividend is injected into the LLM system prompt before any response is generated. 5-minute TTL cache per ticker.

5 sources · median consensusHIGH/MED/LOW confidence

Finance Benchmark Harness

Runs FinanceBench and FinQA across all ensemble voters and derives router weights from measured accuracy + latency. Weights are not hand-tuned — they are earned.

FinanceBench + FinQAdata-driven weights

📊Reports, Charts & Token Accounting

Slide Deck Generator

6-slide institutional PDF reports via ReportLab: holdings overview, dividend metrics, 3-year income projection, risk heatmap, Claude AI recommendation. Served from /static/reports/{user_id}/ with 24h auto-cleanup.

POST /api/v1/reports/slide-deckReportLab

Mind Map Generator

Nested JSON tree for D3.js / React Flow rendering. Topics: strategy (income/growth/safety/international pillars), sectors (GICS grouping), portfolio (per-ticker metric breakdown). Enriched with block-memory entities.

POST /api/v1/reports/mind-mapD3 / React Flow

Forecast Fan Chart — 4-View

Three-scenario (bull / base / bear) charts for price, dividend, yield, and NAV rendered with matplotlib. One endpoint, four views per symbol.

GET /api/v1/charts/forecast-fan/{symbol}matplotlib

HarvestEngine Platform

Institutional-grade dividend strategy backtesting and portfolio optimization — Aristocrat rotation, max-yield with HRP sizing, sector rebalance, drawdown control.

/api/harvest/*HRP · backtest

Token Usage & Cost Tracking

Real-time per-provider LLM cost tracking with real token capture, user identification, and spending alerts. Daily aggregates persisted to dbo.llm_token_usage. REST: GET /api/v1/tokens/today · /summary · /historical · /users · /alerts · /alert-status · /pricing + POST /api/v1/tokens/flush.

FinOpsdbo.llm_token_usage

International Dividend Intelligence

Analysis across 9 international markets with withholding-tax adjustment and FX conversion (Frankfurter / ExchangeRate-API). After-tax yield, treaty rate awareness, base-currency normalization.

9 marketswithholding · FX

ChatGPT GPT Actions Integration

ChatGPT-ready OpenAPI 3.1 spec with 9 Actions covering price, dividend safety, dividend history, fundamentals, screener, neural prediction, scenarios, FMP history, chat.

9 ActionsOpenAPI 3.1

Notebook Dashboard Spec

Comprehensive frontend integration spec at docs/NOTEBOOK_DASHBOARD_FRONTEND_SPEC.md — covers Tasks #58–#61 endpoints, block-chain schema, SmartSheet data contract, tool-call UI component contracts, tier gating, WebSocket protocol.

frontend contracttasks #58–#61

🗂️API Routes (112 files)

advanced_analyticsadvisoragent_toolai_sdkalert_pushclaude_intelligencecode_interpretercomprehensive_trainingcurated_listdashboarddatabase_mldeep_researchdeepseek_trainingdividend_aristocratsdividend_intelligencedividend_listsdividend_neuraldividend_pipelinedocument_learningeducation_trainingexternal_ml_apifile_processingfinetuningfingptfinrobotfmpfmp_traininggeneral_investmenthallucination_preventionharvestharvest_traininghashtaginvestment_strategymarket_intelligencemcpml_predictionmoatmulti_source_trainingnotebookperplexityperplexity_trainingquantlibs_trainingrecommendationrecommendation_trainingresearcher_agentrlmroundtablesecurity_comparisonsemanticsentimentsocial_mediastrategicstreaming_chattrading_intelligencetrading_strategies_trainingtraining_managementttsultimate_packunified_data_lakeuser_profilevideovideo_trainingx_dividendx_dividend_training

🌐Network & Ports

8001HeyDividend AI Main API (public)

9001ML API (public)

9000Internal API (localhost)

8000Shadow / restarting

443Nginx → llm.heydividend.ai (llm.theheydividend.ai legacy alias)

🗄️Database

EngineAzure SQL Server

Serverhey-dividend-sql-server.database.windows.net

DatabaseHeyDividend-Main-DB

Driverpymssql (native FreeTDS)

ViewsvSecurities, vDividendsEnhanced, vSchedules, vSignals, vPredictions

unified_dividends787,787 rows · canonical primary source

Data guardCHK_unified_dividends_amount_sanity · amount > 0 AND ≤ $500 · 0 bad rows

Guard layers① code batch-uniformity check ② SQL CHECK constraint ③ moat cleanup endpoint

⚙️Architecture Enhancement Layer

Model Telemetry System

active

Cost-Aware Router (4-tier)

active

ML-Based Intent Classifier

active

Shared Cache (LRU in-mem)

active

Async DB Pool (ThreadPoolExecutor)

active

Response Gating

active

Parallel Source Fan-Out

active

RAG Retrieval Reranking

active

Circuit Breaker (ML API)

active

ML Health Monitor (30s interval)

active

🔑Active External Integrations

Azure OpenAI

Endpointhtmltojson-parser-openai-a1a8.openai.azure.com

DeploymentHarveyGPT-5

Status✓ active

xAI Grok (X API 2025)

PurposeX.com monitoring, real-time social sentiment, ensemble voter

Modelgrok-3-fast · /v1/responses · x_search tool

Billing✓ pay-per-use — no monthly credit exhaustion

Status✓ active · MCP tool #19 get_x_dividend_signals

Google Gemini 2.5 Pro

PurposeEnsemble (31%) + market intel

Status✓ active

Anthropic Claude

ModelsSonnet 4 + Opus 4.5

PurposeEnsemble + deep research

Status✓ active (httpx, no SDK)

Perplexity Sonar

PurposeEnsemble (21%) + research training

Status✓ active

ElevenLabs TTS

PurposeAudio generation for research

KeyELEVENLABS_API_KEY ✓

Status✓ configured

FMP (Financial Modeling Prep)

Endpoints80+ API endpoints

Status✓ active

Helicone LLM Observability

PurposeToken tracking, latency, cost

Status✓ active (proxy layer)

🔁GitHub Actions — ML Training Pipeline

Workflowtrain-ml-models.yml · daily 2 AM UTC + manual dispatch

Models6 total — Dividend Growth, Cut Predictor, Anomaly Detection, ESG Scorer, Payout Rating, Portfolio Optimization

No-DB models (always train)① Growth Forecaster ⑤ Payout Rating ⑥ Portfolio Optimizer — use synthetic data when DB unreachable

DB-dependent models② Cut Predictor ③ Anomaly Detection ④ ESG Scorer — train when Azure SQL reachable

UploadAzure Blob Storage → ml-models container · versioned archive per run

Status✓ Fixed — exit code 2 (pip cache / MSSQL install) + TRAINED=0 (wrong args) resolved

🩺VM Cron Health — LLM VM (20.81.210.213)

vm_training_cron.sh✓ execute permission fixed (was chmod -x, silent 30-min fail)

training_health_check.py✓ deployed — runs every 6h via cron, validates all training tables + unified_dividends integrity

Health checksRow counts per training table · unified_dividends 787k+ · amount sanity · zero / over-$500 scan

📋Log Health (Current Session)

Dominant error sourceGrok 429 storm resolved — migrated to X API 2025 pay-per-use; ensemble weights restored

Residual errorsDB timeouts (transient), yfinance fallback, config misses

Log file size~205 MB (accumulated; rotate monthly)

Live countsSee Live Logs tab → ERROR filter for real-time totals

Accuracy logBatch INSERT active — 1 background thread per verify_all_claims() call (~200ms vs ~4.2s sequential)

Loading logs…

— — — HeyDividend AI Backend · Replit dev env · auto-refreshes every 10s

Client Layer

🌐 Next.js Frontend

llm.heydividend.ai · Dark blue UI · Vercel AI SDK · REST consumers · RIA / broker-dealer clients

↓

API Gateway

🛡️ Nginx Reverse Proxy

TLS termination · Port 443 → 8001 · /api/internal/ml → 9001

⚡ FastAPI :8001

67 route files · API key auth · Rate limiting · ASGI streaming middleware

📊 Helicone

LLM observability · Token tracking · Latency · Cost per model

🔌 Vercel AI SDK Layer

/api/ai-sdk · Tool-call streaming protocol · SSE streams · 4,400+ lines · Next.js native

↓

Real-Time & Output Layer

🔔 WebSocket Alerts

/ws/alerts/{user_id} · Real-time dividend cut-risk push · Async scanner · Connected user tracking

🎙️ ElevenLabs TTS

/api/v1/tts/synthesize · HeyDividend AI responses → MP3 audio · Voice presets · ELEVENLABS_API_KEY

📊 Dashboard Builder

/api/v1/dashboards · Custom ticker dashboards · Min 3 tickers · Widget config · Persist to DB

↓

Intelligence Routing

🗂️ Query Router

27+ query types · Intent classification · Asset class detection (6 classes)

🎭 Claude Premium Router

39 premium signals · IoC reports · Investment thesis · Retirement plans · Deep dives → Sonnet 4 (6K tokens) · OAI fallback

🎯 ML Intent Classifier

TF-IDF + ensemble scoring · 19 intent types · Semantic routing

🔍 Ticker Preflight Gate

Real-time ticker validation · Stopword filter · $TICKER prefix bypass

↓

Unified Intelligence System — Dividend Safety Ensemble

⚖️ 8-Voter Weighted Ensemble · Dynamic weight redistribution on failure

🎭 Claude Sonnet 426% · Verification gate

🟣 Gemini 2.5 Pro22% · Fundamentals

🔵 Perplexity Sonar18% · Real-time news

🟢 HeyDividend AI GPT-513% · Complex reasoning

🔶 DeepSeek-R110% · DCF / FCF math

🌐 GAN Scenarios6% · Stress-test futures

📡 Market Signals3% · CDS + IV

📰 BERT Sentiment2% · FinBERT news

↓

Claude Intelligence Layer v1.2.0 — 8 Services · /api/v1/claude/*

🔌 ClaudeClient

httpx async core

Sonnet 4 / Opus 4.5

No SDK · lazy import

Fail graceful

📋 Deep Research Agent

Opus 4.5 · 5 report types

IoC · Earnings deep dive

Sector · Dividend · Risk

🎓 Training Reviewer

5-rubric QA scoring

pass/review/fail verdict

Batch up to 50

🔄 Self-Improvement

Queries query_log + feedback

Knowledge gap detection

Health score 0-100

📐 Formula Engine

43 deterministic formulas

8 categories · Excel syntax

DCF · DDM · YOC · AFFO

🎯 Premium Router

39 signals → Sonnet 4

7 complex intent types

6K token budget

✅ Safety Vote

5% ensemble weight

Logical consistency

Hallucination check

🔬 DeepResearch Methods

generate_initiation_report

generate_dividend_deep_dive

generate_sector_comparison

↓

Analysis & Safety Layer

🔬 Deep-Dive Framework

10 institutional research templates · IoC · Earnings · Sector · Risk

🛡️ Hallucination Prevention

Claim verification · Schema fingerprint · Multi-angle fact-check · Response gating

🤖 Agentic Tool-Calling

/api/v1/agent/query · Multi-step financial reasoning · Tool loop · List available tools

🐍 Code Interpreter

/api/v1/interpret/run · Sandboxed Python execution · Financial modelling · DCF sandbox

📚 SkillLoader Pipeline

34 financial skills · 250+ trigger keywords · Auto-injects methodology context into system prompt

🔄 Two-Layer LLM Cache

L1 in-memory (LRU) + L2 Azure SQL · semantic similarity · 30-50% token cost reduction

📜 Chat History API

4 endpoints · Azure SQL persistence · x-user-id / x-conversation-id headers · streaming wrapper

🎯 Accuracy Enforcement

99.999% target · claim gate · batch INSERT logging · 1 bg thread · multi-source cross-validation

↙↓↓↘

Data · Neural Engine · Training · HarvestEngine

📡 Real-Time Data

FMP (80+ endpoints)

Finnhub · yFinance

EDGAR SEC · FRED

X.com (Grok) · Alpha Vantage

🧬 Dividend Neural Engine

Fourier Denoiser

ARIMA Feature · LSTM Predictor

Stacked Autoencoder

GAN · SOM Anomaly · Eigen

🎓 Training Pipeline

33 agents · ~25k pairs/day

Cron (*/30 min + 2×daily)

256+ screener templates

Roundtable · Trading (13 modules)

🌾 HarvestEngine

Backtesting (6 strategies)

Portfolio Optimizer

NAV Avoidance Screener

Dividend Calendar · DRIP Sim

↓

8-Stage Dividend Intelligence Pipeline — /api/v1/dividend-pipeline/* — Quantium Library Stack

① EDGAR 8-K dividends · 10-K XBRL · Policy classifier

② Calendar Exchange validation · Fiscal year · Pay-gap stress

③ GARCH Yield volatility · Cut-risk score · Crisis regime

④ Forecast ARIMA + Prophet + Ridge · Dynamic MAE weights · 80% CI

⑤ tsfresh 800+ features · GBRT cut-risk · Top-5 explainer

⑥ Income Sharpe · Sortino · YOC CAGR · Max drawdown

⑦ Optimizer PyPortfolioOpt · HRP · Kelly sizing

⑧ Alphalens Factor IC/ICIR · Safety Score backtest · Alpha signals

↓

Azure OpenAI Fine-Tuning Pipeline

3-Source Consolidation → Content-Hash Dedup → JSONL Upload → Azure Fine-Tune Job

💬 heydividend_query_memory50% budget · confidence ≥ 0.80 · live conversations

🤖 InvestmentResearcherTraining35% budget · 256+ synthetic Q&A templates

📚 heydividend_training_data15% budget · quality ≥ 0.75 · ingestion service

☁️ Azure OAI Fine-TuneAZURE_OPENAI_FINETUNE_MODEL · 3 epochs · heydividend_finetuning_jobs

↓

HeyDividend AI ML API (Internal)

🏭 heydividend-unified-ml v4.0.0 — Port 9001 — 7 Models Loaded

/score/symbol · /predict/yield · /predict/cut-risk (GBRT + time-ordered CV) · /predict/payout-rating · /health · conda env: llm (PyTorch + miniconda3) · Circuit Breaker (CLOSED/OPEN/HALF-OPEN) · rate-limit protection

↓

Data Persistence

🗄️ Azure SQL — HeyDividend-Main-DB

hey-dividend-sql-server.database.windows.net · pymssql · 20+ training tables · unified_dividends · vSecurities · vDividendsEnhanced · heydividend_finetuning_jobs · mcp_audit_log · 13 heydividend_* synonyms for backward compat

💾 LRU Cache

5,000 entries in-memory · 10,800s TTL · Async DB pool via ThreadPoolExecutor

📁 ml_training/

966 MB on VM · saved_models/ · tsfresh_cut_risk_classifier.pkl · dividend_lstm.pt · dividend_gan.pt · dividend_som.pkl

↓

Infrastructure

☁️ Azure VM

20.81.210.213 · Ubuntu · 27 GB RAM · 248 GB disk

🐍 miniconda3 llm

Python 3.11 · PyTorch · uvicorn · conda env

⏱️ Cron + Scheduler

112 jobs · */30 min training · GBRT monthly retrain

🔑 Secrets (28)

Azure OAI · Grok · Gemini · Claude · Perplexity · FMP · Finnhub · ElevenLabs · FRED · DeepSeek

📡 MCP Server v3.1

23 tools · 5 groups (data/screening/analytics/intelligence/brokerage) · OAuth 2.1 · Claude Desktop · SSE /api/mcp/sse · MCPGuard · Audit log

🧠 What HeyDividend AI Has Learned

Training knowledge base — 8 domains, 33+ agents, continuous learning

Knowledge Domains

—

Training Agents

—

Agent Codebase (KB)

—

Log Days Available

—

Est. Conversation Pairs

💬 Conversations & Training

—

Total Queries

—

Unique Sessions

—

Avg Response (chars)

Top Tickers Today

—

— conversations

Select a date to load conversations

💬

Click a conversation to inspect it

LLM-generated investor debates — 6 personas, genuine back-and-forth, Claude-powered

Click Load Debates

🗣️

Select a debate to read the transcript

Individual investor persona questions extracted from roundtable training

Click Load Q&A to fetch training questions

Auto-detected failed HeyDividend AI responses with LLM-generated corrections

—

Detected

—

Fixed

—

Unfixable

—

Fix Rate

Click Load Queue

🔧

Select an item to see the original error and the fix

Token Usage & Cost

Auto-refreshes every 60s

—

Today's Spend ($)

—

Tokens Today

—

Requests Today

—

Top Model Today

—

30-Day Spend ($)

—

Avg Daily Cost ($)

—

30-Day Tokens

—

Alert Status

Spend Thresholds

Warning: —

Critical: —

Set via DAILY_SPEND_ALERT_USD / DAILY_SPEND_CRITICAL_USD env vars

$0—% of warning threshold—

Cost by Model — Today

Model	Requests	Tokens	Cost	Share
Loading…

Cost by User — 7 Days

User	Requests	Tokens	Cost
Loading…

Daily Cost — Last 30 Days

Daily Tokens — Last 30 Days

⚡ HeyDividend AI — Engine Insights

🌊 Fourier Denoiser

📊 ARIMA Feature

🔲 Stacked Autoencoder

🔮 LSTM Predictor

🎲 GAN Engine

🗺️ SOM Anomaly Detector

🔗 Eigen / Contagion

edgartools

exchange-calendars

pandas-market-calendars

arch (GARCH)

pmdarima

prophet

scikit-learn Ridge

tsfresh

empyrical-reloaded

ffn

PyPortfolioOpt

Riskfolio-Lib

skfolio

alphalens-reloaded

ta · finta

financedatabase

pypme · rateslib

🌐 Next.js Frontend

🛡️ Nginx Reverse Proxy

⚡ FastAPI :8001

📊 Helicone

🔌 Vercel AI SDK Layer

🔔 WebSocket Alerts

🎙️ ElevenLabs TTS

📊 Dashboard Builder

🗂️ Query Router

🎭 Claude Premium Router

🎯 ML Intent Classifier

🔍 Ticker Preflight Gate

🔌 ClaudeClient

📋 Deep Research Agent

🎓 Training Reviewer

🔄 Self-Improvement

📐 Formula Engine

🎯 Premium Router

✅ Safety Vote

🔬 DeepResearch Methods

🔬 Deep-Dive Framework

🛡️ Hallucination Prevention

🤖 Agentic Tool-Calling

🐍 Code Interpreter

📚 SkillLoader Pipeline

🔄 Two-Layer LLM Cache

📜 Chat History API

🎯 Accuracy Enforcement

📡 Real-Time Data

🧬 Dividend Neural Engine

🎓 Training Pipeline

🌾 HarvestEngine

🏭 heydividend-unified-ml v4.0.0 — Port 9001 — 7 Models Loaded

🗄️ Azure SQL — HeyDividend-Main-DB

💾 LRU Cache

📁 ml_training/

☁️ Azure VM

🐍 miniconda3 llm

⏱️ Cron + Scheduler

🔑 Secrets (28)

📡 MCP Server v3.1

🧠 What HeyDividend AI Has Learned

💬 Conversations & Training