AIN Asset Management — AI Signal Intelligence

v15 Pipeline · v15_minrisk Canonical · SHIFT=1 ← Back to App Investment Review Document · Updated 2026-04-07
Pipeline Flow
Prompt Injection
AI Learning
Strategies
Backtest Methodology
Thematic Purity
Data Sources

Weekly Prediction Pipeline — End-to-End Flow

The system runs automatically every Monday 05:00 UTC via cron. It covers 15 semiconductor tickers across 6 steps. Each ticker's analysis is self-contained and uses no future data.

Step A–C: Weekly Data Collection (auto, non-fatal)
Thesis Diffs ETLExtract stance changes from prior week reports
Sector IntelGenerate sector-level macro summary
News CollectionCompany + sector news, all 15 tickers
Step 1: Data Loading (per ticker, data_loader.py)
Street Consensus90-day PDF extraction lookback
avg EPS + Target Price per bank
Research ReportsCurrent week + 3-week context
PDF → GPT-4.1 extraction
News FilterCompany news → GPT-4.1 filter
relevance scoring, top-N kept
Price ContextWeek open/close/range
4-week trend, 52-week position
Earnings CalendarUpcoming earnings date
TTM PE, EPS actuals
Step 2: Memory & Knowledge Injection (per ticker)
4-Week Judgment HistoryPrior signals, EPS/PE deviations, reasoning
Narrative Knowledge BasePersistent analyst memory — company-specific insights
Narrative ExperiencesAnchored past mistakes + lessons by narrative type
Few-Shot LessonsRule-based corrections from prior prediction errors
Consensus AnchorPercentile-normalised EPS posture (p25/p50/p75)
Step 3: Prompt Assembly
system.mdAnchored Deviation framework
+ 11 decision rules
👆 click to view
+
weekly_delta_v18.md7 data sections + Step 0
+ Q1–Q5 narrative framework
+ 3 calibration steps
👆 click to view
Step 4: LLM Judgment (GPT-4.1)
JSON Outputeps_deviation_pct · pe_adjustment_pct
narrative_reasoning · key_insight · memory_update
Step 5: Code Computes (deterministic)
AI_EPS= Street_EPS × (1 + eps_dev%/100)
Cap: ±15%
AI_PE= Forward_PE × (1 + pe_adj%/100)
Cap: ±20%
Target Price= AI_EPS × AI_PE
Expected Return= (Target / Current Price − 1) × 100
Cap: ≤ 30%
Signal + ConvictionDerived from EPS deviation + PE adjustment
(NOT output by LLM)
Step 6: Store + Backtest Refresh
weekly_analyst_journalAll signals, prices, EPS/PE saved
Backfill Missing PricesAuto-fills week_close + weekly_return for past weeks
Backtest SnapshotAll 16 strategies computed + saved to weekly_portfolio_snapshot

15 Tickers Covered

Large Cap: NVDA, AMD, TSM, QCOM, AVGO, TXN, INTC, MU, AMAT, LRCX, KLAC, ADI, MCHP, ON, WOLF

Coverage spans AI/data center accelerators, memory, EDA, analog, power, and foundry subsectors.

SHIFT=1: No Look-Ahead Bias

Signal from Week N → Trade in Week N+1. The pipeline runs after market close on Friday. Positions are entered at Monday open of the following week and held until Friday close.

This eliminates any use of future data in signal generation.

Automatic Weekly Operations

Cron runs every Monday 05:00 UTC, executing data collection → pipeline → backfill → snapshot refresh automatically. No manual intervention required for normal weeks.

Trade Timing — SHIFT=1 Illustration

Event Timing Detail
Signal GenerationMonday 05:00 UTC (Week N+1)Pipeline runs using Week N's price close. Output = buy/hold/sell signal for each ticker.
Trade EntryMonday Open (Week N+1)entry_price = Monday open of trade week. This is the actual simulated buy price.
Trade ExitFriday Close (Week N+1)week_close_price = Friday close. weekly_return = (Friday close / prev Friday close − 1) × 100.
Actual Return (UI)Week N+1's weekly_returnThe "Actual Return" shown in UI for Week N signal = the return of the following week's trade.

Full Prompt Assembly — Injection Order (weekly_delta_v18.md)

Every week, the prompt is rebuilt from scratch by injecting fresh data into each section in order. The diagram below shows every section in injection sequence, with its template variable and data source. The LLM only sees the final assembled text — it never accesses the database directly.

SYSTEM (static)
MEMORY / LEARNING
VALUATION ANCHORS
RESEARCH / NEWS
PRICE / MARKET
INSTRUCTIONS
LLM OUTPUT
SYSTEM PROMPT — injected as system role (static, never changes)
SYS

system.md — Anchored Deviation Framework

Static file

Analyst identity: Buy-side fundamental equity analyst, semiconductor sector. Strong prior that consensus is usually right.

Valuation formula: AI_EPS = Consensus_EPS × (1 + eps_deviation_pct/100)  |  AI_PE = Forward_PE × (1 + pe_adjustment_pct/100)  |  PT = AI_EPS × AI_PE

Hard caps: EPS ±15%, PE ±20%, ER ≤ 30%. Signal/conviction computed by code — NOT output by LLM.

11 decision rules (default=no change, sanity-check data, price-run = already priced in, >40% PT divergence = reassess) + Sector knowledge (cycle dynamics, cross-company signals, sub-sector PE ranges, EPS principles)

WEEKLY DELTA PROMPT (weekly_delta_v18.md) — injected as user role, rebuilt each week
1

Section 1 — Past 4-Week Judgment History

{judgment_history_4w} weekly_analyst_journal

Per-week row: EPS forecast, PE, PT, signal, conviction, actual close, actual weekly_return, key_insight. Includes direction-correct count and bullish-bias pattern summary. Lets the AI audit its own recent track record before making a new call.

Prompt instruction: "Review your pattern: Are you consistently biased in one direction?"

2

Section 2 — Past Analytical Lessons (Few-Shot)

{few_shot_lessons} learning_review_outcomes

Up to 3 event-matched correction examples. Format: "W{n} ({error_type}): Your prediction={signal}, ER={pct}% → Actual={pct}% → Root cause → Lesson." Filtered by event type (earnings lessons only on earnings weeks, non-earnings lessons otherwise).

3

Section 3 — Valuation Anchors

report_extractions + stock_prices

Consensus (Street View):

{consensus_eps} NTM Consensus EPS — Next-Twelve-Months blend (fiscal year weights), avg of last 90 days reports, deduped per bank
{street_avg_tp} Street Avg Target Price  |  {num_banks_bullish} / {num_banks_bearish} bank count

PE Context:

{trailing_pe} Forward PE = Price / NTM Consensus EPS — this is the pe_adjustment_pct anchor
{ttm_eps} Trailing PE (TTM) = Price / trailing 4Q actuals — reference only, not the anchor
{historical_valuation_context} Historical PE range, EPS beat/miss history, analyst consensus PE
3b

Section 3b — Consensus Anchor (Percentile)

historical consensus data
{consensus_anchor_pt} Consensus PT used as anchor  |  {consensus_anchor_dynamic_er_pct} Implied upside vs price
{consensus_anchor_dynamic_er_percentile_pct} Historical percentile  |  {consensus_anchor_posture} Street posture label
p25 {consensus_anchor_hist_p25_pct} / median {consensus_anchor_hist_p50_pct} / p75 {consensus_anchor_hist_p75_pct}  |  {consensus_anchor_takeaway} Auto-generated takeaway

Usage rules injected: ≥80th pct → street already optimistic, require NEW evidence for bullish; ≤20th pct → street conservative, modest bullish more defensible; 20–80% → normal. Primarily affects MAGNITUDE not DIRECTION. Used for pe_adjustment_pct, NOT eps_deviation_pct.

3c

Section 3c — Narrative Knowledge Base

{narrative_kb_section} research_os.narrative_kb

Persistent company-specific analyst memory. Each card: narrative_id, stability (STRUCTURAL/CYCLICAL/TACTICAL), claim, confirm_signal, falsifier, state (ACTIVE/STABLE/WEAKENED). These are the narratives the AI evaluates in Q1–Q4. Updated via memory_update after each run.

3d

Section 3d — Narrative-Anchored Experiences

{narrative_experiences_section} narrative_experience_links

Historical outcomes grouped by narrative_id. Format: "[W{n} {net_direction} | reinforced/challenged] reason text." Shows the AI how each specific narrative has played out historically — if "AI demand narrative" was reinforced but the stock fell 3 times, the AI must factor that in.

4

Section 4 — Research Reports (This Week + Recent Context)

{report_summary} research.extractions

"This Week" block: New reports published this week — parsed by GPT-4.1 into structured JSON (bank, EPS estimates, target price, rating, key thesis, catalysts, risks). This is the ONLY source for Step 0 analyst_claims extraction.

"Recent Context" block: Prior 3 weeks of reports — background only. NOT used for analyst_claims. Used in Q2/Q3 for trend assessment.

5

Section 5 — Research Experience (Settlement Log)

{experience_summary} analyst_experience_rules

Running audit trail: past AI predictions vs settled actual outcomes. Format: prediction made → result observed → error type → root cause summary. Creates accountability for past calls and forces re-examination of persistent thesis errors.

6

Section 6 — News Digest

{news_summary} {n_news} / {total_news} weekly_news_raw → GPT-4.1 filter

Raw news filtered by GPT-4.1 for relevance to this ticker's investment thesis. Shows {n_news} filtered items from {total_news} total collected. Only material news passes. Lower weight than research reports — can flip a marginal call but cannot override clear report signals.

7

Section 7 — Price Context + Earnings Calendar

stock_prices + earnings_calendar
{week_close} Friday close  |  {price_4w_trend} 4-week trend  |  {vs_52w_high} vs 52-week high/low
{vs_nasdaq_4w} vs NASDAQ (relative performance)  |  {earnings_context} upcoming earnings date / is_earnings_week flag

Earnings week triggers discipline rules: higher uncertainty → default toward 0/0 deviations. Price trend used by Rules 8–10 (rally = already priced in, decline = market sees risk).


REASONING INSTRUCTIONS — in-prompt task sequence (static template text)
S0

Step 0 — Analyst Claim Ledger (MANDATORY, before Q1–Q5)

Instruction

Extract every EPS/PT/rating from Section 4 "This Week" reports. One entry per bank × narrative pair — if one bank addresses two narratives, output two entries. Map each claim to the closest narrative_id from Section 3c.

Self-check: Count banks in "This Week". If count > 0, analyst_claims must be non-empty. analyst_claims = [] only if zero new reports this week.

Q1

Q1 — Which narratives are REINFORCED this week?

Instruction

For each narrative in Section 3c: does this week's data match its confirm_signal? Cite specific report/news. No vague claims.

Q2

Q2 — Which narratives are CHALLENGED or WEAKENING?

Instruction

Does data trigger any narrative's falsifier? CYCLICAL/TACTICAL narrative un-reinforced for multiple weeks = weakening. Export controls present for weeks = priced in, only flag NEW escalation.

Q3

Q3 — Is any STRUCTURAL narrative changing fundamentally?

Instruction

STRUCTURAL narratives rarely change. Require multi-source evidence of genuine shift — never from a single data point. Single-week news alone is insufficient.

Q4

Q4 — NET narrative direction this week?

Instruction

Aggregate: STRUCTURAL reinforced = +1.5 weight; TACTICAL challenged = +0.5 weight toward bearish. Rules: reinforced > 2× challenged → bullish; challenged > 2× reinforced → bearish; within 1 → neutral. Result: net_direction field.

Q5

Q5 — How do narratives translate to eps_deviation and pe_adjustment?

Instruction

CYCLICAL narrative reinforced → justify above-consensus EPS? lean_bullish → modest +1% to +3% bias, not forced to zero. STRUCTURAL intact + below historical PE → PE expansion defensible? Section 3b percentile modulates SIZE.

CAL

3 Calibration Steps

Instruction
Step 1: EPS Calibration
Baseline = historical beat/miss avg. Narrative reinforced → add. Challenged → subtract. Cap ±15%. Do NOT move EPS from PE signals.
Step 2: PE Calibration
Current PE vs historical median + consensus PE gap. Section 3b percentile → modulate magnitude. Cap ±20%.
Step 3: Uncertainty Check
Wide estimate spread → eps closer to 0. Multi-source contradiction → revise. Genuinely uncertain → 0%/0% is correct.

LLM OUTPUT — returned as JSON, no markdown
OUT

Required JSON Output

→ code computes signal/conviction
// analyst_claims: Step 0 output — per-bank × narrative
"analyst_claims": [{ bank, narrative_id, stance, key_claim }]
// Q1–Q4 output
"narrative_reasoning": { reinforced[], challenged[], structural_shift, net_direction, narrative_summary }
"conflict_arbitration": { stance, key_conflicts[], resolution }
// Calibration output
"eps_integration": { bear_eps, base_eps, bull_eps, bear_prob, base_prob, bull_prob, weighted_eps_deviation_pct }
"pe_assessment": { fundamental_pe_range, sentiment_adjustment_pct, final_pe_adjustment_pct, reasoning }
// KEY: the two numbers that drive everything downstream
"eps_deviation_pct": ±15% max    "pe_adjustment_pct": ±20% max
"eps_deviation_reason", "pe_adjustment_reason"
"weekly_summary", "key_insight"
"memory_update": "feeds narrative_kb (Section 3c next week)"
CODE

Code Computes (deterministic, post-LLM)

pipeline.py
AI_EPS
Consensus_EPS × (1 + eps_deviation_pct/100)
AI_PE
Forward_PE × (1 + pe_adjustment_pct/100)
Target Price
AI_EPS × AI_PE
Expected Return
(PT / prev_close − 1) × 100, cap 30%
Signal
ER > 5% → buy | ER < −5% → sell | else → hold
Conviction
Derived from magnitude of combined EPS+PE deviation

system.md — Analyst Identity, Framework & Rules

The system prompt defines the analyst's identity, valuation methodology, and behavioral rules. It does NOT change week to week.

Analyst Identity

Buy-side fundamental equity analyst covering the semiconductor sector. The AI must behave like an institutional analyst: disciplined, skeptical of hype, anchored to consensus data, with a strong prior that consensus is usually right.

Valuation Framework: Anchored Deviation

The AI does NOT produce a target price directly. It predicts deviations from market anchors:

  • eps_deviation_pct: How much AI thinks EPS will differ from Street consensus (±15% cap)
  • pe_adjustment_pct: How much AI thinks the market will re-rate PE vs forward PE (±20% cap)

Code then computes: AI_EPS = Consensus_EPS × (1 + dev/100)

AI_PE = Forward_PE × (1 + adj/100)

Target Price = AI_EPS × AI_PE

Expected Return = (Target / Current Price − 1) × 100%

Hard Caps (Enforced by Code)

  • EPS deviation: maximum ±15% from consensus
  • PE adjustment: maximum ±20% from forward PE
  • Expected Return: capped at 30% (prevent over-concentration)

Identity Property: When both deviations = 0%, Target Price = Current Price → Expected Return = 0% → Signal = hold. This is the correct default.

11 Decision Rules

  • Rule 1: Default is no change — output 0%/0% unless new information materially shifts expectations
  • Rule 2: Only adjust EPS when new earnings evidence exists (guidance, demand shift, design wins/losses)
  • Rule 3: Only adjust PE when market's willingness to pay changes (macro, rotation, growth shift)
  • Rule 4: Distinguish signal from noise — most headlines are noise
  • Rule 5: Street consensus is an important reference; your deviation is your differentiated view
  • Rule 6: Use incremental changes — >5% EPS or >10% PE in one week requires exceptional evidence
  • Rule 7: Sanity-check all input data — negative EPS for profitable companies, PE < 1x or > 200x, EPS surprises > 200% are likely errors. Never blindly trust data.
  • Rule 8: Sustained rally (>15% in 4 weeks) = good news already priced in → be conservative on positive deviations
  • Rule 9: Sustained decline (>15% in 4 weeks) = market sees risk → investigate before maintaining bullish estimate
  • Rule 10: If your implied Target Price diverges >40% from current price, reassess — the market prices information not yet in reports
  • Rule 11: Signal and conviction are computed by code from your two numbers — you do NOT output them

System Prompt — Semiconductor Sector Knowledge

The system prompt also provides structural industry knowledge that the AI uses to contextualise weekly data.

Cycle Dynamics

  • Semiconductor cycles last 2–4 years; inventory corrections (down-cycle) last 12–18 months
  • Leading indicators: TSMC monthly revenue (demand proxy 6–8 weeks ahead), equipment book-to-bill, DRAM spot prices
  • Inventory normalisation precedes recovery: watch DIO returning to historical range

Cross-Company Signal Transmission

  • TSMC revenue → AMD/NVDA/AVGO revenue (1 quarter lag) — TSMC is the "canary in the datacenter coal mine"
  • AMAT/LRCX equipment orders → TSMC capex → chip supply (9–12 month lag)
  • NVIDIA data center → AMD MI-series competitive pressure → AMD pricing
  • Memory (MU) leads logic cycle by 6–12 months — memory restocking precedes logic upcycle

Valuation Ranges by Sub-Sector

  • AI/DC GPU (NVDA, AMD): 30–60x PE in upcycle, 20–30x in maturity
  • Fabless (QCOM, AVGO): 20–40x PE; premium for recurring software/royalty revenue
  • Memory (MU): 10–20x PE trough, 15–25x peak — highly cyclical
  • Equipment (AMAT, LRCX): 20–35x PE
  • Analog (ADI, TXN, MCHP, ON): 20–30x PE; industrial exposure adds cycle lag
  • Foundry (TSM): 15–25x PE; premium for leading-edge share

EPS Forecast Principles

  • Datacenter/AI: 2–3 quarters visibility via hyperscaler capex guidance
  • Auto/Industrial: long design win cycles (3–5 years); near-term driven by end-demand
  • Memory: ASP × volume model — ASP can swing ±30% in one quarter
  • Beat-and-raise is the bullish pattern; guide-down is the bearish signal that moves stocks most

Weekly Delta Prompt (v18) — 7 Data Sections + Reasoning Framework

The weekly delta prompt is re-assembled each week with fresh data injected into each section. Template variables are shown in {curly_braces}.

1

Historical Judgment Context (4 Weeks)

Injects the analyst's own prior signals, EPS/PE deviations, actual outcomes, and reflection notes for the past 4 weeks. Allows the AI to learn from recent history within a single prompt.

{judgment_history_4w}
Source: weekly_analyst_journal → formatted as structured table per week
2

Past Analytical Lessons (Few-Shot Learning)

Curated examples of past analytical mistakes and corrections, formatted as "Situation → Error → Correction" rules. These are company-specific rule-based nudges derived from prediction tracking.

{few_shot_lessons}
Source: analyst_experience_rules table, filtered by ticker
3

Valuation Anchors

Provides the market anchors that the AI will deviate from. These are live-computed values:

  • NTM Consensus EPS: Next-Twelve-Months blended EPS using fiscal year weights — the industry standard anchor. Averaged from all bank reports in past 90 days, deduplicated per bank. {consensus_eps}
  • Street Average Target Price: Consensus of bank target prices {street_avg_tp}
  • Banks: Bullish/bearish count {num_banks_bullish} | {num_banks_bearish}

PE Context (critical distinction):

  • Forward PE (this week's anchor): = Price / NTM Consensus EPS. This is what pe_adjustment_pct adjusts from. {trailing_pe}
  • Trailing PE (TTM reference only): = Price / trailing 4-quarter actual EPS. Not used as the adjustment anchor. {ttm_eps}
  • Historical Valuation Context: Sub-sector PE ranges, historical median, analyst consensus PE {historical_valuation_context}
Source: load_street_consensus() + load_trailing_pe() from report_extractions + earnings_calendar tables
3b

Consensus Anchor (Percentile Context)

Shows where the current implied upside (consensus PT vs price) sits within this stock's own historical distribution — using p25, p50, p75 percentiles. Injected variables:

{consensus_anchor_pt} {consensus_anchor_dynamic_er_pct} {consensus_anchor_dynamic_er_percentile_pct} {consensus_anchor_posture} {consensus_anchor_takeaway}

Usage Rules (injected into prompt):

  • Percentile ≥ 80%: Street is already unusually optimistic for this stock → require specific NEW evidence before bullish deviation, and size it conservatively
  • Percentile ≤ 20%: Street is unusually conservative → if this week's evidence is stable-to-better, a modest bullish deviation is more defensible
  • Percentile 20–80%: Normal → rely mainly on this week's reports/news
  • Affects MAGNITUDE more than DIRECTION — doesn't flip signals, but sizes them
  • Primarily for pe_adjustment_pct, NOT eps_deviation_pct — do NOT move EPS solely because the percentile is high or low
Source: historical consensus data, percentile computed weekly from last N weeks of observations
3c

Narrative Knowledge Base

Persistent, company-specific analyst memory. Contains crystallised knowledge about this company's business model, typical patterns, and recurring themes. Updated by the AI after each prediction cycle.

{narrative_kb_section}
Source: analyst_narrative_kb table, company-specific
3d

Narrative-Anchored Experiences

Past prediction mistakes grouped by narrative type (e.g. "AI demand narrative", "memory cycle recovery narrative"). Each entry shows: what narrative was used, what the prediction was, what actually happened, and what to adjust next time.

{narrative_experiences_section}
Source: analyst_experience_rules with narrative_type grouping
4

Research Reports — Analytical Views

Structured extraction of analyst research reports, split into two blocks:

  • "This Week" block: NEW reports published this week — these are the primary source for Step 0's analyst claim ledger. Each report is parsed by GPT-4.1 into: bank, EPS estimates, target price, rating, key thesis, catalysts, risks.
  • "Recent Context" block: Reports from the prior 3 weeks — background only. NOT used for analyst_claims extraction.
{report_summary}
Source: report_extractions (PDF → GPT-4.1 → structured JSON), load_week_reports()
5

Research Experience (Settlement Log)

Running log of past AI predictions vs actual outcomes, at the point when results were known. Includes: what was predicted, what happened, how large the error was, and whether the AI's reasoning held up. This creates a self-audit trail.

{experience_summary}
Source: analyst_experience_rules + journal settlement data
6

News Digest

Filtered company news for the current week. Raw news is first filtered by GPT-4.1 for relevance to this ticker's investment thesis. Only material news is passed to the main prompt. The total count and filtered count are both shown to the AI.

{news_summary}
Source: weekly_news_raw → filter_news() (GPT-4.1) → top-N relevant items
7

Price Context + Earnings Calendar

Current week's price data: open, close, weekly change, 4-week trend, 52-week high/low position. Also includes the upcoming earnings date if within 2 weeks, which triggers earnings-week discipline rules.

Source: stock_prices + earnings_calendar tables

Reasoning Framework — Step 0 + Q1–Q5 + 3 Calibration Steps

After the data sections, the prompt instructs the AI to follow a structured reasoning sequence before outputting any numbers.

Step 0: Analyst Claim Ledger (COMPLETE BEFORE Q1–Q5)

Before any reasoning, the AI must extract raw bank-by-bank evidence from Section 4's "This Week" reports into analyst_claims. This is separate from the narrative synthesis in Q1–Q5.

  • One entry per bank × narrative pair. If one bank addresses two distinct narratives, output two entries.
  • Each claim must be mapped to the closest narrative_id from Section 3c's active narratives.
  • analyst_claims may be [] ONLY if "This Week" shows zero new reports.
  • Do NOT bury bank evidence only inside narrative_reasoning — the full claim must appear in analyst_claims.
  • Self-check: Count distinct banks in "This Week". If count > 0, analyst_claims must be non-empty.

Q1: Which narratives are REINFORCED this week?

Review each active narrative from Section 3c. For each STABLE or WEAKENED narrative: does this week's data (reports, news) provide confirming evidence matching that narrative's confirm_signal? If yes, mark it as reinforced. Must cite which report/news said what — no vague claims.

Q2: Which narratives are CHALLENGED or WEAKENING?

For each narrative: does this week's data trigger the falsifier? Is a CYCLICAL or TACTICAL narrative fading due to lack of confirming evidence? A narrative that goes un-reinforced for multiple weeks should be flagged as weakening. Specific evidence required.

Q3: Is any STRUCTURAL narrative changing fundamentally?

STRUCTURAL narratives (competitive moat, technology position) rarely change. Only flag if there is multi-source evidence of a genuine structural shift — not a single week's data point.

Q4: What is the NET narrative direction this week?

Aggregate reinforced vs challenged narratives using weighted scoring:

Conditionnet_direction
STRUCTURAL reinforced AND ≥50% of remaining reinforcedbullish
Reinforced count > 2× challengedbullish
Reinforced count > challengedlean_bullish
Reinforced ≈ challenged (within 1)neutral
Challenged > reinforcedlean_bearish
Challenged > 2× reinforcedbearish

Weighting rules: STRUCTURAL reinforced = +1.5 weight. TACTICAL challenged = only +0.5 weight toward bearish. TACTICAL challenges should NOT drag direction bearish if STRUCTURAL + CYCLICAL are intact. Export controls/regulatory risks present for multiple weeks are considered "priced in" — only flag if NEW escalation.

Q5: How do active narratives translate to eps_deviation and pe_adjustment?

  • Narrative-driven EPS: If a CYCLICAL narrative about demand/pricing is reinforced, does it justify above-consensus EPS? By how much?
  • When net_direction = lean_bullish: eps_deviation should have a modest positive bias (+1% to +3%), NOT forced to zero just because it's not fully bullish
  • Narrative-driven PE: If STRUCTURAL narrative is intact and the stock trades below historical PE (per Section 3 PE Context), does narrative confidence justify PE expansion?
  • Consensus Anchor (Section 3b) modulates SIZE — high percentile → smaller magnitude even if bullish narrative

3 Calibration Steps (Before Final Output)

Step 1: EPS Calibration (anchor: consensus EPS)

  • Structural baseline: Check historical EPS beat/miss pattern. If company beats by avg X%, starting eps_deviation ≈ +X%
  • Narrative adjustment: If demand/pricing narrative is reinforced → add to baseline. If challenged → subtract.
  • Compute final eps_deviation_pct as NET adjustment vs consensus (±15% max)
  • Do NOT move EPS based on PE/valuation signals — EPS requires earnings or guidance evidence

Step 2: PE Calibration (anchor: forward PE)

  • Historical PE range: Is current PE below historical median? → positive adjustment likely justified
  • Analyst consensus gap: Street analysts price at consensus_pe. Market at forward PE. Intact narrative → convergence warranted.
  • Consensus Anchor percentile (Section 3b): High percentile → smaller positive adjustment
  • Compute final pe_adjustment_pct FROM forward PE anchor (±20% max)

Step 3: Uncertainty Check

  • Wide earnings spread → eps_deviation closer to 0
  • Narrative directly contradicted by multiple independent sources → revise, don't just flag
  • Genuinely uncertain → output 0% (valid and expected answer)

LLM Output JSON Structure

The AI outputs a structured JSON object. Signal and conviction are NOT in this output — they are computed by code from the numeric deviations.

// LLM output JSON (signal/conviction computed by code afterward)
{
  "analyst_claims": [{ bank, narrative_id, stance, key_claim }], // Step 0: per-bank × narrative

  "narrative_reasoning": {
    "reinforced": ["narrative_id: why"],
    "challenged": ["narrative_id: why"],
    "structural_shift": "none|flag",
    "net_direction": "bullish|lean_bullish|neutral|lean_bearish|bearish|mixed",
    "narrative_summary": "2-3 sentence thesis"
  },

  "conflict_arbitration": {
    "stance": "lean_bullish|lean_bearish|neutral|mixed",
    "key_conflicts": ["..."],
    "resolution": "how resolved"
  },

  "eps_integration": {
    "bear_eps": number, "base_eps": number, "bull_eps": number,
    "bear_prob": number, "base_prob": number, "bull_prob": number,
    "weighted_eps_deviation_pct": number
  }, // 3-scenario probabilistic EPS

  "pe_assessment": {
    "fundamental_pe_range": [low, high],
    "sentiment_adjustment_pct": number,
    "final_pe_adjustment_pct": number,
    "reasoning": "string"
  }, // structured PE reasoning

  "eps_deviation_pct": 5.2, // KEY OUTPUT: ±15% cap
  "pe_adjustment_pct": -3.0, // KEY OUTPUT: ±20% cap
  "eps_deviation_reason": "Short text justification...",
  "pe_adjustment_reason": "Short text justification...",
  "weekly_summary": "2-3 sentence summary for UI",
  "key_insight": "Single most important takeaway",
  "memory_update": "Key info to carry forward" // feeds AI Learning (narrative KB)
}

Signal Derivation (Code, Not LLM)

Signal (strong_buy / buy / hold / sell / strong_sell) is computed by code based on eps_deviation_pct and pe_adjustment_pct thresholds. Conviction (high/medium/low) is derived from the magnitude of the combined deviation. This ensures consistent, rule-based signal generation that cannot be "talked into" by narrative.

AI Learning System — Three Layers of Memory

The system maintains three distinct memory layers that persist across weeks, allowing the AI to improve its analytical accuracy over time without retraining the model.

Layer 1: Narrative Knowledge Base (KB)

Company-specific, persistent knowledge that accumulates over time. Contains crystallised insights about a company's business model, typical PE ranges by cycle phase, management communication patterns, and recurring analytical pitfalls.

Updated by: LLM's memory_update output field after each prediction

Injected via: Section 3c of weekly delta prompt

analyst_narrative_kb

Layer 2: Prediction Experience Rules

When a prediction outcome is known (week N+1 actual return available), the system runs a settlement step: compare prediction vs outcome, classify error type, and write a corrective rule. These rules are formatted as few-shot examples for future prompts.

Updated by: Settlement job (runs after results available)

Injected via: Section 2 (few-shot lessons) + Section 5 (experience summary)

analyst_experience_rules

Layer 3: Narrative-Anchored Experiences

Same as Layer 2 but grouped by narrative type. If the AI used "AI demand narrative" to justify a bullish EPS call 5 times and was wrong 4 times, Section 3d will explicitly show this pattern, forcing the AI to discount that narrative class.

Updated by: Same settlement job, with narrative_type tagging

Injected via: Section 3d of weekly delta prompt

analyst_experience_rules (narrative_type)

Weekly Learning Cycle — Predict → Observe → Settle → Update

Week N: PredictAI outputs eps_deviation_pct
pe_adjustment_pct + memory_update
Week N+1: ObserveActual price return known
weekly_return filled in
Settlement JobCompare prediction vs actual
Classify error type
Memory UpdateWrite corrective rules
Update narrative KB
Week N+2: Next PredictionMemory + lessons + narrative experiences all injected → AI "remembers" past mistakes

Settlement Process — What Happens When Results Arrive

1. Outcome Detection

When Week N+1's week_close_price and weekly_return are filled (by the backfill job), the system knows the trade week result for Week N's signals.

2. Error Classification

Compare prediction direction (bullish/bearish/neutral) vs actual return direction (up/down). Classify: correct, directionally wrong, or magnitude error. Record error_magnitude.

3. Rule Generation

For significant errors, generate a corrective rule in natural language. Example: "When NVDA quarterly guidance is in-line but forward PE is above 45x, avoid PE expansion assumptions — the market typically doesn't re-rate further."

4. Memory Update Application

The LLM's own memory_update field from Week N's output is applied to the narrative KB. This allows the AI to proactively update its knowledge from new information, not just from errors.

What the AI Learns Over Time

Pattern Recognition by Narrative

Over 57+ weeks, the system accumulates patterns like: "Bullish AI demand narratives for NVDA tend to over-predict EPS by +8% on average. Discount AI-demand-driven EPS calls by at least half."

Company-Specific PE Behavior

The KB learns that NVDA's PE compresses aggressively when guidance disapppoints, while TXN's PE is historically stable ±5% even in weak quarters. Each company gets custom calibration.

Cross-Company Signal Reliability

The system tracks whether cross-company signals (TSMC beat → NVDA bullish) actually predicted well historically. Unreliable cross-company signals are downweighted in Section 5 experience rules.

Earnings Week Discipline

The system learns to reduce deviation confidence in the week before earnings (high uncertainty → default to 0/0). This prevents over-confident pre-earnings calls that have historically been wrong.

16 Portfolio Strategies — Backtest Universe

All strategies are evaluated simultaneously on the same 15-ticker universe, same weekly signals, SHIFT=1 methodology. No strategy has access to future data. Performance is based on 55+ weeks of live signals starting March 2025.

AI Signal Strategies

Top 5 AI Primary Strategy

Select: Top 5 tickers by adjusted_er (Conviction-adjusted Expected Return)
Weight: Equal — each selected ticker = 20%
Condition: Only include if adjusted_er > 0 (positive upside required)
If <5 positive ER tickers: hold fewer positions; if 0: go to cash

The primary strategy the investment process is built around. adjusted_er = expected_return × conviction_multiplier × risk_multiplier. This ensures higher-confidence calls get marginally more exposure but avoids extreme concentration.

Benchmark for all other strategy comparisons

Signal Weighted

Universe: Only tickers with signal = buy or strong_buy
Weight ∝ signal_score (strong_buy > buy, high conviction > medium > low)
Normalize: weights sum to 1.0 across selected tickers

Proportional to signal strength. Avoids over-weighting marginal buy signals. Goes to cash if no buy signals exist.

ER Equal Weight

Universe: All tickers with expected_return > 0
Weight: 1 / N_positive — equal weight among positive ER tickers
Condition: ER must be > 0; hold signal tickers excluded

Broader diversification than Top 5 — all positive-ER tickers equally weighted. Lower concentration risk, smoother returns.

ER Proportional Best Performer

Universe: All tickers with expected_return > 0
Weight ∝ expected_return (higher ER = higher allocation)
w(t) = ER(t) / Σ ER(all positive tickers)

The consistently best-performing strategy. Higher-upside signals get proportionally larger allocations, creating natural concentration toward the strongest calls without arbitrary cutoffs.

Historically highest total return

ER Prop + 20% Cap

Same as ER Proportional, then:
Cap any single ticker at 20% maximum weight
Excess weight redistributed proportionally to remaining positions

Adds a concentration guard. Prevents a single dominant call (e.g. NVDA at 80%) from making the portfolio too dependent on one name.

ER Prop + 7% Hurdle

Universe: Only tickers with expected_return ≥ 7%
Weight ∝ expected_return (same as ER Prop)
Logic: <7% upside = "already priced in", not worth holding

Quality filter on top of ER Prop. Removes low-conviction marginal buys. Typical effect: fewer positions but higher average conviction.

ER Prop + NASDAQ Guard

Base: ER Proportional weights
Guard: If sector average vs_nasdaq_4w < 0 (sector lagging NASDAQ for 4 weeks):
  → halve all weights (50% cash buffer)
Rationale: reduce exposure in sector downtrends

Regime-aware strategy. Adds a defensive half-position when the semiconductor sector has been collectively underperforming the NASDAQ for 4 consecutive weeks, signaling potential sector headwinds.

Quality-Adjusted ER (QA-ER)

adj_er = expected_return × conviction_mult × regime_mult × data_quality_mult
Weight ∝ adj_er (only positive adj_er tickers)
Cap: 20% per ticker
conviction_mult: high=1.2, medium=1.0, low=0.8

Most sophisticated AI-based weighting. Incorporates conviction quality, sector regime (bull/neutral/bear), and data quality (how many reports available). Positions with thin data or low conviction are systematically downweighted.

Bloomberg / Bank Consensus Strategies (Benchmarks)

Bank Consensus

Weight ∝ bullish_ratio = num_banks_bullish / (bullish + bearish)
Higher bullish ratio = higher allocation
Default ratio = 0.5 (neutral) if no data

Uses the internal bank report data. Tickers with more bullish than bearish bank views get higher weight. Pure consensus signal with no AI adjustment.

Consensus baseline

BBG Top 5

Select: Top 5 tickers by Bloomberg consensus upside
upside = (BBG_target_price / current_price − 1) × 100
Weight: Equal — each = 20%
Source: bloomberg_consensus table (NOT in pipeline — comparison only)

Mimics AI Top 5 strategy but using Bloomberg sell-side consensus target prices. Used as a baseline to compare AI performance vs professional street consensus.

Street consensus comparison

BBG Proportional

Universe: All tickers with BBG upside > 0
Weight ∝ Bloomberg consensus upside
Source: bloomberg_consensus table

Bloomberg's equivalent of ER Proportional. Direct apples-to-apples comparison of AI's proportional allocation vs Bloomberg consensus allocation.

BBG Equal Weight

Universe: All tickers where BBG consensus target > current price
Weight: 1 / N_positive (equal weight)
Source: bloomberg_consensus table

Bloomberg's equivalent of ER Equal Weight. Holds any stock with positive Bloomberg upside equally.

Passive / Benchmark Strategies

Equal Weight (Passive)

Weight: 1/15 = 6.67% for all 15 tickers, every week
Rebalanced weekly back to equal weight
No signals used

Pure passive benchmark within the semiconductor universe. Eliminates all stock selection — any outperformance vs this baseline reflects the value of the AI signals.

Buy & Hold

Start: Equal weight (1/15 each)
Weights drift with actual price changes each week
No rebalancing — position sizes reflect cumulative price performance

Simulates holding the initial 15-stock basket without rebalancing. Winners naturally grow larger. Good for measuring how the universe performs in a "set and forget" approach.

Market Cap Weighted

Weight: Fixed proportional to initial market capitalization
Rebalanced back to market-cap weights each week
NVDA typically dominates (~30-40% weight)

Simulates an index-like approach within the semiconductor universe. Naturally overweights NVDA, TSM, QCOM. Good for checking whether AI adds value vs passive market-cap exposure.

MVO (Mean-Variance Optimization)

Expected returns (μ): from AI expected_return per ticker
Covariance matrix (Σ): estimated from 52-week price history with Ledoit-Wolf shrinkage
Objective: Maximize Sharpe Ratio (PyPortfolioOpt)
Constraint: long-only, weights sum to 1

Quantitative portfolio construction using Modern Portfolio Theory. Uses AI signals as return inputs but optimizes the portfolio for risk-adjusted returns using historical covariance. Compare to see if quant optimization adds value vs simpler ER-proportional allocation.

Backtest Design Principles

The backtest is designed to be as realistic as possible for a long-only weekly rebalancing strategy. All methodology decisions are conservative — we prefer to undercount performance rather than overcount.

SHIFT=1: Zero Look-Ahead Bias

Signal from Week N is applied to trade Week N+1. The pipeline runs after Friday close (Monday 05:00 UTC), after Week N prices are final. No future price information is ever used in signal generation.

This is the most important design decision.

Realistic Entry Price

entry_price = Monday open of the trade week. This is the actual price at which a trader would have executed. We do not use week N's Friday close as the entry — that would be slightly optimistic.

Weekly Return Definition

weekly_return = (this week's Friday close / previous Friday close − 1) × 100

This is close-to-close, not open-to-close. The "actual return" in the UI uses this for the benchmark comparison. Trade return (open-to-close) is separately tracked via entry_price vs week_close_price.

No Transaction Costs

The backtest does not subtract transaction costs, slippage, or market impact. For a 15-ticker weekly-rebalancing strategy with institutional-size trades, these are non-trivial. Actual returns would be lower by approximately 10–30bps per week depending on implementation.

Return Calculation — Exact Formulas

Weekly Portfolio Return

R_portfolio(week) =
  Σ_{ticker} weight(ticker) × weekly_return(ticker)

weekly_return(ticker, week N+1) =
  (close_price[N+1_Friday] / close_price[N_Friday] − 1) × 100

Cumulative Return

cumulative = 1.0 # start at $1
For each week:
  cumulative *= (1 + R_portfolio / 100)

Total Return % = (cumulative − 1) × 100

Sharpe Ratio (Annualized)

Sharpe = (avg_weekly_return / std_weekly_return)
          × sqrt(52)

Risk-free rate: 0 (simplification)
Uses all completed trade weeks

QQQ Benchmark

QQQ weekly return =
  (QQQ_trade_week_close / QQQ_trade_week_open) − 1

Alpha = portfolio_cumulative − QQQ_cumulative
Note: QQQ uses open→close (same as trade week)

Data Integrity Safeguards

Auto-Backfill of Missing Prices

When the pipeline runs, it first checks for any past weeks where week_close_price or weekly_return is NULL (e.g. from a previous pipeline run that couldn't fetch prices). These gaps are automatically filled from stock_prices before computing the current week's signals.

backfill_missing_prices()

Accuracy Backfill (Weekly)

After each pipeline run, backfill_accuracy computes prediction accuracy metrics: did the signal direction match the actual return direction? Win rate, error magnitude, and EPS accuracy vs actual reported EPS are all tracked.

backfill_accuracy.py

Current Week Preview

The most recent signal week is shown in the UI even before its trade week closes. It appears as a "Preview" row with 0 return placeholders, so portfolio managers can see current holdings and signals. The preview row is automatically replaced once real returns are available.

is_preview: true in snapshot

Snapshot Freshness

The backtest snapshot (weekly_portfolio_snapshot) is refreshed every Monday after the pipeline runs. Any missing price data that was later filled by the data provider is automatically incorporated in the next snapshot refresh.

weekly_portfolio_snapshot

Strategy Performance Summary (As of Last Backtest Run)

Performance based on approximately 55 completed trade weeks starting March 2025. All strategies use SHIFT=1 methodology on the same 15-ticker universe.

Strategy Signal Source Allocation Logic Key Characteristic
ER ProportionalAI Expected ReturnWeight ∝ ER (positive only)Historically strongest total return
QA-ERAI (conviction-adjusted)Weight ∝ adj_er, 20% capBest risk-adjusted return
ER Prop + HurdleAI Expected ReturnWeight ∝ ER, ER≥7% onlyFewer but higher-quality positions
Top 5 AIAI adjusted_erEqual 20% each, top 5Primary display strategy
ER Equal WeightAI Expected ReturnEqual weight, ER>0 universeBroader diversification
Signal WeightedAI signal scoreWeight ∝ signal strengthBuy/strong_buy only
BBG ProportionalBloomberg consensusWeight ∝ BBG upsideStreet consensus benchmark
BBG Top 5Bloomberg consensusEqual 20%, top 5 BBG upsideAI vs BBG comparison
Bank ConsensusPDF-extracted bank ratingsWeight ∝ bullish ratioInternal consensus baseline
Equal WeightNone (passive)6.67% each, 15 tickersPassive benchmark
Market Cap WeightedNone (passive)Proportional to market capIndex-like benchmark
QQQNoneSingle ETFMarket benchmark

Important Caveat

Past performance over ~55 weeks is a short track record. The strategy has operated through one semiconductor sector cycle (2025 up-cycle). Performance in a sustained downturn or high-volatility macro environment has not been fully stress-tested. Transaction cost assumptions are zero — real implementation returns will be lower.

Thematic Purity V3 — Revenue Disaggregation + AI Alignment Pipeline

Based on MSCI/Bloomberg methodology + ASC 606/IFRS 15 revenue disaggregation + AI-driven alignment scoring. Data is theme-agnostic (extract once, score for any theme). → View Live Dashboard

84+
Companies
625+
Sub-Segments
84
Business Types
6
Pipeline Steps
3
Themes
~$0.80
Total AI Cost

Phase 1 Data Extraction — Theme-agnostic · Run once · All themes reuse

STEP 1

Segment Extraction

ASC 280 / IFRS 8
  • Annual report PDF → Gemini 2.5 Flash
  • Extract L1 reportable segments
  • Revenue %, amounts, description
  • Key products & end markets
Output Tables
annual_report_extractions
extracted_segments_v2
🤖 Gemini Flash × N
STEP 2 ⭐

Revenue Disaggregation

ASC 606 / IFRS 15
  • Same PDF, second Gemini call
  • End market breakdown
  • Product type breakdown
  • Geography breakdown
  • Cross-check with L1 (±5%)
Output Table
extracted_sub_segments
🤖 Gemini Flash × N
STEP 2.1 🆕

Financial Notes Extraction

ASC 360 / ASC 718 / ASC 350 / IAS 38
  • Same PDF, third Gemini call
  • PP&E breakdown — land, buildings, equipment, CIP
  • SBC allocation — R&D vs Sales vs G&A
  • Intangible assets — patents, technology, goodwill, customer relationships
  • Net amounts, useful lives, validation status
Output Tables
extracted_ppe_breakdown
extracted_sbc_allocation
extracted_intangibles_breakdown
🤖 Gemini Flash × N
STEP 3

Business Type Normalization

GICS-like classification
  • Classify L2 sub-segments (priority)
  • Fallback to L1 if no L2 data
  • Standard industry categories
  • e.g. "AI & HPC Semiconductors"
Output
business_type field
🤖 GPT × 1 call

Phase 2 Theme Scoring — Theme-specific · Per theme · AI-driven alignment

STEP 4 ⭐ CHANGED

AI Per-Segment Alignment Scoring

Gemini 2.5 Flash · Per company · With financial context
  • NEW: AI scores EACH segment's alignment (0-100%)
  • Input: segments + business_types + financial notes context (PP&E, SBC, Intangibles)
  • AI determines: layer (CORE/ENABLER/ADJACENT) + alignment_pct (0-100)
  • Replaces fixed factors (1.0/0.5/0.25) with granular per-segment scoring
  • Financial notes provide context for alignment judgment
Output
ai_layer + alignment_pct per segment
stored on extracted_segments_v2 / extracted_sub_segments
🤖 Gemini Flash × N companies
STEP 5 CHANGED

Score Calculation

Mechanical · Uses AI alignment_pct
  • If L2 exists → use L2 granularity
  • If no L2 → fallback to L1
  • NEW formula: Score = Σ(revenue_pct × AI alignment_pct / 100)
  • OLD formula: Score = Σ(revenue_pct × fixed layer_factor)
  • Score range: 0~100
  • MSCI comparison (human_score) for gap analysis
Output Table
thematic_scores (ai_score + human_score + gap)
🟢 Mechanical (post AI alignment)

Scoring Formula — V3 AI Alignment

Thematic_Score(company, theme) =
  Σsegment ( revenue_pct × AI_alignment_pct / 100 )

Where: alignment_pct = AI judgment per segment (0-100), not fixed factors
Example: NVDA — "Data Center" 89.7% × 95/100 + "Auto" 1.1% × 85/100 + "Gaming" 7.4% × 0/100 = 86.2
Score 0 = zero connection, Score 100 = pure-play

V2 → V3 Change

V2 (old): Fixed factors per layer — CORE=1.0, ENABLER=0.5, ADJACENT=0.25. Same factor for all segments in a layer.
V3 (new): AI scores each segment individually with alignment_pct (0-100%). A CORE segment might get 95% or 70% depending on actual relevance. Financial notes (PP&E, SBC, Intangibles) provide additional context for AI judgment.

Financial Evidence Integration (Step 2.1)

Financial notes data provides supporting evidence for thematic alignment. Extracted from annual report and displayed as insight cards on the company detail page.

PP&E — Capital Investment

ASC 360
  • Land, buildings, equipment, CIP
  • Gross & net amounts
  • Auto-insight: manufacturing vs fabless
UI Dimension
Capital Commitment KPI card

SBC — Talent Allocation

ASC 718
  • R&D vs Sales vs G&A breakdown
  • % allocation per department
  • Auto-insight: innovation vs commercial culture
UI Dimension
R&D Intensity KPI card

Intangibles — IP Moat

ASC 350 / IAS 38
  • Patents, technology, goodwill
  • Customer relationships, licenses
  • Auto-insight: organic IP vs acquisition growth
UI Dimension
IP Portfolio KPI card

Database Schema

annual_report_extractions

security, fiscal_year, report_url,
segment_classification, total_revenue,
extraction_model, extracted_at

extracted_segments_v2

extraction_id → FK,
segment_name, revenue_pct,
revenue_amount, description,
business_type, ai_layer, alignment_pct

extracted_sub_segments

extraction_id → FK, parent_segment,
sub_segment_name, revenue_pct,
breakdown_type, business_type,
ai_layer, alignment_pct

thematic_scores

security, run_id → FK,
ai_score, human_score,
scoring_granularity (L1/L2),
scoring_rationale, status

Primary Data Sources — Weekly Prediction Pipeline

All data used in the weekly signal pipeline is pulled from the DEV database or collected by automated ETL scripts. Bloomberg data is used only for the backtest comparison UI, not in signal generation.

Table / SourceContentUsed InUpdate Frequency
report_extractions Structured extraction of research report PDFs: bank name, EPS estimates, target prices, ratings, key thesis, catalysts, risks Street consensus anchors, report_summary section, analyst claim ledger Triggered when new PDF added
weekly_news_raw Raw news articles collected by news_collector ETL. Company + sector news. Fields: headline, body, source, published_at, ticker News filter (GPT-4.1) → news_summary injected in Section 6 Weekly (Monday cron, Step C)
stock_prices Daily OHLCV prices for all 15 tickers + QQQ. Fields: ticker, date, open, high, low, close, volume Price context, entry_price, week_close_price, weekly_return, 4-week trend, 52-week range Daily (data provider feed)
earnings_calendar Upcoming earnings dates, analyst EPS estimates, actual EPS results, EPS surprise %. Fields: ticker, report_date, eps_estimate, eps_actual Earnings week detection, EPS surprise backfill, earnings-week discipline Updated as earnings reported
weekly_analyst_journal All pipeline outputs per ticker per week: signal, conviction, eps/pe deviations, expected_return, summary, week_close_price, weekly_return, prediction accuracy Historical context (Section 1), experience rules, backtest engine Weekly (pipeline output)
analyst_narrative_kb Persistent company-specific knowledge: business model insights, PE ranges, management patterns, recurring analytical errors Section 3c (narrative KB injection) Updated after each pipeline run via memory_update
analyst_experience_rules Corrective rules generated from past prediction errors. Fields: ticker, narrative_type, situation, error, correction, created_at Section 2 (few-shot lessons), Section 3d (narrative experiences), Section 5 (experience summary) Weekly (settlement job post-result)
sector_intel (ETL) Weekly sector-level macro summary: SOX trend, demand signals, supply chain updates, cross-company signals Available as context for Q5 (cross-company signals) Weekly (Step B)
bloomberg_consensus Bloomberg consensus target prices and ratings for all 15 tickers. Used ONLY for comparison — never in signal generation BBG strategy benchmarks in backtest UI only Weekly (separate Bloomberg feed)

Data Flow — From Source to Signal

Research Reports (PDF)

Bank research PDFs are uploaded to the system. A GPT-4.1 extraction job parses each PDF into structured JSON: analyst name, bank, date, EPS estimates (FY1/FY2/FY3), target price, rating, key thesis, catalysts, and risks. Each report is stored in report_extractions. The pipeline uses the past 90 days of reports per ticker, deduplicated to latest per bank.

News (Company + Sector)

Company news is collected weekly for all 15 tickers. A two-stage process: (1) bulk collection of all relevant news from the past week, (2) GPT-4.1 relevance filter that scores each article for relevance to the ticker's investment thesis. Only materially relevant articles pass through to the weekly prompt.

Price Data

Daily stock prices from a market data provider. Used for: week open/close (entry/exit prices), 4-week price trend, 52-week position (range context), QQQ benchmark, and historical covariance estimation for the MVO strategy. Backfill jobs automatically detect and fill gaps when data is delayed.

Earnings Calendar

Earnings dates are pre-loaded and updated when actuals are reported. The pipeline checks earnings_calendar to determine if the upcoming week is an earnings week. If so, the AI's discipline rules apply: higher uncertainty → default toward 0/0 deviations unless there is very strong conviction from the claim ledger.

What Bloomberg Data Is NOT Used For

Bloomberg Is Comparison-Only

Bloomberg consensus target prices and ratings are never injected into the weekly prediction prompt. They are available in the bloomberg_consensus table and are LEFT JOIN'd into the backtest API purely for the purpose of showing AI performance vs street consensus in the UI.

Street consensus in the prompt (Section 3) = average of internal PDF-extracted bank reports, not Bloomberg. This means the AI's valuation anchors are derived from the same research reports that investors read, not from a black-box Bloomberg aggregate.