Document Analysis
SEC Filing Analyzer
Compare risk factors across companies and years—hours of review compressed to minutes.
The Business Case
Equity research analysts covering 15-20 companies face a scaling problem during earnings season: reading every 10-K filing (150+ pages each), comparing year-over-year risk factors, and flagging material changes.
The manual process involves hours of Ctrl+F and scrolling. The opportunity cost is high—time spent validating text is time taken away from thesis generation.
The Solution
A targeted compliance engine that answers: “What material changes occurred in risk disclosures since last year?”
It surfaces changes with citations for verification. The mantra: “False positives are fine. False negatives are not. When in doubt, show me more.”
Key Strategic Decisions
1. Comparison-aware search instead of structural diff
The client initially wanted a side-by-side document comparison: the old version on the left, the new version on the right. I explored this, then realized the approach would be misleading: AI finds conceptually similar passages, not structurally aligned ones. When companies reorganize their risk factors year over year, matching by similarity results in false “MODIFIED” tags.
The solution was to search both years, let the AI synthesize what actually changed, and cite everything so the analyst can verify. Same outcome, more honest about what the system can and can’t do.
2. Hit counts as a trend signal
One challenge in trend analysis across 10Ks is parsing natural language to identify changes over time and across companies for specific topics. This product clearly highlights this by displaying hit counts as trend signals before the cited excerpt.
For example, “AI regulation, FY2020: 1 result → FY2024: 12 results.”
That trend line tells a story before the analyst reads a single word, giving them context to the cited excerpts in milliseconds.
3. Recall over precision
Most AI search tools try to return only the “best” matches. This client needed the opposite: a tool that doesn’t miss anything, even if it surfaces some noise. I tuned the system to show more potential matches with confidence scores, letting the human filter rather than the algorithm.
The Solution
Three modes built for the analyst workflow:
- Search: Find relevant risk disclosures across five companies and five years using natural language queries, with AI-generated summaries and citations
- Compare Years: See what changed between FY2023 and FY2024 for the same company
- Compare Companies: Compare how Meta and Microsoft discuss AI risk in the same filing year
All outputs include verbatim excerpts with source citations for verification. The system can ingest new 10-Ks, pulling directly from SEC sources.
Results
- 5 companies indexed: META, AAPL, GOOG, MSFT, AMZN (FY2020-2024)
- ~2,500 document chunks searchable across the corpus
- Estimated time savings: 4-6 hours → 30 minutes per company for YoY risk comparison
What I Learned
Requirements gathering is the actual skill. The simulated client conversation surfaced requirements I wouldn’t have anticipated on my own: collapsible summaries, trend signals, and confidence indicators. Building is the straightforward part. Knowing what to build is harder.
The value is in the workflow, not the AI. The core technologies here (semantic search and language models) are accessible to anyone. The value is in understanding how it fits a specific job: what to index, how to present comparisons, when to surface uncertainty versus synthesize an answer.
Interested in something similar?
Let's explore how systems like this could work for your team.