Stack
The non-default product choices in the system. Commodity primitives (S3, EventBridge, IAM) omitted by design — the picks that explain a design decision are the picks worth listing.
Foundation
- Python
Primary language across all eight repos. Lambda runs 3.12; local development on 3.13.
- AWS
Step Functions orchestrate the pipelines, Lambda runs the stateless work, EC2 hosts the stateful executor and batch jobs, S3 is the inter-module data bus, CloudFormation owns the infrastructure. Per-service breakdown in *Compute & orchestration* below.
- Anthropic Claude
Tiered usage — Haiku for per-ticker quant, qual, and peer-review (~12 calls per Saturday run); Sonnet only for synthesis (macro economist, CIO batch evaluation, nuanced judge cases). Sonnet is ~5× Haiku; reserving it for synthesis keeps per-run cost stable.
- Streamlit
Live data console at
live.nousergon.aiand the private interview-demo console atconsole.nousergon.ai. Pragmatic pick for read-only monitoring: a custom React dashboard would be weeks of work for the same outcome.
Agentic layer
- LangGraph
Multi-agent orchestration with
Send()fan-out semantics and custom dict-keyed reducers. Six sector teams + macro economist + CIO + LLM-as-judge run as graph nodes with typed state. Vanilla LangChain doesn't compose for parallel multi-team coordination with state merging. - LangSmith
Auto-tracing on every production LLM call. The trajectory validator polls LangSmith for graph-correctness invariants (required nodes present, sector_team_node appears exactly 6 times) post-
graph.invoke(). Free tier covers a personal-scale workload. - Pydantic
Typed agent outputs (
with_structured_output(..., include_raw=True)with strict-mode parse-error contract across every LLM-output site).
Data & retrieval
- ArcticDB (over parquet on S3)
Feature store for ~50 features × ~900 tickers × 10y. Migration was driven by a real performance bottleneck: per-feature parquet scans on S3 were slow on the read side and OOMed c5.large training instances. ArcticDB's symbol-keyed range queries pull only the slice each consumer needs.
- Neon — serverless Postgres + pgvector
RAG corpus for SEC filings, 8-Ks, earnings transcripts, and rolling investment theses. HNSW indexing for fast vector search; serverless tier means no idle compute cost when the weekly research run isn't firing.
- Voyage `voyage-3-lite`
512-dimensional embeddings tuned for financial-domain text. Cheaper than OpenAI's
text-embedding-3-*tier with better fit on the SEC-filing corpus the qual agents retrieve over. - Polygon.io
Primary market-data source for adjusted OHLCV and intraday data; yfinance retained as cross-validation only, not as a silent fallback (silent-fallback patterns have been a repeat offender for masking real failures and were retired across the data layer).
- FRED + FMP
Macro indicators and supplemental fundamentals.
- VectorBT
Historical portfolio simulation in the backtester.
- SQLite
Per-module local audit trails —
research.db(signal history, rolling theses, macro snapshots) andtrades.db(every order, fill, retry, rationale). Both backed up to S3 after each run; S3 is canonical, SQLite is the queryable working copy. Embedded over a hosted DB because the workload is single-writer, the queries are bounded, and embedded survives a network partition.
ML tools
- LightGBM
Layer-1 gradient-boosted models in the predictor meta-ensemble (momentum + volatility).
- pandas + numpy
Feature engineering, signal scoring, metric computation.
Compute & orchestration
- AWS Step Functions (over Airflow / Dagster / Prefect)
Three pipelines (Saturday weekly research, weekday morning, EOD post-close) orchestrated as state machines. Native to the AWS account that already runs Lambda + EC2 + S3, so no new infrastructure to operate. Execution history is itself the audit trail.
- AWS Lambda
Stateless agent calls, predictor inference, LLM-as-judge, rationale clustering, replay-concordance, replay-counterfactual. Container-image deploys via ECR for the Python 3.12 research Lambda.
- AWS EC2
t3.smallfor the executor daemon (stateful, holds the IB Gateway connection),t3.microfor the dashboard,c5.largespot for batch (predictor training + backtester). Right-sized per workload — whenpredictor_data_prepOOMed on c5.large, the response was a pandas refactor (~1.1 GB → ~91 MB resident), not a bump to xlarge. - CloudFormation
Infrastructure-as-Code for Step Functions, Lambda functions, IAM roles, EventBridge rules. Drift detector compares CloudFormation stamps against live AWS state weekly.
- Telegram (over PagerDuty / Opsgenie)
Dual-channel ops alerts (Telegram + SNS) via the
alpha_engine_lib.alerts.publishprimitive — every fail-loud surface posts to both channels independently, dedup-keyed at cadence-window granularity. Telegram delivers the operator-visible signal; SNS keeps the audit trail. PagerDuty-tier escalation is overkill for a single-operator system; the cost asymmetry is the design choice.
Surfaces
- Astro (this site)
Static-site generator for the apex marketing surface — zero JS shipped by default, fast TTFB, build-time SEO. The harness identity benefits from a marketing surface that loads as instantly as the instrument it describes.
- Cloudflare (Pages + Access)
Cloudflare Pages serves the apex marketing site; Cloudflare Access gates the private operator console at
console.nousergon.ai. Zero-trust SSO without standing up a separate auth layer; the public live data console atlive.nousergon.aistays accessible without auth so the system is inspectable end-to-end. - IB Gateway (gnzsnz Docker + TOTP)
Paper-account broker connection runs in
gnzsnz/ib-gateway-dockeron the trading EC2 with TOTP-based 2FA (secret in AWS Secrets Manager). Docker isolates the Gateway's X-server requirement and survives session-reset edge cases that the localIBC + Xvfbsetup hit mid-2026. Hard runtime check refuses to connect to non-paper accounts (account ID must start with "D") — defense against accidental live-account wiring.