Stack

The non-default product choices in the system. Commodity primitives (S3, EventBridge, IAM) omitted by design — the picks that explain a design decision are the picks worth listing.

Foundation

Python

Primary language across all eight repos. Lambda runs 3.12; local development on 3.13.
AWS

Step Functions orchestrate the pipelines, Lambda runs the stateless work, EC2 hosts the stateful executor and batch jobs, S3 is the inter-module data bus, CloudFormation owns the infrastructure. Per-service breakdown in *Compute & orchestration* below.
Anthropic Claude

Tiered usage — Haiku for per-ticker quant, qual, and peer-review (~12 calls per Saturday run); Sonnet only for synthesis (macro economist, CIO batch evaluation, nuanced judge cases). Sonnet is ~5× Haiku; reserving it for synthesis keeps per-run cost stable.
Streamlit

Live data console at live.nousergon.ai and the private interview-demo console at console.nousergon.ai. Pragmatic pick for read-only monitoring: a custom React dashboard would be weeks of work for the same outcome.

Agentic layer

LangGraph

Multi-agent orchestration with Send() fan-out semantics and custom dict-keyed reducers. Six sector teams + macro economist + CIO + LLM-as-judge run as graph nodes with typed state. Vanilla LangChain doesn't compose for parallel multi-team coordination with state merging.
LangSmith

Auto-tracing on every production LLM call. The trajectory validator polls LangSmith for graph-correctness invariants (required nodes present, sector_team_node appears exactly 6 times) post-graph.invoke(). Free tier covers a personal-scale workload.
Pydantic

Typed agent outputs (with_structured_output(..., include_raw=True) with strict-mode parse-error contract across every LLM-output site).

Data & retrieval

ArcticDB (over parquet on S3)

Feature store for ~50 features × ~900 tickers × 10y. Migration was driven by a real performance bottleneck: per-feature parquet scans on S3 were slow on the read side and OOMed c5.large training instances. ArcticDB's symbol-keyed range queries pull only the slice each consumer needs.
Neon — serverless Postgres + pgvector

RAG corpus for SEC filings, 8-Ks, earnings transcripts, and rolling investment theses. HNSW indexing for fast vector search; serverless tier means no idle compute cost when the weekly research run isn't firing.
Voyage `voyage-3-lite`

512-dimensional embeddings tuned for financial-domain text. Cheaper than OpenAI's text-embedding-3-* tier with better fit on the SEC-filing corpus the qual agents retrieve over.
Polygon.io

Primary market-data source for adjusted OHLCV and intraday data; yfinance retained as cross-validation only, not as a silent fallback (silent-fallback patterns have been a repeat offender for masking real failures and were retired across the data layer).
FRED + FMP

Macro indicators and supplemental fundamentals.
VectorBT

Historical portfolio simulation in the backtester.
SQLite

Per-module local audit trails — research.db (signal history, rolling theses, macro snapshots) and trades.db (every order, fill, retry, rationale). Both backed up to S3 after each run; S3 is canonical, SQLite is the queryable working copy. Embedded over a hosted DB because the workload is single-writer, the queries are bounded, and embedded survives a network partition.

ML tools

LightGBM

Layer-1 gradient-boosted models in the predictor meta-ensemble (momentum + volatility).
pandas + numpy

Feature engineering, signal scoring, metric computation.

Compute & orchestration

AWS Step Functions (over Airflow / Dagster / Prefect)

Three pipelines (Saturday weekly research, weekday morning, EOD post-close) orchestrated as state machines. Native to the AWS account that already runs Lambda + EC2 + S3, so no new infrastructure to operate. Execution history is itself the audit trail.
AWS Lambda

Stateless agent calls, predictor inference, LLM-as-judge, rationale clustering, replay-concordance, replay-counterfactual. Container-image deploys via ECR for the Python 3.12 research Lambda.
AWS EC2

t3.small for the executor daemon (stateful, holds the IB Gateway connection), t3.micro for the dashboard, c5.large spot for batch (predictor training + backtester). Right-sized per workload — when predictor_data_prep OOMed on c5.large, the response was a pandas refactor (~1.1 GB → ~91 MB resident), not a bump to xlarge.
CloudFormation

Infrastructure-as-Code for Step Functions, Lambda functions, IAM roles, EventBridge rules. Drift detector compares CloudFormation stamps against live AWS state weekly.
Telegram (over PagerDuty / Opsgenie)

Dual-channel ops alerts (Telegram + SNS) via the alpha_engine_lib.alerts.publish primitive — every fail-loud surface posts to both channels independently, dedup-keyed at cadence-window granularity. Telegram delivers the operator-visible signal; SNS keeps the audit trail. PagerDuty-tier escalation is overkill for a single-operator system; the cost asymmetry is the design choice.

Surfaces

Astro (this site)

Static-site generator for the apex marketing surface — zero JS shipped by default, fast TTFB, build-time SEO. The harness identity benefits from a marketing surface that loads as instantly as the instrument it describes.
Cloudflare (Pages + Access)

Cloudflare Pages serves the apex marketing site; Cloudflare Access gates the private operator console at console.nousergon.ai. Zero-trust SSO without standing up a separate auth layer; the public live data console at live.nousergon.ai stays accessible without auth so the system is inspectable end-to-end.
IB Gateway (gnzsnz Docker + TOTP)

Paper-account broker connection runs in gnzsnz/ib-gateway-docker on the trading EC2 with TOTP-based 2FA (secret in AWS Secrets Manager). Docker isolates the Gateway's X-server requirement and survives session-reset edge cases that the local IBC + Xvfb setup hit mid-2026. Hard runtime check refuses to connect to non-paper accounts (account ID must start with "D") — defense against accidental live-account wiring.