Seven tools, pipelines, and data products I built using AI as core infrastructure — not as a feature bolted on afterward.
Each one solved a real operational problem at a global nonprofit accelerator managing a portfolio of 4,486 startups
across healthcare, climate, fintech, food systems, and security.
A searchable, filterable portfolio tool for 4,486 startups. Staff search by name or pitch text,
filter across 8 dimensions (sector, sub-sector, funding stage, geography, demographics, confidence score,
exit status, program year), toggle between card and table views, star companies, and export filtered
results to CSV. Each company card shows funding, an active/closed confidence score using a composite
survival model, last funded date, and Crunchbase rank.
Designed with an agentic news research feature — the company detail modal includes an
on-demand button that triggers a Claude API call with web search tool use to autonomously find and
summarize recent press coverage for any alumni company.
StackVanilla HTML/JS/CSS + JSON data layer. Python/pandas for data processing. Static deploy.Data4,486 Crunchbase-matched startups. 3MB JSON. Confidence scoring via cohort-adjusted composite model.DeployNetlify (drag-and-drop, two-file static site). Password-protected for internal use.AI roleOn-demand agentic research (web search + summarization per company). Data processing pipeline. Full UI generation.
Designed and ran an LLM classification pipeline to categorize 4,486 startups into 5 strategic sectors
and 5 healthcare sub-sectors. The system feeds elevator pitches and Crunchbase descriptions through
Claude Sonnet in batches of 30–40, with structured output parsing, retry logic, and progress tracking.
It replaced a keyword-matching approach that had a 26% error rate on reclassifications.
The classified dataset became the analytical foundation for a published LinkedIn article series,
internal dashboards, partnership memos, and event planning — every downstream tool in this portfolio
depends on it.
StackReact artifact (browser-side Anthropic API calls). Python/pandas for validation and merge.Scale4,486 companies × 2 classification tasks (sector + healthcare sub-sector). ~150 API batches.ValidationManual spot-check of 50 reclassified companies. Cross-tabulation against existing labels. Error rate analysis.AI roleCore — LLM performs the classification. Prompt engineering for edge cases (dental, pest control, dual-use tech).
03
Data Product21 ChartsDeployed
Portfolio Intelligence Dashboard
A 6-tab interactive dashboard analyzing 15 years of startup outcomes — 4,486 companies, $27.1B in funding,
21 interactive charts. Tabs cover Portfolio Growth, Challenge Areas, Funding Pipeline, Survival & Benchmarks,
Demographics, and Cohort Composition. Built as a public-facing data product for VC, corporate innovation, and
ecosystem audiences — and as a companion to a published LinkedIn article series.
Key analytical features include a cohort-adjusted survival model calibrated against BLS and
Stripe benchmarks (explaining why Crunchbase's raw 87% active rate is actually ~59%), a funding pipeline
funnel (4,486 → 300 Series A+ → 12 IPO), a power law concentration curve (top 1% =
45% of funding), and a funding parity analysis by gender and race across all 5 Challenge Areas
and 5 HC&LS subsectors. Animated KPI counters, scroll-reveal animations, and methodology explainer panels
throughout.
StackChart.js (CDN), vanilla HTML/CSS/JS, Google Fonts. Single self-contained HTML file — no framework, no backend, no build step.Data4,486 startups, 102 columns merged from Crunchbase + internal records. All numbers audited against published articles for consistency.Scale21 interactive charts, 6 analytical tabs, 4 KPI cards, animated counters, methodology panels, "How This Was Built" section.AI roleDataset analysis, chart design, data auditing, cross-referencing past articles for number alignment, editorial decisions on methodology transparency.
LinkedIn Article Series — Data-Driven Startup Analysis
A 5-part published series analyzing 15 years of startup portfolio data. Each article follows an
end-to-end AI-assisted production pipeline: Python/pandas analysis of the classified dataset →
live web research on every named company (current revenue, customers, pivots, milestones) →
2,000–2,500 word article draft → publication-ready React/Recharts chart artifacts → two LinkedIn
post variants → 10-slide carousel copy for AIcarousels.com.
Articles 1–4 published. Topics: portfolio-wide trends, financial outcomes, Healthcare & Life Sciences
deep dive (including a disability tech analysis with personal context), and Climate & Environment.
10+ custom data visualizations with a consistent design system.
StackPython/pandas for analysis. React/Recharts for charts. CairoSVG for static PNG exports.Output4 published articles, 10+ charts, 8 LinkedIn posts, 4 carousels. ~10,000 words total.RigorFact-check protocol: every named company researched live before drafting. CB data cross-referenced with press releases.AI roleResearch, drafting, chart code generation, post/carousel copywriting. Human editorial judgment on framing and fact-checking.
A web application where staff enter information about a startup, mentor, donor, or partner and receive
generated marketing copy across four channels: LinkedIn post, blog copy, case study draft, and content
angles. Features conditional form fields by subject type, quick-revision buttons ("shorter," "lead with
metric," "stronger CTA," "warmer tone"), and per-output copy-to-clipboard.
The agentic research layer is the differentiator: when a user enters a company name, the
system autonomously triggers a Claude API call with web search tool use to gather recent press, milestones,
and funding data. It simultaneously cross-references an internal alumni database (alumni_slim.json hosted
on the same Netlify instance) to pull cohort year, sector, program history, and prize data. This research
context is injected into the generation prompt — so the marketing copy is informed by real, current data
rather than whatever the staff member remembers.
StackHTML front end + Netlify serverless function (Node.js) proxying the Anthropic API. API key secured in environment variables.AgentTwo-mode serverless function: "research" (web search + DB lookup → structured context) and "generate" (context-informed copy).SecurityAPI key lives in Netlify environment variables only — never exposed in browser. Staff see no setup.AI roleAutonomous research (web search tool use + database cross-reference), content generation, iterative revision.
An interactive HTML walkthrough that teaches a 5×5×4 content system (5 Challenge Areas × 5 personas ×
4 funnel stages = 100 content combinations) to non-marketing staff. Step-by-step: key term definitions →
select a persona and Challenge Area → see the content journey for that combination → walk through a full
10-email nurture sequence in that persona's shoes, with subject lines, asset types, CTAs, and send timing.
Companion to a 4-tab Excel workbook (content matrix, 12-month editorial calendar, persona-specific nurture
stream templates, and a board-ready summary view). The HTML tool makes the framework accessible to program
staff who don't live in spreadsheets.
StackSelf-contained HTML/CSS/JS. No dependencies. Opens in any browser. Also deployed to Netlify.Companion4-tab Excel workbook (openpyxl) with full content matrix, editorial calendar, nurture templates, board summary.Scale100 content cells (5×5×4), 50 nurture emails (10 per persona × 5 personas for HC&LS), 8 key term definitions.AI roleFramework design, content cell copy, nurture email writing, full UI code generation, Excel workbook generation.
Analyzed 4,486 companies to surface a 50-company shortlist of alumni who went through a program 3+ years
before reaching a major funding milestone (Series C, IPO, or equivalent) — the "first believer" narrative
for an annual gala. The tool filters by Challenge Area, founder demographics (female, BIPOC), tier
(confirmed Series C+ vs. strong Series B), and press coverage. Each card shows the gap between MC
participation and breakout event, total funding, and a research-backed description.
Also delivered as a 7-sheet color-coded Excel workbook with sector-specific tabs, a summary dashboard,
and a methodology sheet.
StackSelf-contained HTML (all 50 companies baked in — no external data dependency). Also: openpyxl for Excel.MethodFiltered by funding_stage + ipo_status, calculated program-to-event gap, verified via live web research per company.ResearchWeb search on all 50 companies for press coverage, recognition (TIME Best Inventions, Forbes 30U30), and current status.AI roleData analysis, company research, description writing, full UI generation, Excel workbook creation.