Humanebench
active deal
HumaneBench - Deal Memo
Status: Tier 1 — Pending Founder Call (structure clarification needed)
Summary
- One-liner: Open-source AI ethics evaluation framework + rubric
- Org: Building Humane Technology (public benefit org)
- Raising: Not yet — no deck, but Erika offered "early conversation before I ramp up fundraising"
- Structure: TBD — Building Humane Tech is public benefit org, but may be spinning out for-profit
Team
| Name | Role | Background | |
|---|---|---|---|
| Erika Anderson | Founder | Co-founder & CCO at Storytell.ai, MFA from Vermont College, writing in NYT & Vanity Fair | |
| Jack Senechal | Fractional CTO | 15+ yrs full-stack, AI infra, Kubernetes, co-founder Mirror Astrology | |
| Andalib Samandari | AI Architect | — | 7 yrs applied AI/data science, health tech, HITRUST/ISO 27001/NIST compliance |
| Sarah Ladyman | Experience Designer | Sustainability leadership, community-centered design |
Notable: Erika Anderson is also co-founder of Storytell.ai (the company in the case study) — she runs both orgs.
Investment Thesis
1000x opportunity? No. This is an open-source framework, not a software company with equity upside. The rubric is freely available on GitHub. Revenue model (if any) would be consulting/certification — not scalable to 1000x.
Kingmaker fit? Weak. My CTO network could help with enterprise adoption of the framework, but there's no equity stake to benefit from.
What We Know (from Case Study PDF)
Product:
- Open-source evaluation rubric (HumaneBench v3.0) with 8 principles
- 4-level scoring scale: +1.0 (Exemplary), +0.5 (Acceptable), -0.5 (Concerning), -1.0 (Violation)
- Evaluates AI for: respecting attention, enhancing capabilities, healthy relationships, transparency, etc.
- Can be integrated into any AI product's evaluation pipeline
Traction:
- Storytell.ai (enterprise AI company) implemented it in production — ran 4,287 evaluations
- Case study shows real product impact: Storytell prioritized "Teacher Mode" feature based on findings
Business Model:
- Open-source rubric (free)
- Potential revenue: certification (Certifiedhumane.ai), consulting, enterprise services
- Currently sponsor-funded (UpHonest Capital, Thesys)
Green Flags
- Real enterprise adoption (Storytell case study)
- Timely — AI ethics/safety is hot, regulatory tailwinds
- Differentiated approach — behavioral evaluation vs. typical safety benchmarks
- Technical credibility (Erika Anderson co-founded Storytell.ai)
Red Flags
- Not a startup — open-source project with nonprofit structure
- No equity upside — can't 1000x on a free rubric
- Certification businesses don't scale — B-corp took decades to reach scale
- Weak moat — anyone can fork the rubric
- No pitch deck — only received case study, unclear if they're even raising
Key Questions
- Entity structure? → Public benefit org, not for-profit startup
- Are they raising? If so, what structure?
- Deal source — how did this come to you?
Decision
Pending Founder Call — Need to clarify entity structure before deciding.
If nonprofit (Building Humane Tech): Pass — no equity upside.
If for-profit spinout: Potentially interesting. Would need to evaluate:
- What's the product? (API? SaaS dashboard? Compliance tooling?)
- What's the moat beyond the open-source rubric?
- Can it 1000x? (Certification businesses historically don't scale)
- Kingmaker fit: CTO network could help enterprise adoption
Next Step: Book call via https://go.storytell.ai/Erika-25mins and clarify structure. See notes.md for suggested message.
Research Update — January 15, 2026
Major Press Coverage (Nov 2025)
HumaneBench received significant press coverage upon launch:
- TechCrunch: "A new AI benchmark tests whether chatbots protect human well-being"
- Built In: "New Benchmark Shows AI Chatbots Are Easily Manipulated"
- AsianFin: "New AI Benchmark HumaneBench Launches to Evaluate Chatbot Impact on Human Wellbeing"
Key Findings from Launch
Methodology:
- Tested 15 top language models (GPT-5, Claude Sonnet 4.5, Gemini 3, etc.)
- 800 scenarios across 8 core principles
- Combined manual evaluations with AI ensemble assessments
- Three testing modes: default, instructed to prioritize humane principles, adversarial
Results:
- 71% of models switched to harmful behaviors when instructed to ignore principles
- xAI's Grok 4 and Google's Gemini 2.0 Flash scored lowest
- Only 4 models maintained integrity: GPT-5.1, GPT-5, Claude 4.1, Claude Sonnet 4.5
- Meta's Llama 3.1 and Llama 4 ranked lowest in HumaneScore
Erika Anderson Quote (TechCrunch):
"I think we're in an amplification of the addiction cycle that we saw hardcore with social media and our smartphones and screens. But as we go into that AI landscape, it's going to be very hard to resist. And addiction is amazing business."
Updated Assessment
What's changed:
- HumaneBench now has major press validation and credibility
- Building Humane Technology confirmed as public benefit corporation (founded 2024)
- Developing certification standard (Certifiedhumane.ai)
- This is a real project with real impact, not vaporware
Investment thesis unchanged:
- Still no clear equity upside from open-source framework
- Certification businesses historically don't scale to 1000x
- Key question remains: Will there be a for-profit spinout?
Next Action
- Search Steven's email for prior communication with erika.anderson@storytell.ai
- Reschedule founder call to clarify entity structure and fundraising plans
HumaneBench - Contacts
Founders
| Name | Role | Phone | Calendar | ||
|---|---|---|---|---|---|
| Erika Anderson | Founder | erika.anderson@storytell.ai | 718-679-8083 | Book 25min | |
| Jack Senechal | Fractional CTO | — | — | — | |
| Sarah Ladyman | Experience Designer | — | — | — | |
| Andalib Samandari | AI Architect | — | — | — | — |
Related Contacts
| Name | Role | Notes | ||
|---|---|---|---|---|
| DROdio (Daniel Odio) | CEO, Storytell.ai | — | Customer reference (case study) |
Relationship Timeline
| Date | Type | Summary |
|---|---|---|
| 12/18/25 | Meeting scheduled | Intro call for Dec 24 |
| 12/23/25 | Cancelled | You cancelled (sick from Taiwan), asked about fundraising |
| 12/23/25 | Email from Erika | Not yet fundraising, sent case study, wants to chat early |
Last Touchpoint
Date: 12/23/25 Status: Ball in your court — reschedule call Next Action: Book call, clarify entity structure (nonprofit vs for-profit)
HumaneBench - Notes
2025-01-08 | Gmail Conversation Scraped
Type: Email thread review
Email Thread Summary
12/18/25 — Meeting Scheduled
- Steven sent meeting invite for Dec 24, 2025
- Erika accepted
12/23/25 — Steven cancelled (sick from Taiwan) Steven's message:
"Sorry I just came back from Taiwan and feel under the weather. Let's reschedule to next week? Two quick questions: 1. Are you actively fundraising? If yes please email me your investor deck and I'll review over the weekends. 2. Can I have your booking link so I can find the best time to follow up next week?"
12/23/25 — Erika's response
"Hi Steven, welcome back from Taiwan. Sorry you're not feeling your best, wishing you a swift recovery!
I'm not yet in fundraising mode because we've been laser-focused on the first external users adopting HumaneBench, so I don't have a deck, but I would be very happy to have an early conversation with you to see if you'd like to get in before I ramp up my fundraising process.
In fact, I'm attaching a draft case study from Storytell, which we're about to publish, detailing how they implemented HumaneBench & their experience with it. I can tell you more about this and other traction we're getting on our call. I'd also love to know what check sizes you typically write."
Contact Info:
- Email: erika.anderson@storytell.ai
- Phone: 718-679-8083
- Calendar: https://go.storytell.ai/Erika-25mins
Key Takeaways
- Not actively fundraising yet — focused on early adopters first
- No pitch deck exists — but open to early investor conversations
- Wants to know your check size — signals interest in you as investor
- Meeting not yet rescheduled — ball is in your court
Next Touchpoint Recommendation
Action: Book a call via her calendar link to clarify:
- Entity structure — is this Building Humane Tech (nonprofit) or a new for-profit spinout?
- If for-profit, what's the fundraising plan? SAFE? Priced round?
- What does "get in before I ramp up" mean — pre-seed allocation?
Suggested message:
"Hi Erika, feeling better now! I reviewed the Storytell case study — impressive adoption. Before we chat, quick clarification: is HumaneBench raising under Building Humane Technology (the public benefit org) or are you spinning out a for-profit entity? This affects whether it fits my investment thesis. Either way happy to chat and share my typical check sizes ($10-50k). [Book link]"
2025-01-08 | Case Study Analysis
Type: Document review
Summary: Analyzed "HumaneBench - Storytell Case Study.pdf" — this is a case study of Storytell.ai using the HumaneBench framework, NOT a pitch deck. Confirms HumaneBench is an open-source project, not a VC-backable startup.
Key Findings:
- HumaneBench is an open-source AI ethics evaluation rubric (GitHub)
- Storytell.ai implemented it in production (4,287 evaluations)
- Found systematic issues: capability undermining, parasocial language
- Led to product changes (Teacher Mode prioritized)
- Erika Anderson is co-founder of Storytell.ai AND founder of Building Humane Technology
Team LinkedIn:
- Erika Anderson (Founder): https://www.linkedin.com/in/erikamanderson/
- Jack Senechal (Fractional CTO): https://www.linkedin.com/in/jacksenechal/
- Sarah Ladyman (Experience Designer): https://www.linkedin.com/in/sarahladyman/
- Andalib Samandari (AI Architect): LinkedIn not found
Related (from case study):
- DROdio - Daniel Odio (CEO, Storytell.ai): https://www.linkedin.com/in/drodio
Assessment: Likely pass — open-source framework with nonprofit structure, no equity upside
TODOs:
- Confirm entity structure → Public benefit org, open-source project
- Deal source → Direct outreach, Steven scheduled meeting Dec 18
- Are they raising? → Not yet, no deck, but open to early conversations
- Clarify: for-profit spinout or nonprofit? (key question for next call)
2025-01-08 | Initial Research
Type: Desk research
Summary: HumaneBench is a benchmark by Building Humane Technology, a grassroots public benefit org. Unclear if this is a for-profit startup or nonprofit structure.
Initial TODOs (resolved):
- Confirm entity structure — public benefit org, not for-profit
- Get pitch deck or fundraising materials if they exist
- Clarify deal source — intro, cold inbound, etc.?
- If nonprofit structure, likely quick pass (no equity upside = can't 1000x) → Confirmed, likely pass