Roll Roll AI: Build // AI.
Posts
The Seven Sources of Harm Nobody Warned You About

The Seven Sources of Harm Nobody Warned You About

Beyond “The problem is just that the data is biased.”

Jeremy Lezac
May 17, 2025

Welcome to this week’s edition, where I crack open the myth of “objective AI” and call out the seven flavors of bias hiding in plain sight (spoiler: fixing the spreadsheet won’t save you).
Scroll down for the best reads of the week (don’t miss the one roasting Product Managers).

As always, you’ll find enough spicy links to keep your inner contrarian well-fed.

Best Reads of the Week
- The Seven Sources of Harm Nobody Warned You About
Closing Loop

Best Reads of the Week

Due to the number of links, I kept my comment short!

The End of Human Product Management (and It IS for the Reasons you Think) - Look, everyone’s obsessed with using AI to replace someone, why not start with Product Managers?
The Unspoken State of PM as of mid-2025 - Seems PMs are getting roasted this week.
Why Romanticizing PLG is Dangerous - Read with the article below
The nasty side effects of bad PLG - Read with the article above
Getting People to Listen is a Skill — Here’s Where to Start - Good stuff if your solution to get people to listen is “screaming”
AI Office Tender - European or not, you should keep an eye on the EU artificial intelligence law and the myriad of reactions

The Seven Sources of Harm Nobody Warned You About

“The problem is just that the data is biased.”

This throwaway line gets tossed around so often it might as well be a knock-knock joke. Machine learning systems aren’t magic boxes. They are messy pipelines, full of human choices, historical baggage, technical trade-offs and social context.

Bias and harm can sneak in every stage, from the moment someone decides what data to collect until the deployment of a system making split-second decision that impacts a life.

We see this playing out in the real world:

By understanding the seven key sources of harm throughout the ML lifecycle, we can stop blindly blaming “the data” and actually start fixing what’s broken. Let’s get specific about where bias comes from, what it looks like in practice, and what you can do about it.

1. Historical Bias: The Ghosts in the Data

Historical bias is the kind of bias you can’t escape, even if you collect “perfect” data. Why? Because sometimes, the world itself is unfair and if your system learns from the world as it is (or was), it will reflect and perpetuate those inequalities.

Suppose you train an AI on employment records from the past 50 years. If women were systematically excluded from certain roles, your AI will “learn” that engineers are mostly men. That’s not an artifact of bad data collection; it’s a faithful reflection of historical exclusion.

Think about predictive policing, where an AI recommends where police patrol. If decades of over-policing certain neighborhoods have filled the crime database, the AI will keep sending police to those same places, locking in a feedback loop.

So what?
Historical bias is tough to fix by “collecting more data.” What you actually need is a willingness to question whether automating a biased process is ever appropriate or if your AI should sometimes break with the past.

2. Representation Bias: Who Gets Seen, Who Gets Ignored

Representation bias creeps in when your dataset doesn’t actually reflect everyone it’s supposed to serve. If your data underrepresents a group, your model will fail them sometimes in dramatic ways.

ImageNet, the backbone of countless computer vision breakthroughs, is notoriously Western-centric. Almost half its photos come from the US, while massive populations in Asia and Africa barely show up. Researchers found that ImageNet-trained models performed significantly worse on photos from India and Pakistan, missing subtle (and not-so-subtle) differences in dress, culture, and context.

So what?
Diversify your data. On purpose, not by accident. Audit who’s included and who’s invisible. Sometimes, building targeted datasets or oversampling underrepresented groups is the only fix.

3. Measurement Bias: When the Ruler’s Off

Measurement bias pops up when the features or labels you use to “measure” reality are flawed, especially if they mean different things for different groups.

Imagine trying to build an AI that predicts “workplace performance.” If you use “number of errors caught” as your metric, but some employees are more closely monitored than others (think new hires vs. veterans, or one department vs. another), your AI will learn more about who’s being watched than who’s actually doing good work.

Another scenario: Diagnosed cases of a disease are often used as a label for “has the disease.” But systemic underdiagnosis in certain groups (e.g., women and heart disease, or minority populations and pain management) means the label itself is less accurate for those groups, leading the model astray.

So what?
Always question your proxies. What are you really measuring? Is that label equally meaningful for everyone in your data? Sometimes, it’s better to combine multiple proxies or use qualitative data to supplement quantitative labels.

4. Aggregation Bias: The Danger of One-Size-Fits-All Models

Aggregation bias happens when you try to shoehorn everyone into a single model, even though subgroups might behave differently or need different treatment altogether.

Suppose you build a health risk prediction tool that works “pretty well” for most patients. But the underlying risk factors for men and women differ; or, a symptom that’s a red flag for one group is normal for another. By lumping everyone together, your model gives bad advice to everyone who isn’t the majority.

So what?
Build subgroup-specific models, or at least include interaction terms and domain expertise. Sometimes, a little complexity in modeling goes a long way towards fairness.

5. Learning Bias: When the Algorithm’s Goals Miss the Point

Learning bias arises from the choices you make about how your AI “learns.” Sometimes, optimizing for accuracy or speed can unintentionally make things worse for certain groups.

Most machine learning models optimize for a global objective like overall accuracy or minimizing average error. But if your dataset is imbalanced, the model will do great on the majority and poorly on minorities, because that’s what makes the numbers look best.

So what?
Don’t just chase the highest accuracy. Regularly audit model performance by subgroup, and use multi-objective optimization if possible—balancing overall accuracy with fairness or coverage.

6. Evaluation Bias: The Myth of the Universal Benchmark

Evaluation bias creeps in when the test datasets (benchmarks) you use to measure model performance don’t reflect the people your model will actually serve.

AI teams love benchmarks: UCI datasets, ImageNet, standard speech datasets. But if your benchmark doesn’t include enough examples of a certain group, your model might look great in the lab and fail spectacularly in production.

So what?
Pick or build benchmarks that are representative of your user base. Report performance not just in aggregate, but by subgroup, ideally intersectionally.

7. Deployment Bias: When Real Life Isn’t a Lab

Deployment bias is the most “human” of all, when your model gets used (or misused) in ways you didn’t plan for, or when the real-world context introduces new variables you didn’t train for.

Your AI is designed as a “decision aid,” but people start treating it as a final authority. Or, you build a model to predict who might benefit from a medical intervention, but doctors use it to decide who doesn’t get care.

So what?
Design for transparency, adaptability, and recourse. Make it clear how your model should (and shouldn’t) be used, and build in feedback mechanisms to catch problems quickly.

Wrapping Up: What Does This Mean for You?

Whether you’re building models, buying AI products, or just living in a world shaped by algorithms, understanding these seven sources of harm is important.

For AI builders:

Don’t settle for “fixing the data.” Audit your whole pipeline.
Test and report performance by subgroup, not just aggregate numbers.
Collaborate with domain experts and affected communities to define what “harm” really means in context.
Assume your model will be used in unexpected ways, and plan for it.

For tech leaders and product managers:

Ask hard questions: Who is missing from the dataset? Are we auditing for fairness? How will this model be used in the wild?
Build in processes for ongoing monitoring, not just one-off audits.
Push for impact assessments before AND after-deployment

Bias isn’t just “in the data.” It can sneak in anywhere. Recognizing each source gives us a fighting chance to make AI more fair, reliable, and aligned with our values. The next time someone blames “biased data,” ask yourself: which kind?

Closing Loop

If you found value here, forward this newsletter to a colleague who still thinks “the data is just the data.”
As an extra for reading throught, a visual exploration on the difficulties to fix “hidden bias”, by Google PAIR.

And if you have a story of bias gone wrong (or right), I want to hear it, your battle scars make us all smarter.

Keep rolling and see you next week.

More to explore:

Share your thoughts:
How did you like today’s newsletter?
You can share your thoughts at [email protected] or share the newsletter using this link.

Reply

or to participate.

The Seven Sources of Harm Nobody Warned You About

Beyond “The problem is just that the data is biased.”

Table of Contents

Best Reads of the Week

The Seven Sources of Harm Nobody Warned You About

1. Historical Bias: The Ghosts in the Data

2. Representation Bias: Who Gets Seen, Who Gets Ignored

3. Measurement Bias: When the Ruler’s Off

4. Aggregation Bias: The Danger of One-Size-Fits-All Models

5. Learning Bias: When the Algorithm’s Goals Miss the Point

6. Evaluation Bias: The Myth of the Universal Benchmark

7. Deployment Bias: When Real Life Isn’t a Lab

Wrapping Up: What Does This Mean for You?

Closing Loop

Reply