Chapter 2: AI Reflects Human Choices and Perspectives

Chapter 2: AI Reflects Human Choices — diverse silhouettes around an AI node

AI Reflects Human Choices and Perspectives 🪞

Discovering who “teaches” AI and how human choices shape technology.

Imagine you are creating a brand-new custom character for your favorite video game. You get to choose their outfit, their skills, their strengths, and their weaknesses. You decide if they are going to be a stealthy ninja or a heavy-armor knight. If you make them super fast but terrible at jumping, that is exactly how they will behave in the game. The character does not make those choices; you do. They simply act out the rules you programmed for them.

Artificial Intelligence is remarkably similar. Because AI systems run on computers, it is incredibly easy to assume they are perfectly logical, neutral, and objective. We are used to computers being literal. If you type 2 + 2 into a calculator, it will always say 4. Because of this, we tend to think of all machines as being free from human flaws, emotions, and prejudices.

But AI is not a simple calculator, and it is not truly independent. Behind every single AI system is a massive team of humans making thousands of invisible choices. Humans decide what the AI will be used for, what rules it will follow, what problems it is trying to solve, and most importantly, what information it will use to learn.

If AI is a mirror reflecting the world, humans are the ones holding the mirror and choosing which exact part of the world it points at. If you point a mirror at a messy room, the mirror shows a mess. To truly understand AI, we have to look behind the screen and realize that AI does not invent its own knowledge out of thin air. It reflects our human choices, our unique perspectives, and sometimes, our deeply rooted human flaws.

2.1 — The Data Diet: What Does AI “Eat”? 🍽️

In Chapter 1, we learned that AI systems are powered by algorithms that look for patterns in massive amounts of data. But where exactly does all that data come from?

Think of data as an AI’s diet. In computer science, there is a famous saying: Garbage in, garbage out . This means that if you feed a computer bad information, it will give you bad answers. If an AI is going to learn how to write essays, translate languages, or generate cool images of astronauts riding skateboards, it has to consume a massive “diet” of information first. This is called training data .

For most modern AI tools, the primary source of training data is the open internet. Developers use automated programs to scrape or download massive chunks of the web. They feed the AI billions of websites, digital books, Wikipedia articles, news stories, online forums, and public social media posts. To put that in perspective, it would take a human being thousands of lifetimes to read the amount of text an AI consumes in just a few weeks. If you have ever posted a public comment on a YouTube video, uploaded a picture to a public profile, or written a review for a video game, there is a very good chance your words and images have been scooped up to become part of an AI’s training data.

But it is not just old internet posts. AI systems also learn from real-world, physical data collected by digital sensors. For example, GPS apps track how fast thousands of cell phones are moving on a highway to learn where traffic jams are. Smartwatches track heart rates and sleep patterns.

Furthermore, AI systems are constantly gathering new data from you in real time to adjust their behavior. Every single time you click “like” on a funny TikTok, replay a specific part of a song on Spotify, or skip past an advertisement before it finishes, the AI algorithm is feeding on that exact data point. It uses your everyday digital choices to adjust its mathematical predictions and serve you more of what it thinks you want to keep your attention on the screen.

2.2 — The Invisible Human Workforce 👷

Because computers process data so quickly, we often talk about AI “learning” to recognize a dog or “learning” to block a spam email as if it happens by magic. But AI does not learn these things entirely on its own. It requires a massive, often invisible, amount of human labor.

Before an AI algorithm can even begin to recognize patterns, humans have to organize, categorize, and label the training data. Imagine showing a toddler a picture book and pointing out, “This is a fire truck, this is a banana, this is a dog.” AI developers have to do this exact same thing, but on a massive, global scale. To teach a self-driving car algorithm to recognize a stop sign, humans must look at thousands and thousands of photos of streets and manually draw digital boxes around the stop signs. They have to label edge cases, too — like what a stop sign looks like when it is covered in snow, or blocked by a tree branch.

You’ve Helped Train AI! 🤖

You have probably even helped label data for AI yourself! Have you ever tried to log into a website and had to prove you were not a robot by clicking all the pictures containing a crosswalk, a bicycle, or a traffic light? That is called a CAPTCHA . While it does prove you are a human, it is also a brilliant way for tech companies to use your brainpower to label images for free. Every time you click the crosswalks, you are actively teaching an AI system how to identify them in the real world.

Ghost Workers 👻

Beyond labeling images, humans also work as content moderators . Sometimes called “ghost workers,” there are thousands of people around the world whose full-time job is to chat with AI systems and grade their answers. They have to review thousands of AI outputs to tell the system, “This is a helpful, safe answer,” or “No, this answer is toxic, rude, and harmful.” The AI is entirely dependent on the personal judgment, beliefs, and cultural backgrounds of the specific humans doing the labeling. If the human moderators make a mistake, or if they only view the world from one specific cultural perspective, the AI will learn those mistakes and limits as if they were absolute facts.

2.3 — The Problem with the Mirror: Algorithmic Bias ⚠️

Because AI learns entirely from human-created data and human-labeled categories, it inevitably learns human prejudices. This is known as algorithmic bias . Bias occurs when an AI system produces unfair, inaccurate, or unequal outcomes for certain groups of people based on flaws in its training data or how it was designed.

Facial Recognition Failures

Let’s look at a real-world example that middle and high schoolers experience all the time: facial recognition and camera filters. Have you ever used a filter on Snapchat or Instagram that puts a funny hat on your head, gives you dog ears, or changes the shape of your face? To do this, the AI has to accurately track exactly where your eyes, nose, and mouth are in the video.

However, a few years ago, computer scientists discovered a massive problem: many popular facial recognition AIs were terrible at recognizing the faces of people with darker skin tones. Some systems wouldn’t even register that a person was in the photo at all. Why did this happen? It wasn’t because the computer was intentionally trying to be racist or mean. It happened because of the AI’s data diet. The human developers who built the system trained it using millions of photos, but the vast majority of those photos were of people with lighter skin. The AI became an expert at recognizing lighter faces and failed miserably at recognizing darker ones, simply because it was not given enough examples to learn from.

This might seem like a small issue for a funny photo filter, but it becomes a massive problem when police departments use flawed facial recognition to identify suspects, or when a student with dark skin gets locked out of their own school tablet because the camera doesn’t recognize them.

Bias Is Everywhere

Bias can show up anywhere, not just in pictures:

Where Bias Shows Up	What Goes Wrong
Voice assistants (Siri, Alexa)	Struggle to understand certain accents, fast speakers, or kids with speech impediments because training data heavily features standard adult American or British accents
Language translation	If trained on 1950s data where most doctors were men and nurses were women, it may automatically translate “doctor” as “he” and “nurse” as “she”
Essay grading AI	If trained only on essays from one region or school type, it might unfairly penalize students who use different cultural slang or regional phrases

2.4 — Fixing the Mirror 🔧

Bias inherently exists in AI because bias inherently exists in human society. If the internet is full of stereotypes, rumors, and mean comments, an AI trained on the internet will act out those stereotypes. It is just reflecting the mirror back at us.

However, because humans create these systems, humans also have the power to fix them. AI developers have a serious ethical responsibility to act as “fairness detectives.” Before releasing an AI tool into the real world, they must test it extensively to see exactly who it benefits and who it might accidentally harm. They do this through a process called red teaming , where developers intentionally try to break the AI or trick it into saying something biased, just to see where its weak spots are so they can be patched. They can also fix algorithmic bias by intentionally feeding the AI more diverse training data — making sure photos, voices, languages, and perspectives from all different types of people, ages, and cultures around the world are included in the data diet.

As a user of AI, you also have an incredibly important role to play. Whenever you use a chatbot to help brainstorm for a project, or when an algorithm recommends a news story or a video to you, you must remember that the AI is not a perfect, objective genius. It is simply a reflection of the data humans chose to give it.

2.5 — You Are What You Click: The Engagement Loop 📲

Every time you open TikTok, Instagram, or YouTube, an AI is watching you very carefully. Not in a creepy-spy way — but in a mathematical, data-collection way. Every single thing you do sends a signal:

❤️ You liked a video → strong positive signal
⏩ You scrolled past a video in less than 2 seconds → strong negative signal
🔁 You rewatched a clip three times → extremely strong positive signal
💬 You left an angry comment → positive signal (engagement is engagement!)
📤 You shared a video with your group chat → the strongest signal of all

These signals feed into what is called the engagement loop . The loop works like this:

You take an action → The algorithm collects data → The pattern model updates → You see more targeted content → You react more strongly → More data is collected → Repeat

This is exactly how TikTok’s “For You” page works. When you first create an account, TikTok shows you a mix of random videos. As soon as you start interacting, the algorithm begins building a model of you — your interests, your humor, your emotions, the time of day you watch certain types of content. Within 30 minutes of use, TikTok’s algorithm often knows your preferences better than people who have known you for years. Within a few days, it can predict what video will keep you watching with unsettling accuracy.

The Dark Side: Engagement Bait 🪤

Here is the uncomfortable truth: the engagement loop does not optimize for what is good for you. It optimizes for what keeps you on the app the longest. This is why you sometimes fall into rabbit holes of content that makes you feel anxious, angry, or upset — but you can’t stop watching.

Content creators and marketers have figured this out and deliberately create engagement bait — content specifically designed to trigger strong emotional reactions like outrage, shock, fear, or morbid curiosity. A video titled “You won’t BELIEVE what this school did to this student 😡” is not designed to inform you. It is designed to get your heart rate up so you click, share, and comment — all of which train the algorithm to push it to millions more people.

2.6 — The Cost of Convenience: The Privacy Trade-Off 🔓

Here is a question nobody asks when they download a free app: “If this app is free, how does the company make money?”

The answer, almost always, is your data. The implicit bargain of the internet age is this: you get a free service (maps, email, social media, music), and in exchange, the company gets to collect detailed information about your behavior, preferences, location, and habits. They then sell access to this data (or directly sell ads targeted at you) to generate billions of dollars in revenue. You are not the customer — you are the product.

The Cambridge Analytica Scandal

In 2018, the world discovered just how far this data economy could go. A political consulting firm called Cambridge Analytica obtained detailed personal data on approximately 87 million Facebook users — without those users’ knowledge or meaningful consent. They got this data through a seemingly innocent third-party quiz app that asked users to share not just their own profile, but also the profiles of all their friends.

Cambridge Analytica then used this data to build detailed psychological profiles of voters and send them highly personalized political advertisements — different ads targeted at different people based on their fears, values, and political leanings. The goal was to influence the outcome of elections, including the 2016 US presidential election and the Brexit vote in the UK.

Why does this matter? In a democracy, we believe that citizens should form their own political opinions based on honest debate and shared facts. When an AI can secretly profile 87 million people and push targeted propaganda to each one individually, it becomes nearly impossible to have that honest debate. Different voters were seeing completely different — and sometimes completely false — versions of political reality, engineered specifically to manipulate their vote.

You Think You’re Sharing	What the App Actually Collects
Just your username and profile photo	Your precise GPS location, down to the building you’re in
Your posts and photos	Every other app on your phone and how often you use each one
Your messages to friends	Metadata: who you message, when, and for how long
Nothing (the app is just in the background)	Microphone access patterns, contact lists, browsing history
Your age and birthday	Inferred income, health status, political views, and relationship status based on behavior

2.7 — Around the World: Different Rules for Data 🌍

Not everyone agrees on how much power companies should have over your data. Different countries have taken very different approaches to protecting (or not protecting) their citizens.

In 2018, the European Union passed the GDPR (General Data Protection Regulation), one of the strongest data privacy laws in history. Key features include:

Consent first: Companies cannot collect your data without clearly asking for your permission first. No more buried checkboxes.
Right to explanation: If an AI makes a major decision about you (like denying you a job application), you have the legal right to ask how and why.
Right to be forgotten: You can ask any company that holds your data to permanently delete it — as if you never existed in their system. Try asking Google to delete your search history entirely, or ask an app to erase your account completely, not just deactivate it.
Massive fines: Companies that violate GDPR can be fined up to 4% of their global annual revenue — which for a company like Google, could be billions of dollars.

The United States: A Different Approach 🇺🇸

The US takes a very different approach. Instead of one comprehensive federal law, the US has a patchwork of industry-specific rules. The general philosophy is opt-out rather than opt-in: companies can collect your data by default, and it is your responsibility to find the settings buried deep in menus and turn off data collection if you don’t want it. Most people never do.

Some states (like California with its CCPA law) have stronger protections. But at the national level, US tech companies largely operate under a system of voluntary guidelines and self-regulation.

Why It Matters to You

The rules your country has about data directly affect your daily life. In Europe, when you visit a website, you see a clear “Accept or Reject All” cookie banner because the law requires genuine choice. In the US, you might just see a banner that says “By using this site, you agree to our terms” — no real choice at all.

As an AI-literate citizen, understanding these differences helps you advocate for the protections you deserve — no matter where you live.

Can an Algorithm Decide Someone’s Future?

In courtrooms across the United States, judges have used a software tool called COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) to help decide how likely a criminal defendant is to commit another crime in the future. This “recidivism score” influences decisions about bail, sentencing, and parole.

In 2016, the investigative journalism organization ProPublica conducted an in-depth analysis of COMPAS predictions and found a deeply troubling pattern: the algorithm was twice as likely to incorrectly label Black defendants as high-risk compared to white defendants. At the same time, it was more likely to incorrectly label white defendants as low-risk when they went on to reoffend.

The algorithm was not “choosing” to be racist. But it was trained on historical criminal justice data — data that itself reflects decades of systemic racial inequity in policing and prosecution. The algorithm learned those historical patterns and reproduced them, stamping them with the false authority of mathematical certainty.

This is a documented, real-world example of algorithmic bias causing direct harm in a high-stakes setting. People’s freedom was influenced by a biased computer score — and most defendants didn’t even know the algorithm was being used.

This is exactly why AI literacy, transparency, and independent auditing are not just academic topics. They are matters of justice.

Chapter Activity: Fairness Detectives 🔍

Let’s put your detective skills to the test! Imagine your school district is building a brand-new AI system to automatically select which student gets the “Student of the Month” award. The developers plan to train the AI using data from the past 10 years of school records to figure out what a “good student” looks like.

Part 1: Investigate the Data Diet

What specific kind of data do you think the AI will look at? Think about:

📊 Grades and test scores
📋 Attendance records
🏈 Sports statistics
🎭 Club memberships
⚠️ Detention records
📝 Teacher notes and recommendations

Step 2: Spot the Bias

How could using 10 years of historical data be unfair to current students?

What if a new club (like an Esports team or a Robotics club) was only created this year? Would the AI know to value it as much as the 50-year-old football team?
What if certain teachers in the past were stricter graders than others?
What if the definition of a “good student” has changed over the last decade?

Step 3: Fix the System

If you were the developer in charge of this project, what new rules or data would you add to make sure the AI is fair to every type of student, not just the ones who get straight A’s or play traditional sports?

Part 2: My Data Diet Audit 📱

Time to investigate your own digital life! For this activity, list 5 apps you use regularly and think carefully about what data each one might be collecting from you. Then honestly rate how comfortable you are with that collection.

For each app, fill in the table below:

App Name	Likely Data Collected	Comfort Rating (1-5)
Example: TikTok	Location, watch time, device info, contacts, clipboard	2 — I didn’t realize it tracked my location
Your App 1
Your App 2
Your App 3
Your App 4
Your App 5

Rating Scale: 1 = Very uncomfortable, 3 = Neutral, 5 = Completely fine with it

Discussion Questions:

Were there any apps where you were surprised by what they collect?
Did any app collect data you weren’t aware of before this audit?
Are there any apps you might delete or limit based on what you found?
What is the difference between data collection that helps you (like a maps app knowing your location) vs. data collection that profits from you?