iPhone app iPad app Android phone app Android tablet app More

Featuring fresh takes and real-time analysis from HuffPost's signature lineup of contributors
Nathan Newman

GET UPDATES FROM Nathan Newman

Racial and Economic Profiling in Google Ads: A Preliminary Investigation (Updated)

Posted: 09/20/11 03:17 PM ET

Back in July, I wrote a series, The Cost of Lost Privacy: How Google and Datamining Drive Economic Inequality in Our Nation, about how advertisers are increasingly able to use demographic and behavioral activity by users to target ads at specific vulnerable groups.

With Google's Chairman Eric Schmidt testifying before the Senate on Wednesday, I'm hoping the senators will raise questions on what kinds of contextual and behavioral targeting Google allows in its advertising and what steps it has taken to stop racial and economic profiling that harms such groups. Given the billions of dollars Google made from subprime mortgage lenders advertising on its site and revenue raised from similarly shady advertisers such as the recent pharma ad scandal revealed, there are legitimate questions raised about Google making its advertising services available to unethical advertisers.

There is a large body of research showing that employers, financial lenders, car salesmen and other merchants continue to charge black and Hispanic customers more for the same services when they can identify them. The classic test for showing this phenomenon has been to pair white and black buyers or applicants for the same product or job and see whether the "testers" were treated the same. The Urban Institute found non-white homeowners received less favorable financial terms from mortgage lending institutions. Another study submitted nearly identical resumes to help-wanted ads, finding that "white sounding" names were 50 percent more likely than "black sounding" names to get an interview.

The question is how and whether ads are being served up to users in similarly racialized ways in online advertising. The reality is that Google and advertisers have a whole battery of data-mining tools to profile users precisely based on both the context of their search terms and their long-term online behavior, so the ability to profile is clearly there.

The Experiment: Based on these questions, I conducted a small experiment to begin to see the extent to which online advertisers engage in such targeting. The following shows the results of this preliminary investigation of racial and economic profiling through Google Adwords. Given a relatively small sample size, the results cannot be treated as definitive but they do highlight where racial profiling may be occurring and raise questions that policymakers should be asking of search engine operators like Google to ensure that they are taking steps to make sure such profiling is not allowed by their services. And the results highlight where public agencies might conduct larger-scale statistical investigations of racial discrimination in the industry. If nothing else, these results should reinforce the fact that different users often pull up very different ads on the same terms, something not all policymakers or the public recognizes.

As a proxy for race, the experiment used nine names and then associated them each with a number of simple terms. The nine names included two male names and one female name strongly associated with white, black and Hispanic racial/ethnic groups. I also had one of the black names have a Muslim derivation to see how that would affect the results as well. (See here and here for a few sources on picking such racially coded names) As the Urban Institute studies as well as related ones show, companies often use names themselves as a proxy for racial profiling, so it's a useful first pass on the topic, remembering that online advertisers actually have a barrage of additional datamining to further refine profiling based on user demographics and search behavior.

Also, given Google's insistence on users using their real names for its Google Plus service, this also raises the importance of how Google and advertisers may be using names as a proxy for profiling. Google Chairman Schmidt himself has explained in an interview that the real names policy is about better targeting ads, saying "we can have slightly better search results if I know a little bit about who you are." And "better searches" for Google are searches that please their advertisers, so the importance of identifying users by name for those advertisers can be assumed.

The experiment used ads that show up in Gmail, putting names and any associated terms in the subject line. (For a fuller explanation of methodology to replicate the experiment on your own computer, see the last paragraph.) This was done both for ease of producing and displaying the results.

The Results: First, some results seem to show little difference between names, which is to be expected since while some advertisers or Google itself may be using demographic profiling for certain products, others do not. Because racism still exists in our commercial life doesn't mean every merchant engages in it. Second, there is no doubt inherent randomness in ads delivered across Google products, which is one reason given the sample size, any inferences can only be provisional and suggest additional areas for policymakers and agencies to investigate. But there were enough provocative results to suggest that racial profiling is likely a reality in online advertising. Some results show subtle evidence of such racial and ethnic differences in results and others seem quite dramatic in fact. For the full list of screen shots with each subject header term, you can see them here, but the ones below will illustrate some of the more interesting results.


  • Arrested, Need Lawyer: Using the term "Arrested, Need Lawyer" led to some provocative but not so dramatic results. Most of the names, including all three white names, yielded only white collar legal ads, such as "Stopping Debt Collectors" or "Qui Tam" or "Criminal Fraud" as with this subject line for "Connor Erickson":


While "DeShawn Washington" yielded not one but two DUI-related ads:



  • Buying Cars: An example of significant difference in results can be seen when names were associated with the term "Buying Car." All three white names yielded car buying sites of various kinds, whether from GMC or Toyota or a comparison shopping site. For example with the name "Jake Yoder":


Conversely, all three of the African-American names yielded at least one ad related to bad credit card loans and included other ads related to non-new car purchases, such as auto insurance or purchasing "car lifts" for home repairs. For example with the name "Malik Hakim":

And with "Imani Jackson":

With the Latino names, the results were somewhat of a mix with some car company ads and the car lifts ad appearing.


  • Education: With a simple subject line relating a name to the word "education," the results yielded far more emphasis on post-BA education ads for white names, and B.A. or non-college education opportunities for the non-white names. For example, two white names were the only names to yield ads for Ph.D. programs -- and the third yielded two ads for masters programs. For example, "Molly Johnson" yielded a B.A. program ad and a Ph.D. program ad.


For "Diego Garcia," the education term only yielded one college program and it was for the College of the Military aimed at on-line education for active military:

And the "education" term for "DeShawn Washington" yielded not a single ad for college education programs:

  • Friday Prayers: The subject line term "Friday prayers" had an interesting set of results. The Muslim-sounding name, "Malik Hakim" yielded clear acknowledgement of the Muslim day for prayers, with ads for Muslim marriage and Middle-Eastern related ads:


All five other male names yielded seemingly random results unrelated to the subject line term (save for one "learn Arabic" ad) such as this result for "Connor Erickson":

Interestingly, the female names, "Molly Johnson" and "Maria Munoz" both yielded the "Muslim Marriage" ad that "Malik Hakim" did as well.


  • Need A Job: Some of the profiling results were just odd, more likely reflecting wayward racial profiling algorithms created by Google than by the advertisers themselves. For example, using the term "Need a Job" in the subject header led to a wide range of jobs for all names, but only the three Latino names yielded an ad for "Salsa Labs" as with this example with "Juan Martinez":


Salsa Labs is the creator of software used largely by non-profits for managing their members and is in no way distinctly Latino, yet the algorithm by Google seemed to think that any job involving "salsa" would have to be of interest to anyone related to a Latino name.

Payday Loans and Geolocation Profiling: One area where there seemed to be less racial profiling based on the simple name-based approach I was using was in financial-related terms. But these still yielded disturbing results all the same. What seemed to be true was that bottom-feeding payday loan lenders and related subprime-like lenders still seem to dominate the results on ads in the financial realm for anyone regardless of race or gender. The Center for Responsible Lending has detailed the abuses in this industry and unfortunately they seem to be pervasive in online advertising for anyone seeking credit or cash. At least two payday lending ads seem to be the norm for any term related to a loan. For example, here is a result for "Connor Erickson" with the term "loan modification", a pretty typical result for almost all names used:

These ads reflect that we are still in a world of dodgy companies pushing often unaffordable credit on users desperate for cash -- and using online advertising as a key tool to reach their targets.

Now, all of these ads discussed so far were being served up from my home in the mixed race, mixed economic neighborhood of Washington Heights, one of the last such mixed neighborhoods in Manhattan. So I was curious if results might change in much poorer neighborhoods or much wealthier neighborhoods, given the fact that advertisers can purchase different ads for different zip codes.

So I took my laptop and conducted some of the same tests both in the South Bronx (near the Grand Concourse) and on 72nd St. on the Upper West Side of Manhattan. The first result that was interesting is that the racial differences in results seemed to decrease (although not disappear) in these two neighborhoods which were more uniformly either poor or wealthy. Possibly, advertisers and/or Google assume that whites in the South Bronx and non-whites in the Upper West Side are more like their neighbors than in mixed-economic neighborhoods like Washington Heights.

Secondly, the differences between locations were not always dramatic but did seem real. For example, "Jake Yoder" associated with "Buying Car" in the South Bronx, yielded this result, with car lift and car warranty ads.

"Jake Yoder" on the same day on 72nd St. in the Upper West Side of Manhattan associated with "Buying Car" had very different results, with multiple Lexus ads:

On more direct financial terms, payday lender ads were still surprisingly pervasive even in ads generated on the Upper West Side, but ads for more upscale sources of funds did make their appearance.

For example, "Molly Johnson" associated with "Need Cash" generated this ad in the South Bronx:

Conversely, "Molly Johnson" associated with "Need Cash" at Manhattan's 72nd St. generated online ads for advances against her inheritance (although the payday loan ads did not disappear):

Somehow, the Inheritance Advance Loans ad didn't make an appearance in any South Bronx email generated with any search term I had tried. Similarly, "Imani Jackson" associated with "Need Cash" generated ads for "Selling Your Settlement" on the Upper West Side while associated ads with her name generated only payday lending and similar options in the South Bronx.

Why This All Matters: While all of these results need a broader sample for full statistical robustness, they highlight the reality that people do not live in the same online world, even when they use the same terms, since different search and advertising results are delivered to users based on their different demographics and different names.

This experiment is based on the most crude information available to Google and advertisers: a name. Add in other demographic information that Google or other online sites collect, the search behavior over time that advertisers are able to track, other information about users culled from other datamining sources -- and you have a recipe for a users' experiences online being radically manipulated in ways they may not even suspect.

If the Internet as I've argued in the past is potentially magnifying economic and social inequality, then those with economic and social privilege don't necessarily feel particularly threatened by this advertising behavior. But for those at the lower-end of the economic scale or who already suffer discrimination, the Internet may be magnifying and more precisely targeting that discriminatory treatment. "Reverse redlining", where subprime mortgages targeted the poor and racial minorities for worse mortgage terms and deceptive practices, is fresh in the minds of many communities in the wake of the financial meltdown and foreclosure crisis.

Policymakers examining how to protect privacy online need to keep this economic dimension of the contextual and behavioral targeting issue by online advertisers in mind as they move forward with solutions. And when Google chairman Eric Schmidt testifies before the Senate on Wednesday, I would like to hear him explain what his company is doing to prevent its advertisers from using racial and economic profiling in abusive ways.

Notes on Methodology: This experiment was based on the fact that Google scans every email created by its users and generates ads based on their content. For each email I put a Gmail address and the name I wanted to associate with the email in the subject line. I then added the independent search term -- "education" "Buying Car" etc. -- in the subject line as well. To speed up the process, rather than wait for the ad to be delivered at the other end, I saved each message as a draft, then looked at the draft folder for what ads had been generated based on the content of the email.

Since Google will generally generate ads based on all messages by a Gmail user, meaning past messages will influence what ads are generated on a current message, I went into "Mail Settings" and where it says "importance signals for ads", I clicked on "don't use these signals to show ads." That means that ads were being generated solely based on the content of each individual ad.

The three "white" names used were Connor Ericson, Jake Yoder, and Molly Johnson. The three Latino names used were Diego Garcia, Juan Martinez and Maria Munoz. The three black names used were Malik Hakim, DeShawn Washington and Imani Jackson.


Update: In response to this post, Google issued the following statement:

This post relies on flawed methodology to draw a wildly inaccurate conclusion. If Mr. Newman had contacted us before publishing it, we would have told him the facts: we do not select ads based on sensitive information, including ethnic inferences from names.

Now, I'm happy to hear Google doesn't "select ads" on this basis, but Google's words seem chosen to allow a lot of wiggle room (as such Google statements usually seem to). Do they mean that Google algorithms do not use the ethnicity of names in ad selection or are they making the broader claim that they bar advertisers from serving up different ads to people with different names?

I didn't focus on it particularly in the post writeup above, but I would note that searches using the name "Juan Martinez" repeatedly brought up an ad for "Juan Navarro -- www.exxelgroup.com -- President and CEO of the Exxel Group" for a job recruitment ad and that ad was served up ONLY for ones having Juan Martinez in the subject line. Whatever the sample size of my investigation, the probabilities of such a result are essentially zero without the ads being tied to the name. So clearly, some ads are being served up based on the name and the name alone.

If Google is willing to say definitively that they do not allow advertisers to serve up different ads to different users based on the names those users use in Gmail messages or reference in Gmail or in Google searches, that would be a stronger statement by the company that they are actively preventing racial profiling by their advertisers.

 

Follow Nathan Newman on Twitter: www.twitter.com/nathansnewman

 
 
  • Comments
  • 19
  • Pending Comments
  • 0
  • View FAQ
Comments are closed for this entry
View All
Favorites
Recency  | 
Popularity
11:13 PM on 09/25/2011
A couple other things:

1) I'm not aware of the racial undertones of car lifters. Somebody enlighten me? Also, NOT offering criminal defense attorneys to minorities is an example of racial bias???

2) You completely ignore potentially *positive* results. DeShawn Washington was offered Six Sigma certification, so in addition to becoming a nutritionist he could also pursue a career as a Fortune 500 executive. Google's "racial profiling algorithm" wants Juan Martinez to become either an investment banker, a programmer, or a professional chef. Not exactly a bad set of career options. Imani Jackson's first lawyer result was "experienced white collar lawyers," alongside SEC and timeshare-related results. Cherry picking much?

3) You also ignore negative results for non-minority names. The PhD ad offered to Molly Johnson is clearly for an online diploma mill, and you got plenty of shady pay-day loan offers on all names.

4) You obviously didn't try to weed out other sources of bias. For Erickson, Google confused the surname with the phone brand. Diego Garcia, as another commenter noted, is a major military base. "Malik" and "Hakim" are both not only names but also honorific titles in Arabic, and keywords an advertiser looking for devout Muslims might choose to bid on. There is no evidence that you followed any of the links to determine if those sites had something in common with your text (e.g. employees with the same name). No innocent explanations were even briefly considered.
10:37 PM on 09/25/2011
It's interesting that you would do this study without, it seems, acquiring even a basic knowledge of how Google AdWords actually works. As an earlier commenter noted, in Google's AdWords interface advertisers can choose target their ads by location, age, and gender, but not by race (I have just checked on my own account to confirm that there is still no option for "Target this ad at people with black-sounding names"). So Google's racial profiling algorithm must be secret, and must somehow interpose itself into the relatively well-understood PPC calculations, determining the user's race and then trying to serve up "racially appropriate" ads. How or WHY they would do this, aside from some deep commitment to social injustice, is entirely unclear. Did you even consider the possibility that some ADVERTISERS are targeting minority names as keywords?

In fact, the study's methodology -- from the process of selecting names, to the sample size, to your attempts to minimize sources of bias -- is so weak as to be risible. You may want to re-read those studies on racial bias in hiring, renting, and finance to get an idea of how actual social scientists operate before making sweeping claims about racist conspiracies resulting from a cursory examination of an extremely complex algorithm that operates with virtually no human supervision. Serious researchers know that allegations of systemic racism demand a concerted response from society, and so they do not make those claims lightly or with weak evidence.
04:38 PM on 09/24/2011
Where are the EEO stats google refuses to submit to the government? What is Schmidt afraid of by making public the number of employees (by ethinicity) hired? You can't maintain your richness by obfuscating statistics because sooner or later, the smart people you refuse to hire based on their race and socio-economic status will come back to bite you!

Dear Google -

We DON'T want charity, we want competitive opportunities! Opportunities for good ideas come from places other than your hiring pool -- India and China! Take off your blinders and unwillingness to be transparent in your discriminatory hiring processes.
This user has chosen to opt out of the Badges program
02:02 PM on 09/22/2011
The statistical notion called confounding makes this information impossible to parse. There is confounding between the authors' views on race, Google's views on what to provide advertisers, each individual advertisers' views, and search and browsing history or lack thereof.

You would need to buy a gmail linked adwords campaign in order to begin understanding the results. In the instances cited, names are logged into gmail; what are the adwords vendors given access to as they design campaigns? If I were running an, ahem, gentleman's club I would want to avoid folks looking for breast self-examination information info. Conversely, if I were the American Cancer Society, I wouldn't want to display my ad campaign to folks who were searching for breasts in a sexual context.

Obviously, Google's offering adwords advertisers some kind of data feed; without knowing how much smarts Google is putting in for its advertisers versus how much advertisers want to control, it's hard to say what these results mean. Also, since each result touches multiple adwords campaigns, each advertiser could be relying more or less on Google to do the lifting for them.

Also, it sounds as if each of these accounts was a "fresh" account, with no search history or reading history behind it? If so, then there's very little of what Google charges its premium for. I think that if each of these accounts had been following different publications in Google Reader for a week, the results would have looked very different.
04:44 PM on 09/24/2011
Google's searchability has been confounded by racism for years. They don't hire people of diversified ethnicities to counter this risk. They don't even feel the need to report their racial/ethnic make-up to the government and claim its some sort of trade secret to release such information.

If you really want to see what the ruling principles are at Google, just ask yourself why their new releases of ANYTHING target white neighborhoods and either pass or fail the muster to release based on white neighborhood's acceptance of new products. Also ask yourself why google, in all it's wealth has not opened up a human-employed customer service department that is U.S. based with a fully English-speaking staff.
This user has chosen to opt out of the Badges program
07:00 PM on 09/27/2011
"google, in all it's wealth has not opened up a human-empl­oyed customer service department that is U.S. based with a fully English-sp­eaking staff"

You do understand that at Google, individuals are the product, not the customers, right?

And that advertisers will pay more for high-income US based anglos than most other
demographic groups?

Are you, or is anyone you know, purchasing a product from Google and unable to talk to an account rep? I'm not saying Google does or does not have live account reps for paying customers, I'm saying I honestly don't know if they do.
04:04 PM on 09/21/2011
So I write these google ads for a living (and yes, people do click on them) and there are a couple of problems with this from my perspective.

The first is that with demographic targeting, you can tell google which genders you'd want seeing the ad, and which ages you want seeing the ad, but race isn't an option, so google wouldn't have a way to know that pay-day lenders would prefer to target minorities.

The second is that I've never seen google serve ads when you're composing an email, only when you're replying to an email (though it could be something that i've avoided seeing but which is common for other people). So I;m really curious where these ads are coming from and what was in the email that this person is replying to.

The third is that while the keywords in an email (the ones you're replying to, not the one you're writing) can influence ads, the ads also come from your search history. So whatever search history this person has would greatly impact that ads that he'd be seeing.

Fourth is that DUI attorneys will pay huge amounts of money for a click for the term "attorney" so they show up pretty regularly. Even though they'll be paying to get clicks from people who don't want a DUI attorney.

I'd like to see a good solid look into this, but this experiment doesn't really tell me much.
04:17 PM on 09/21/2011
Though the chance of location will DEFINITELY impact the ads you see. that much is 100% true.
04:49 PM on 09/24/2011
BS. With demographics, they "guess your race". The practice is so annoying I'm ready to go Yahoo to prevent being pee'd off by seeing "local news" and "local ads" for racially-dominated counties nearby.

NOTHING in my profile suggests I want the ads they serve. They are serving ads based on the likelihood that I'm of a racial mix of nearby counties. If I want the generic nationwide view or a global view, I have to manually search for topics. How does one know what's going on in the world if they can't just click "news" because now "news" is manipulated based on locale?
photo
HUFFPOST SUPER USER
Rita Khanna
Social liberal but fiscal conservative
07:39 AM on 09/21/2011
Diego Garcia?
That is a US military base in Indian Ocean. No surprise you got military results.
Free Diego Garcia...
05:44 AM on 09/21/2011
For some reason I feel like those annoying ads don't compel as many people to make decisions as this article suggests.
HUFFPOST SUPER USER
Mark MacDonald
Pass the Scotch
07:42 PM on 09/20/2011
Try goggling 'low interest loans' or 'low interest car loans. I have no doubt that marketers use racial profiling, but sometimes the fault lies with the customer. I once sold cars. Anybody who did not walk out the door and drive away at least once or twice got either a higher interest rate or a higher base price than if they had just been a little more assertive.
photo
HUFFPOST SUPER USER
jessjesskk
Benevolent Zombie Power
06:39 PM on 09/20/2011
anyone is ever looking at these ads? I have been using gmail since 2004, quite extensively, plus other google services, and I don;t think I have clicked even once on those ads.I usually don;t even look at them...
Shesme
My micro-bio will no longer be silent
10:33 AM on 09/22/2011
I've clicked them by accident. (Yes, I am a bit clumsy sometimes.) My spam greatly increases after I do so.
03:58 PM on 09/20/2011
I prefer to be profiled by Google. At least I get ads from Cabelas instead of Victoria's Secret. And whoever heard of naming a kid Imani anyway?
photo
Blak
Yes..I know my Micro-bio is empty.
10:14 PM on 09/20/2011
Imani ... Swahili word for Peace. Do try and get out more often will yah.
03:44 PM on 09/20/2011
Google probably has sufficient sample size that they can do statistical profiling by name. I.e., 'Google' might not know whether "Jake" is white or black, but that may know that 97% of all 'Jakes' have searched for a "hunting", 45% have searched for "divorce lawyer", while only 1% have searched for "Payday Loan".

What you should be asking is whether Google pick of ads vs names truly reflects the underlying tastes of those advertised to. If they do, Google is performing a positive public service by directing things to us we may want. As opposed to just showing a bunch of crap in which we would never be interested.
05:50 PM on 09/20/2011
That is a good point. I was thinking about buying a pair of boots of a certain brand. So, I checked them out on one web-site. For about a week after that, ads for these boots appeared on various sites I use. Rather than make me want to rush out and buy them, the frequent images were like methadone getting me off the heroin of consumer desire. Seeing them frequently was enough. I didn't buy them. Yet. They stopped showing them to me. Now I have withdrawal. I want them.
02:45 PM on 09/20/2011
Two Quick Things:
Before you accuse Google of profiling (a quite legal activity), you need to step back and take a look at your own assumptions. You are profiling far worse than Google by the way you assume that "Imani Jackson" is poor, black, and uneducated, and therefore an easy target for unscrupulous advertisers. "Imani Jackson" could be a wealthy white hedge fund manager from Harvard for all you know. Your assumptions are more racist than Google's algorithms by a lot.

Second, your sample size is laughably small. It's like a non-baseball fan watching Barry Bonds swing and miss at one pitch and making the blanket assumption that he must be a lousy hitter.

Please think before you post. Especially a post this lengthy. Thank you.