This past summer it was widely reported that hackers from China stole information on more than 21 million people from databases at the Office of Personnel Management (OPM). Included in this theft was detailed personal information taken from nearly every person that had completed a U.S. government background check in the last 15 years. The scope of information stolen was sweeping, including not only social security numbers and addresses, but also health and financial data and information about spouses and friends.
For millions of federal employees and contractors, the question is: What could the Chinese do with the data, and how will the theft affect their lives? Of course the most obvious use is espionage, enabling the PRC intelligence apparatus to target those with access to classified information. But beyond this cyber spying coup, the unprecedented OPM breach also calls us to examine more closely the nature of identity in the 21st century and how the Chinese might use big-data skills to exploit the information for political purposes beyond spying.
Identity in the digital age transcends analog ID cards and protected account numbers. Now one's identity encompasses not only documentary proof of our names and addresses, but also includes a composite profile of our lives over time, pinpointing our behavior: where we live, when we move, what we buy, what we look like, who we love, who we vote for and our most personal preferences and proclivities. This data-based identity composite picture largely resides in government and corporate databases along with additional context in social media.
Circling back to OPM, most foreign hackers usually steal American identities for fraud, combining multiple hacked and open source datasets to mimic their victims. After probing a few online accounts, they strike, withdrawing funds from their target's bank account or more often simply selling bulk identities at commoditized prices. For China, however, emptying our checking accounts has been secondary to espionage. But what if China were to move beyond cyber spying and begin to exploit the troves of identity big data to start influencing our opinions in the way that political campaigns do here in the U.S.?
Data scientists working on campaigns have developed the tools and computing power needed to build a nuanced view of each citizen in a large population. Modern political campaigns now target you individually, tracking how likely you are to vote and convince your friends to vote. Beijing has already announced that they are collaborating with Moscow to shape international dialogue through new media propaganda. If they want to shape American views through new media campaigns, the first step is developing a rich citizen-by-citizen database that combines personal, professional and community personas. We are seeing ISIL exploit social media to target individuals in their campaigns, so why would it not be logical for China to adopt a similar approach on a much larger scale with a broader political agenda?
Two decades ago, analyzing a whole population meant extracting statistical trends from the top down. It was rare, even in the 1990s, to look at anyone individually, unless the effort would yield a very high value (i.e., large political donors, suspected terrorists, or ultra-frequent fliers). Big Data analytics and political campaigns have evolved in parallel as technology costs have come down -- microtargeting has become feasible in many more business and political scenarios.
FICO scores were one of the first attempts at microtargeting each person within the U.S. population. Introduced in 1989, FICO took a "person-centric" view, cutting across multiple datasets to convert each consumer's creditworthiness into a single number. These scores could take months to update, drawing from a massive web of data shared across retailers, credit providers and credit bureaus. Consumers soon saw the day-to-day impact of a crude, numeric judgement of how responsible they were.
The 2008 Obama presidential campaign demonstrated just a few years later how powerful microtargeting had become. Rather than look at demographics and population trends, the Obama campaign's data scientists built an electronic dossier on every single Democratic voter. The campaign didn't just predict how many voters lived in a district or how liberal they were. Names, voting patterns, and influential friends were all collected. They calculated multiple dynamic scores for every voter and updated that information in real time. This was the first public glimpse at a set of powerful tools changing the relationship between ordinary citizens and big data.
Two election cycles later, our citizens create eight times more personal data every day. And during this period, Chinese computer science has become world class, and the resources available to the Chinese government dwarf that of even an American political campaign. But unlike American corporations and federal agencies, the Chinese government is not constrained by legislation, FOIA, or public relations. They can augment their electronic dossiers with every tweet, hack, leak, and data breach. It is reported that both the Russians and the Chinese operate large numbers of "sock puppet" accounts to amplify and suppress key messages on social media platforms.
As citizens in an open democracy, Americans take to social media to debate and decide how the U.S. should approach foreign policy. On any major controversial topic, Beijing and Moscow broadcast their positions domestically through the CCTV and RT networks, but what could they do with your composite identity details? As we debate with our friends, foreign governments are investing in new media that can influence the discussion using "sock puppets," paid content and micro-targeted ads. They can agitate certain views and dampen others, identifying key speakers through machine learning.
During the 2016 election cycle, dozens of political organizations will profile you, your family, and your friends. Each of them will seek to simulate you hundreds of times per day, projecting how likely you are to donate and to vote. Those Big Data insights will help them microtarget you for ads, phone calls, and conversations. We have come to accept (and to a certain degree regulate) the season of political ad campaigns that is ramping up now.
As we consider the implications of the OPM breach, perhaps amongst all these data-driven political groups operating this season is the PRC? The Chinese government has a bigger war chest than Hillary Clinton, a richer database than Jeb Bush, and more wealth at stake than Donald Trump. The OPM hack put them even further ahead by identifying 21 million American adults that have applied to work for the Federal government. No doubt espionage for China will remain the primary use for this data, but just as we update our view of identity in the 21st century so too might the PRC update its plans for the use of such data.