What Lies Beneath China’s Live-Streaming Apps?

Today, the Citizen Lab is releasing a new report, entitled: “Harmonized Histories? A year of fragmented censorship across Chinese live streaming platforms.”  The report is part of our NetAlert series, and can be found here.

Live-streaming media apps are extraordinarily popular in mainland China, used by millions.  Similar in functionality to the US-based and Twitter-owned streaming media app, Periscope (which is banned in China) China-based apps like YY, 9158, and Sina Show, have become a major Internet craze.  Users of these apps share everything from karaoke and live poker matches to pop culture commentary and voyeuristic peeks into their private lives.  For example, Zhou Xiaohu, a 30-year-old construction worker from Inner Mongolia, films himself eating dinner and watching TV, while another live-streamer earns thousands of yuan taking viewers on tours of Japan’s red-light districts.

The apps are also big business opportunities, for both users and the companies that operate them.  Popular streamers receive virtual gifts from their fans, who can number in the hundreds of thousands for some of the most widely viewed. The streamers can exchange these virtual gifts for cash.  Some of them have become millionaires as a result. The platforms themselves are also highly lucrative, attracting venture capital and advertisement revenues.

Chinese authorities have taken notice of the exploding live-streaming universe, which is not surprising considering their strict concerns over free expression.  Occasionally streams will veer into taboo topics, such as politics or pornography, which has resulted in more scrutiny, fines, takedowns, and increased censorship.

To better understand how censorship on the platforms takes place, our researchers downloaded three of the most popular applications (YY, 9158, and Sina Show) and systematically reverse engineered them.   Doing so allowed us to extract the banned keywords hidden in the clients as they are regularly updated.  Between February 2015 and October 2016, we collected 19,464 unique keywords that triggered censorship on the chats associated with each application, which we then translated, analyzed, and categorized.

What we found is interesting for several reasons, and runs counter to claims put forth in a widely-read study on China’s Internet censorship system authored by Gary King et al and published in the American Political Science Review.  In that study, King and his colleagues conclude that China’s censors are not concerned with “posts with negative, even vitriolic, criticism of the state, its leaders, and its policies” and instead focus predominantly on “curtailing collective action by silencing comments that represent, reinforce, or spur social mobilization, regardless of content.”  Their analysis gives the impression of a centralized and monolithic censorship system to which all Internet providers and companies strictly conform.

We found, on the other hand, that there is significant variation in blocking across the platforms.  This variation means that while the Chinese authorities may set general expectations of taboo or controversial topics to be avoided, what, exactly, to filter is left to the discretion of the companies themselves to implement.

We also found, contrary of King et al, that content they suggested was tolerated was actually routinely censored by the live-streaming companies, albeit in inconsistent ways across each of the platforms.  We also found all sorts of keywords targeted for filtering that had nothing to do with political directives, including censoring of posts by live-streaming applications related to their business competitors.

In other words, our research shows that the social media ecosystem in China — though definitely restricted for users — is more decentralized, variable, and chaotic than what King and his colleagues claim. It confirms the role of intermediary liability in China that Rebecca Mackinnon has put forward, known as “self discipline,” whereby companies are expected to police themselves and their users to ensure a “harmonious and healthy Internet.”  Ironically, that self-discipline often results in entirely different implementations of censorship on individual platforms, and a less than “harmonious” Internet experience as a result.

Our reverse engineering also discovered that YY — the most popular of the live-streaming apps, with over 844 million registered users — undertakes surveillance of users’ chats. When a censored keyword is entered by a user, a message is sent back to YY’s servers that includes the username of who sent the message, the username of who received the message, the keyword that triggered censorship, and the entire triggering message. Nearly a billion unwitting users’ chats subject to hidden keyword surveillance!  Recall that in China companies are required to share user information with security agencies upon request, and Chinese citizens have been arrested based entirely on their online actions.  Recently, for example, one user posted an image of a police report of a person under investigation for downloading a VPN on his or her mobile phone.

On a more technical level, our research shows the value of careful reverse engineering for revealing information controls hidden from the view of the typical user.  The keyword lists we extracted and are publishing reveal exactly what content triggers censorship and surveillance, something that is known only to the decision makers within the companies themselves.  We see this type of research as critical to informing users of the ecosystem within which they communicate.

Sometimes what we find also runs counter to conventional wisdom.  You don’t know what’s being censored if you can’t see the list of banned keywords. Opening these applications up allows us to see them from the inside-out in a truly unbiased way that other more impressionistic scans can only infer.