We Chat (But Not about Everything)

Imagine if your favourite social media application silently censored your posts, but gave you no information about what topics are censored.

Imagine if everything seemed fine as you posted message after message and image after image, for days on end with no issues, but then occasionally one of your posts would simply not appear without explanation.

And what if the messages or images you are prevented from posting sometimes seem connected with a controversial political issue, but other times not?  Perhaps it’s deliberate, you might guess. Perhaps it’s just you and your bad Internet connection?  Who can say for sure?

Unfortunately this Kafka-esque situation is the reality for well over a billion users of WeChat and Sina Weibo, two of China’s largest social media applications and among the largest in the world.

Our new report provides detailed evidence from systematic experiments we have been performing on WeChat and Sina Weibo to uncover censorship on each of the applications.  As with prior reports on each of the applications, we are interested in enumerating censored topics — a difficult question to answer since neither of the companies is transparent about what they block.

For our latest research, we focused on censorship of discussions about the so-called “709 Crackdown.” This crackdown refers to the nationwide targeting by China’s police of nearly 250 human rights lawyers, activists, as well as some of their staff and family, since July 9, 2015, when lawyers Wang Yu (王宇) and her husband Bao Longjun (包龙军) were forcibly “disappeared.”  The 709 Crackdown is considered one of the harshest systematic measures of repression on civil society undertaken by China since 1989, and is the subject of much ongoing international media and human rights discussion.  

Unfortunately, as our experiments show, a good portion of that discussion fails to reach Chinese users of WeChat and Weibo. Our research shows that certain combinations of keywords, when sent together in a text message, are censored. When sent alone, they are not.  So, for example, if one were to text 中国大陆 (Mainland China) or 王全璋的妻子 (Wang Quanzhang’s Wife) or 家属的打压 (Harassment on Relatives) individually, the messages would get through.  Sent together, however, the message would be censored.  The Citizen Lab’s Andrew Hilt’s has created a visualization showing these keyword combinations here: https://citizenlab.org/709crackdownviz

In addition to a large number of censored keyword combinations our tests unearthed, we also discovered 58 images related to the 709 Crackdown that were censored on WeChat Moments for accounts registered with a mainland China phone number. (For accounts registered with a non-mainland China phone number, on the other hand, the images and keyword combinations go through fine). This is the first time we have documented censorship of images on a social media platform, and we are continuing to investigate the exact mechanism by which it takes place.

The purpose of Citizen Lab’s research on applications like WeChat and Weibo is to better understand and bring transparency to restrictions such as these. We live in a world in which our choices and decisions are increasingly determined by algorithms buried in the applications we use.  What websites we visit, with whom we communicate, and what we say and do online are all increasingly determined by these code-based rules.  Whether those algorithms are fair or not, whether they respect human rights, whether they make mistakes or not, are all questions that can only be answered if the algorithms can be properly examined.

Unfortunately, many social media hide their algorithms, either for proprietary and financial reasons (they want to protect the “secret sauce” that earns them money) or for political reasons (their algorithms are used to enforce restrictions on speech and they don’t want their customers to know about it).  Our research aims to break through that obfuscation and bring such algorithms to account.

Generally speaking, the algorithms that drive social media censorship or surveillance can operate in one of two ways: either on the client side — meaning, inside the application on your device; or on the server side — meaning, inside one of the company’s computers that runs the service.  Typically, to investigate the former, we rip the application apart — “reverse engineer” it — and subject it to various tests to determine what the algorithm does beneath the surface.

For server-side rules, on the other hand, whatever censorship or surveillance is going on happens inside the company’s infrastructure, making it more challenging to interrogate the rules.  Both WeChat and Weibo perform censorship and surveillance on the server side, so we had to undertake detailed experiments using combinations of keywords and images drawn from news stories and fed into the applications systematically to zero in on what’s filtered.  You can read about these experiments in the full report here: https://citizenlab.org/2017/04/we-cant-chat-709-crackdown-discussions-blocked-on-weibo-and-wechat/

Our report serves as a reminder that for a large portion of the world, social media act as gatekeepers of what they can read, speak, and see. When they operate in a repressive environment like China, social media can end up surreptitiously preventing important political topics from being discussed.  Our finding that WeChat is now also systematically censoring images as well as text opens up the daunting prospect of multi-media censorship and surveillance on social media.