The myth of privacy and End-to-End Encryption in Zoom

Zoom is a service to host meetings without the need to install and configure complicated applications: it is one of the many solutions for those who, during quarantine, want to keep in touch with friends and relatives.

In this last period of lockdown, Zoom is becoming popular because of how simple it is to host meetings, without a complicated setup. You share a link, and all the participants are connected. This has allowed many professors to implement distance learning using the Zoom.us platform.

However, another debate is coming to life; in fact, while it is true that this platform allows millions of people to stay in touch with friends, family and colleagues, there are many doubts about its effectiveness in protecting users’ personal data and therefore their privacy.

My intention is to investigate in a technical way what data is actually sent to the Zoom servers. The first part of the analysis will take into consideration the privacy policy applied before March 29th and then move on to a technical assessment of whether or not this policy has been respected.

This article won’t be the only one where I will deal with this topic and will be followed by a second “part” in which it will explain how the client works and what happens when creating a meeting.

Zoom.us Privacy Policy

Why should we start this analysis with the privacy policy? Essentially because it is an example of brand protection policy that, through a series of legal terms, aims to armor its reputation in the face of the doubts that arose after the case of the vulnerability found on the Zoom client for the iOS platform. It is also interesting to see how this policy has changed in a time window of about 10 days.

In fact, in the version extracted on web.archive.org, dated March 18, the differences with the version currently in force are clear. Through an overt play on words and the replacement of some headwords as “third parties” the current version appears very “privacy compliant”:

We do not use data we obtain from your use of our services, including your meetings, for any advertising.

This paragraph, however, is somewhat contradicted by this:

To personalize marketing communications and website content based on your preferences, such as in response to your request for specific information on products and services that may be of interest.

In summary, Zoom states that it guarantees privacy by not exploiting any user data for business purposes. Unfortunately, this statement does not reassure and does not actually clarify whether and how user data is handled. So the question is, can my privacy be considered guaranteed if Zoom does not sell my data?

In order to resolve any doubts, all that remains is to technically analyze what happens during a meeting with Zoom and confirm or refute “their version of the facts”.

Analysis: inspect requests

Moving on to the technical part I preferred the use of ZAP, a web application scanner along the lines of the much more famous Burp, which allowed me to capture individual requests in https and its Websocket sessions.

Once you have logged into a meeting on Zoom, you will notice that the first request sent by the source host has zoom.us/j/config as its destination, which allows the client itself to obtain all the details of the meeting, including the list of endpoints and the parameters to establish receiving and transmitting data with the Zoom infrastructure. Like any HTTP request, the data that is sent is divided into two portions: the header and the body.

In the header, we can find:

  • a cookie containing a unique fingerprint ID named _ZMMTG_TRACK_ID;
  • user-agent: Zoom client version, operating system and architecture (x86 or x64);
  • ZM-CAP: number
  • ZM-PROP: hard-coded string per client, for Windows it’s Win.Zoom, for MacOS it’s Mac.Zoom
  • ZM-RF: 1
  • ZM-ORIGIN: US (it could be US, CN, JP, DE, FR, PT, ES, KR based on the selected language)
  • ZM-LOCALE: “DEF” for all the countries except for China (the value will be “CN”)
  • ZM-CID: a base64-encoded string that contains device_id salted with md5 algorithm
  • ZM-NSGN: concatenation of the salted device_id (probably used to match the User Agent to the device type) and timestamp of the request

As body, we might find:

  • username with which you participate in the meeting
  • _ZMMTG_TRACK_ID which must be the same as the one sent in header
  • client version used
  • “client” for those who join the meeting, otherwise “server” if you are the host;
  • il meeting ID
  • meeting password (if the meeting is not available via public URL)
  • source (usually the referral source, it could be via “link” or via “dashboard”)
  • timestamp

Once the list of endpoints is obtained the connection is set to end-to-end (or peer-to-peer) mode through the use of the web socket used by the Zoom infrastructure. This connection establishment mode remains the same on all platforms such as Microsoft, Apple and Linux.

Android app: a particular case

For Android, which is a special case, the operation is different. In fact, three pieces of information are sent first:

  • hardwareInfo
  • meeting-id
  • personal-zoom-id

What arouses a certain amazement, but also curiosity, is the getHardwareInfo() function different from the implementation we find in the SDK provided by the company. As shown below in the screenshot, this function creates a kind of diagnostic report on the device used including information about the brand, model, operating system up to the country and whether or not there is a change for the use of root permissions.

The code snippet of Zoom Android Application

Going on with the analysis of the apk there is a further section in which the presence or absence of the binary “on” is checked to acquire the user of another user but in particular of root.
This control is unusual especially for an application used exclusively for online meetings.

Screenshot of the root section of Zoom App

The end-to-end encryption myth

Returning to the topic “privacy” it is very interesting to highlight how the so acclaimed “end-to-end” communication really works. Reading carefully what is reported in the screenshot below, Zoom ensures that every session between two or more hosts would be encrypted, I use the conditional because, as also stated by the Zoom spokesperson:

“Currently, it is not possible to enable E2E encryption for Zoom video meetings. Zoom video meetings use a combination of TCP and UDP. TCP connections are made using TLS and UDP connections are encrypted with AES using a key negotiated over a TLS connection."

Zoom statement about end-to-end encryption

Although end-to-end encryption is supported, sessions are not encrypted as the user expects. End-to-end encryption is based on the criterion that any information exchanged between two or more hosts is encrypted and is only decrypted by the sender and the recipient, making it impossible for a third party to enter and retrieve the information.

Here, in the case of Zoom, the third party institution is precisely the server infrastructure that takes charge, in the end-to-end scheme, only to transmit the message to the recipient without displaying or manipulating it. The recipient, in this model, does nothing but connect to the Zoom infrastructure and access the encrypted message sent by the sender and then decrypt it thanks to the key in his possession.

However, this does not seem to be the case with Zoom, since both the Desktop client and the Web client use TLS technology to communicate with the server.

TLS technology is a communication protocol based on packet encryption using keys. The keys for this symmetric encryption are uniquely generated for each connection and are based on a “secret” that was negotiated at the beginning of the session (handshaking). However, the server communicating with the client must be aware of the “secret”.

This implies that Zoom can access any content on its servers; this shows how the word “end-to-end” is only added to enhance “the corporate image”. In fact, Zoom’s documentation only mentions the possibility of end-to-end encryption of meeting chats, provided you subscribe to a paid plan. “If it’s free, you’re the product”.

Conclusions

Summing up, it can be said that Zoom is certainly a versatile solution, easy to use and accessible by everyone on any platform known today in the home environment. However, there remains a deep uncertainty on how the privacy of the end user and the device used by them is guaranteed and protected.

As a researcher, but also as an astute user, I expect the company to point out this sort of foggy situation to me and make it absolutely transparent.

Although Zoom can, to date, be considered the most successful platform, there is the possibility to take advantage of other more open solutions including Jitsi which is perhaps the most viable alternative.

Jitsi is a set of open source projects that allows you to easily create and implement secure and private video conferencing solutions. At the heart of Jitsi are Jitsi Videobridge and Jitsi Meet, which allow you to hold conferences on the Internet. The biggest advantage is the “self-hosted” nature that allows users to install Jitsi on their server to manage their data more directly.