This newfound vulnerability highlights a potential avenue for cyber-attacks, especially in the context of video conferencing platforms like Zoom, which have experienced widespread adoption. With the prevalence of devices equipped with built-in microphones, the risk of sound-based cyber threats has escalated.
Advanced decoding techniques
The system that the researchers devised employs machine learning algorithms to discern the individual keys being pressed by analyzing the auditory cues produced during typing. This approach mirrors the techniques applied to decipher the WW2 cipher device Enigma in recent years.
To train the system, the researchers meticulously recorded the sounds generated by pressing each of the 36 keys on a MacBook Pro multiple times, using various fingers and degrees of pressure. These sound recordings were captured both via a Zoom call and a smartphone positioned in close proximity to the keyboard.
The researchers then fed a subset of this recorded data into their machine learning system, which gradually learned to recognize distinct acoustic patterns associated with each key. While the precise indicators used by the system remain ambiguous, it was suggested that factors like the proximity of keys to the edge of the keyboard might play a pivotal role in generating distinguishable sounds.
Subsequent testing revealed that the system accurately matched the correct key to its corresponding sound in 95% of cases when recorded via a phone call, and in 93% of instances during a Zoom call.
This study primarily serves as a proof-of-principle and has not been employed to crack passwords in practical scenarios. As such, it underscores the need for vigilance in safeguarding sensitive information. The researchers emphasize that laptops, frequently used in public spaces, face heightened risks due to their common keyboard design. Nevertheless, similar eavesdropping techniques could potentially target any type of keyboard.
To mitigate the risks posed by such acoustic “side channel attacks,” the researchers propose several strategies. Those may include opting for biometric passwords when feasible or implementing two-step verification systems. Alternatively, they recommend utilizing a mixture of upper- and lower-case letters, numbers, and symbols, as the auditory cues for releasing the shift key are difficult to discern. In addition, one should avoid typing passwords and sensitive messages during calls.
The authors of the study claim this was not the first attempt at password identification by sound. However, their method is among the most advanced and has shown the greatest accuracy. The team expects that such models will continue to improve. In light of that, they emphasize the urgency of public discussions on the governance of AI considering the increasing prevalence of smart devices with microphones in households.
The study was published as a part of the IEEE European Symposium on Security and Privacy Workshops.