Tue Sep 01, 2020 11:14 pm by entropy
Just skimmed the paper. I can't tell whether they tried it on real audio. Relevant sentence from the paper: "As clicks occur from real-world acoustic signals as depicted in Figure 3(a), we simulate such click timeseries for all possible victim keys in the pool, and obtain their corresponding set of candidate keys". So "real-world acoustic signals" certainly sounds like they used real audio. But this is in a section titled "Simulation Setup and Implementation" so... not real audio? And they tested it on 330424 different keys. I doubt they physically cut that many keys. And I'd expect if they actually tried it experimentally they'd be listing the make and model of the phone in the excruciating detail typical of such papers.
Also this: "SpiKey is able to provide 5.10 candidate keys guaranteeing inclusion of the correct victim key from a total of 330,424 keys, with 3 candidate keys being the most frequent case" (emphasis theirs). Inclusion of the correct key could only be guaranteed if it was a simulation.
In the abstract: "In this paper, we propose SpiKey, a novel attack that ..." So they are only proposing it? Also "As a proof-of-concept, we provide a simulation, based on real-world
recordings" So is it a simulation or real-world?
If they tried it on real audio and it worked, then I'm astonished. When you are inserting a key in a lock there is all sorts of metal flying all over. There is the passing of each key ridge across each pin, which is what they are after. But also you are essentially bumping the lock when inserting the key. There will be pins jumping up and bouncing back down at indeterminate times. And metal tends to ring like a bell, at least to some extent. So you have one event still emitting sound when the next event happens. The part they talk about in the paper, working out where you expect the sounds to occur, is the easy part. The hard part is the signal processing to find the clicks in the mess of audio.