As we are going accustomed to attendees of voice and that these are in more devices, are going on more situations like that you saw last week’s mass orders since the Echo Amazon obeyed to televisions. But apparently there are a few hidden commands that only voice attendees understand, While we humans seem more like voices from beyond the grave.
The researchers presented the paper entitled “Hidden Voice Commands” have shown it. What have accomplished is that through some changes a human voice command is very understandable for us while attendees of voice recognized it without problems, so that they obey the request.
When the “beyond” is only three metres
Researchers have developed two kits of hidden depending on the type of victim commands: one for Google Assistant and the other for an open voice recognition program source (CMU Sphinx speech recognition system), available at this link (specified as white and black box). In comparison we see that there are slight differences in the audio from one to another, but in both cases they are not too understandable by human beings.
That Yes, as indicated in the explanation by knowing what to listen can predispose us and condition listening, understanding it better to know what we are dealing with, something that would not happen if someone give these sounds in a normal situation (going by bus, in a restaurant, etc.). Is for this reason that neither gave details to human testers in their work, who they were unable to ‘translate’ command (only 25% tried to write at least half sentence).
There are also certain differences depending on which command whether. According to specified Yes the command “OK, Google” was understood by 90% of the time, but things change when it comes to the order itself, since the human testers could understand it only 20% of the time (compared with 95% of Google Assistant).
The differences of understanding according to the command and the receiver (human or not).
How they did these as effective as chilling commands? By resorting to complex algorithms, so them were going to improve as they obtained orders that the human ear is not understood well and that would be understandable by machines. In addition, achieve commands to fool Google Assistant supposed more challenge by not arranging in a public way how processed human orders.
Luckily, the distance is one of the limiting factors. In the video we see that put the speaker (who emits the voice) about 3 meters, and specify that the commands are ineffective from about 3.6 meters. However, a radius of 3 meters It is enough for these commands are used with discretion, even taking into account that they are effective, although there is background noise. Also influences which mobile hear the phrase directly or through YouTube (hence testing at home can leave one lower percentage of effectiveness)
Are we helpless before the commands from beyond the grave?
There are attendees of voice which are beneficial factors for the use (even more for those who still does not have integrated into our routine), as systems type “always-on” that make that we can always resort to them, but that at the same time mean that we are always exposed. In The Atlantic also mentioned talking about these hidden commands of Apple’s new earphones, the famous Air Pods, allowing also to have constant access to Siri.
This, together with a number of sensors It seems to be increasing in the industry, conditions to be “easier” mislead the phones. I.e., larger number of sensors, most likely for anyone other take control (microphones, cameras, etc.), something that is known in the field of research on security as “increase of the surface of attack”.
Then which threat implies that our device follow these external command? Beyond joke that send embarrassing or similar, what we see is that the Assistant obeys and fully understands the command When asked to open a web site, so one of the dangers is that they be ordered to open one with malware or that they cause failures like that you saw months for iPhones.
In this sense, as it is usual in these works, after testing the possibility of attack or deception researchers tried to devise ways to improve systems to prevent the effectiveness of these hidden commands. They determined that it is not enough with a notice (given that can be ignored or even heard in according to environments) or confirmation that we mentioned before, clarifying that also the function that only recognize a voice not always works well at all.
What then is the solution they propose? The machine learning, i.e., machine learning, applying it in the sense that the devices are able to differentiate between a human voice of those processed voices. Also apply filters, so the processed commands do not exceed validation and if humans do so though this difficult understanding by the wizard and manufacturers might show rejection to the idea of incorporating them.