Smart assistants in Russia: market overview, trends and prospects

May 18, 2021, 12:49 (UTC+3)|

2867

At the end of April, SberDevices opened beta testing of the Visper platform to create a virtual presenter which can read a text like a live speaker. It is the first year when Sberbank deals with digital avatars: Nika, a presence robot, was launched in 2018, and Elena, a virtual news anchor, was launched in 2019, the patent for the technology of creating human facial expressions based on text was obtained by the company in 2020. Last year Mail.ru Group introduced its platform with digital presenters.

Against the background of developing smart assistants, there is a trend of developing voice assistants beyond the voice only approach, that means developing voice assistants not limited only to voice interface.

ICT.Moscow spoke with key players of this market in Russia and foreign industry representatives to understand what is happening with the digital assistant industry now and what are the main trends in the near future. A complex picture of the industry has been formed on the basis of the opinions of 17 experts has formed a complex picture of the industry.

The main trends are: the development of multimodality of smart assistants; experimenting with device formats and user interaction mechanics; the growing expectation of secure and convenient voice commerce; hopes and concerns related to voice identification of users; the increased use of smart assistants in business. Here are some aspects of smart assistants that ICT.Moscow discussed with the representatives of the industry:

How the use of digital assistants is expanding

Who makes smart assistants in Russia

2021 — the year of trends

From speakers to avatars

Digital assistants as a workforce

Areas of application of digital assistants

The rise of voice commerce

Smart assistants for smart cities

Biometrics for working with digital assistants

Legal framework for smart assistants

How the use of digital assistants is expanding

According to Strategy Analytics, in 2020, the global market for smart speakers surpassed the mark of 150 million units sold. At the same time, the share of smart screens reached 26%. According to Just AI, by the end of 2023 there will be 640 million smart speakers in the world. Juniper Research experts expect there will be 8.4 billion voice assistant devices in use by 2024.

According to Just AI’s estimates, in 2020 the number of users of voice assistants in Russia amounted to 52 million users. The most popular assistants in the country are Alice (45 million users), Google Assistant (11 million) and Siri (6 million). Part of the audience use several solutions at once. The Just AI survey among smartphone users showed that more and more people use smart assistants: in 2019 71% of respondents have interacted with such services, and in 2020 this figure reached 77%. Every day in Russia in 2020 32% of respondents used voice assistants against 29% in 2019.

Kirill Petrov, managing director of Just AI, explained that 2020 had become a turning point for smart assistants, and in 2021 their popularity will continue to grow.

The demand for smart speakers in Russia is also increasing. Sales of speakers with a voice assistant increased sevenfold in a year’s time. According to M.Video-Eldorado’s estimates, in January-July 2020, the vast majority of sales accrued to devices with Alice. In March Yandex announced that it had sold over 1.3 million speakers with its voice assistant in the three years since launch. Nevertheless, smart speakers have not yet become the main channel of interaction between a person and a smart assistant. The Mail.ru Group claims that smartphones are the leading category of devices with voice assistants.

We estimate that 90 to 95% of voice assistant users use phone assistants. We are observing this trend and are working in the voice first model (multimodal format that allows to interact with voice) rather than voice only. The voice only format has several limitations, they are clearly expressed in the scenarios for choosing, searching or studying information.

Anatoliy Kulbatskiy

Marusya Product Director at Mail.ru Group

Pavel Gvay, CEO and co-founder of Fabble.io, a tool for designing dialogues, also mentions the limitations of the voice only format.

Voice only will develop along with neural networks and voice recognition technologies. However, the potential of this format will always be limited to tasks that do not require visual contact. In this regard, the voice first format has almost limitless potential, inheriting the strengths of both the graphical and the voice interface.

Pavel Gvay

CEO and co-founder of Fabble.io, a tool for designing dialogues

Who makes smart assistants in Russia

Natural language processing (NLP) is the fourth largest area of work in Russia in the field of artificial intelligence (AI): according to the creators of the “Artificial Intelligence map of Russia” (as of April 29, 2021), 52 companies out of approximately 480 work in this area. The top 15 Russian companies who are developing NLP are Yandex, Speech Technology Center, ABBYY, Mail.ru Group, Just AI, Tinkoff, Sberbank, etc. (the list was compiled by the authors of “AI Almanac No. 2. AI Report — NLP” based on an expert survey).

As Anatoliy Kulbatskiy, Marusya Product Director at Mail.ru Group, notes, “ecosystems are the key players in the market of general-purpose assistants”. These players are, first of all, Yandex with its Alice, Mail.ru Group with Marusya and Sber with the Salut virtual assistants. Together with the development of voice assistants, these companies have created their own devices — Station (Yandex), Capsule (Mail.ru Group) and Portal (Sberbank). The latter is currently the only Russian smart screen similar to Google’s Nest Hub or Amazon’s Echo Show.

MTS is also working on its “speaker-assistant” pair. Last summer, the device was handed out to users for testing, but in early 2021 the media reported that the project had “come to a halt”. Tinkoff also has its own voice assistant “Oleg”; its core functionality is financial management, but it is also capable, for example, of answering incoming phone calls (when using Tinkoff Mobile).

During the past year we saw the activity, investment and development of specialized assistants for solving specific problems: bank assistants, assistants for ordering services, answering user questions. I think that this year or next year the formation of players in the market for general-purpose assistants will finally be completed, and assistants for solving specialized tasks will more actively come to assistant-platforms. This step will allow to reach out to a large audience and provide people with the experience of interacting with a specialized assistant.

Anatoliy Kulbatskiy

Marusya Product Director at Mail.ru Group

Stepan Mitaki, head of the My Moscow mobile application, agrees that one of the current trends is the emergence of narrowly specialized voice assistants, each of which is aimed at solving specific user problems. An example of such an assistant is “Oleg”, which was described as “a voice assistant for financial and lifestyle services”. Experts recently discussed in Clubhouse that over time, companies will be less likely to create their own independent smart assistants and focus more on specialized skills within open platforms. For example, Pavel Kaplya, the head of the Yandex.Dialogues service, noted that “businesses should not set the task of making their own assistant — they need to think about how to effectively and concisely enter other general-purpose assistants”.

Another trend of the industry (which, however, the participants of the discussion called controversial) is the opening of platforms for third-party developers to create new skills of smart assistants, in other words — focusing on a model somewhat similar to the open source principle. Based on this model, Alice’s skills, Sber’s smartapps (applications which allows to promote goods and services on smart devices with a built-in Salut voice assistants) and Marusya’s skills are created. Experts see this model as similar to writing apps inside the App Store and Google Play, and predict that over time this area will gather pace and the mechanisms for creating skills will become simpler. But at the same time, they do not unequivocally claim that the industry will develop exactly according to this scenario.

2021 — the year of experiments

The experts with whom ICT.Moscow discussed the trends in the development of digital assistants do not expect drastic changes in 2021, but they expect the emergence new mechanics of user interaction with smart assistants and foresee experiments with digital avatars and various devices.

I don’t think 2021 will be a turning point in the development of voice tech. The peak of expectations is over, now companies are more likely to experiment with screens, avatars, Emotional AI and wearable devices. These experiments may well ensure the development and change of the market structure in the coming years.

Pavel Gvay

CEO and co-founder of Fabble.io, a tool for designing dialogues

Mikhail Burtsev, head of the Neural Networks and Deep Learning Laboratory at MIPT, notes that assistants will become cross-platform, and reminds that Alice is already available in the speaker, TV and car. The Speech Technology Center CEO Dmitry Dyrmovsky also speaks about the experiments. He notes that “banks and financial institutions traditionally give preference to modern AI solutions to improve user experience, they have already realized their effectiveness and will continue to conduct experiments”.

The advantage of voice only format will remain, and we will also observe the transition from voice only to combined devices. We do not predict exponential growth in 2021; it will happen in the future three years.

Dmitry Dyrmovsky

CEO of Speech Technology Center

Co-founder and COO of Neuro.net Alexander Kuznetsov believes that “the potential of voice assistants has not yet been exhausted and there is definitely space for growth”. “It is possible that new formats will emerge, and the prerequisites for this are already appearing in the market”, he adds.

More and more different life scenarios involve voice interaction, and this creates space for the introduction of virtual assistants and the possibilities for omnichannel interaction with users, switching between devices and formats (voice-to-text) while maintaining context.

Arkady Sandler

expert in conversational interfaces and voice technologies

Pavel Gvay, co-founder of Fabble.io, says that the potential of the voice only format will always be limited to tasks that do not require visual contact. “The voice first format has almost limitless potential in this regard, inheriting the strengths of both the graphical and the voice interface,” the expert claims. Holger G. Weiss, head of German Autolabs, also highlights the limitations of voice only assistants, especially when it comes to interacting with lists. “That is why we are convinced,” he says, “that a combination [of formats] will win — at least for more complex use cases. Smart speakers will still be great for playing music and turning the lights on”.

Big changes await us, and the key changes lie in multimodality, when interaction with an assistant takes place using both voice and visual elements. As for smart screens, assistants integrated into smart TVs will be widely used.

Roman Doronin

CEO of EORA

Kirill Petrov, Managing Director of Just AI, recalls that at the end of last year, sales of smart displays started in Russia. According to the expert, “smart screens give more expressiveness and open up new opportunities, for example, video shopping”. At the same time, Roman Doronin from EORA does not expect a large demand for such devices and believes that smart speakers with a screen in 2021 will remain “devices for experts”. Denis Filippov, CTO of SberDevices, believes that the range of devices with virtual assistants will be actively increasing in the near future: any home appliance from a refrigerator to a TV is a place where an assistant can be installed.

Smart display architecture embodies a multimodal approach — a synergy of visual, voice and touch interfaces. The trend towards multimodality will continue to grow and will gradually reorient the market from voice only to voice first, although in 2021 the concept of voice only will remain the mainstream.

Kirill Petrov

Managing director of Just AI

Igor Kalinin, founder of TWIN (creates an automated communications platform), is convinced that in terms of technology, a turning point in the field of voice systems has already come, the next step is scaling, including in the Russian market.

From speakers to avatars

SberDevices, the same division that released the first smart screen in Russia, is involved in the creation of digital avatars at Sberbank, together with other structures. They note that an avatar is needed to help business deliver content to an audience without searching and attracting live speakers, that means the process will be faster and cheaper. They described their virtual presenter in the same way in Mail.ru Group. At the time of the presentation of the service, the company predicted that by 2022, 79% of Internet traffic in Russia will be online video.

In our view, virtual personalities are unlikely to be able to massively replace real people from any sphere in the near future, but they are able to supplement the interfaces for the client’s interaction with information.

Denis Filippov

CTO of SberDevices

During a conversation in Clubhouse with profile experts, Fabble CEO Pavel Gvay spoke about the possibilities of multimodality and noted that, probably, “in the future we will be able not only to hear the assistant, but also to see his avatar with facial expressions”.

Another division of Sber, AR/VR Lab, is also engaged in developing digital avatars: in February, a free alpha access to the service was opened, the service creates facial animation of a 3D character from a sound file with a recording of a person’s speech. Holger Weiss, founder and CEO of the German Autolabs company, which develops voice assistants for the logistics sector, also points to the prospects for the interpenetration of augmented and virtual reality technologies with smart assistants.

Voice assistants will have new AR and VR cases, for example, in the service and manufacturing areas.

Holger Weiss

Founder and CEO of German Autolabs

There are already examples of digital avatars being used instead of TV presenters. For example, in the fall of last year, this technology began to be used on MBN, a Korean TV channel. Journalists believe that a virtual presenter can be especially useful in covering emergencies in case there is no right specialist. But the replacement of presenters or announcers with smart assistants is not always perceived positively: recently, the Moscow Department of Transport in the competition for the Metro announcer received applications submitted on behalf of Alice and Salut assistants, but still chose living people.

Alexander Kuznetsov from Neuro.net notes the increasing availability of technologies — including for medium and small businesses — and also speaks of the trend of introducing smart assistants into user interfaces. Denis Filippov from SberDevices emphasizes that digital avatar technologies can significantly diversify the video content market, reducing production costs. But the question of successful business models of such solutions remains open, the search for new options for their application continues.

Digital assistants as a workforce

Voice tech developers claim that smart assistants, performing part of the functions of people, will not replace human workers.

We are against the idea of firing employees and replacing them with digital assistants. Our technologies allow you to release a person from routine tasks, monotonous work that can turn a person into a robot. Our digital agents can handle about 80% of standard cases.

Aleksandr Kuznetsov

co-founder and COO of Neuro.net

MegaFon representatives believe that new professions emerge with the development of technology. For example, the development team of Elena, a virtual assistant, includes configurators and dialogue designers, but five years ago there were no such jobs on the Russian market.

Even if voice assistants do not replace humans, they will have a great influence on human labor. Analysts at Gartner at the end of last year included the increase in labor productivity due to the use of speech technology in the top 10 strategic forecasts. They estimate that by 2025, 75% of all conversations at work will be recorded and analyzed, including through smart speakers. Gartner also sees hyper-automation as one of the global technology trends, which includes the use of AI and virtual assistants.

Areas of application for digital assistants

The experts agree that digital assistants are most actively introduced in the banking sector. At the same time, cases of using voice assistants, chatbots and smart avatars can be found not only in banking, but also in medicine, customer support, transport, city services, education, culture and media.

The financial industry, primarily large banks, will remain the driving force. Call center automation will remain the main area of application. The general trend for the next few years is the introduction of assistants in areas where there is a lot of interaction with customers, for example, in online stores.

Mikhail Burtsev

head of the Neural Networks and Deep Learning Laboratory at MIPT

Alexander Kuznetsov, co-founder and COO of Neuro.net, calls the banking and financial industries and telecom the most active in the implementation of voice assistants. He expects that these industries will be joined by major players in retail, e-commerce and services.

First of all, these are banks and retail — and this is obvious: banks and financial institutions traditionally give preference to modern AI solutions to improve user experience, they have already realized the effectiveness of these solutions and will continue to experiment. The key driver of growth is freeing employees from routine tasks, automating standard queries, searching for a relevant user response in a short period of time in order to save the client’s time.

Dmitry Dyrmovsky

CEO of Speech Technology Center

The co-founder of Fabble.io Pavel Gvay says that the banking sector, medicine and vehicles the most promising areas. According to him, in medicine and the banking sector, it is necessary to collect a lot of information and answer the same type of questions: how to make an appointment with a doctor, what test needs to be done before an appointment. But in terms of highly qualified services, for example, doctors and consultants, digital assistants are unlikely to be replaced in the near future, the expert adds.

Just AI experts also mention that a smart assistant is originally the solution created by IT companies, and they say that Internet companies (Yandex, Mail.ru) and large banks and financial institutions will be the driver in the development of voice assistants in 2021.

In the medium and small-sized business segment, integrated voice assistants will become popular in 2022-2023. Meanwhile, the most popular conversational AI technologies in these companies are relatively simple scenarios, such as robotic calls and informing companies’ customers.

Kirill Petrov

Managing director of Just AI

Igor Kalinin from TWIN company says that over time, bots will emerge in all B2C industries. The only problem is that the Russian consumer is not yet used to communicating with bots.

This is similar to the situation with self-checkout counters — often people prefer to stand in line, just not to checkout and pay for goods on their own. They are afraid to do something wrong. The same applies to bots: many people think that the dialogue will be unsuccessful, and AI will not solve the problem. At the same time, nowadays it is becoming more and more difficult to distinguish a virtual specialist from a real operator. And you may have communicated with a bot without knowing it.

Igor Kalinin

founder of TWIN

Managing director of Just AI Kirill Petrov says that the voice search for goods in an electronic catalog is among the new scenarios that are gaining popularity. “This trend partly explains the fact that in the US more than 45% of users would like to be able to interact with mobile applications by voice”, he explains. “In addition, we will see more smart devices in commercial organizations, for example, in hotel rooms”.

Stepan Mitaki, head of the “My Moscow” mobile application, also mentions the fact that smart devices go beyond apartments. According to him, “in the West, you can now find voice assistants in hypermarkets or in various service institutions. And people are not afraid to talk to them”.

CEO of EORA Roman Doronin also pays attention to the efficiency shown by projects where different technologies are combined, for example, natural language processing and computer vision.

Another example of combining technologies would be digital avatars, which use speech technologies with the realistic video images generation. They primarily target industries that use audiovisual content, such as media.

The rise of voice commerce

Making purchases using a voice assistant is one of the basic functionalities that were announced during the presentation of both Alice and Salut. So far, however, commerce is not on the list of the main user scenarios for interacting with virtual assistants. Just AI polls show that in Russia, voice assistants are most frequently used to search the Internet, to navigate, to find out the weather forecast, to call, set an alarm or turn on the music. Dmitry Dyrmovsky, CEO of Speech Technology Center, states that so far most of the skills of voice assistants have a clear entertainment priority, and business orientation is only gaining momentum.

In 2018, experts from OC&C Strategy Consultants company optimistically predicted that by 2022 the volume of the voice commerce market in the United States will reach $40 billion and this sales channel will change retail. According to them, 36% of owners of smart speakers have already used these devices for shopping (for other analysts this figure was lower — 22% according to Edison Research and 23% according to Voicebot). Experts from Juniper Research in November last year predicted that in the next five years, the number of purchases using voice in smart home devices will grow by 630%, and about 20% of all purchases will be made using smart screens and smart TVs. By 2025, the value of transactions using voice on smart home devices will reach $164 billion.

Roman Doronin from EORA agrees that 2021 will be a breakthrough year for the commercialization of voice assistants. According to him, “the trend for this is set by Sber with the Salut assistants ecosystem and the ability to integrate payments into different types of applications”.

For now, monetization of music via subscriptions for all assistants on the market has been implemented, the next stage is the development of payments for digital goods (games, audio content, other subscription services) for smart devices, as well as non-digital for the platform in mobile applications and the introduction of assistants to simplify user scenarios where users need to enter data, re-order or order a specific product.

Anatoliy Kulbatskiy

Marusya Product Director at Mail.ru Group

At the same time, Anatoliy Kulbatskiy from Mail.ru Group draws attention to the existing restrictions on the commercialization of both digital content and non-digital goods in Russia. Kulbatskiy points to a relatively small market of devices for digital goods (about 1.5 million devices in the Russian Federation) compared to the market of smartphones, PCs and TVs. Since “the dominant category of use of voice assistants are smartphones, the sale of digital goods via assistants falls under the regulation of sales on Apple and Google platforms”, he emphasizes. On the other hand, payment with voice confirmation is at an early stage, and users do not have a “buy using voice” pattern. But the expert expects a number of new and interesting solutions for purchasing goods, paying for services and payments to appear on the market this year.

We believe that 2022 will be a breakthrough year, but this year will also be important for customized voice assistants. Apart from banks, retailers will have similar assistants this year. Voice commerce in Russia will develop in line with global trends. The consumer pattern in terms of voice shopping is largely shaped by smart speakers and screens.

Kirill Petrov

Managing director of Just AI

The expectations of other experts are more restrained. For example, Arkady Sandler, an expert in the field of conversational interfaces and voice technologies (he was the CEO of “Nanosemantics”, a chatbot development company, and supervised the creation of “Marvin” smart speaker and voice assistant at MTS), believes that we will not see a boom in voice commerce this year, although he expects that experiments will be conducted in this area.

In 2021, more than a quarter of which has already passed, the process of users getting more accustomed to the product and the way of interaction will take place. This will lead to an increase in the user base, coverage by class of products. As for monetization, I do not think that massive monetization will begin in the area of general-purpose assistants this year, but there can be some experiments.

But assistants for special purposes are created and will be created in order to provide some kind of business model, optimize the business process, etc. Actually, such assistants started to be created long before general-purpose assistants. The very existence of special purpose assistants is the proof of economic feasibility.

Arkady Sandler

expert in conversational interfaces and voice technologies

Alexander Kuznetsov, co-founder of Neuro.net, is convinced, that data confidentiality is the main limitation for the commercialization of smart assistants. He says that the participants in this fast-growing market need to pay a lot of attention to this issue.

Denis Filippov, CTO of SberDevices, points out that at present smart assistants practically do not bring profit.

While virtual assistants are practically not monetized, for now it looks more like an investment. The local market has not yet reached maturity, but now companies are gradually discovering commercial models in which assistants act as guides between the customer and the purchase.

Denis Filippov

CTO of SberDevices

Nikita Murenky, VUI Team Lead of the TORTU conversational product design and development team, discusses the difficulties of another type of commercialization — payment for individual skills of assistants, rather than making purchases using it. In his opinion, in Russia the problems with commercialization are the same as in the rest of the world: “firstly, it is difficult to find the right skills in assistants, although Amazon and Google platforms are doing a lot to change this; secondly, the use cases are either of little value, or the user is simply not ready to pay for them yet”. Today, the culture of using smart devices in Russia and the world is only being formed, the expert emphasizes.

A user who has bought a device and regularly pays for a subscription to services believes that he has already paid for an assistant — for him, this is part of the product. He does not perceive the assistant skills as a separate product and does not understand why he has to pay for something else.

Nikita Murenky

VUI Team Lead of KODE’s TORTU conversational product design and development team

Another factor holding back the growth of the segment of smart speakers and other devices with smart assistants is the availability of electronic components to manufacturers. A representative of MTS draws attention to this. “There is an acute shortage of AI chips all over the world, and there are very few companies that already have ready-made chips and products based on them”, he says. “We estimate that the AI chip market will grow by an average of 25% annually”. Also, the expert added that to solve this problem the company had invested $10 million in a startup — the manufacturer of AI chips Kneron.

Smart assistants for smart cities

Over the past few years, smart assistants have begun to be used to simplify the receipt of various social and other services. For example, there is a beta version of the digital assistant on the federal portal of public services, smart chat bots are used in various services of Moscow.

In our experience, Moscow and the Moscow Region initiate projects, and then the successful experience is scaled in the regions. For example, our joint project with the Ministry of Health of the Moscow Region is indicative here: in December we launched digital operators to make an appointment with a doctor, first on the hotline of the Moscow Region governor, and then the case was introduced in several more regions.

Aleksandr Kuznetsov

co-founder and COO of Neuro.net

Dmitry Dyrmovsky says that the Speech Technology Center receives more and more requests for intelligent dialogue systems, which become a convenient communicator, mediator between the city and its residents. As an example, he mentioned the “Alexandra” chatbot, created jointly with the Moscow Metro team, which answers 88% of passengers’ questions without transferring to an operator. And the head of the laboratory of neural systems and deep learning at MIPT, Mikhail Burtsev, says that in Tatarstan, on the basis of the open library DeepPavlov, they have developed and implemented “Lilia” — an intelligent assistant for public services. She can answer questions about COVID-19, register for vaccinations and take meter readings.

Arkady Sandler says that one of the most frequent options for the implementation of cognitive automation technologies is the creation of chatbots according to their subject areas, which, in fact, is the development of specialized virtual assistants. The main direction of work of states in voice tech is the implementation of AI into hotlines, summarizes Nikita Murenky from TORTU, adding that at the level of regional MFCs this is already happening in Russia right now.

A great advantage of working with voice assistants in this area lies in a typical case: a user of public services, as a rule, knows what he needs, but does not know how to formulate a request for a service. This is the perfect case for AI, and the voice allows to speed up interactions. The main barrier to implementation is the underdevelopment of the market and lack of experience.

Nikita Murenky

VUI Team Lead of KODE’s TORTU conversational product design and development team

Boris Mayatsky, a representative of thee “Citywide contact center” product of the IT Department of Moscow Government, considers it more promising for urban tasks to develop individual solutions taking into account information security measures, although some services will be implemented using the skills of voice assistants, for example, Alice or Salut. Stepan Mitaki, head of the “My Moscow” mobile app, speaks in favor of the combined approach. There are situations in which a particular solution can better cope with the user’s task and people have a greater level of trust in it. In some situations, it is possible to help a person through integration. The latter is most relevant for obtaining reference information.

Experts from MegaFon see a high interest in voice assistants from the state and say that it has especially increased during the pandemic. The press service of the telecom operator adds, that in government agencies, voice assistants are most often used to optimize the costs of routine processes: providing reference information, collecting data on metering devices, etc.

But there is also an opposite point of view: Oleg Kovpak, product director of ID R&D, does not yet see much interest from government agencies. “Despite the fact that such services would make it possible to automate the titanic volumes of requests from citizens, such implementations are still rare in Russia”, he explains.

Biometrics for working with digital assistants

The use of digital assistants is impossible without reliable protection systems. The industry is now exploring the possibilities and weaknesses of one of the options for such protection — voice biometrics (identification and authentication of users by voice). In mid-April, it became known about the government’s intention to restart the collection of biometric data of citizens, including voice samples, for the Unified Biometric System (UBS). Experts see voice biometrics as a key to new business models for smart assistants, but they are cautious in assessing the timing of widespread adoption of the technology. The central issue is still the issue of security, but the prospects and possibilities of interaction between business and the UBS are not yet clear.

The ability to personalize services and provide them only to authenticated users will enable businesses to scale up the use of voice assistants faster, and this is the area where voice biometrics will be widely applied.

Oleg Kovpak

Product Director of ID R&D

Arkady Sandler emphasizes that for the use of voice biometrics in sensitive operations, sufficient legal security is required: either regulation, or a clear explanation to the user that he is acting at his own peril and risk.

Product Director of ID R&D Oleg Kovpak lists the factors necessary for accurate voice authentication: it should work on sufficiently short phrases, should not depend on the text of the phrase, and should be protected from possible attacks (for example, playing a command recorded on a dictaphone or a synthesized voice).

According to the expert, such technologies already exist. The UBS does not yet support such scenarios, although the legislative obstacles were removed at the end of last year, Oleg Kovpak says. In addition, some of these scenarios may be tied to voice processing on the device, rather than in the cloud. “I believe that the widespread use of biometrics depends not on the number of samples in the UBS or Sberbank database, but on the availability of services demanded by end customers,” the expert says. “The UBS and Sberbank have an excellent base for providing biometrics as a service to other companies, but it is not yet clear whether they will develop this potential”.

The introduction of the UBS will certainly contribute to the spread of voice biometrics, especially in the banking sector. The voice sample, left in one bank, can be used by other banks. This will simplify access to banking products.

Nikita Murenky

VUI Team Lead of KODE’s TORTU conversational product design and development team

Nikita Murenky believes that it is better to combine voice biometrics with more familiar authentication methods. He explains this by the fact that “the biometric accuracy of the voice is in a fairly wide range of 90-99%”. In addition, using voice is inconvenient in crowded and noisy places, especially when it comes to confidential data, not to mention the fact that a voice sample can be stolen, and this is practiced by telephone fraudsters now.

Mail.ru Group ICT.Moscow says they will consider the option of integration with the UBS, if it is useful for users, but they also focus on the development of their own technologies and solutions. Neuro.net co-founder Alexander Kuznetsov believes that the participation of the state and large players can accelerate the implementation of the technology, but expects that it will be actively used no earlier than next year.

The citywide contact center does not plan to implement voice identification in city services and make payments by voice. “Within the framework of the city contact center, the applicants as well as the legal and regulatory framework are not yet ready for this”, explains Boris Mayatsky, a representative of the “Citywide contact center” product of the IT Department of Moscow Government. “Calling the payment service by voice within the mobile app is, of course, a simple function, but the identification and acceptance of the payment will still be carried out using the usual methods”.

Voice biometrics is a very interesting and promising technology, but it is still not sufficiently developed. For example, it is not entirely clear what level of security it can provide. This is especially true with the growth of deepfake and voice synthesis technologies. We study and test this technology, but there is still a long way to go before it can be fully implemented.

Stepan Mitaki

head of the “My Moscow” mobile application

Roman Doronin from EORA emphasizes that voice biometrics systems must be resistant to different types of attacks. “And this complexity lies not in the amount of data for training models, but in the logic of the security system and the mechanic of human validation. Attackers do not even use a deepfake now, but simply pre-record phrases while they are talking to you, and can send them to the model’s input”, he explains. Dmitry Dyrmovsky, CEO of the Speech Technology Center group of companies, also sees prospects in the combination of voice and facial biometrics. In his opinion, it will be not only convenient, but also safe.

Alexander Kuznetsov from Neuro.net, on the other hand, says that using the so-called “voice fingerprint” can effectively combat fraud, spoofing (voice substitution or synthesis) and collect a database of fraudsters’ voices.

Voice identification is not only a way to new services, but also a way to improve existing ones. For example, Anatoliy Kulbatskiy, product director of Marusya at Mail.ru Group, believes that there is a number of scenarios when it is important to determine whether a child or an adult is talking to an assistant in order to form the correct set of content.

We know that a home device is often used by several family members and sometimes by guests. The introduction of biometrics should address the personalization of content. The user will be able to listen to his music, return to the place in a game, where he himself, not his relative, has stopped.

Anatoliy Kulbatskiy

Marusya Product Director at Mail.ru Group

Biometrics will be developed and it will help distinguish users for accessing sensitive data — payments, mail, correspondence on social networks, adds Kulbatskiy. This is a normal evolutionary development of the assistant's functionality. Dmitry Dyrmovsky, CEO of the Speech Technology Center, also speak about the ability of smart assistants to distinguish family members and differentiate access rights, forming relevant proposals. But he emphasizes that the main thing is to provide an opportunity to perform financially significant transactions to a strictly defined circle of people.

The legal framework for smart assistants

Experts from one of the Russian IT companies, during a discussion about voice tech in Clubhouse in February, argued that domestic voice systems are in many ways more developed than foreign ones due to the limitations faced by developers in other countries. Experts with whom ICT.Moscow discussed this issue partly agree with this statement, although there is no complete unanimity on this point.

Today in Russia it is a little easier to work with personal data than in Europe, but we understand that regulation will be developed. Whether this is good or bad depends on how this regulation will be introduced, to what extent the real scenarios and interests of all stakeholders, including business, will be taken into account.

Kirill Petrov

Managing director of Just AI

Arkady Sandler notes that his colleagues in other countries do not feel constrained when they comply very predictable laws. “Where there is no clear regulation (not necessarily prohibitive, by the way), there is freedom of interpretation, and the tradition of interpretations by law enforcement agencies in the Russian Federation, to put it mildly, is opportunistic motivated and prone to bias”, the expert adds.

In Europe, where there is a GDPR, and in the United States (California), the requirements are much stricter, we know this, including from our international projects. But even when working in Russia, we still have many questions regarding regulatory controls. For example, all companies-users of our service must receive consent to advertising calls from their customers, and when using our service in the cloud, they must also receive consent to the transfer of anonymized personal data of their customers to us. Although in fact we do not have access to this data, they are temporarily stored in our cloud storages.

Aleksandr Kuznetsov

co-founder and COO of Neuro.net

In April, the European commission prepared rules for regulating artificial intelligence systems. In particular, the rules classify chat-bots as “moderate risk” and instruct to clearly inform the user that they are not interacting with a person. And remote biometric identification systems are classified as “high risk”, which imposes even more restrictions and requirements on them.

Oleg Kovpak from ID R&D is convinced, that in Russia, there are rather tight restrictions, especially in terms of biometric personal data, and the latest changes signed by the president at the end of last year tighten them even more.

In my opinion, this does not in any way improve the ability of businesses to use biometrics, especially in already regulated sectors such as banking. Providing information security for working with biometrics puts them on the brink of economic feasibility, and in some places makes them simply technically impracticable, for example, due to the lack of cryptographic protection of the required class on target platforms or devices. If the regulatory trend continues, then possible fears are not unfounded, both the commercial biometrics industry and the state sector represented by the UBS may become more complex, and this will ultimately hit ordinary users.

Oleg Kovpak

Product Director of ID R&D

A representative of MTS speaks about the need to refine the existing standards. The company considers it important “to make point adjustments to the legislation on personal data so that companies have the opportunity to process pre-anonymised data, including those accumulated by the state, regulated by law”, and “at the legislative level, simplify the procedure for converting personal data into depersonalized information and allow the use of such information”.

The successful development of the market of smart assistants based on AI technologies requires an increase in the amount of available high-quality data and the creation of supportive environment for its use.

Alexey Merkutov

press secretary of the MTS Group

Igor Kalinin from TWIN has the opposite point of view. He believes that in Russia bots are still minimally limited by regulators — and this gives developers more freedom. But the lack of legislation also indicates a lack of recognition. In his opinion, voice technologies do not yet seem to be a priority area for the government. Moreover, in order to build cooperation with state-owned companies, it is necessary to overcome many restrictions. But at the same time, he recalled that the Ministry of Digital Development intends to provide public services in a dialogue mode with a smart assistant, and, according to the expert, this plan can be implemented in the next few years.

Original (in Russian)

Subject:

#development

Technologies:

#voice_assistants #artificial_intelligence #biometry

Companies:

#Yandex #MTS #Sberbank #T_Technologies #VK #Speech_Technology_Center #Fabble #EORA #Just_AI #Twin #KODE #DIT #MegaFon #MIPT