DROTR VOICE (under development)

At the stage of commencement of DROTR VOICE development we have thoroughly analyzed all the services currently available worldwide.


- Google (no feature of voice stream recognition by http interface is available, and thus the user has to press the speech recognition activation button before each phrase all the time / it is impossible to convey the original speech to the other side of communication because Google takes over the microphone / it is impossible to use it on iOS platform in our application);

- Nuance (business-model not adapted for everyday use of the service: user tariff package is limited to mere 40 transactions daily (this is unacceptable to our clients) or envisages a transaction fee – the option which we have to use now but which is very expensive: 0.7 cents per transaction +30% of market commission, making the cost for us as much as 1 cent per a transaction / no trial period / a long bureaucratic procedure of application moderation (instead of the stated 5-7 days for version moderation, a response to moderation of an application lasts about 25 days on the average) / poor recognition quality and VAD algorithm on the proposed SDK. Noise blanking is virtually unavailable / as a temporary solution in DROTR-Local we integrated our own algorithms which allowed to improve the recognition quality, and introduced our improved VAD / it is impossible to use it for clients with Android, Windows devices because of high prices for the services);

- iOS-7 (it is impossible to re-assign the recognizer launch button, the recognizer may be activated only by a button on the keyboard / no http interface is available / too little languages recognized).

Implementation of our speech recognition technologies will allow us to ensure the following:

- Users of all platforms will not have to press the button before each phrase any more. The principle will be «call and speak»;

- Original speech will be conveyed (even if the language of another person is unknown, it is important to feel the emotions, intonation, pauses);

- Convenient tariff packages envisaging sale of minutes of communication with translation or a fixed subscription for a certain period of time;

- Trial period;

- B2B service as the segment of promotion and formation of financial flow. (Annual income from sales of speech recognition services received by Nuance makes up USD 1,2 bln, and it is actually the global monopolist in respect of the http interface service for many languages);

- Independence of the service from Google and Nuance;

- Recognition feature may be offered free of charge (or with a large discount) for DROTR application at the expense of B2B services sale to other developers and manufacturers, thus allowing to create a large client database over a short period of time.

© Pavel Naumenko / Library and Information Science Resources 2014