This morning I was surprised by Google Drive. They offered me to use voice for some basic commands, instead of selecting them or using a shortcut (in my case).
A few months ago I created an experiment by combining the shiny SoHo Interface with a few good working opensource javascript implementations for voice and gesture to control the interface.
I knew that some companies were experimenting with it but maybe because I was too busy with other projects and day-to-day routines I hadn’t realized that the time for it has come.
I am sure that the experiment by Google (seems useless from user point of view) will evolve into something more usable and can save a lot of time to the end-user.
Pros:
- It’s fun – you can shout commands to your website and it will respond with an action.
- Sometimes you can do something useful – like control your HTML5 game or even login to your favorite website.
- Brings apps to people that can’t write (yet), but can talk – this is something huge.
- Widens the horizon of the developers and companies – think about one more usability and User Experience layer
- It is super exciting and it evolves well.
Cons:
- There are some technological ones, but I don’t want to be a hater this time :) Yay!
- The other one is what happens with all of the data collected by the mic? Some of the devices are known for listening all the time for the our precious voice. Should we start ripping batteries off from our laptops and tablets like we do for our mobile phones?
How to get started?
See my demo here – there is a video for voice and gesture controlled UI. This is how a modern app should look like – you can use your voice, but also to listen to the voice answer sent back to you and if you feel like moving things around – use your webcam to do it..
More links:
- I am using Annyang for the voice commands
- Gest.JS for the gestures
- and this JS library to interact with the GoogleTTs engine
What is the future?
Bright – pretty soon we’ll be seeing more and more startups combining the Voice with the millions of the APIs that exists to build even interfaceless applications that will work well at the beginning and then will replace most of those apps we use these days.
What do you think?