Since the introduction of the Macintosh in 1984 Mac OS has had the ability to convert text into speech. Even eight-bit computers like the Commodore 64 had SAM, an early voice synthesizer, but as I bemoaned several months ago, there has been relatively little progress in speech recognition and synthesis in the intervening decades. For the more than 45 million Americans with literacy problems this is especially important. Despite the lack of exceptional progress, OS X does offer options for text-to-speech that may be of interest to users regardless of their literacy level.
Here are some uses for speech synthesis that you may not have thought of. Anyone who writes, even if it's only an occasional professional email, can benefit from text-to-speech. While spell checkers are great for finding egregious errors, more subtle problems are harder to spot. Often writers inadvertently use the wrong word or add extra words to their text. For example, how often have you seen "you" in place of "your" accidently? One easy way to find these problems is to listen to someone read what you wrote. OS X can do that for you.
Similarly to the Dictionary application, speech synthesis has been integrated into the modern Mac operating system. Any highlighted text whether it be in a web browser or an e-mail, can be read aloud by the computer. In many applications like word processors the user just needs to bring up the context menu by right clicking or control clicking and choose the "Speech" option, and "Start speaking". If the option is not in the context menu it is still available in the Services menu. Click on the name of the application in the menu bar and then go to "Services/Speech/Start speaking". It is also possible to create a shortcut key for this option. Simply go to System Preferences and open the Speech preference pane. In the "Text to speech" tab, check "Speak selected text when the key is pressed" and then push the "Set key" button. Now just highlight text in any application, and your computer will read it to you at the touch of a button.
Another speech feature can be useful to many people. When working on the computer it's easy to lose track of time. Sometimes hours go by before I realize it. To avoid this, OS X can announce the time for you. The option is available in the Date and Time settings. These can be accessed in several ways. There is a button in the aforementioned "Text to speech" pane, or you may click on the time in the menu bar and choose the "Open Date & Time..." option. Date and Time is also a choice from the main System Preferences menu. Once there, simply click "Announce the time", in the Clock tab, choose how often, and click "Customized voice" if you wish to set specific voice options.
Some users like me, who keep their Dock hidden, may not always notice applications bouncing their icons in the Dock when they need attention. This can be addressed by having OS X speak to you when a program needs attention. This option is also in the "Text to speech" tab of the Speech System Preferences. Just check "Announce when an application requires your attention". The computer is even very polite, saying, "Excuse me. Application X needs your attention."
What if you are dissatisfied with the standard computer voices? Without doing an exhaustive search I found two companies that offer commercial voice packs for OS X. Both have fairly realistic voices. You can hear many samples or download demos at the InfoVox and Cepstral web sites. Unfortunately, they're rather pricey. The InfoVox voices are $100 for the American English pack, whereas Cepstral voices are sold individually for $29 each.
While it would be hard to say that speech synthesis has come a long way on the Mac, the availability of universally integrated speech options and high-quality commercial voices does make a compelling combination. For those who prefer to have text read to them or just simple system alerts, text-to-speech can be a useful and important component of the operating system.
For more great information on the Services menu, see this web site.
Friday, May 16, 2008
Mac OS X Speech Synthesis
Labels: assistive technology, OS X, user interfaces
Sunday, April 13, 2008
Why-Mac Part One: Window Management
Why-Mac will be a series of articles explaining in detail how I have found Mac OS X to be the best in usability, productivity, and aesthetics. Much has been written about switching to Mac or intricately tweaking OS X, but most of this information is either very basic or too technical. These articles will span the middle ground. For readers who are familiar with computer usage and MS Windows, recent switchers or those considering a Mac, it will present details about how Macs are different and how those differences can make you more productive. Hopefully even longtime Mac users will find some tips and tricks and come to understand their computer better.
First, a bit of background on what qualifies me to be writing these articles. I started using personal computers at the age of 11 on a Texas Instruments 99 4/A. My parents wouldn't buy any game cartridges for it, so my brother and I learned to program in Basic. Later, I became a fan of Atari computers. The Atari ST used the GEM interface, which was a knock-off of the Macintosh OS, but it offered more "Power Without the Price". In high school, the local newspaper published a letter to the editor in which I argued against the purchase of Macs for our school (infuriating our computer teacher). After high school, I worked at a couple of PC clone stores, selling, building, and repairing computers. I learned the workings of DOS and Windows. The promises of Microsoft for each revision of Windows would excite and then disappoint me. In 1995, I became an internet programmer and later learned Java. My experience with Macs began shortly after OS X was released. Having tinkered with Linux off and on for years, the stability of Unix coupled with a nice user interface appealed to me. I got my first Mac in 2001, spent a couple months learning OS 9.2 in order to understand some history, then plunged into OS X and never looked back. While I don't like to consider myself a "fanboy", as my friend said on the matter, "There is no fervor like that of the converted." Without further ado, here then is part one of Why-Mac.
One of the primary differences between Windows and OS X that is often overlooked is the basic way applications are run and windows handled. The Unix world uses the concept of a window manager. It decides how to arrange and display the individual windows of running applications. Though MS Windows and OS X lack a true window manager program, for ease of discussion I will nonetheless use this terminology.
The OS X window manager offers many usability and productivity advantages over Windows. As most anyone who has used a PC and a Mac knows, the running application in OS X displays its menu options, File, Edit, et cetera, at the very top of the screen. Windows on the other hand, puts these options within the window of the program. Ergonomics experts talk about Fitts's Law, which calculates the amount of time for a desired target to be accessed when doing something like moving a mouse. It has been shown that having these common options on a border makes them easier and faster to access.



The differentiation between windows and applications provides still more benefits. Pressing Command-W on a Mac will consistently close only the current document window. Pressing Command-Q will quit the entire application and close all of its documents. In MS Windows it tends to be a crap shoot whether Alt-F4 (the shortcut for closing a window) will exit just that document or the entire program. In addition, an option available only in OS X is running a program with no open documents. At first this seems nonsensical and confusing. If you close all a program's documents, it remains running with its menu bar at the top of the screen but nothing below. An obvious use for this functionality is loading a program like Photoshop and leaving it run even when no images are currently being edited. Photoshop has many plug-ins and takes a long time to load. Being able to leave it open in this way is a real productivity boost.
The newest OS X, Leopard's window manager also gives the option of placing programs on various virtual desktops. This feature is called Spaces. It provides a simple way to segregate your work into separate domains; a further option that eliminates the clutter of running many applications and makes accessing information faster and easier.
The final area of window management in which OS X excels is maximizing windows. In the Microsoft world, maximizing a window means making it take up the entire screen regardless of how much information it actually presents. In most OS X applications the documents are smart enough to resize only as much as needed. For example, when zooming in and out on images in Photoshop, a maximized image window will fit the size of the image on screen as long as there is available real estate and not cover additional space with a blank window.
This concludes part one of my Why-Mac series. Understanding window management is key to maximizing productive computer use. Mac OS X facilitates efficiency by providing the aforementioned means of organizing, viewing, and switching between applications. The rest of this series will look at more ways Macs enable a more pleasant and productive computing experience.
Labels: Apple, OS X, user interfaces
Thursday, January 24, 2008
Before Touch Screens, Multitouch Mice
As much as I would like for there to be a sub-$1000, tablet-like, touch screen Mac the economics of it just don't work yet. The company Axiotron previewed its Modbook over a year ago and just started shipping them (supposedly). Still, the price of $2,300-2,500 is prohibitive. A Wacom 12.1" touch LCD runs a grand and weighs over four pounds. Unfortunately, I don't think there is enough magic at Apple Labs to deliver the product I crave; however, an intermediate step may be entirely plausible and could ship soon. Imagine grafting together a slightly rounder, flatter Mighty Mouse, a MacBook Air trackpad, and the guts of a Wii controller.
The ideal device that I envision is decidedly a bit ambitious and futuristic, but there are variations on the theme that keep it more practical. First, imagine an iMac G3 "puck mouse" (shudder) without the cord or button. Overlay on this surface the multi-touch, gesture sensitive trackpad that debuted recently on the Air. For just moving the cursor around it is much more convenient to have something physically moving than trying to rub a trackpad just the right way. That is where the mouse nature comes into play. Due to its roundness, it would be convenient if the mouse were inertially sensitive rather than relying on optical movement over a surface. That is where the Wii-like internals would be used. The orientation wouldn't affect the direction of cursor movement. You could move it around without worrying about the direction it is facing, avoiding the annoying problem when the puck mouse would turn. Eventually, this could lead to hand-held devices being moved in 3d space though at that point gestures would have to be handled differently.
For the current iteration, however, the surface of the mouse would register taps (mimicking the behavior of standard mouse buttons) but would also allow the use of iPhone gestures- swiping side to side, pinching and expanding, or rotating. Since these gestures are based more on what is currently selected than the mouse position, it makes sense for that sensitivity to be layered on top of the means of moving the cursor rather than coupled with it.If an inertially sensitive, orientation-independent version is too ambitious for now, it would be equally plausible to base the design on a slightly flattened Mighty Mouse rather than the puck mouse. This would maintain the standard mouse directionality, and the device could come with a cord or wireless. It would also eliminate the need for the hardware and software to handle Wii-like position sensing. The basic idea of overlaying the gesture sensitivity would be the same.
It may look a little clunky, but the multitouch mouse would provide a new level of interactivity to the Mac interface. It would also leverage the work done on the iPhone and Touch interface and get users used to the "standard" Apple gestures. Until we can get fully touch sensitive notebook or tablet screens, the multitouch mouse would be a welcome step forward.
Labels: future, user interfaces
Wednesday, January 9, 2008
Hello, Computer?? Apple can you hear me?
There is a classic (among geeks at least) scene in Star Trek IV where the crew of the Enterprise has traveled back in time to the late 20th century. Chief Engineer Scott sits down in front of a Mac computer and says, "Computer. Computer?" Getting no response Dr. McCoy helpfully hands Scott the mouse, which he holds like microphone, "Hello, Computer??" The befuddled 1980's Earthling standing by finally tells him to just use the keyboard ("How quaint"). [Watch on YouTube] While the Mac was not able to respond to voice input, it did bring the mouse-driven graphical user interface revolution to the masses.

Saturday, January 5, 2008
Nintendo Wii: Good but Not Too Good
Sales reports have consistently shown the Nintendo Wii to be leading the pack when it comes to current generation game consoles. Back when the Wii was just the conceptual "Revolution" I predicted and hoped that it would indeed revolutionize gaming with its user interface innovations. The Wii has been successful because it is good but not too good.
The most obvious interpretation of this statement involves the price/performance trade offs that console manufacturers face. While Sony and Microsoft chose to continue escalating the technical specifications of their hardware Nintendo took a middle ground approach. The Wii does HD but not 1080 resolution. It has a DVD drive but doesn't play movies (let alone Blu-ray or HD DVD). In all respects the system has less power than the competition, but by choosing lower hardware requirements Nintendo was able to deliver a more affordable, smaller console.
A less apparent application of good but not too good is an aspect of human nature that I believe will foretell near term advances in virtual reality (VR). On the commentary for one of the early CGI movies (it may have been Shrek, but I don't recall) the animators talk about a phenomenon whereby people started to dislike the characters if they became too close to real. It seems the human mind is happy to place itself in a state of suspended disbelief when what it is experiencing is clearly unbelievable. We don't watch a Roadrunner cartoon and complain that there is no way the coyote could survive that fall. The problem for movie makers occurred as animated characters started approaching reality. At that point people would look at them and know that something was "not right" but not necessarily be able to put their finger on it. The computer graphics had passed the threshold of being obviously fake but had not yet reached the point of being believable. They were too good for their own good and actually had to be made less realistic.
The same logic can be used with virtual reality and the Wii. Nobody would claim that waving around a remote control truly gives you the same experience as swinging a tennis racket at a ball or slicing a goblin with a sword. Yesterday I was reading about haptic interfaces. The Webopedia article states, "For example, in a virtual reality environment, a user can pick up a virtual tennis ball using a data glove. The computer senses the movement and moves the virtual ball on the display. However, because of the nature of a haptic interface, the user will feel the tennis ball in his hand through tactile sensations that the computer sends through the data glove, mimicking the feel of the tennis ball in the user's hand." This is certainly far above what the Wii's controller offers. Will this be the next generation of gaming? I don't think so, and the reason is that it defies the good but not too good philosophy. When games start to mimic tactile sensations it butts up against the "close to reality but just not right" barrier. I'm sure such a device would be interesting to try, but in order to lose ourselves in the experience of a game, just like with a movie, we either need to be in a clearly non-real environment or so totally immersed that it is difficult to distinguish what is and is not real.
It's been over a decade since Pixar introduced us to full length CGI animation with Toy Story. Movies are just now approaching the use of fully realistic human characters. While VR has also been in development for decades, the Wii gaming console is definitely the largest real world application of virtual reality concepts. Before the next level of immersive VR is achieved the industry will have to overcome the problem of being too close to reality without being close enough. In my opinion this will likely take the next ten years. In the meantime there is plenty of opportunity using the current technology for unbelievable games to be incredibly fun.
Labels: future, Nintendo Wii, user interfaces, virtual reality