Showing posts with label OS X. Show all posts
Showing posts with label OS X. Show all posts

Friday, May 16, 2008

Mac OS X Speech Synthesis

Since the introduction of the Macintosh in 1984 Mac OS has had the ability to convert text into speech. Even eight-bit computers like the Commodore 64 had SAM, an early voice synthesizer, but as I bemoaned several months ago, there has been relatively little progress in speech recognition and synthesis in the intervening decades. For the more than 45 million Americans with literacy problems this is especially important. Despite the lack of exceptional progress, OS X does offer options for text-to-speech that may be of interest to users regardless of their literacy level.

Here are some uses for speech synthesis that you may not have thought of. Anyone who writes, even if it's only an occasional professional email, can benefit from text-to-speech. While spell checkers are great for finding egregious errors, more subtle problems are harder to spot. Often writers inadvertently use the wrong word or add extra words to their text. For example, how often have you seen "you" in place of "your" accidently? One easy way to find these problems is to listen to someone read what you wrote. OS X can do that for you.

Similarly to the Dictionary application, speech synthesis has been integrated into the modern Mac operating system. Any highlighted text whether it be in a web browser or an e-mail, can be read aloud by the computer. In many applications like word processors the user just needs to bring up the context menu by right clicking or control clicking and choose the "Speech" option, and "Start speaking". If the option is not in the context menu it is still available in the Services menu. Click on the name of the application in the menu bar and then go to "Services/Speech/Start speaking". It is also possible to create a shortcut key for this option. Simply go to System Preferences and open the Speech preference pane. In the "Text to speech" tab, check "Speak selected text when the key is pressed" and then push the "Set key" button. Now just highlight text in any application, and your computer will read it to you at the touch of a button.

Another speech feature can be useful to many people. When working on the computer it's easy to lose track of time. Sometimes hours go by before I realize it. To avoid this, OS X can announce the time for you. The option is available in the Date and Time settings. These can be accessed in several ways. There is a button in the aforementioned "Text to speech" pane, or you may click on the time in the menu bar and choose the "Open Date & Time..." option. Date and Time is also a choice from the main System Preferences menu. Once there, simply click "Announce the time", in the Clock tab, choose how often, and click "Customized voice" if you wish to set specific voice options.

Some users like me, who keep their Dock hidden, may not always notice applications bouncing their icons in the Dock when they need attention. This can be addressed by having OS X speak to you when a program needs attention. This option is also in the "Text to speech" tab of the Speech System Preferences. Just check "Announce when an application requires your attention". The computer is even very polite, saying, "Excuse me. Application X needs your attention."

What if you are dissatisfied with the standard computer voices? Without doing an exhaustive search I found two companies that offer commercial voice packs for OS X. Both have fairly realistic voices. You can hear many samples or download demos at the InfoVox and Cepstral web sites. Unfortunately, they're rather pricey. The InfoVox voices are $100 for the American English pack, whereas Cepstral voices are sold individually for $29 each.

While it would be hard to say that speech synthesis has come a long way on the Mac, the availability of universally integrated speech options and high-quality commercial voices does make a compelling combination. For those who prefer to have text read to them or just simple system alerts, text-to-speech can be a useful and important component of the operating system.

For more great information on the Services menu, see this web site.

Thursday, May 15, 2008

Universal Access Options for Everyone


As with most operating systems of the last decade or so Mac OS X contains options for accessibility by people with disabilities. What users may not realize is that some of these options are extremely useful to anyone.

OS X puts these functions in the Universal Access pane of System Preferences. This article will concentrate on a few options in the Seeing and Keyboard tabs.

The first item that is certainly of use is the zoom feature. Pressing Command- Option 8 zooms in on the screen around the mouse cursor. By default graphics are smoothed after the zooming takes place, so images that would otherwise appear pixilated still look decent. I often use this feature when watching low-quality web video. Rather than putting up with a tiny postage stamp sized video I simply press the short cut key and watch it much closer to full screen. Once in zoom mode the magnification can be adjusted by pressing Command- Option-minus or Command- Option-equals.


The Keyboard tab has features designed for people who have difficulty typing. However, one of the options in Sticky Keys is very useful for people creating screen casts. With Sticky Keys open and the option "Display pressed keys on screen" checked, the symbols for modifier keys, command, option, control, or shift, appear on the screen when they're pressed. In tutorial situations and with new users this is useful to provide a visual cue to go along with the name of the key being used.


Finally, the option "Enable access for assistive devices" appears at the bottom of the Universal Access pane all the time. This choice needs to be selected in order for tools like text expanders to work. It allows applications to access the keyboard buffer as you are typing.

For people with no challenges using a computer the Universal Access pane may be the last place they would look to add useful functionality to OS X. As you can see, there are some options, however, that can improve the computing experience for anyone. Hopefully people will be inspired to explore further.

Sunday, April 13, 2008

Why-Mac Part One: Window Management

Apple stock compared to the Nasdaq and Dow Jones.

Until recently, there were no real contenders to Microsoft's OS monopoly. Since the release of OS X and the iPod, however, Apple has steadily begun to challenge that dominance. Apple has over 19 billion dollars in cash stashed away. Their stock price, despite recent declines due to economic fears, has increased over 350% since 2005. Studies have shown 40% of incoming freshmen at some universities using Macs, and Apple has garnered a 25% market share by revenue for laptops sold by all manufacturers for February 2008.

Why-Mac will be a series of articles explaining in detail how I have found Mac OS X to be the best in usability, productivity, and aesthetics. Much has been written about switching to Mac or intricately tweaking OS X, but most of this information is either very basic or too technical. These articles will span the middle ground. For readers who are familiar with computer usage and MS Windows, recent switchers or those considering a Mac, it will present details about how Macs are different and how those differences can make you more productive. Hopefully even longtime Mac users will find some tips and tricks and come to understand their computer better.

First, a bit of background on what qualifies me to be writing these articles. I started using personal computers at the age of 11 on a Texas Instruments 99 4/A. My parents wouldn't buy any game cartridges for it, so my brother and I learned to program in Basic. Later, I became a fan of Atari computers. The Atari ST used the GEM interface, which was a knock-off of the Macintosh OS, but it offered more "Power Without the Price". In high school, the local newspaper published a letter to the editor in which I argued against the purchase of Macs for our school (infuriating our computer teacher). After high school, I worked at a couple of PC clone stores, selling, building, and repairing computers. I learned the workings of DOS and Windows. The promises of Microsoft for each revision of Windows would excite and then disappoint me. In 1995, I became an internet programmer and later learned Java. My experience with Macs began shortly after OS X was released. Having tinkered with Linux off and on for years, the stability of Unix coupled with a nice user interface appealed to me. I got my first Mac in 2001, spent a couple months learning OS 9.2 in order to understand some history, then plunged into OS X and never looked back. While I don't like to consider myself a "fanboy", as my friend said on the matter, "There is no fervor like that of the converted." Without further ado, here then is part one of Why-Mac.

One of the primary differences between Windows and OS X that is often overlooked is the basic way applications are run and windows handled. The Unix world uses the concept of a window manager. It decides how to arrange and display the individual windows of running applications. Though MS Windows and OS X lack a true window manager program, for ease of discussion I will nonetheless use this terminology.

The OS X window manager offers many usability and productivity advantages over Windows. As most anyone who has used a PC and a Mac knows, the running application in OS X displays its menu options, File, Edit, et cetera, at the very top of the screen. Windows on the other hand, puts these options within the window of the program. Ergonomics experts talk about Fitts's Law, which calculates the amount of time for a desired target to be accessed when doing something like moving a mouse. It has been shown that having these common options on a border makes them easier and faster to access.
Safari windows revealed by Exposé.
The next OS X feature that is often overlooked is how multiple documents within one program are handled. Unlike Windows, Mac OS distinguishes between an application and its separate documents. This enables several advantageous usage scenarios. Take the Safari web browser, for example. If several separate windows are opened, they can be quickly switched between by using Command and ~, the tilde key, (i.e. Apple-~). To view the open windows graphically, press F-10 to activate what Apple calls Expose, which also gives the ability to click on a desired document. If you want to switch to a different program altogether, say going to iTunes to change playlists, pressing and holding Command-Tab shows the current apps. Sensibly, they are shown only once, not once for each open document. Similarly, the Dock shows running applications, not their individual windows.
Alt-Tab reveals running applications.
There is even more granularity available, though. Minimizing a document by pressing the yellow minus sign removes it from this internal list, so it no longer appears in Exposé or when switching with Command-~. This is useful, for example, when there is a website I want to read but not right at the moment. A tiny screenshot of the minimized window appears in the Dock, complete with the icon from its parent application to make distinguishing it easier.
Safari windows minimized in the Dock.
OS X has also retained the Macintosh feature of hiding an application. Pressing Command-H makes a program hide. Its minimized windows are removed from the Dock (though the program's icon remains), and Exposé no longer shows any of its documents. The program can be unhidden by selecting it with Alt-Tab or clicking on the Dock icon.

The differentiation between windows and applications provides still more benefits. Pressing Command-W on a Mac will consistently close only the current document window. Pressing Command-Q will quit the entire application and close all of its documents. In MS Windows it tends to be a crap shoot whether Alt-F4 (the shortcut for closing a window) will exit just that document or the entire program. In addition, an option available only in OS X is running a program with no open documents. At first this seems nonsensical and confusing. If you close all a program's documents, it remains running with its menu bar at the top of the screen but nothing below. An obvious use for this functionality is loading a program like Photoshop and leaving it run even when no images are currently being edited. Photoshop has many plug-ins and takes a long time to load. Being able to leave it open in this way is a real productivity boost.

The newest OS X, Leopard's window manager also gives the option of placing programs on various virtual desktops. This feature is called Spaces. It provides a simple way to segregate your work into separate domains; a further option that eliminates the clutter of running many applications and makes accessing information faster and easier.

The final area of window management in which OS X excels is maximizing windows. In the Microsoft world, maximizing a window means making it take up the entire screen regardless of how much information it actually presents. In most OS X applications the documents are smart enough to resize only as much as needed. For example, when zooming in and out on images in Photoshop, a maximized image window will fit the size of the image on screen as long as there is available real estate and not cover additional space with a blank window.

This concludes part one of my Why-Mac series. Understanding window management is key to maximizing productive computer use. Mac OS X facilitates efficiency by providing the aforementioned means of organizing, viewing, and switching between applications. The rest of this series will look at more ways Macs enable a more pleasant and productive computing experience.

Monday, January 28, 2008

Text Expansion: Wasting time trying to save time

Perhaps nothing is more irritating than trying to set up some time-saving software, having problems, and wasting lots of time resolving them. That's one reason why I found OS X so compelling when I first started using it; for the most part things just worked. Rather than fighting with the computer just to get the proper tools in place, I could actually get things done.

For reasons I'll go into some other time I have been searching for quite a while for ways to speed up my text input. My most recent endeavor was based on the idea of using text expansion to minimize the number of keystrokes I have to enter. As a special education teacher I had worked with the application Co:Writer by Don Johnston software, which does a fine job of text prediction as letters and sentences are typed. Unfortunately, a single license is $325. Since that is far too rich for my blood, I decided to set up a system of abbreviations myself. That can't be too hard, right? Guess again.

I found three programs that work as text expanders for OS X, Typinator, TypeIt4Me, and TextExpander. All are available as free trials with full licenses costing 19.95 Euros, $27, and $29.95, respectively.

All three programs work the same way. They run in the background watching your keystrokes. When you type a space, punctuation, or other defined key, the programs compare the keyboard buffer to the list of abbreviations you have defined. If there is a match, they backspace over what you have just typed, copy the expansion onto the clipboard, and paste its contents. A sound can also play when this happens.

To give you an example, I have "ty" defined as a shortcut for "thank you". When I start a new word with "ty" (typing it right after a space or other delimiter) and then type, say, a period, the text expansion program backspaces three times, deleting the delimiter and the "ty". Then it copies the expansion and the period onto the clipboard and pastes it into place, effectively replacing the "ty." with "thank you.". It may sound complicated, but it's really not.

The product pages for each program tend to emphasize the use of abbreviations for larger snippets of repetitive text like form letters. My usage goal was a little different. I wanted to use very short abbreviations for very common, but sometimes also very short, words. According to teacher school, if you take the 100 most common words in the English language, you can read (or write) 50% of all elementary text. One of the popular lists of words by frequency is Fry's First 100, named for its creator, Edward Fry. I figured that would be a good starting place for my abbreviations. Of course that is where the trouble started.

The Fry 100 Word List

I began simply enough. I had a text file of Fry's List with one word per line. The programs all had options for importing text files, so I started typing abbreviations after each word, with a comma in between. If a word was only one letter (like "I") or not easily abbreviated (like "in"), I deleted it from the list. For very common and short words I used one-letter abbreviations ("t" for "the", "n" for "and", etc.)

Unbeknownst to me, there were several problems with this. First of all, the programs would accept tab-delimited but not comma-delimited text. I had to search and replace all my commas with tabs, but not too big of a deal. Next, however, I discovered that the order I had put the abbreviation and expanded text were reversed. I didn't want to retype all of that (though it probably would have been faster in the long run), so I found a simple Java program that read in a comma delimited file and wrote it out differently and modified it to fit my needs. Unfortunately, after all of this I still had a problem with the encoding of the text. The text expansion programs would not accept Unicode, so I had to resave the file.

After all this conversion I finally had my abbreviations loaded into TextExpander. The program installs itself as a System Preferences pane and has a nice interface with some advanced features. You can decide on a "snippet" by snippet basis whether to type the delimiter and how to treat upper case. I started using the program while doing emails and blogging. As I encountered a new word that I use a lot, I would add a snippet if there was not one already. It was gratifying to hear the little beeps as I typed, knowing that I was saving keystrokes each time the sound played.

But my troubles were not over. For some reason my one-letter abbreviations were not working. It turns out that TextExpander and Typinator set a minimum of two letters for an abbreviation. While TextExpander correctly highlighted my snippets in red if I accidently created duplicates, it did not flag the one-letter snippets. This limitation eliminated much of the benefit of text expansion as I was using it. Fortunately TypeIt4Me allows single letter abbreviations, but changing programs led to another problem.

I had used TextExpander for a while and added some 50 new expansions. Once again I had new abbreviations that I had to transfer into a program. TypeIt4Me's "Open File..." would allow me to choose the TextExpander file, but no new words would appear. I took a look at the two programs' abbreviation files, both plain text XML. Both are standard Apple plists, even using the same name for most attributes. However, TypeIt4Me capitalized the first letter of each, while TextExpander did not, and XML is case-sensitive. In this case close did not count.

In my stubborn refusal to do data entry when something is already in a computer, I ended up with another time consuming solution. I took the TextExpander XML file and used XSLT to parse out each abbreviation and expansion and write them to a tab-delimited text file for import into TypeIt4Me. I'll try to be an optimist and imagine that maybe somewhere all this foolishness of mine will be useful to someone else.

I have since gotten TypeIt4Me set up to my liking. I have a shortcut key to toggle it on and off, and another to add a new abbreviation. My abbreviation file has grown to over 200 items. I have also learned not to type too fast after a replacement is triggered, or sometimes I end up typing in the middle of the copy and paste.

TypeIt4Me has a nice feature where it tracks the number of expansions done and keystrokes saved. As you can see below, it will be a while before I make up the hours spent mucking around with these programs, but I did get to polish up my Java and XML knowledge and eventually solve my problems.
TypeIt4Me shows how many keystrokes have been saved.

Wednesday, January 9, 2008

Hello, Computer?? Apple can you hear me?

There is a classic (among geeks at least) scene in Star Trek IV where the crew of the Enterprise has traveled back in time to the late 20th century. Chief Engineer Scott sits down in front of a Mac computer and says, "Computer. Computer?" Getting no response Dr. McCoy helpfully hands Scott the mouse, which he holds like microphone, "Hello, Computer??" The befuddled 1980's Earthling standing by finally tells him to just use the keyboard ("How quaint"). [Watch on YouTube] While the Mac was not able to respond to voice input, it did bring the mouse-driven graphical user interface revolution to the masses.


Moreover, when Steve Jobs introduced the Macintosh over 20 years ago his demonstration, in part, "let the computer do the talking" via a synthesized voice reading a short speech. This was quite a feat in 1984. However, very little has changed in speech synthesis between then and now. Why has there been so little advancement in voice-based computer interfaces in over two decades? Are the factors finally in place for the next interface revolution to truly put the Personal in PC's. The answer may be yes, and the company poised to lead that change is once again Apple.

The answers to the first question are many. Primarily, voice-based interfaces have stagnated not due to technology constraints but because of a lack of demand. The niche market has been served by software venders like Dragon Systems (now owned by Nuance) who have been able to do voice recognition since the days when the 486 processor ruled. Current iterations showing the feasibility of voice recognition include voice dialing, available on many, even the most inexpensive, cell phones, and the Sync system for making phone calls and controlling digital music players in Ford cars. The lack of demand on desktop systems in the past is largely due to the fact that the majority of computer use took place in the cubicle farms of the American office. Voice interfaces would not fit very effectively into that environment.

As more people spend more time online at home, speech-based interactions make more sense. In addition, many people compose numerous emails, and blog, chat, or Twitter daily. All of these applications would be well served by dictation software. Further, the generation of people having grown up with computers continues to grow. While older people, as a general rule, may be less comfortable with technology, kids and young adults have no aversion to talking to a machine. Perhaps the time is finally right for someone to take this seemingly logical next step in computer interfaces.

If the time is now the company may be Apple. Buoyed by an amazing chain of products since Steve Jobs regained the helm, Apple has shown a repeated ability to take existing technologies and polish and package them in a user-friendly way that brings them to more people. The iPod and iTunes have done it for digital music, OS X for Unix, and now the iPhone/iPod Touch for mobile computing. History has shown Apple to have an interest in improving the user experience. Another major advantage is control of the hardware and software environment and a commitment to open source. Apple has long included built-in microphones on its laptops and all-in-ones. Tweaking these for noise reduction or other speech enhancements would be fairly easy. If they set their engineers to the task, speech could become an intrinsic part of Mac OS.

This is the key to my argument. I don't purport to have done an exhaustive review of the available add-ons that can make a computer voice-activated. Far from it. But that is because this technology should not be an add-on. If I can edit a photo, listen to digital music, browse the internet, and write formatted text using a stock installation of an operating system, I should just as easily be able to search for a file, cue up a song, navigate to a web site, or dictate text without using the keyboard. I'm not saying that the computer should understand complex natural language or that the mouse and keyboard would be replaced entirely. I would be happy to follow a set format for commands and annunciate clearly and separate each word from the others.

Mac OS X even includes some support out of the box for voice recognition and computer speech (my computer tells me the time every half hour). The problem is these features are not highlighted as the way to interact with the computer. Until there is a keynote where Steve Jobs uses spoken commands in a demonstration or there is an Apple ad campaign that shows users talking to their computers, consumers (and therefore developers) won't take speech seriously. But if it were suddenly put forward as part of human interface guidelines a whole new breed of more usable applications could take hold, and the next generation of computer interface could develop. If only Apple is listening.