5 min read

Sound Poetry for Screen Readers

Last year, as part of my 101 X 101 Words project, I made a few experiments with creating sound poems specially for screen readers. I thought it might be interesting to put the built in screen reader technology available in most modern computers/devices to artistic use, with the intention of embracing the varying sound results rather than relying on fixed sound-files for audio.

Text-to-Speech has long been a feature on Macs, and has been growing in sophistication during recent years. Opening System Preferences and selecting “System Voice” under the Dictation & Speech panel provides a “Customize…” option. Here a number of extra voices covering a wide range of countries can be downloaded, with an added option to download high quality versions of certain voices.

The fun part has been experimenting with how to coax the screen reader into making sounds that are not necessarily the words of everyday speech. Including a vowel in a crazy combination of consonants so that it doesn’t read the letters individually, for example. One of the difficulties is that, even when restricted to the specific range of Mac voices, the way they interpret the text is by no means consistent. Some of the voices from older versions of the OS read certain combinations of letters individually whereas the more recent voices (including all of those on iOS) do their best to combine them into a single word.

Some of my first experiments can be found on my post for day 63 of the project. (Despite the intention of an open ended sound result, I’ve since added recordings to the posts as a form of documentation and point of reference.)

Screen Reader Pieces 1–5

Another tricky thing is getting the reader to make pauses, an important aspect should one want to play with the musical aspects of a text. New lines, even with the inclusion of periods, are interpreted differently depending on how the speech is activated. Activating via a context click results in a quicker reading than triggering the same selection via a keyboard shortcut.

Here’s an excerpt from the fifth piece from the above-mentioned post activated first via context click and then keyboard shortcut.

context-click
keyboard-shortcut

Another example could be the screen-reading of the text for the 62nd day of the project, which is based on sentences constructed from 3-letter-words printed by Erik Spiekermann at the p98a letterpress workshop in Berlin.

context-click
keyboard-shortcut

The intonation of the text, again a crucial “musical” aspect, also depends on how much text is selected. With the context of an entire paragraph intonation will differ from selecting a single word to be spoken.

I also experimented with some pieces created specifically for the characteristics of speech on iOS. Since the release of iOS 9 a screen reader can be conveniently activated by either swiping down with two fingers from the top of the screen or selecting text and choosing “Speak” from the pop-up menu. This needs to be enabled in Settings → General → Accessibility → Speech. The first option on the Accessibility Speech screen is “Speak Selection”, which I prefer since one can be quite specific with what one would like to have spoken.

“Speak Selection”

When using “Speak Selection” iOS sometimes chooses the nationality of the voice depending on the text selected. This can make for some quite specific sounds, as with this little Nocturne created for the Hungarian voice Ezster on day 79.

“Speak Screen” is the second option Accessibility Speech screen – swiping down with two fingers from the top of the screen brings up a control bar with the option to skip to the next section of the page as well as adjust the speed, pause playback, or dismiss it altogether.

“Speak Screen”

There’s also a third option to “Highlight Content”, which highlights the word on the screen that is currently being spoken. Unfortunately this only works with the screen in portrait orientation. In landscape you can amuse yourself with vertical blocks of blue moving across the screen as if it were still in portrait.

I was surprised to find a specifically South African voice amongst both the Mac and iOS voices, and since I happened to be in South Africa during the last month of the project I thought it might be fun to include a little piece specifically for that voice using typically South African turns of phrase. In this case the character of the language is not enough to trigger an automatic change of voice and you’ll have to navigate (in iOS) to Settings → General → Accessibility → Speech → Voices and select “Tessa”. The timings are not ideal, and I might include my own reading at some point, but not bad “for a computer”.

* * *

In some of my 101 word posts I used the term VoiceOver to describe the built in text-to-speech feature on the Mac and iOS. That’s probably not entirely correct since the VoiceOver feature makes it possible to control the computer through gestures or the keyboard, based on spoken descriptions of items on the computer screen. It might be interesting to create something specifically for that at some point.


Have you published a response to this? (Learn more):

Rudiger Meyer is a composer interested in the play between traditional concert music and new media.