Ensuring Audio CAPTCHA Cross-Device Compatibility

August 2013. by

The audio Captcha alternative is a necessary component of making Captcha protection accessible. However, from a strictly accessibility-focused perspective, generating Captcha sounds alone is completely useless if the visitor can't play them in their browser of choice. And despite numerous and frequent improvements of web technology, playing background audio on a web page is still an action that requires careful handling when cross-device and cross-browser compatibility is important...

Table of Contents

HTML5 <audio> CAPTCHA Sound Playback

HTML5 is The New HTML Standard

In the best case, the visitor's browser is modern enough to support Html5 audio – this means we can use full sound functionality, and the user doesn't have to worry about installing any sound player browser plugins. When we detect a Html5 audio-capable browser, we can use JavaScript to add an invisible <audio> element to the page when the sound icon is clicked, give it the Captcha sound Url, set autoplay="true" and loop="false", and we're done – the sound will play in the page background as soon as it is loaded from the server.

<audio src="BotDetectCaptcha?get=sound&..."></audio>

The promise of Html5 audio is that the browser alone will be capable enough to handle sound playback, and this is mostly true. But only mostly, because Html5 is still in a creative-destructive state of flux and mild "Browser Wars 2.0"-induced disorientation. So we have to tread carefully, and can't place absolute trust in browsers happily confirming their Html5 audio capabilities...

Childhood Maladies of HTML5 Multimedia Reproduction

For example, certain older versions of Firefox (v3.6.x) will cheerfully return true to canPlayType("audio/wav") inquiries, and will even play the sound – but with significant portions of its ending cut off, due to a bug seemingly confusing regular and gzipped audio content length. Fortunately, Firefox 4.0 and newer fixed this issue.

And yes, if we want to support the largest audience possible, we need to also take care of older version users as well means that BotDetect audio Captcha fully supports IE 6.0 :)

Yes, Firefox is currently in version 22.0 and obscure version 3.6.x bugs might seem irrelevant – but the fact that newer browsers can handle audio Captcha functionality flawlessly doesn't mean we can just hope visitors will always upgrade to the latest version – if we want to support the largest audience possible, we need to also take care of older version users as well.

In another corner case, Windows versions of Safari accept Html5 audio – but can't play it without also installing QuickTime (despite the "standalone playback" premise of Html5 audio). Since we want to keep such corner cases (the "corners" involved having sharp edges) well away from visitors' sensitive ears, BotDetect classifies such browsers as incompatible with Html5 audio – unfortunately, QA test results must be trusted over vendor feature advertisements.

Graceful Degradation of CAPTCHA Audio in Legacy Browsers

Classifying certain browsers as Html5 incompatible is all well and fine – but how do we handle audio Captcha functionality once they have been classified as such? As it turns out, a completely different implementation is required...

XHTML <object> + Proprietary <embed> Playback of CAPTCHA Audio

While more limited in capabilities, older browsers which don't support Html5 audio can still play Captcha sounds in the background using a combination of <object> and <embed> Html elements. This approach to web page sound playing is well known from the previous era of web multimedia, before Html5 was introduced or conceived of at all.

<object classid="clsid:22D6F312-B0F6-11D0-94AB-0080C74C7E95" height="0" width="0">
  <param name="AutoStart" value="1">
  <param name="Volume" value="0">
  <param name="PlayCount" value="1">
  <param name="FileName" value="BotDetectCaptcha?get=sound&...">
  <embed src="BotDetectCaptcha?get=sound&..." autoplay="true" hidden="true" 
         volume="100" type="audio/x-wav"/>

Unfortunately, this backup approach has a notable disadvantage: it requires that a sound player plugin is installed and configured in the browser. Similar how Flash content can't be seen by visitors who don't have the Flash player installed, sound content embedded on web pages loaded in legacy browsers can't be played by users who don't have QuickTime (Firefox, Chrome, Safari, Opera) or Windows Media Player (IE) browser plugins installed and enabled.

JavaScript-Independent, Downloadable CAPTCHA Sound Files

JavaScript disabled

Both of the above sound playback approaches depend on JavaScript – the required audio container elements are only created on the client-side after the user requests Captcha audio. This makes sense since generating and pre-loading Captcha audio on page load for all visitors would put a significant extra load on the server, while only a minority of those visitors would actually use the audio Captcha functionality.

Obviously, if the visitor has JavaScript disabled, or uses a browser that doesn't support it at all, Captcha audio needs to be delivered in a different manner. To support this use case, the BotDetect sound Captcha icon is also a link to the audio Captcha sound file, that only starts a download of the file if JavaScript is disabled. When users download the sound file to their device, they can use whatever sound playing software is available to play it.

Depending on browser configuration and the sound playing software available on the device, this won't necessarily support the ideal use case of playing the sound in the background. But it will at least allow the user to hear the Captcha code pronounced, and give them a fair chance of passing the Captcha challenge.

Audio CAPTCHA Compatibility with Mobile Browsers

The good thing about mobile browsers is that they are modern enough to support Html5 audio out of the box. The less good thing about mobile browsers is that they also often have peculiar multimedia requirements, significantly different than their desktop equivalents. And while it's true that that blind or low sighted people who need audio Captcha the most rarely sport fashionable iPhones, mobile browser compatibility is still important for accessible audio Captcha implementations.

mobile devices

Handling iOS Audio CAPTCHA Range Requests

For example, the iOS browser (Mobile Safari) only accepts sound files delivered using Http range requests. BotDetect usually sends the Accept-Ranges: none Http header with audio Captcha responses, to instruct clients not to use range requests. The reasoning behind this default behavior is that Captcha sounds are fully dynamic and generated on-the-fly, with each Http request for audio Captcha content generating a new and random pronunciation of the Captcha code.

But since iOS simply didn't play anything when given a non-range Http response, we've had to implement an iOS-specific workaround supporting audio Captcha delivery using range requests. Of course, this means that the generated sound file must be stored by the server between requests – triggering issues with the (in)famous statelessness of the Http protocol – until all byte ranges are delivered to the client, consuming server memory and possibly affecting server performance when large numbers of iOS clients are connected.

Audio CAPTCHA Starting Delay and Mobile Browser Restrictions

Another mobile-specific audio Captcha implementation issue was related to the BotDetect feature of configurable Captcha starting sound delay. This functionality is very important for audio Captcha accessibility, since it allows webmasters to delay Captcha sound playback depending on the sound icon label.

When screen reader software such as JAWS reads the page for a blind user, it will pronounce the sound icon label ("Speak the CAPTCHA Code" by default) first, and give the user an option to activate it. If there is no delay between reaching the sound icon and starting audio Captcha playback, the label pronunciation could be simultaneous with Captcha code pronunciation, making both of them incomprehensible. If a starting delay is configured, BotDetect JavaScript code will delay playing the sound using appropriate setTimeout() calls.

Mobile Browser JavaScript Execution Limitations

However, some recent versions of mobile browsers (both iOS and Android) require multimedia playback to only happen directly after user input (the JavaScript execution context started by the onclick handler of the sound icon, in this case). And since setTimeout() calls push code execution to a different execution context, delayed audio Captcha playback could get blocked by those browsers.

BotDetect deals with this by respecting the browser limitation and creating the <audio> element in immediate response to the user action – but also immediately pausing the playback, and only resuming it after the desired amount of time has elapsed. This workaround is perhaps somewhat less than pretty, but also necessary to implement proper audio Captcha functionality compatible with mobile browsers.

While we are hopeful that future browser and device improvements will make such workarounds obsolete, we have to deal with the reality of device and browser support for audio playback now.

Conclusions & Further Reading

Implementing a widely usable audio Captcha requires dealing with a broad range of client devices – ranging from modern browsers supporting Html5 multimedia, over older browsers requiring an external sound player plugin, to clients with no JavaScript support at all – and gracefully adapting to the available sound playback capabilities.

If you want to check the result of these cross-device compatibility efforts in the final BotDetect product, test Captcha audio compatibility here yourself. And if you have any trouble playing the Captcha sound in any client, please contact us and we'll investigate what needs to be done to handle it.

Sound Advice: All About Audio CAPTCHA