I give this some thought as I make my lunch with ingredients I purchased from an online grocery store. I keep thinking about it as I return to my desk and respond to a few emails using my online mailbox. I check my online calendar for upcoming meetings (there are two, both to be held using one ubiquitous VOIP platform or another), and I keep thinking about it.
I jot down some of my thoughts in a text editor, using my laptop with an accessible OS and an integrated screen reader, then check the time on my accessible watch (that has a different integrated screen reader).
I think about some of the things I've done in recent days. I think about the pair of jeans I bought from an online retailer, the statements I reviewed for multiple online bank accounts, the invoices I sent to customers from a cloud-based accounting application, the contract I reviewed and signed using an online signature platform, the ideas I found whilst browsing for ways to celebrate my birthday, the code I maintained using an online version control platform, the posts I made on social media, the TV (with Audio Description) I watched from different online broadcasters, the articles I read from numerous online news outlets, the recipe I found for dinner the other night, the online calculator I used to work out the carbs in the recipe so I could take the correct dose of insulin, the app I used to check my glucose levels before I ate, and...
I stop thinking about it.
I don't know if Nielsen has a disability or not and I won't presume to speak about his experience in any case. I do have a disability and since Nielsen didn't afford disabled people the same courtesy, this is my experience.
Of course there were some things I found difficult, and a few that I couldn't do at all without sighted assistance.
But this is my life, my experience, and much of it happens online.
Nielsen thinks accessibility has failed.
Nielsen thinks that generative AI will make my experience better. Nielsen apparently doesn't realise that generative AI barely understands accessibility, never mind how to make accessible experiences for humans.
I think Nielsen needs to think again.
BeMyEyes is one of the most remarkable apps to have emerged in recent years. You sign up either as a sighted volunteer, or a blind or low vision person. If, like me, you fall into the latter category, you can then borrow a pair of eyes from a randomly selected volunteer.
The app is smart enough to connect you with a volunteer who speaks your language, and for whom it is daytime when you make your request. This is especially useful when you're somewhere that makes it inconvenient to call someone you usually would for help like this - because it's the middle of the night for them for example.
I've used it to remind me of settings on my stove, to figure out air-con controls in hotel rooms, to check the results of a rapid COVID test, and much more besides.
Although the anonimity of both participants is preserved, there are times when asking a stranger for help isn't convenient or comfortable. This is where BeMyAI comes in.
You can take a picture from inside the BeMyEyes app or upload one from elsewhere on your phone. With help from ChatGPT4 you get a detailed description of the image presented as text, which VoiceOver reads automatically. So far at least, this isn't anything that other apps like SeeingAI don't offer. But here's the thing - you can talk to this just like you would one of the sighted volunteers.
Well, sort of.
For the moment you have to type or dictate your questions into a text box and submit them. In time it would be good if this feature could be made conversational, but this is an early beta, so it is what it is.
When I posted on the socials that I was experimenting with BeMyAI, people were curious about its accuracy - not surprising given its purpose and the fact that ChatGPT is, well, shall we say somewhat badly educated?
So I took a photo of a bit of my office that I know very well. I've owned most of the things in the picture since before I lost my sight, and I'm very familiar with the rest because it's my house after all.
Here's the initial description:
The picture shows a wooden shelf attached to a wall. On the shelf, there are three items from left to right; an old fashioned rotary telephone, a metal slinky, and a Rubik's cube. The telephone is beige with a black rotary dial and a coiled cord, the metal slinky is in its classic coiled position, the Rubik's cube is colourful with different colours on each side including green, yellow, and red.
The wall behind the shelf is painted in a light colour and the shelf is illuminated by a light source from above.
Not bad. Not entirely accurate either though. The telephone isn't beige, it's entirely black.
I asked BeMyAI:
Are you sure the telephone is beige?
It replied with:
Yes, I apologise for any confusion. The telephone is indeed beige in colour with a black rotary dial, and a coiled cord connecting the handset to the base.
Then there's the enormous yellow pencil. It's on the shelf between the telephone and the slinky.
So I asked:
Can you describe the pencil?
And received the reply:
I apologise, but there is no pencil visible in the picture you provided. The picture shows a rotary telephone, a metal slinky, and a Rubik's cube on a wooden shelf. If there is another picture with a pencil in it you would like me to describe, please upload it and I'd be happy to help.
The information I received along with access to the early beta had this to say:
Our AI pioneers have observed that sometimes AI will tell you some things are in the picture when they are not, or it will omit things that are (these are called hallucinations). That’s why at this early stage we urge you to follow a few simple safety guidelines...
Fair enough. You'd be surprised by how much stuff a sighted person doesn't pay attention to, or at least think it worthy of commenting on, so I'm not holding BeMyAI to a particularly high standard just yet.
A lovely feature of humans is that you can ask them questions when you want more information, and those questions can be quite abstract. Sometimes though it might not be convenient to - with BeMyEyes you're always slightly conscious that some kind person has given up a few moments of their time to help, and you don't want to presume too much on their time. Or at least I do. Others may feel differently of course.
Either way, it's the ability to question BeMyAI that makes this particularly interesting. So I asked:
What is the biggest item on the shelf?
It told me:
The biggest item on the shelf is the old fashioned rotary telephone. It takes more space than the metal slinky and the Rubik's cube.
Alright:
What do I do with the middle item on the shelf?
BeMyAI cheerfully told me:
The middle item on the shelf is a metal slinky. A slinky is a toy that is typically played with by allowing it to stretch and reform its shape or by placing it at the top of a set of stairs or an inclined surface and watching it walk down step by step as it flips over itself. It's a classic toy that can be used for simple amusement and entertainment.
True that. But enough of these easy questions.
Has the Rubik's cube been solved?
BeMyAI explained that:
In the picture the cube has not been solved. The sides that are visible show different colours indicating that it is still mixed up.
Ah yes. It's been like that for at least 30 years I expect, but I digress.
How do I solve the Rubik's cube?
Wait for it...
Solving the Rubik's cube involves making each side of the cube a single solid colour. There are various methods and algorithms to solve it But I'll give a basic outline of steps:
- Solve the white: start by solving the white edge pieces to form a cross on one face of the cube.
- Solve the white corners: next solve the white corner pieces so that the entire white face is complete.
- Solve the middle layer: move onto solving the middle layer by aligning and positioning the edge pieces currently.
- Solve the top face: solve the yellow face by getting all the yellow pieces on the top layer without worrying about their final position.
- Position the yellow corners: position the yellow corner pieces currently.
- Position the yellow edges: finally position the yellow edge pieces currently to complete the cube.
Each of these steps involves specific moves and algorithms . It might be helpful to watch tutorials or read guides that provide detailed instructions and visual aids. Since you are blind I would recommend looking for audio tutorials or guides that are specifically designed for visually impaired individuals. There are also tactile Rubik's cubes with different textures on each side to make it.
Good grief! That seems plausible, but who knows? And before you ask, no, I have no idea if it works or not. This is not an accessible Rubik's cube and the other member of this household is otherwise occupied right now Washing his hair apparently.
The truncation in some of the ChatGPT responses is a little odd (after the word "currently" for example), and the end ing is something of a cliff-hanger. I'm not sure if that's ChatGPT or a bug in the BeMyAI feature though.
When BeMyEyes announced the BeMyAI feature it was reported that:
If a user sends a picture of the inside of their refrigerator, the Virtual Volunteer will not only be able to correctly identify the items within, but also extrapolate and analyze what can be prepared with those ingredients.
No. I'm not going to share a picture of the inside of my fridge with you. It's frankly embarrassing, although in my defense my weekly groceries don't turn up until tomorrow.
Here's what BeMyAI had to say on the matter though:
The picture shows the inside of an open refridgerator. On the top shelf there is a bag of something green, possibly vegetables and a white container with a lid. On the second shelf there is a red jug with a lid and a carton of almond milk. On the third shelf there is a container of eggs, a bag of mixed nuts, and a lemon. In the bottom drawer there are some vegetables.
The refridgerator is white and has a wooden door handle on the right side. The light inside the fridge is on.
I posed the critical question:
What shall I cook for dinner?
BeMyAI chirpily suggested:
Based on the items in your refridgerator you could make a simple vegetable stir-fry
Well, yes, if I had any vegetables I suppose I could.
...with mixed nuts for some added crunch and protein.
Ah. That'll be those infamous hallucinated nuts - or should that be haliucinogenic? Probably came from the same place as the hallucinated eggs.
I'm reminded of the old joke - "If we had some bacon we could have some bacon and eggs if we had some eggs".
Anyway, back to the cooking suggestion:
You can use the green vegetables from the top shelf
Well, yes, except it's actually half a loaf of brown bread.
...the mixed nuts, from the third shelf
Those again?
... and add a squeeze of lemon for some freshness.
Um. What lemon? I know there isn't a lemon in the fridge because I'd have stuck some in my G&T at the weekend.
To be fair it's a yellow plum, so close enough.
If you have any additional ingredients like rice noodles or sauces, you can add them to make the dish more flavourful.
I can indeed. Good idea
And there you have it. My adventures with the early beta BeMyAI feature of the BeMyEyes iOS app.
What's really going to bake your noodle though, is that I really did have stir-fry for dinner tonight, only some time before I stuck BeMyAI in the fridge and asked it for ideas..
]]>The concern is that if a website makes your screen reader talk faster or louder, or in a different voice, or if it forces your screen reader to speak content in a particular way, it could make the content difficult or impossible to understand. These concerns are not unfounded, but neither are they as unique, serious, or unsolvable as they might seem and arguably they do not outweigh the benefits CSS Speech could bring.
It's worth noting that the use case for CSS Speech is not just about screen readers. It also encompasses reader/read aloud capability (available in Firefox, Edge, and Safari, plus Chrome with an extension). So concerns about speaking rate and volume apply to anyone who listens to content for whatever reason.
Let's take a look at the protections and safe-guards that already exist, that could be improved upon, and that will be needed if CSS Speech is to be supported in browsers and adopted across the web.
As a rule, screen reader users are accustomed to listening to synthetic speech and we often do so at higher speeds than most other people find comfortable. This probably puts us at something of an advantage over people using browser readers, but the likelihood is that the speech rate will be designed to suit the majority and if anything, screen reader users will probably find the chosen speech rate too slow rather than too fast.
If the CSS sets a rate that is too fast or too slow, screen readers have shortcuts that let you change the rate of speech on the fly. It's not uncommon for screen reader users to do this anyway - you might slow down your screen reader when reading a document that needs your close attention, with the speaking rate reverting to normal when you switch to another application or use the shortcut again.
The ability to change the speaking rate is also available in browser readers, so in short, if the speaking rate set in the CSS is too fast or slow for your preference, you can change it yourself using the browser reader or your screen reader.
The CSS Speech module has some safeguards built into it. For example, the voice-volume
property can currently be set to "silent", "x-soft", "soft", "medium", "loud", or "x-loud". These are all relative to the volume you're listening at, so "loud" would make the voice a bit louder than usual, and "x-loud" a bit louder than that. Here's what the CSS Speech module currently says about this:
This sequence of keywords corresponds to monotonically non-decreasing volume levels, mapped to implementation-dependent values that meet the listener's requirements with regards to perceived loudness. These audio levels are typically provided via a preference mechanism that allow users to calibrate sound options according to their auditory environment. The keyword ‘x-soft’ maps to the user's minimum audible volume level, ‘x-loud’ maps to the user's maximum tolerable volume level, ‘medium’ maps to the user's preferred volume level, ‘soft’ and ‘loud’ map to intermediary values.
Which is a complicated way of saying that the increments are not meant to be dramatic or uncomfortable, just changes that are relative to your current volume level.
However, according to the CSS Speech module, the voice-volume
property can also be set using a Decibel offset from the chosen keyword. The idea is that more subtle incremental changes could be made to the volume, but it also opens up the possibility of more extreme changes too. This could be addressed in the specification , either by limiting the range of Decibels that are supported, or by removing the Decibel offset option entirely.
The ultimate safeguard is for browsers to offer a way to ignore CSS Speech. There is precedent in the existing ability to ignore all CSS, but that is too much of a blunt instrument in this context. Not all screen reader users are blind or have low vision, so the visual design of content is still relevant to some. Even screen reader users who cannot see are affected by certain CSS properties like display
that have an impact on the way content is presented to sighted and non-sighted people alike. In other words, there needs to be the option to ignore CSS Speech that is independent of the option to ignore CSS generally.
With these protections and safeguards in place, CSS Speech would let us elevate the experience of listening to content from a monotone into something altogether more interesting.
There are some truly extraordinary Text To Speech (TTS) engins out there, but they're not the ones used by browsers or screen readers as a rule. For the most part this means that content spoken by web readers or screen readers is delivered in a flat, disinterested way that never changes - whether reading about a climate catastrophe or describing unfeasibly cute and fluffy things.
Imagine this news item:
<h2>News headline: Nothing happened!</h2>
<p>Reported on 1 April</p>
<p>Nothing happened today. Everybody went and had a nice cup of tea instead.</p>>
MALE VOICE (FLAT VOLUME, RATE, PITCH): "News headline: Nothing happened! Reported on April the 1st. Nothing happened today. Everybody went and had a nice cup of tea instead."
With CSS Speech it would be possible to subtley alter the way parts of the news item are spoken - giving them a little bit of texture and interest:
.headline {
speak: auto;
voice-volume: x-loud;
voice-rate: fast;
voice-pitch: high;
}
.date {
speak: auto;
voice-volume: soft;
voice-rate: x-fast;
voice-pitch: low;
}
<h2 class="headline">News headline: Nothing happened!</h2>
<p class="date">Reported on 1 April</p>
<p>Nothing happened today. Everybody went and had a nice cup of tea instead.</p>>
MALE VOICE (LOUD, FAST, HIGH PITCH): "News headline: Nothing happened today." MALE VOICE (SOFT, EXTRA FAST, LOW PITCHED): "Reported on April the 1st." MALE VOICE (NORMAL VOLUME, RATE, PITCH): "Nothing happened today. Everybody went and had a nice cup of tea instead."
This is the real case for CSS Speech - the ability to make aural design as much a part of thoughtful user experience as visual design.
]]>Once this was largely an accessibility issue. Listening to a screen reader is incredibly tedious. Everything sounds exactly the same, regardless of what it is. Imagine if every website looked like it did back in the 90s (all grey backgrounds with blue/maroon links) and you start to get the idea!
Now it's something much broader than that. Edge, Firefox, and Safari, all make it possible for someone to listen to content, and there are extensions that bring the same capability to Chrome. Users have limited choices (to change the voice, increase/decrease the speaking rate), but authors have no ability to design the voice experience itself.
In these days of voice interaction on every platform, there is a growing expectation that it should be possible to design that experience just like we can the visual experience. In the same way an organisation chooses a logo and colour palette for its website, it stands to reason that they may also choose a particular voice that represents their brand.
SSML is intended for this purpose, and indeed it's how voice design for the Echo and devices with Google Assistant is handled, but it is not supported in any browser that I know of. Even if it were, it arguably returns us to the days of HTML before we had CSS to let us control the design from a central source (think font and centre elements for example).
Enter CSS Speech. It began life back in the early 00s as Aural Style Sheets. Although a version of it was implemented in EMACSPEAK, it never achieved the implementation experience to become a W3C Recommendation.
In summary, CSS Speech is a set of CSS properties intended to let authors design the aural presentation of content - whether by a platform Text To Speech (TTS) through the browser's read aloud capability or someone using a screen reader. In 2015, I used the Web Speech API to simulate CSS Speech support with a very basic proof of concept.
It begins with the speak:;
property. Like display:;
determines if an element is visible or not, the speak:;
property determines if the content contained within the element should be spoken or not. In fact, when set to auto
the intent is that speak:;
and display:;
will be symbiotic - if display: none;
, then speak is automatically
assumed to be speak: none;
.
There are then numerous other properties for setting the voice-family
, voice-pitch
, voice-rate
, and voice-volume
of the speech output. Again, you'll notice the similarities with existing CSS properties - voice-family
is the equivalent of font-family
for example.
One of the arguments against implementing CSS Speech is that it could disrupt or even prevent someone from using the website. This is both true, not true, and relatively easily solved in any case.
It is possible for CSS to make a website thoroughly unusable already. It happens all the time. This is why browsers have the ability for users to turn-off style sheets already. With CSS Speech support it would be necessary to have an additional on/off toggle just for Speech style sheets, but the basic capability has precedence.
Conversely, several properties have recently been added to CSS that either enhance the usability of web content, or that enable users to indicate their
personal preferences. Think prefers-reduced-motion
for example. It may also be possible to add a prefers-no-speec
h property to the CSS Speech spec perhaps.
I've spent the past few years talking a lot about this at conferences, along with ways to design the voice experience on other platforms like Google Home or the Echo. Without fail, people are astonished that no such capability is available on the web, and so I think the time has come to revisit it.
As it is, the CSS Speech spec is too big, too wordy, and has too many features. It's essentially unchanged in the 20 years it's been around, and certain features like the "aural box model " are definitely "of their time"!
A while ago the CSS Working Group accepted my offer to edit the CSS Speech module. It didn't come to anything at the time, but with sufficient interest, I'd like to strip the spec down to the bare minimum (the speech media type and a handful of basic properties like speak
, voice-family
, voice-rate
, voice-pitch
, and voice-volume
for example), then with some implementations (behind the flag possibly) get some evidence to validate the anecdotal evidence.
What do you think?
]]>The concept of "affordance" was first described by JJ Gibson in the late 1970s and Don Norman applied it to design in the late 1980s. Norman later used the phrase "perceived affordance" to clarify the use of affordance in interface design. Perceived affordance, as Norman puts it, is "whether the user perceives that some action is possible (or in the case of perceived non-affordances, not possible)".
For example, a button is intended to be pressed with the expectation that something will happen when it is. When a button is not pressed it stands proud of the surrounding surface, and when it is pressed it's flat. This means you can tell that it's a button and what state it's in by touching it or by looking at it. In other words, a button has both physical and perceived affordances. The knowledge of what to do with it comes from your prior experience of using buttons that conform to the same convention.
The concepts of affordance and convention were so familiar to most people, they were transferred from the physical world to digital. Radio buttons, for example, are so called because of car radios from the days before digital broadcasting.
Finding a radio station used to involve twiddling a knob to slowly move through the different radio bands until you found a station you wanted to listen to. Doing this whilst driving and, presumably, wanting to keep your eyes on the road, was not altogether conducive to an accident free life, so radio buttons were invented. Car owners could pre-tune their radio to a few favourite stations and assign each station to a button on the front of the radio. Since listening to one station at a time is enough for most people, the radio buttons were designed so that only one could be pressed at a time.
The digital world still hasn't really figured out physical affordance, at least not to the extent it's useful across websites and applications. This means we depend on perceived rather than physical affordance.
Buttons and radio buttons are not the only User Interface (UI) components to make the transition from the physical to digital world (think restaurant menus or the tabs in filing cabinets and telephone organisers), but let's stick with them for now.
Even though digital radio buttons look nothing like their physical counterparts used to, a common visual convention for digital buttons and radio buttons has emerged over the past three decades or so. If you're sighted and you've used any form of Graphical User Interface (GUI) in that time, you'll probably be able to recognise a button or set of radio buttons by their visual appearance. if you can't see and use a screen reader instead, then you'll recognise them based on their semantics and the way your screen reader announces that information.
Which brings us to the mismatch problem.
Using a set of radio buttons for a component that lets people select one thing at a time makes sense - it's what they're for after all. It's when the radio buttons don't look like radio buttons that the mismatch happens.
If you're a sighted mouse user, you'll see a set of buttons and your prior experience will tell you that you can point at the one you want and click on it with the expectation that something will happen when you do - and indeed it does.
If you're a sighted keyboard user, you'll also see a set of buttons, and your prior experience will tell you that you can repeatedly use the Tab key to navigate to the button you want, then the Enter or Space keys to activate it - except you can't because it isn't a set of buttons, it's a set of radio buttons, and the keyboard interaction is not the same.
You can use the Tab key to move to the first radio button in the set, but if you press it again, you'll navigate to the next focusable element after the radio buttons, not to the next radio button in the set. In other words, unless you happen to guess you're really using a set of radio buttons and you realise you should be using the arrow keys not the Tab key to navigate between them, you're likely to think the component is broken because you can't reach any other buttons in the set.
Note: Firefox is the exception because it does allow you to use the Tab key to navigate between all radio buttons in a set.
If you're a sighted screen reader user, you too will see the set of buttons, and your prior experience will tell you that you can use your screen reader's shortcut key for navigating to and between buttons to reach them - except you can't because it isn't a set of buttons, it's a set of radio buttons, and the screen reader shortcut is not the same.
One of the Inclusive Design Principles is:
Be consistent: Use familiar conventions and apply them consistently.
It goes on to explain:
Familiar interfaces borrow from well-established patterns. These should be used consistently within the interface to reinforce their meaning and purpose. This should be applied to functionality, behaviour, editorial, and presentation. You should say the same things in the same way and users should be able to do the same things in the same way.
Good design means following that principle and making sure that the functionality of your component matches the perceived affordances of your visual design, or that your visual design matches the functionality of your component.
]]>Here's a question for you: if you read a text description that says "A person", what does the image in your head look like?
There's a good chance that the person in your imagination looks a lot like you. That's because to you, you are your own normal.
One of the ways we normalise diversity is by making it, well, normal. The more you see people who are not like you on TV, in adverts, in the workplace, and every other place that humans can be found, the more normal it becomes.
When it comes to skin tone and ethnicity, much of this depends on the ability to see; so how do we make sure that people who cannot see are experiencing the same normalisation?
Jake rightly points out that context is important, and he makes this very reasonable comment:
In the context of a conference website, if the speaker is talking about JavaScript performance it seems unnecessary to bring race into it. But if the speaker is talking about their experience of prejudice in the tech industry, that changes the context.
Jake is right, but I also trust he won't mind if I suggest another way to look at it:
When I'm invited to speak at a conference, or I'm interested in attending one, one of the things I want to know is if the organisers have brought together a diverse line-up of speakers. Not all forms of diversity are visible of course, but if a person's skin tone and/or ethnicity are included in the text description, it normalises the idea that great conference speakers come from diverse ethnic backgrounds, even if you cannot see the photos of the speakers themselves.
There can be times when not mentioning skin tone can have unexpected consequences. I recently watched Lupin, a Netflix series set in Paris, where the lead character is played by an actor of colour.
There is a scene where Lupin fools the police into arresting someone else, instead of Lupin himself. The switch is possible because both characters are wearing similar clothes, and because both characters are people of colour. Except the Audio Description (AD) hadn't mentioned that the other character was a person of colour, only that Lupin was, and because diversity has not yet become normalised on TV, I assumed the "someone else" was white.
Yes, I know. That was my assumption at fault. But that's the point I'm trying to make: in the absence of information to the contrary, and until diversity is normalised in the ways blind people consume content, I'm likely to assume that, left unspecified, that "someone else" looks like me. I'm trying to do better, but a little (text description) help would go a long way.
An interesting note on this particular case, is that if the AD had described the skin tone of the other person, it would have drawn attention to the fact and that in turn would have been a clue to the impending twist in the story. If the AD had included skin tone as a regular part of describing all the characters, I wouldn't have given it a second thought, and the end of the episode would have been as much of a surprise to me as everyone else.
Which brings me to another point: normalising diversity is important, but so is enjoying it. The world would be an incredibly boring place if we all looked the same, but of course to a blind person everyone does look the same, unless things like text descriptions and AD enable us to both normalise and enjoy all the wonderful ways a human being can look.
]]>At the end, the consistency should be fairly thick (somewhere between a soup and a heavy gravy), with all the onions and peppers almost entirely dissolved (with possibly a few bits of pepper skin still intact - feel free to try and fish them out), and the meat fork-tender.
If the sauce is too runny for your liking, you can add some corn starch slurry (one teaspoon of corn starch, mixed with a little bit of boiling water in a bowl first before pouring into the pot, to avoid lumps) and briefly bring it to the boil again until it thickens.
Serve with whatever starchy accompaniment you want: spaetzle (as most common in Austria), large potato dumplings (the more common German version), gnocchi, rigatoni, potatoes, rice, or just with bread.
There are many variations of goulash, and this is one of the common German/Austrian versions my dad used to make. Some people will throw in extra ingredients like carrots and diced potatoes, or deglaze the pot with some red wine before adding the stock, or throw in extra spices like marjoram, oregano, cumin, tarragon. If that sounds like something you might enjoy, go ahead and add those, why not.
]]>I use a screen reader and it sounds the same, whatever I'm doing - reading an email from a friend, reading news of a global disaster, reading something in the first-person female or first-person transgender, shopping for clothes or scrutinising a business contract. Although the quality of voices used by screen readers is OK, you're never in any doubt that you're listening to synthetic speech - and believe me, it's boring!
It's worse than that though. I've been using the same screen reader with the same voice for about 20 years. The quality of synthetic speech has improved considerably over that time, so you might well ask why I choose to listen to a voice that has its origins in the 1980s.
The reason is not vocal quality; it's performance. Like other screen reader users, I've grown used to listening at relatively high speed - something like 520 Words Per Minute (WPM), as opposed to the roughly 150 WPM of conversational English.
Like GPS navigation systems, automated phone menus, and voice assistants, screen readers use Text To Speech (TTS) engines to create the voice you hear. Each TTS engine comes with a different set of voices, and some screen readers let you switch TTS engines so you can choose a voice you prefer.
Different types of TTS have emerged over the years, and each has its advantages and disadvantages.
To understand the way different TTS engines work, it's helpful to know a bit about how human speech works.
Sound is created when a flow of air is interrupted. The interruption makes air particles vibrate so they bump into other air particles. Those other air particles also start to vibrate, and bump into yet more air particles. This keeps happening until the wave of vibration, or sound wave, runs out of energy.
When you speak, air is pushed out of your body by your diaphragm, ribs, and abdominal muscles. On the way out, the flow of air is interrupted by your vocal cords, which vibrate, creating a sound wave.
The number of times your vocal cords vibrate in a second determines the pitch of your voice. If your vocal cords vibrate 85 times a second, the flow of air is interrupted 85 times a second, which means the pitch of your voice is 85HZ. The higher the frequency of the vibrations, the higher the pitch of your voice.
Your vocal cords do not all vibrate at the same frequency. The lowest and strongest frequency is known as your fundamental pitch - the pitch you speak at. Harmonics are vibrations that happen at the same time, but at a higher and weaker frequency.
Sounds that have a low pitch are louder than those with a higher pitch, so if your vocal cords were the only part of you involved in making yourself heard, your harmonics would not be heard over your fundamental pitch.
Your body solves this problem by amplifying the volume of the sound in resonating chambers. Each chamber contains air that will vibrate at a frequency determined by the size of the chamber, the smaller the chamber, the higher the pitch.
The sound wave you generated passes through your resonating chambers. If one of them has a size and shape that means it has a vibration frequency that's the same as either your fundamental pitch or one of the harmonics, the air inside the chamber starts to vibrate at the same frequency - amplifying the sound.
You can change the size and shape of your resonating chambers using things like your diaphragm, tongue, soft palette, jaw, and lips; and in so doing you control the pitch (frequency) and volume (amplitude) of the sound that emerges from each chamber. These chambers typically resonate at between three and five frequencies, known as formants.
Formant TTS creates artificial sounds based on a set of rules that mimic the basic characteristics of human speech (like frequency and amplitude). It needs little processor power, making it a good choice for devices with little or unknown computational capability - like computers running screen readers in the 1990s or older computers today.
The fact that formant TTS is entirely artificial means it's flexible enough to convert any text into synthetic speech, but the rules for generating different voices are limited. For example, a female voice is typically created by nothing more sophisticated than doubling the pitch of a male voice.
My usual Windows screen reader is Jaws, which uses the Eloquence TTS by default. There are eight voices available; Reed, Shelly, Bobby, Rocko, Glen, Sandy, Grandma and Grandpa. The voices are intended to represent people of different ages and/or binary genders, but as the following demo shows, the voice characterisations are rudimentary at best:
All voices have an American accent.
Robotic Male Adult Voice: "Hello, this is the voice of Reed.," Robotic Female Adult Voice: "This is the voice of Shelly," Robotic Male Child Voice: "This is the voice of Bobby," Robotic Deep Male Adult Voice: "This is the voice of Rocko," Robotic High Male Adult Voice: "This is the voice of Glen," Robotic High Female Adult Voice: "This is the voice of Sandy," Robotic Elder Female Voice: "This is the voice of Grandma," Robotic Elder Male Voice: "and this is the voice of Grandpa."
Speech generated using formant TTS has few audio glitches, so it's intelligible even at faster speeds, but it's every bit as artificial as the TTS itself, sounding robotic, monotone, and utterly devoid of expression.
In the following demo, the formant Eloquence TTS is used to speak a passage of text at the default speaking rate set by Jaws, then again at the rate at which I typically listen to my screen reader:
Robotic Male Adult Voice (Slow): "For millions of years, humans lived just like the animals. Then something happened that unleashed the power of our imagination. We learned to talk."
Robotic Male Adult Voice (Fast): "For millions of years, humans lived just like the animals. Then something happened that unleashed the power of our imagination. We learned to talk."
To overcome the artificial sound of formant TTS, concatenative TTS uses recordings of human speech, stored in a database in chunks of different sizes like phones and syllables.
Speech is synthesized by searching the database for the recorded chunks that most closely match the features of the text to be spoken, then joining them together, or concatenating them, them to form coherent words and sentences. This takes quite a bit of processor power and the act of searching the database and sequencing the speech output takes time, so the result is generally slower and less responsive than formant TTS.
In theory, concatenative TTS preserves the human qualities of the original recordings, but in practice it isn't possible to record a large enough set of words, in enough combinations of speaking style, voices, or emotional expressions, to do so. The voice quality tends to be better than formant TTS, but it still lacks the variable characteristics of human speech, and often has a "mumbling" quality that makes it less crisp when listening at higher rates of WPM.
In the following demo, the concatenative Vocalizer Expressive 2 TTS is used to speak the same passage of text:
Artificial Male Adult Voice: "For millions of years, humans lived just like the animals. Then something happened that unleashed the power of our imagination. We learned to talk."
Concatenative TTS is a good choice in situations where the vocabulary is limited and the rate of speech is measured; like airport announcements or automated phone menus, but less good for screen readers that need to be able to read everything and anything. Despite this, the vocal improvement over formant TTS means that most screen readers have the capability to use concatenative TTS voices.
Parametric TTS was created to overcome the limitations of both formant and concatenative TTS. It uses recordings of human speech, but unlike concatenative TTS, it does not sequence chunks of sound to construct words and sentences. Instead, it describes the statistical characteristics of the recorded speech in a set of rules, or parameters. Those rules are then used to generate an artificial acoustic wave using a signal processor known as a vocoder.
By changing the parameters that are sent to the vocoder, parametric TTS can be used to create voices with more flexibility than concatenative TTS, and more nuance than formant TTS. Despite the ability to simulate voices of different genders, ages, and accents with reasonable accuracy, parametric TTS smooths out the sound, giving speech an oddly flat, monotone quality.
In the following demo, Google's parametric TTS is used to repeat the same text:
More Human Male Adult Voice: "For millions of years, humans lived just like the animals. Then something happened that unleashed the power of our imagination. We learned to talk."
Like parametric TTS, neural TTS creates artificial wave forms, but it does so using a neural network that's been trained by an enormous amount of voice data.
Being trained on truly vast datasets means that neural TTS voices can sound remarkably human, even to the point that it can be hard to tell they're synthetic unless you know what you're listening for.
In the following demo, Google's WaveNet TTS is used to repeat the same text:
Nearly Human Male Adult Voice: "For millions of years, humans lived just like the animals. Then something happened that unleashed the power of our imagination. We learned to talk."
It takes a lot of processor power to generate speech using neural TTS though, so speech is typically generated in the cloud where computational "oomph" is plentiful.
This means that neural TTS are not yet an option for screen readers.
There are privacy safeguards that would need to exist for one thing. Imagine everything you read, everything you look at, and everything you write, being processed in the cloud, and I do mean everything. All your email; documents like your will, the naughty message you just sent someone, the groceries you bought this week, or a Non-Disclosure Agreement (NDA) you need to sign; people's personal information including you're own; and a thousand other things that you happen to look at over the course of time.
There are performance issues for another thing. Although the generation of speech using a neural TTS is itself highly performant, network latency is another matter. Even on a high bandwidth connection, the time needed to make the round-trip to the cloud and back is not fast enough for even the most leisurely screen reader user to contend with.
Then there's the problem of network resilience and of having no network connection at all. A screen reader must work wherever the person using it happens to be, and that includes places where there is no connectivity, or where the connectivity is there but it's flaky and unreliable.
So here is my predicament:
I want to listen to my screen reader fast; really fast. It makes my teeth itch when I must listen at fewer WPM. If you have a home assistant, make it speak two or three times more slowly than usual, and you'll see what I mean.
I'd also like to listen to a voice that sounds almost, if not yet entirely quite like, a proper human. One that can put a bit of expression into the things I listen to, even if it's still a bit limited.
But until neural TTS engines can be used with screen readers, I'm taking performance over voice quality because I have work to do, emails to write, contracts to sign, and maybe even the occasional naughty message to send someone, and I don't want to hang about!
This post was updated on 12 September, with thanks to feedback from Blake Roberts. The original 3 September post can be found through the Internet Archive Way Back Machine.
]]>I use Firefox with Enhanced Tracking Protection (ETP) enabled. This is how Mozilla describes ETP:
Enhanced Tracking Protection in Firefox automatically protects your privacy while you browse. It blocks trackers that follow you around online to collect information about your browsing habits and interests... It also includes protections against harmful scripts...
So this is the browser I used when I visited the AccessiBe website to find out more about their accessibility overlay and how it works.
Except the AccessiBe overlay didn't work because it was blocked by the ETP.
When I turned off ETP and tried again, I found that the AccessiBe overlay automatically detected my screen reader and preselected a configuration based on that information.
I wrote about my thoughts on screen reader detection in 2014. Now, I wanted to understand the full impact screen reader detection could have on my personal privacy.
When the AccessiBe overlay is opened, the JavaScript it uses sets up local storage, session storage, and cookies from accounts.accessibe.com, although no persistent data seems to be stored in them. In other words, AccessiBe does not appear to use cookies or web storage for tracking (no matter if you use an assistive technology or not).
But...
The JavaScript that provides the AccessiBe overlay is loaded directly from AccessiBe's servers. This means it's a third party JavaScript.
Third party JavaScripts, especially those that are added by web managers outside of the development lifecycle, can pose a security risk. They can also be used to track you as you browse different websites.
Having realised that the AccessiBe overlay is a third party JavaScript that could be used to track my browsing activity, I looked at the AccessiBe privacy policy. Here's what it says about the information it collects from the websites you visit that use the AccessiBe overlay:
When a website implements the AccessiBe Solution, it sends the following data to our servers: Image URLs, link URLs, HTML structure, CSS attributes, clicks, interactions, and pages viewed.
So, in addition to knowing you visited the website, AccessiBe also knows which pages you visited, and what you clicked on or otherwise interacted with whilst you were there.
It goes on to say:
This information does not include, actual content/text from the webpage or form or field information.
The interesting word there is "actual". AccessiBe may not collect "actual content" like an image, but (according to the AccessiBe homepage:
AccessiBe uses contextual understanding and image recognition to scan and analyze the functionality of every element on your website...
So it might not collect the "actual" image file, but it says nothing about the data it collects by scanning the image to find out what's in it.
The privacy policy goes on to say:
In addition, the foregoing information does not include any Personal Information of any user and cannot be associated with any user.
This seems reassuring; until you remember that it only relates to the information AccessiBe collects from the website you visited. It does not cover the information AccessiBe collects about you:
Our servers automatically collect information when you access or use the Website and record it in log files. The log information we collect may include your IP Address, the address of the web page visited before using the Website, browser type and settings, and cookie data.
Unless you use a Virtual Private Network (VPN), the geolocation of your IP address can be used to determine roughly where you were in the world when you visited the site with the AccessiBe overlay enabled, as well as the internet service provider you were using at the time. It's likely that your IP address is shared by the other devices on your network, and so does not identify your device specifically, but it is another piece of information that can be used to profile you.
When you follow a link, your browser keeps note of the web page you were on at the time, and makes the URL available to the website you're navigating to in the HTTP referrer. This is another piece of personal information that AccessiBe has access to.
Your browser type and settings
create your browser fingerprint
. Like its name suggests, your browser fingerprint is like your actual fingerprint. Whenever you touch something you leave behind a fingerprint. If someone knows what your fingerprint looks like, they can track the places you've visited by matching your fingerprints to those found at different locations.
Your browser fingerprint includes information about the time zone you're in, the language you're using, the browser you're using (including the plugins you have enabled), your operating system, the device you're using, your screen resolution, and more besides.
You might think that millions of other people have the same browser type and settings as you, but according to the Electronic Freedom Foundation (PDF, 419Kb), your browser fingerprint is shared by 1 in every 287,777 people.
The population of Bristol, the region identified by my IP address, is currently 465,900. That means there is 0.6 of another person somewhere in the area with the same browser fingerprint as me.
Now remember that as well as my IP address and browser fingerprint, AccessiBe also knows I use a screen reader, which, statistically speaking, means I'm blind or have extremely low vision.
I wonder, what are the chances of the other 0.6 of a person being a screen reader user too?
None of this information identifies me as Léonie Watson, but it does identify me as a unique individual that AccessiBe can track across the websites that use the AccessiBe overlay.
Except that the AccessiBe privacy policy also says:
We may collect information about you from other sources that may include your Personal Information. sources... such as third-party payment processors and social media services... publicly available sources; and marketing service providers... We may combine the information collected from third parties as set forth above with other information which is in our possession.
So there it is. Not only can AccessiBe identify me as a unique individual with a disability, they can also associate that information with my name, and any other information they can obtain from LinkedIn, Twitter, and other public sources.
According to the Information Commissioner's Office (ICO):
A Data Protection Impact Assessment (DPIA) is a process to help you identify and minimise the data protection risks of a project. You must do a DPIA for processing that is likely to result in a high risk to individuals.
AccessiBe's privacy policy makes me think there is a high risk to individuals, and the ICO states that a DPIA must be carried out in such cases, so I wanted to know what due diligence AccessiBe had carried out, and what ethical and legal basis they had to justify the risk.
On 17 March 2021, I sent AccessiBe a Direct Message (DM) on Twitter, asking them for a copy of their DPIA:
When AccessiBe did not respond, I completed the contact form on their website on 23 March. I was promptly called by one of their sales team, who could not tell me anything about AccessiBe's compliance with the General Data Protection Regulations (GDPR), but assured me someone would be in touch who could help.
When they had not responded by 1 April, I sent AccessiBe an open tweet asking them for a copy of their DPIA.
I'm still waiting.
If you choose not to risk letting AccessiBe have access to your personal data:
With thanks to Laura Kalbag for help uncovering what the AccessiBe JavaScript does; to Heather Burns for help understanding GDPR policy; and to Christine Runnegar for helping me learn more about privacy in general.
]]>That is an awful lot of content that a blind person cannot see, understand, or use.
If an image in an HTML document does not have a text description, which is to say if it is missing an alt
attribute (or other attribute capable of giving it an accessible name like aria-label
or aria-labelledby
) the results are ugly. Different screen reader and browser combinations behave a little differently, but one common behaviour is to use the path to the image and/or the image file name instead.
Occaisionally the file name offers a tiny clue as to the content of the image but the days of human readable file names are largely gone and the majority of images now have non-sensical alpha-numeric strings as file names- and what "Image001.jpg" might contain is anyone's guess.
Even when there is a text description it isn't always helpful. The WebAIM Million further noted that just under 10% of images with text descriptions used descriptions that were effectively useless - things like "image", "blank", or even "filename.png".
In the absence of text descriptions provided by content authors, screen readers are turning to machine learning. Screen readers have used a form of machine learning called Optical Character Recognition (OCR) for some time. OCR examines a graphical representation of a document (like a scan of a paper document) for text content, then converts it into actual text that screen reader users can read. More recently screen readers have begun introducing image recognition capabilities. VoiceOver on iOS uses image recognition to describe images and identify common objects like icons and buttons; Jaws Picture Smart describes images and controls sourced from files, websites and the clipboard; NVDA recognises objects and with addons can be further extended.
As Arthur C. Clarke put it: "Any sufficiently advanced technology is indistinguishable from magic", and image recognition certainly seems like magic at first. The fact that a screen reader can examine an image and make some educated guesses about what it contains has only been possible for a few years.
image recognition has been around since the 1960s but it is only relatively recently that advances in machine learning and cloud computing have made it possible to add image recognition capabilities to consumer products and services. An "educated guess" is about as good as it gets though, because machine learning systems are only as good as the data they're trained with - and in the case of image recognition which needs to be able to recognise anything that can be put into an image, we're still a long way from the accuracy of human recognition.
To help mitigate this, the Picture Smart feature in Jaws uses certainty qualifiers to describe the image recognition results. For example here are the results for this image of the Mona Lisa by Leonardo Da Vinci:
If I had not actually seen the Mona Lisa, I could be reasonably confident that it was a picture of a woman, most likely a portrait painting, but much less sure if it was a sketch or a drawing (assuming there is a difference at all). Not bad; not bad at all.
If you're thinking to yourself that providing text descriptions is no longer something you need to bother with though - think again.
Here are the Picture Smart image recognition results for this image of The Metamorphosis of Narcissus by Salvador Dali:
To borrow from Douglas Adams, it's almost, but not quite entirely nothing like it. Here is the description of the painting from Wikipedia:
In Dalí's painting, he depicts the figure of Narcissus on the left side of the canvas crouched by a lake, with his head resting on his knee, and a stone hand clutching an egg mirroring the shape of his body on the right. From out of the cracked egg, a narcissus flower sprouts. In the mid-ground of the painting stand a group of Narcissus's rejected suitors. Among the mountains in the background rests a third Narcissus figure.
So here's the thing - image recognition in screen readers is a massive improvement over the absence of anything better, but it isn't better than a text description provided by a content author who knows exactly what's in the image, why its being used and the context its being used in.
]]>For those of us that do not process visual data because some part of the infrastructure is missing or damaged, the act of looking at something, processing that information, deciding what to do with it, then filing it away for future reference, just doesn't happen.
In recent years Artificial Intelligence (AI) has begun to solve a small part of the problem. Apps like SeeingAI from Microsoft (an iOS app that uses image recognition) have made it possible for blind people to read text and handwriting, find objects and people, identify paper money in a variety of currencies, detect light sources and colours, and scan product bar codes.
As astonishing as this is, and believe me it really is astonishing, the apps all have one disadvantage - you have to point your phone at the thing you want to look at. Depending on the situation this can be inconvenient, socially awkward, or downright unsafe.
Imagine these scenarios:
How much less inconvenient, socially awkward, or unsafe would it be if you could keep your phone in your pocket and just do what everyone else does - look at the thing you're interested in?
Enter Envision Glasses. Envision Glasses use Google Glass Enterprise Edition 2 and Envision AI image recognition to create a wearable device that lets you look at something, and find out what it is or what it says.
Envision Glasses are available in two styles: the standard Titanium frames with no lenses, or a set of "designer" Smith Optics frames with lenses that can be swapped with your own prescription lenses.
The frames consist of the arm that goes over your left ear and the bits that frame your eyes. The rest, known as the body, attaches to the frames just over your right eye to form the arm that hooks over your right ear.
The body is made of black plastic and it's a lot more substantial than the frames because it's where all the hardware (including the camera and speaker) is found. When attached to the frames the body curves smoothly around the top of the frame over your right eye by about a centimetre. This is where the camera is found so it points forward in the direction you're facing. Continuing about two thirds of the way along the frame over your right eye is a clear perspex block that creates a Heads Up Display (HUD).
If you have usable sight, the HUD is in the top right corner of your field of vision. Architecturally Envision Glasses are based on the "screen first" model of the Google Glass but with a built-in screen reader, which is why the help and instructions mention screens so often, though the experience is seamless whether you use the HUD or not.
I chose the standard titanium frames because the little jewelry I wear is platinum or silver and very plain, so they felt more my style than the darker and heavier Smith Optics frames (think NHS specs from the 1970s or classic Ray Ban frames).
The asthetic imbalance between the light metal frames and the heavier black plastic body, particularly where it curves around the top of one eye, makes me feel a bit like I've been assimilated by the Borg, but I'm actually OK with that. The hardware has to go somewhere and I think the glasses manage it in a reasonably stylish way.
The body is about 5cm longer than the other arm of the Envision Glasses, so it clears the back of my head by perhaps 2cm. This can't be helped and for people with larger heads than mine I daresay it's much less noticeable. At 45g the body is also heavy enough that it makes the top of my ear uncomfortable after a short amount of time, though I assume I'll get used to this with wear.
The Envision Glasses come neatly packaged with the frames and body in separate boxes (with the USB-A to USB-C charging cable in the same box as the body). Short instructions in print and Braille are included, but the essential information is that you use the Envision AI app to get everything setup.
With the Envision AI app installed go to the Glasses tab, then choose Pair Envision Glasses and follow these steps:
Once your Envision Glasses and the Envision AI app are paired a short tutorial begins on the glasses. The touch pad is found on the outside of the body, and Envision Glasses support the following touch gestures:
The main menu has the following options: Home, Read, Identify, Find, Call, Device Settings, Feature Preferences, and Help.
Under Device Settings you can configure the speed and volume of the audio; configure wifi and Bluetooth; choose another language (from English, Dutch, German, French, Spanish, French and another language I do not recognise but which might be Chinese); find the software version and check for updates; check the battery and put the glasses into sleep mode; and turn the HUD on/off.
Tapping the side of the glasses felt a bit strange at first but I soon realised I was doing it without conscious thought, and compared to taking out my phone and pointing it at whatever I'm interested in, it feels like the epitome of discretion itself!
When you turn Envision Glasses on it takes about a minute for them to boot up. There is no audio confirmation that the boot sequence has begun, though lights do turn on should you be able to see them. I cannot, so it took a few attempts before I was able to turn them on for the first time. Even now, having turned my Envision Glasses on/off several more times I still find it an uncertain process.
When Envision Glasses finish booting an audio welcome is played, forming part of the sound scheme designed by Andre Louis. The sound effects are informative without being intrusive, helping you make sense of what the glasses are doing - taking a picture or processing information for example.
The features in the main menu are Read, Identify, and Find.
When you select Read from the main menu there are 2 options: Instant Text, Scan Text, and Batch Scan.
Instantly reads short chunks of text that come into camera range. To use Instant Text:
You can use Instant Text in offline mode. To enable this:
Reads longer chunks of text by scanning the document. You can use Scan Text with or without text detection, but text detection takes a lot of the guess-work out of the processs. To use Scan Text with text detection:
The text is then processed and opened in the reader. The following gestures are then available:
If text detection is not turned on you'll hear a countdown after you double tap on Scan Text and at the end of the countdown a picture will be taken of whatever is in front of you. To turn text description on/off:
Reads multi-page documents by scanning them in succession. To use Batch Scan:
It turns out I'm about 20 years out of practice finding a line of sight so at first I was incredibly frustrated because I could not read anything. After my ever patient husband explained I was staring off into the middle distance instead of the thing I was trying to read, things improved dramatically and so far I've found the text recognition to be tolerably accurate using both Instant Text and Scan Text.
When you select Identify from the main menu there are 2 options: Describe Scene and Detect Colours.
Briefly describes the scene in front of you by taking a picture. To use Describe Scene:
Instantly detects the colour of whatever is in front of you. To use Detect Colour:
Scene descriptions are very succinct (for example "Person standing in a door" or "Flat screen TV on a wall"), but as someone who can see nothing at all I am overjoyed by the ability to understand even the smallest thing about my surroundings. It would be good to have the option to choose lengthier descriptions, but there is a lot to be said for brevity too.
The short wait whilst the image is processed is mitigated by the fact a double tap is all that's needed to request another scene description, and being able to do it just by looking at what I'm interested in and tapping a couple of times on my glasses is nothing short of happiness in a pair of spectacles.
Detect Colours needs good light and without it Envision Glasses struggle for accuracy. Looking at the same bit of blank wall in a room with good artificial light the Detect Colours feature reported a constantly updating chain of olive green, chocolate brown, dark kaki and more, for a wall that is essentially cream albeit with some shadows from the lights. It's good enough to get a rough idea of what colour something is, but I don't think I'd trust it to match paint or choose an outfit - though given the fact I only wear black I suppose that's not too much of a problem really!
When you select Find from the main menu there are 3 options: Find Objects, Find People, and Explore.
Scans for a limited set of objects in your immediate vicinity. To use Find Objects:
Envision Glasses can currently recognise the following objects:
Detects people in your vicinity. To use Find People:
You must not add anyone to your library without their consent. If you think about it, stealthily taking pictures of a person you cannot see in the first place is unlikely to be a successful enterprise no matter how much you might wish it were otherwise, but still...
You have to use the Envision AI app and your phone to add someone to your library; you can't do it using the glasses. To add someone to your library:
You'll then need to take at least 5 pictures of the person. When the phone detects that there is a person within the camera's range the Take Picture button is enabled so you can do the honours. When all the pictures have been taken you can add a description such as the person's name. Next time you use Find Person and someone in your library is recognised the Envision Glasses will say "Looks like X".
Scans your environment and tells you about objects. To use Explore:
The set of objects the Envision Glasses can find is incredibly limited and at present it isn't possible to add your own objects to a library. Envision say they are planning to add new objects, and that they're looking at the possibility of letting you add your own.
I thought it might be possible to use the Teach Envision function to add an object instead of a person to my library, but since the Take Picture button is disabled unless there is a person in view, that particular hack didn't work.
The Find Object feature also seems to have very particular ideas about what certain objects look like. It identified my laptop without difficulty, but no matter how long I stared at my teacup and saucer it didn't seem to think there was a cup in sight, and it didn't seem to think the armchair in my living room was a chair either.
Interestingly, Explore does not seem to be limited to the same set of objects as Find Object. On a wander around the house I saw a bed, a couch, a toilet, and a refridgerator, as well as my laptop again, two sorts of TV, and a table (still no sign of a chair apparently). The experience was only slightly marred by the fact I was staring at a wall when it told me I was looking at a couch (and the nearest couch was behind me and around a corner), and I was in our office when it said there was a fridge and a toilet. I might be blind but I'm reasonably sure we do not have a toilet in our office, or a fridge come to that.
When you select Call from the main menu there are 2 options: Call an Ally and Add an Ally.
Lets you call someone you've added as an ally so they can see the view from your glasses using the Envision AI app. To use Call an Ally:
Adds a trusted person as an ally you can call on when you want an actual pair of eyes on the job. To use Add an Ally:
Your chosen ally will need to install the Envision Ally app and open the URL they receive. Once this happens they should appear in your list of allies to call using your Envision Glasses.
You can send ally requests using email or the messaging apps available on your phone. The person I sent the first request to wasn't able to get the URL to work at first, but after a conversation with Envision we realised it was because the ally needs the Envision Ally app not the Envision AI app.
Being able to call someone and have them see what you see is phenomenal. Even wandering through the house (lock-down #2 is in full effect), the potential is obvious; being able to negotiate an unexpected obstacle, navigating an unfamiliar space to get to my destination, or discovering how something works (like the tea machine in a hotel room when I'm travelling solo), are just a few of the possibilities.
There is also an element of personal security. Knowing someone can help you if you feel vulnerable (and getting lost in an unfamiliar place is still one of the most disorientating things I ever experience), or having someone take a proper look around you to assure you that you're not in danger or at risk, is an amazingly liberating feeling.
When the Envision Glasses arrive they have no charge, so the first thing you should do (even before you install the Envision AI app) is plug them in to charge for a bit. They charge quite rapidly though, so you won't have to wait long.
I haven't had the chance to really put the Envision Glasses through their paces yet, but based on my casual use around the house over the past couple of days, I think you'd probably get at least 4 or 5 hours from a full charge. I did notice that when using the Envision AI app, particularly during the setup and pairing phase, it hammered my phone battery into nothing. I have an iPhone 8 running iOS 14 though, so that's almost certainly a factor too.
The Envision Glasses need an internet connection for the most part. Some features like Instant Text, Detect Colours, and Find Objects, work offline, but otherwise you need to be on a wifi. I thought that switching to data when my phone left wifi range would do the trick, but it seems not - though I wasn't in a position to investigate at the time. The user manual explains you can use your phone as a hotspot, but I haven't tried that yet.
Envision Glasses are expensive. There is no other way to put it. The Super Earlybird price of EUR1,499 was eye-watering, and the pre-order price of EUR1,899 is enough to make your teeth itch.
The economics of assistive technologies have not changed over the years. The audience is small, the cost of production is high, and unless you are a large tech giant or on the receiving end of enough funding to make it possible to ship something for free, someone has to pay the cost, and in this case it's the consumer.
You do get a lifetime subscription to the Envision AI service which is something, though given that SeeingAI is free, making Envision AI a subscription service is a hard sell in the first place. We're back to the same problem though - Microsoft can afford to underwrite the cost of production for SeeingAI, I imagine Envision is not able to do the same for its app/service.
Envision does not have a privacy policy on its website or in the app. There is a short knowledge base article that states that no personal data is stored in the cloud, but that's it.
I've contacted Envision to suggest they provide a more detailed privacy policy and that they make it more easily accessible from the website and the app. I'll update this section if anything changes.
In terms of the audio output from the Envision Glasses it can likely be heard by anyone who is nearby. Obviously it depends on the volume setting, and you can obfuscate what's being said a bit by choosing a fast speech rate (unless the people nearby are also accomplished screen reader users), but essentially you should expect that your Envision Glasses will be overheard unless you use them with a Bluetooth earpiece instead - and a quick experiment with a pair of AirPods proved successful.
I've been waiting a long time for something like these to come along. Perhaps since the first time I watched Star Trek: The Next Generation after I lost my sight and found myself wishing for Geordi La Forge's visor.
The lack of a privacy policy from Envision is troubling, so until I have more information about what happens to my data, I'll use my Envision Glasses with greater than usual caution.
If you're wondering if Envision Glasses are worth it, for me I think the answer is yes.
The Envision Ally feature on its own is a huge part of the value for me. EUR1,899 is an awful lot of money, but if AIRA was available in the UK it would cost about EUR1,020 for a standard 12 month subscription, EUR2,040 for an advanced subscription (based on converting the USD subscription costs). COVID-19 not withstanding I travel a lot, and the ability to call someone for assistance, especially when my hands are full of my white cane, suitcase, rucksack, or other paraphernalia, or when I'm somewhere where I really don't want to hold my phone out in front of me, has immense value - and Envision Glasses are a one-off cost as opposed to an ongoing subscription.
Then there is the other value - the value of feeling like I'm another step closer to barelling through the world like I used to. I can walk around, look at things, find out what they are or what they say (more or less), and I don't have to feel awkward or unsafe as I do it. That isn't something I've been able to do since the turn of the century when I last actually saw the world.
Yes, the Envision Glasses are limited. Unlike SeeingAI, Envision AI does not recognise money, and although the Envision AI app has a Scan Barcode option it is not present on the Envision Glasses. I mentioned the Find Object limitations earlier, and the brevity of the Describe Scene descriptions, but these are all things that can (and will) improve with time and advances in technology.
There is often a tendancy to criticise things that are less than perfect. When image recognition was first introduced, it was criticised for its lack of detail, but mostly by people who could see the images being described. For those of us that otherwise had to put up with garbage information or no information, it was a revelation. As Gez Lemon remarked to me at the time, if you need to get some place and your choice is between a rusty bike and not going anywhere, a rusty bike looks like a damned good option.
So in spite of their limitations, and because of all they do, and all they will be able to do, and because a little bit of me really did want to be 7 of 9, it's fair to say that I think my Envision Glasses are really quite marvellous.
]]>Web Monetization is a proposed standard being incubated at the W3C. It lets browsers securely pay for content, and that in turn enables content creators to choose how to respond - by taking down a paywall, removing advertising, or revealing additional content for example.
Coil is a service that uses Web Monetization to let content consumers pay content creators. You pay a monthly subscription of $5, install the Coil browser extension, and then keep browsing exactly as you always do. Only now something cool, or should that be Coil, happens in the background - whenever you visit a website that has Web Monetization enabled a tiny (really tiny) payment is sent from your browser to the content creator.
To monetize your content you need a digital wallet from a provider that supports the Interledger Protocol, for example Uphold or Gatehub. When your wallet is created it's given a public address known as a payment pointer, so the browser knows where to send payments.
You then use the meta
element to add your payment pointer to your website, like this:
<meta name="monetization" content="$ilp.uphold.com/dPdrQinmELyw">
Note: that example uses my payment pointer. You're welcome to use it of course but...
I've had this website since 1998 and I've been creating content here and elsewhere for much of that time. I've never seriously considered making a living from writing; I've been having too much fun with my chosen career for one thing. With Coil and Web Monetization I might be able to make the odd penny or two, but that is not why I decided to monetize this website.
I am under no illusion that writing will ever be a major source of income, in fact if I ever receive enough to cover my own monthly subscription to Coil I'll be a happy person. But I do think that Coil and Web Monetization are at the vanguard of a quiet revolution.
I decided to monetize this website and to become a Coil member because I've had enough of being tracked across the web, had enough of having my personal information bought and sold, had enough of being pestered to buy the washing machine I bought last week or the greatest hits album of a band I once listened to in 2008, and most of all because I've had enough of being bombarded and battered by online advertising.
Like all revolutions this one needs a critical mass to have the most impact. So I hope you will join me and others to make it happen. We Coil?
]]>For the purposes of this post I'm going to use NVDA. Jaws uses the same keyboard commands in this situation, which is useful because between them they represent more than 80% of the screen reader market on desktop/laptop devices. Check the documentation for VoiceOver and Narrator to find the corresponding keyboard commands for those screen readers.
One more thing before we get started; there is a difference between keyboard and screen reader navigation. Although most screen reader users use a keyboard not a mouse, they are not restricted to the same limited set of keyboard commands as other keyboard users. It also means that keyboard focus and screen reader focus are not the same thing.
Contrary to what you might have heard, you do not need to make each cell of a table focusable with a keyboard to aid navigation. If the cell contains a focusable and interactive element that's OK, but if it contains non-interactive content it is likely that you will make keyboard users work much harder to navigate the table than you intended.
Let's use the following table as our working example:
<table>
<caption>Average daily tea and coffee consumption</caption>
<tr>
<th>Person</th><th>Coffee</th><th>Tea</th>
</tr>
<tr>
<th>Njoki</th><td>5 cups</td><td>0 cups</td>
</tr>
<tr>
<th>Iesha</th><td>1 cup</td><td>2 cups</td>
</tr>
<tr>
<th>Léonie</th><td>0 cups</td><td>25 cups</td>
</tr>
</table>
Press t to move screen reader focus to the table and NVDA will say:
Average daily tea and coffee consumption
Table with 3 columns and 4 rows
Average daily tea and coffee consumption caption
NVDA starts by announcing the table's accessible name, then that there is a table with 3 columns and 4 rows, before repeating the accessible name again but this time indicating it is the table's caption.
All of this information comes from the browser; the table's accessible name is taken from the caption
element, the implicit role of the table
element is "table" (which is how the screen reader knows what it is), and the browser tells the screen reader how many rows and columns there are based on the number of td
and tr
elements.
The repetition of the table caption makes NVDA a little verbose, but it is an example of a common phenomenon - screen readers often do the same thing in slightly different ways. Jaws for example announces this when you press the t key:
Table with 3 columns and 4 rows
Average daily tea and coffee consumption
Column 1, row 1
Person
Jaws does not repeat the caption, but it does not indicate that the name of the table is a caption either. It also takes the additional step of moving screen reader focus to the first cell in the table and announcing its coordinates and content, where NVDA only moves screen reader focus to the table (not into it).
These differences are entirely normal and nothing to worry about. The important question to ask when you're testing with any screen reader is: will someone who uses this screen reader know what kind of content they're dealing with?
Whether you're using NVDA or Jaws, you'll know that there is a table, the table's dimensions, and something about the table content. In other words, you'll have the same information someone sighted might discover by glancing at it.
Use the down arrow key to move NVDA's focus into the table and NVDA will say:
Row 1, column 1
Person
Now the screen reader focus is in the first cell of the table, the next step is to explore the table and orientate yourself so you know what data the table contains.
Use the keyboard command control alt + right arrow to move one cell to the right, and NVDA will say:
Column 2
Coffee
Then repeat the same keyboard command until NVDA says:
Edge of table
Then use control alt + left arrow to reverse direction until NVDA says:
Edge of table
Now use control alt + down arrow to move one cell down, and NVDA will say:
Row 2
Njoki
Repeat the same keyboard command until NVDA says:
Edge of table
Then use control alt + up arrow to reverse direction until NVDA tells you you're at the edge of the table again.
These four keyboard commands let you move left/right through rows and up/down through columns, and with them you have the basic method of navigating data tables with NVDA (and Jaws).
Let's say you want to find out how much coffee Njoki drinks. Use control alt + down arrow until NVDA says:
Row 2
Njoki
Then use control alt + right arrow and NVDA will tell you:
Coffee
Column 2
5 cups
Even though the screen reader is focused on the td
element that represents the cell containing the number of cups, not the th
element that represents the column header, the screen reader uses the information it got from the browser to create an association between the two. The screen reader counts on you remembering which row you're in, and just announces the column header as well as the cell content as you move horizontally across the row.
Now you might want to know how many cups of coffee Iesha drinks. Use control alt + down arrow to move down one cell and NVDA will say:
Eisha
Row 3
1 cup
This time the screen reader associates the th
element that is the row header with the contents of the td
it is focused on, and speaks both pieces of information. As before, the screen reader counts on you remembering which column you're in, and just announces the row header as you move vertically through the column.
This ability to navigate the table using these commands is entirely dependent on the proper HTML (or equivalent ARIA roles) being used; especially the th
elements that act as row and column headers.
Try it for yourself with this demo data table, or watch this screen reader demonstration of the same thing.
]]>aria-label
and aria-labelledby
and how to choose the right one.
The aria-label
and aria-labelledby
attributes are both used to give an element it's accessible name. An element's accessible name is a piece of text that differentiates one instance of that element in a document from the rest; the text of a link or text description for an image for example.
The difference between aria-label
and aria-labelledby
is where they get that piece of text, and the clue is in the name. If you think of the accessible name for an element as its label it becomes more understandable. The aria-label
attribute gives an element its label; an element with the aria-labelledby
attribute is labelled by something else.
The aria-label
attribute takes a piece of text as its value. This text becomes the element's label or accessible name. For example, the accessible name of the button in the following example is "Search":
<button id="search" aria-label="Search"></button>
The aria-labelledby
attribute points to another element in the document. It takes the value of the id attribute on that other element as its value, creating a relationship between the two elements. The text contained in the other element becomes the label or accessible name for the element with aria-labelledby
applied to it. For example, the accessible name for the button in the following example is also "Search":
<h2 id="searchLabel">Search</h2>
<label for="search">Enter search term
<input type="search" id="search">
</label>
<button aria-labelledby="searchLabel"></button>
When it comes to deciding which attribute to use, consider these things:
aria-labelledby
; if no, use aria-label
.The first question references the First Rule of ARIA:
If you can use a native HTML element or attribute with the semantics and behaviour you require already built-in, instead of repurposing an element and adding an ARIA role, state or property to make it accessible, then do so.
There are ways an element can be given an accessible name without using aria-label
or aria-labelledby
. For example, put text inside a link or button, use the alt
attribute to give an image a text description, or match the for
attribute on a label
element with the id
attribute of the form field it relates to.
If you think ARIA is the right solution, then the second question is whether the piece of text already exists in the document. Generally speaking it's better to reuse than duplicate, so using aria-labelledby
to associate the piece of text with the element makes sense if the text already exists. If it does not exist elsewhere, then use aria-label
.
The aria-label
and aria-labelledby
attributes do not work consistently with all HTML elements.
The aria-label
and aria-labelledby
attributes will override any other accessible name assigned to the element. ARIA always takes precedence over native HTML semantics. The accessible name in the following example is "This", despite the content of the button:
<button aria-label="This">That</button>
]]>After browning the veggies and pushing them to the side of the pan (step 5 above):
When using meat, you don't need as much of the savoury flavourings, or the lentils and kasha (which add a lot of the meaty texture to the vegetarian version). However, it definitely doesn't hurt to add it all in, for an extra hearty meal!
This is one of those recipes that is slightly different every time I make it. Some ingredients were taken from an article in Cook's Illustrated magazine, but otherwise the recipe has evolved from experience. I often make it when it's been a long time since a grocery run and fresh veggies are running out, so I make it with what I have on hand. Pretty much everything can be omitted or adjusted, although I make no guarantee that the end result will still look like a classic chilli.
Beyond the veggie-or-not decision, the biggest choice you make is crushed (puréed) or diced (chunky) canned tomatoes. Crushed tomatoes make a classic thick and hearty chilli; diced tomatoes make more of a vegetarian stew. However, with crushed tomatoes you run the risk of it tasting more like tomato sauce than chilli, especially for the vegetarian option, so you need to be more careful about balancing your other savoury flavours.
The goal of the different savoury liquids (including the beer) is to add a rich, deep flavour without tasting like any one particular thing. If you have other “umami” ingredients in your pantry, like marmite or MSG, then can probably go in the mix, too. None of the ingredients are essential on their own, but it is essential that you have a mix so they cancel each other out. You don't want chilli that tastes like beer. You don't want chilli that tastes like vanilla, or soy sauce. But if you mix them all together, the unique flavours disappear and what's left is a warm savoury taste.
After you add the savouries, stir well for a minute, then smell or taste: if you can still detect any one of the ingredients distinctly, add something to contrast. If it's too acidic, add some baking soda. And of course, feel free to add more spices if that's your taste. I usually go for a warm but not hot level of spiciness, and serve with hot sauce for those who like it more intense.
The chilli stores well in the fridge or freezer and reheats in the microwave.
PS. Chilli, chili, or chile? The folks at Merriam-Webster explain they're all the same, just different regional spellings.
]]>You can use pork belly instead of shoulder for this, and plenty of recipes do, but I tend to think the fat to meat ratio of pork shoulder is better.
You can experiment with the herbs. The first time I made rillettes I used fresh Thyme amongst other things. It tasted good, but it turned the fat a slightly odd shade of green! A bouquet garni works well, not least because it's like herbs in a teabag that you can easily fish out of the casserole once the meat is cooked.
It's worth mentioning that when you shred the pork it will be at room temperature. On the basis that the rillettes will be served cold from the fridge, you might want to be generous with the salt you add. Salt as a flavour is less noticeable when foods are cold. Don't over-do it though * it's better to be cautious at this stage and sprinkle a little salt on as you eat it, than add too much and ruin the whole batch now.
]]>People often include screen reader users in the much larger group of keyboard-only users. Whilst this is correct (most screen reader users don't use a mouse), it also creates a false impression of the way screen reader users navigate content.
To help explain this, I'm going to generalise and refer to the two following groups of people:
This is a massive over-simplification of the different groups, but the ability to see the content or not is the crux of the difference in navigation.
Keyboard users navigate through content using a limited set of keyboard shortcuts:
This isn't a very refined way of navigating content, and it isn't without problems, but it generally works if you can see the content as it moves into view. That, of course, is where screen reader users come unstuck.
Apart from things like live regions, screen readers only speak the content they're focused on, and here we need to draw an important distinction: keyboard focus and screen reader focus are not the same thing!
Keyboard focus is restricted to tabbing between focusable elements. If a screen reader user uses the tab key to navigate content, all they will hear is the name of each focusable element as it receives keyboard focus. What they won't hear is all the other content like text, headings, and images.
When using the tab key, keyboard focus and screen reader focus are synchronised with each other. The rest of the time, screen reader users have an enormous range of commands at their disposal for reading and navigating content independently of keyboard focus. The commands vary between screen readers, but they all have one thing in common: they're tied to different HTML elements. There are commands for moving screen reader focus between things like headings, images, paragraphs, sectioning elements, lists and listitems; for moving between tables, as well as the rows and columns inside them; for moving to specific types of form field, like checkboxes, radio buttons, text fields, or buttons; and many more commands besides.
You can see a screen reader in action in this Smashing TV webinar. It's a little long, but there's a short extract available too.
Whether someone is a keyboard user or a screen reader user, the importance of HTML cannot be emphasised enough. Without well-formed HTML that uses the appropriate element for the purpose, screen reader navigation breaks down completely, and keyboard navigation is at a high risk of doing the same.
]]>The ARIA1.1 specification describes the aria-roledescription
attribute like this:
Defines a human-readable, author-localized description for the role of an element. Some assistive technologies, such as screen readers, present the role of an element as part of the user experience. Such assistive technologies typically localize the name of the role, and they may customize it as well. Users of these assistive technologies depend on the presentation of the role name, such as "region," "button," or "slider," for an understanding of the purpose of the element and, if it is a widget, how to interact with it.>
Let's take a link as an example:
<a href="https://tink.uk">tink.uk</a>
The screen reader takes the element's role ("link") and uses it to tell the user what kind of thing they're dealing with. The user can then make an informed decision about what to do with the element (in this case activate the link or ignore it).
Screen readers do this for most HTML elements. In other words, when you can't see the element on-screen, the role is critical to understanding the nature of the element. So why would anyone want to override such useful information?
The ARIA1.1 specification offers two possible use cases for aria-roledescription
:
section
element, when it's used to represent a slide in an HTML slide deck;<button>
element, when the button is used to manage an email attachment.The first use case looks something like this:
<section aria-roledescription="slide" aria-labelledby="s1">
<h2 id="s1">Accessibility mechanics</h2>
...
</section>
The implicit role of the section
element is "region", so without the aria-roledescription
attribute, screen readers would announce this as "Accessibility mechanics region". With the aria-roledescription
attribute, the announcement becomes "Accessibility mechanics slide" instead.
The second use case looks something like this:
<button aria-roledescription="attachment button">this.txt</button>
The implicit role of the button
element is "button", but with the aria-roledescription
attribute the role announcement has been extended to "attachment button". On encountering this particular button, a screen reader would therefore announce "this.txt attachment button".
It can be argued that in both cases, the alteration is worthwhile, because it provides the user with more specific or accurate information about the thing they're dealing with.
The trouble with aria-roledescription
is that it can too easily be used to break accessibility for screen reader users. Let's take a theoretical example:
<button aria-roledescription="slice">Pizza</button>
It's a button that you might find on a delivery restaurant website, for adding a slice of pizza to your order. Except that a screen reader will announce "Pizza slice", with no mention of the word "button" at all.
So how is a screen reader user supposed to know they should use it like a button?
The ARIA1.1 specification acknowledges some of the risk:
The aria-roledescription property gives authors the ability to override how assistive technologies localize and express the name of a role. Thus inappropriately using aria-roledescription may inhibit users' ability to understand or interact with an element. Authors SHOULD limit use of aria-roledescription to clarifying the purpose of non-interactive container roles like group or region, or to providing a more specific description of a widget.>
This really needs to be said much more explicitly though: do not use aria-roledescription to change the way a role is announced, if it will prevent the user from knowing what to do with it.
The ARIA1.1 specification goes on to place constraints on the way authors use aria-roledescription:
When using aria-roledescription, authors SHOULD also ensure that:
- The element to which aria-roledescription is applied has a valid WAI-ARIA role or has an implicit WAI-ARIA role semantic.
- The value of aria-roledescription is not empty or does not contain only whitespace characters.
It then places stronger constraints on the way browsers handle the aria-roledescription attribute:
User agents MUST NOT expose the aria-roledescription property if any of the following conditions exist:
- The element to which aria-roledescription is applied does not have a valid WAI-ARIA role or does not have an implicit WAI-ARIA role semantic.
- The value of aria-roledescription is empty or contains only whitespace characters.
The difference in the language is subtle: in W3C language, things marked "SHOULD" (in uppercase) are advisory, but things marked "MUST" (in uppercase) are required. So in this case, authors are advised against using aria-roledescription
on elements without an implicit or explicit ARIA role (refer to the HTML Accessibility API Mappings specification for details) and against giving aria-roledescription
an empty or whitespace value, whereas browsers are explicitly prohibited from exposing the aria-roledescription
attribute if either of those conditions fail.
One use case worth exploring is WebVR. As we start to see more Virtual Reality (VR) content on the web, we're going to need ways to identify all kinds of objects; everything from a pair of jeans to the Ravenous Bugblatter Beast of Traal. Some of those things will be interactive and some will not, but most will be things outside of the standard set of UI components ARIA is able to handle.
WebVR is scripted, and attached to the canvas
element. This means that objects inside the WebVR have no DOM, so for now it's necessary to put fallback content inside the canvas
element.
<canvas id="webgl-thing">
<div role="img" aria-roledescription="Ravenous Bugblatter Beast of Traal" aria-label="A red monster with lots of teeth"></div>
</canvas>
This example assumes there is a single static object inside of the WebVR, so it's a long way short of production reality, but it does mean the object inside the WebVR would be announced by a screen reader as "A red monster with lots of teeth; Ravenous Bugblatter Beast of Traal". The ordering of the information isn't ideal, but it does at least communicate what the object is, and give a brief description of it.
Note: it's necessary to include role="img"
on the div
in order to conform to the requirement that aria-roledescription
is only used on elements with an implicit or explicit ARIA role (div
elements do not have an implicit ARIA role, so one is provided explicitly using the role
attribute).
For now though the use cases for aria-roledescription
are there, but they're not particularly compelling. In the future this could well change, but in the meantime think very carefully before you decide to change the way an element's role is announced by screen readers!
The example comes from the W3C Process document, and a flowchart that describes the process of taking a specification from its First Public Working Draft (FPWD) through to official W3C Recommendation (Rec). The original SVG can be found in the W3C Process 2015 document, and a revised version (with a first attempt at accessibility) is in the W3C Process 2017 document.
SVG lacks the semantics to express structures like line graphs, bar charts, or flowcharts, and so SVG data visualisations are difficult for screen reader users to understand. The structure of a flowcart can be represented as a series of nested lists, so we can use ARIA to impose these semantics for the benefit of screen reader users.
The following ARIA roles and attributes were used for the W3C flowchart:
The first step was to create the skeleton structure of the flowchart, using the g
element as the base. Here's a fragment:
<g role="list">
<g role="listitem"></g>
<g role="listitem">
<g role="list">
<g role="listitem"></g>
<g role="listitem"></g>
</g>
</g>
</g>
Once the existing SVG content had been moved into the new structure, the img
role was applied to the first child element inside each link (more on the links later), and the title
element used to give the element its accessible name, to encourage screen readers to recognise the listitems.
The aria-hidden
attribute was then used to hide the text
element from screen readers (because it duplicated the accessible name of the title
element).
<g role="list">
<!-* FPWD -->
<g role="listitem">
<a xlink:href="#wd-1">
<rect role="img" fill="white" y="122" width="57" height="45">
<title>First Public Working Draft (FPWD) * Exclusion opportunity</title>
</rect>
<text font-size="8">
<tspan aria-hidden="true" y="132" x="0">First Public WD</tspan>
<tspan x="0" y="152">WG decision</tspan>
<tspan x="0" y="162">Director's approval</tspan>
</text>
<path d="M0,140h53" fill="none" stroke="#000"></path>
<polygon points="47,136 57,140 47,144"></polygon>
</a>
</g>
<!-* WD -->
<g role="listitem">
<a xlink:href="#cr-1">
<ellipse role="img" id="wd-1" ry="18" rx="38" cy="140" cx="97" stroke="black" fill="#fff">
<title>Working Draft (WD)</title>
</ellipse>
<text aria-hidden="true" font-size="14" y="144" x="97" text-anchor="middle">WD</text>
</a>
<g role="list">
<!-* New WD -->
<g role="listitem">
<a xlink:href="#wd-1">
<g role="img">
<title>Publish a new Working Draft</title>
</g>
<text font-size="8"><tspan x="30" y="92">WG Decision: review needed, or</tspan>
<tspan x="40" y="100">No change for 6 months</tspan>
</text>
<path d="M78,124C73,114 79,104 97,104 108,104 115,108 117,114" fill="none" stroke="black" stroke-dasharray="6 1"></path>
<polygon points="120,114 116,124 114,113"></polygon>
</a>
</g>
<!-* Advance to CR -->
<g role="listitem" fill="#060">
<a xlink:href="#cr-1">
<g role="img">
<title>Advance to Candidate Recommendation</title>
</g>
<text x="138" y="134" font-size="8">Director's approval</text>
<path stroke="#060" d="M135,140h81"></path>
<polygon points="211,136 221,140 211,144"></polygon>
</a>
</g>
</g>
...
</g>
...
</g>
The SVG flowchart now has a semantically meaningful structure, but to really understand the data in a flowchart, it's helpful to be able to navigate between siblings, as well as to follow a parent/child path through it.
The flowchart in the W3C Process document attempts to provide this form of navigation using links. This makes it possible to navigate directly between the siblings in the first layer of the flowchart (using the tab key), but is also introduces some different problems:
Although the keyboard experience still leaves something to be desired in SVG, adding ARIA goes some way to making the content more usable by screen reader users. The ARIA enabled version is included in the W3C Process 2018, and a demo version is also available. If you find any issues, you can file them on Github.
]]>I started making this years and years ago, and the whole family enjoys it, especially the kids. It was one of Rebecca’s favorite dinners. The recipe is quick and easy and very robust, easily able to withstand adjustments. I’ve made it without the onion and it was different but still tasty. I’ve used sun-dried tomatoes with a little water and the result was lovely. I once even added water to tomato paste to make a tomato sauce, there being nothing else on hand, and it was not bad—not recommended, but not bad.
Basically, if you can get the cinnamon, honey, and red wine vinegar in with some tomato, you’re at least 80% of the way there. I do strongly recommend the step of dissolving the honey in the red wine vinegar, though. I added them separately a couple of times, and the flavors didn’t really come together.
When my wife Kat went mostly-vegetarian, I started using tofu for her portion, and sometimes mine. We start with extra-firm tofu and press it further, and it works quite well. The chicken and tofu can easily cook side by side if all parties are okay with this—again, being careful with cooking times since tofu and chicken generally brown at different rates.
I find couscous is the ideal accompaniment for this dish, and I often add toasted pine nuts to the couscous for an extra layer of flavor. Rice is also decent as an accompaniment, as is orzo. A shaped small pasta like gemelli, farfalle, or campanelle is an acceptable substitute, but honestly never my first choice in this case.
This recipe originally comes from the long out of print 365 Great 20-Minute Recipes by Beverly Cox. Its official cookbook name is “Sicilian Chicken”, but my kids misheard “Sicilian” as “Cinnamon” and the name stuck.
]]>I lost my sight at the turn of the century, over the course of about 12 months. For some of you reading this, that may be before you were born or before you were old enough to remember much at all. For the rest of you, think back to who you were, where you were, and what you were doing as the new millennium rumbled into life.
I was 25 when the Y2K bug was almost a thing; the last film I actually saw was The Matrix; I danced to Public Domain (Operation Blade (Bass in the Place)) and chilled out to William Orbit (Barber's Adadgio for Strings); I had platinum blonde hair and a tongue stud; the Nokia 7110 with its spring loaded cover, was the must have phone; Windows XP was my OS of choice (on a 19" CRT monitor); My last pack of Marlborough cigarettes cost me about £4; I could pull off Bruce Willis moves (if no-one was looking) when I played Time Crisis on the original PlayStation with the gun style game controller; and I was working for one of the UK's first ISPs when the .Com bubble burst.
It was a long time ago, and a lot has happened since then.
My last impressions of movie special effects include the Computer Generated (CG) ship in Titanic, and the rotating slow-motion bullets and fight scenes in The Matrix. I missed the arrival of Gollum in The Lord of the Rings trilogy, the 3D of Avatar, and continue to miss the increasing reality of CG like the Hulk in the Marvel universe.
Tekken 3, Oddworld, The 11th Hour, and Diablo are some of the last games I remember with any clarity. The rise of MMORPGs like World of Warcraft, of the WII and other motion-sensitive game controllers, and the emergence of VR, are all things I have only an incomplete understanding of.
I missed out on flat screens and the introduction of touch screens (at least in the visual sense), and will probably miss the era of curved screens as well. My understanding of heads-up displays and augmented reality devices is limited, and I freely confess I don't really get flat design because skeuomorphism was the design paradigm when I lost my sight (though most of us didn't know that's what it was called at the time).
Now imagine trying to interpret all that progress in a single step. It would be like stepping from Tron 1982 into Tron Legacy 2010 in the blink of an eye!
I don't look where I'm going anymore. I don't even really need to be facing in the direction I'm moving. The amount of visual processing required to negotiate the world is something I remember like a half forgotten dream. I remember perspective but not its proportions. I know things look smaller when they're further away, but don't remember how to judge distance based on line of sight.
I haven't focused on anything in nearly two decades. I can still do it if I try, but it takes a concerted effort, and switching from near to far focus is no longer something I can do without conscious thought. I doubt I'd be able to read even if I could see. The muscles that control eye movement and focus are at best out of practice and at worst atrophied beyond repair. Oddly I can still cross my eyes though...
Other than scrawling my signature (which looks like a spider dunked in ink and left to wander across the paper), I haven't written anything by hand in many years. Even this happens less and less often, thanks to chip/pin credit cards, biometric scanners, and electronic signatures. I remember the shapes of letters and numbers with complete clarity, but reproducing them legibly in print is unlikely to be a skill I'll be able to summon up should the opportunity arise.
I was in my mid-twenties when I last saw myself in the mirror. I'm forty-mumble-something now. My husband is a fraction older than me, and our friends all much of an age as us. In the intervening years we've collectively changed our do's (umpteen times), gathered the odd wrinkle and some grey in our hair, lost weight, gained weight, acquired glasses, stopped wearing cycling shorts and other crimes against fashion, and many other things that happen as youth trundles into middle age. In my head we all still look like twenty-somethings, full of the promise and beauty of youth though.
I'm barely a decade younger now, than my parents were the last time I saw their faces. If I regained my sight now, I'd be the one who looks middle-aged, not them. However you look at it, that's the kind of thing that will scramble your head!
Then there are all the people I've never seen. All the people I've met since Y2K, all the people who were born after I lost my sight, so many people I've formed strong mental images of but on whom I've never laid eyes.
Have you ever listened to someone on the radio, then later seen a picture of them for the first time? Do you remember how completely wrong your mental image of them was? Now imagine experiencing that for everyone you've met in the past two decades!
When I lost my sight I learned how to do things for the second time in my life. I learned to cook, cross the road, use a computer, catch a plane, recognise people, identify money, write legible code, swim in the right direction, watch movies with audio description, buy clothes, and a million other things besides.
If I regained my sight I would need to learn how to do all those things again. Twenty years is a lot of time to be out of practice at looking at what you're doing! In another few years I'll have been blind longer than I was sighted, and most of my life experiences will have happened without sight (including sky diving).
So when people ask about regaining my sight (assuming I had the choice), I can't readily give the answer most people expect. I daresay I would choose to have my sight back, mostly because I'm curious about how things have changed since the Y2K, and because I want to see the faces of my friends and family (some for the first time, others for the first time this century). But I don't under-estimate the cost of doing so; and that's the thing, I know the cost. I've paid it once, only this time I'd be doing it in reverse. The emotional stress, the physical effort, the fear and uncertainty, the tears and frustration, would all be just as much to contend with as they were when I lost my sight. Only this time there would be light at the end of the tunnel, and perhaps that's all I'd really need to know.
]]>accessibleNode
has been dropped. Read the AOM explainer for more information.
The Accessibility Object Model (AOM) is an experimental JavaScript API that enables developers to modify the browser accessibility tree. The AOM has four phases, and support for phase one recently landed in Chrome Canary behind the flag.
Traditionally, access to the accessibility tree has been limited to the platform accessibility APIs used by Assistive Technologies (AT). Even so, the access has been one-way, AT can query the accessibility tree but not manipulate it.
For developers, the only way to manipulate the accessibility tree has been to use ARIA to add, remove, or change the native semantics of HTML elements. Where the semantics (role, name, and state) of native HTML elements are implicit, ARIA forces us to declare the additional semantics explicitly. Using the AOM it's possible to avoid "sprouting attributes" (as the AOM explainer cheerfully puts it), and instead keep the HTML clean. When it comes to Custom Elements it also means the semantics can be encapsulated within the implementation, instead of added as a "leaky abstraction".
The AOM is being developed by Alice Boxhall and Dominic Mazzoni of Google, James Craig of Apple, and Alexander Surkov of Mozilla. They propose to introduce AOM capabilities in four phases:
The AOM Phase 1 specification introduces the AccessibleNode
and AccessibleNodeList
interfaces. These enable developers to modify the semantics of nodes in the accessibility tree, and to pass references to accessible nodes to other properties.
As an experiment I decided to create a custom disclosure widget using the Accessibility Object Model (AOM) instead of ARIA. For many reasons it's not advisable to start from a span
and a div
when you create this kind of interaction, but in the interests of playing with the AOM it serves the purpose.
The underlying HTML looks like this:
<span id="button">Tequila!</span>
<div id="container">
Makes me happy!
</div>
Everything else happens in the JavaScript. First we create references to the DOM nodes that represent the span
and div
elements:
var button = document.getElementById('button');
var container = document.getElementById('container');
Then modify their properties by setting the tabindex
attribute on the span
to make it focusable, and the hidden
attribute on the div
to hide the content:
button.setAttribute('tabindex', 0);
container.setAttribute('hidden', true);
We could then start adding semantics to the DOM node for the span in the usual way:
button.setAttribute('role', 'button');
button.setAttribute('aria-expanded', false);
But instead we can create a reference to the accessible node that corresponds to the DOM node for the span element, and add the semantics straight into the accessibility tree:
button.accessibleNode.role = "button";
button.accessibleNode.expanded = false;
The AOM uses the same set of roles as ARIA. The AOM Phase 1 spec also includes a table that maps AOM properties to their corresponding ARIA attributes (the expanded
property corresponds to the aria-expanded
attribute for example).
We can then create and call the function that handles the behaviour of the disclosure widget:
function disclose(event) {
if(container.getAttribute('hidden')) {
button.accessibleNode.expanded = true;
container.removeAttribute('hidden');
}
else {
button.accessibleNode.expanded = false;
container.setAttribute('hidden', true);
}
}
button.addEventListener('click', disclose, false);
button.addEventListener('keydown', function(event) {
if (event.keyCode == 13 || event.keyCode ==32) {
disclose();
}
});
It removes the hidden
attribute from the div
element, and changes the value of the AOM expanded
property on the accessible node for the span
element.
The AOM also makes it possible to pass object references to other accessible node properties. We'd usually set the aria-controls
attribute on the span
element like this:
button.setAttribute('aria-controls', 'container');
But the AOM means we don't have to pass an idref to the aria-controls
attribute to indicate that the span
element controls the div
element. Instead we create an AccessibleNodeList:
var content = new AccessibleNodeList();
Then we add the accessible node for the div
element to the array:
content.add(container.accessibleNode);
Lastly, in the function that handles the disclosure behaviour, we create the association between the accessible nodes for the span
and div
elements directly in the accessibility tree:
function disclose(event) {
if(container.getAttribute('hidden')) {
button.accessibleNode.expanded = true;
button.accessibleNode.controls = content;
container.removeAttribute('hidden');
}
else {
button.accessibleNode.expanded = false;
button.accessibleNode.controls = null;
container.setAttribute('hidden', true);
}
}
By using an array to store one or more accessible node references, it's possible to associate multiple accessible nodes with another. Think of the AOM properties that equate to ARIA attributes like aria-labelledby
, aria-describedby
, or aria-owns
, that can take multiple idrefs as values.
As mentioned at the start, the AOM is an experimental API. For now support for Phase 1 is only available behind the flag in Chrome Canary. This means you need to run Canary from the command line. To do this, open Windows Command Prompt or MacOS Terminal, and navigate to the directory where Chrome Canary is installed.
In Windows run:
chrome.exe --enable-blink-features=AccessibilityObjectModel
In MacOS run:
open -a Google\\ Chrome --args --enable-blink-features=AccessibilityObjectModel
You can then open this AOM disclosure demo, and with a screen reader running it'll behave exactly like you'd expect. The screen reader recognises a button in a collapsed state. When the button is activated, the screen reader recognises the button is now in the expanded state and (Jaws only) recognises that the button is being used to control the disclosed content.
With thanks to Dan Hopkins and Ian Pouncey.
]]>Let's take a simple SVG line graph as a working example. In this graph the y axis represents usage by percent, and the x axis represents time. The three lines represent the different screen readers:
Using the graph you can tell what percent of usage each screen reader had at a particular point in time. For example, in January 2014 74% of screen reader users used Jaws, in May 2012 43% used NVDA, and in July 2014 30.9% used Voiceover.
This data can be represented as a table: where the rows represent the screen readers, the columns represent the points in time, with the usage percentage data in the table cells.
It's a good idea to give people choices about the way they consume content, so letting people choose between a graphical and tabular view of the data would be a good thing to do in any case. But in the interests of making your primary content as accessible as it can be, you can use the following ARIA to add table semantics directly in the SVG:
Use a g
element with role="table"
to represent the table:
<g role="table">...</g>
Add role="row"
to each of the g
elements that represent the table rows:
<g role="row">...</g>
Add role="columnheader"
to each of the text
elements that represent the column headers:
<g role="row" class="grey small">
<text role="columnheader" x="334.5" text-anchor="middle" class="grey" y="306">Time</text>
<text role="columnheader" x="130" text-anchor="middle" y="269">Jan 2009</text>
<text role="columnheader" x="232" text-anchor="middle" y="269">Dec 2010</text>
<text role="columnheader" x="334.5" text-anchor="middle" y="269">May 2012</text>
<text role="columnheader" x="436.5" text-anchor="middle" y="269">Jan 2014</text>
<text role="columnheader" x="539" text-anchor="middle" y="269">Jul 2015</text>
</g>
Use a g
element with role="rowheader"
to represent the header for each row in the table. Move the title
element that is the first child of the rowheader
, so it is the child of the path
element, and give the path
element role="img"
.
<g role="row">
<g role="rowheader">
<path role="img" transform="translate(79,63)" d="M 51 48.5 C 51 48.5 112.5 53 153.3 56.8 C 194 61 214 68 255.5 68 C 296 68 317 67.5 357.7 67.5 C 398.5 67.5 460 105 460 105" class="s1" stroke-linecap="round">
<title>Jaws</title>
</path>
</g>
...
</g>
It's necessary to move the title element inside the path element, and to give the path element role="img" in order to fool browsers into thinking there is content inside the left-most column of the table. IE doesn't need this, but it's a workaround for other browsers.
Use a g
element with role="cell"
, to represent each of the table cells. Add role="img"
to each of the use
elements that represent the content of the cells:
<g role="cell">
<use role="img" href="#points-0" x="130" y="112">
<title>74%</title>
</use>
</g>
The img
role is again used to expose the content as a graphic in the browser, with the title
element providing its accessible name.
Use the aria-hidden
attribute to hide the y axis values, the y axis label ("Percentage usage"), and the legend, from screen readers. The same information is available to screen reader users through the row and column headers, so this avoids duplication of information without removing it from the visual presentation.
A working demo of the chart has all of these changes applied.
The state of SVG accessibility support in browsers and screen readers is still highly inconsistent, even when ARIA is used to polyfill semantic information. This solution works with Jaws in Chrome and IE, but only partially in Firefox (because Jaws doesn't recognise the contents of the table cells in that browser); NVDA struggles with the ARIA table semantics in Chrome, Firefox, and IE; whilst this solution works well with VoiceOver in Safari.
The upshot is that this technique is enough to make the primary SVG content more screen reader accessible, and support is likely to improve as browsers and screen readers improve their support for SVG and ARIA, but it isn't enough to make it completely so. For this (and many other good reasons), it's a good idea to provide an alternative view of the information, for example by providing both a graphical and a tabular view of the content (as noted above).
Thanks to WebAIM for the data (taken from their screen reader surveys) used to create this line graph, and to Chaals McCathie Nevile who borrowed the original SVG chart from HighChart and started thinking about its accessibility.
]]>If you're using HTML, use the relevant HTML elements to create tables. The table
, thead
, tfooter
, tbody
, th
, and td
elements are purpose built to do the job. Don't make things harder for yourself by reinventing the wheel.
That said, if you're working with SVG (which doesn't have elements for representing tables), then polyfilling the missing semantics with ARIA makes the SVG content more accessible for screen reader users.
Let's take a simple SVG table as a working example.
The SVG code looks like this:
<svg width="100%" height="100%" viewBox="0 0 370 160">
<title>Expenses</title>
<g id='rowGroup' transform='translate(0, 0)'>
<rect x='25' y='40' width='310' height='20' fill='gainsboro'/>
<rect x='25' y='76' width='310' height='20' fill='gainsboro'/>
<text x='30' y='30' font-size='18px' font-weight='bold' fill='crimson' text-anchor='middle' role="row">
<tspan x='100'>Sales</tspan>
<tspan x='200'>Expenses</tspan>
<tspan x='300'>Net</tspan>
</text>
<text x='30' y='30' font-size='18px' text-anchor='middle' role="row">
<tspan x='30' dy='1.5em' font-weight='bold' fill='crimson' text-anchor='start'>Q1</tspan>
<tspan x='100'>$ 223</tspan>
<tspan x='200'>$ 195</tspan>
<tspan x='300'>$ 28</tspan>
</text>
<text x='30' y='30' font-size='18px' text-anchor='middle' role="row">
<tspan x='30' dy='2.5em' font-weight='bold' fill='crimson' text-anchor='start'>Q2</tspan>
<tspan x='100'>$ 183</tspan>
<tspan x='200'>$ 70</tspan>
<tspan x='300'>$ 113</tspan>
</text>
<text x='30' y='30' font-size='18px' text-anchor='middle' role="row">
<tspan x='30' dy='3.5em' font-weight='bold' fill='crimson' text-anchor='start'>Q3</tspan>
<tspan x='100'>$ 277</tspan>
<tspan x='200'>$ 88</tspan>
<tspan x='300'>$ 189</tspan>
</text>
<text x='30' y='30' font-size='18px' text-anchor='middle' role="row">
<tspan id="q4" x='30' dy='4.5em' font-weight='bold' fill='crimson' text-anchor='start'>Q4</tspan>
<tspan x='100'>$ 402</tspan>
<tspan x='200'>$ 133</tspan>
<tspan x='300'>$ 269</tspan>
</text>
</g>
</svg>
It looks like a table, but the samantic information exposed by SVG in the accessibility tree is not that of a table. According to the SVG Accessibility API Mappings (AAM), the exposed roles are:
graphics-document
for the svg
element;graphics-symbol
for the rect
element;group
for the g
element;group
or text
(depending on the accessibility API) for the text
element.Note: the graphics-document and graphics-symbol
roles are documented in the ARIA Graphics module, and the Graphics AAM.
There is an added complication: although these roles are supported by browsers and accessibility APIs, they are not well supported by screen readers. The group
role is recognised by screen readers but is not communicated to users in this context, and neither the graphics-document
or graphics-symbol
roles is recognised at all. The text content is available to screen readers though.
So in the absence of useful semantic information, we can use these ARIA roles as an accessibility polyfill:
This SVG table looks the same, but thanks to the ARIA it can be navigated by screen reader users in the same way as an HTML table.
>ExpensesSalesExpensesNetQ1$ 223$ 195$ 28Q2$ 183$ 70$ 113Q3$ 277$ 88$ 189Q4$ 402$ 133$ 269
The updated code looks like this:
<svg width="100%" height="100%" viewBox="0 0 370 160">
<title>Expenses</title>
<g id='rowGroup' transform='translate(0, 0)' role="table">
<rect x='25' y='40' width='310' height='20' fill='gainsboro'/>
<rect x='25' y='76' width='310' height='20' fill='gainsboro'/>
<text x='30' y='30' font-size='18px' font-weight='bold' fill='crimson' text-anchor='middle' role="row">
<tspan role="columnheader" x='100'>Sales</tspan>
<tspan role="columnheader" x='200'>Expenses</tspan>
<tspan role="columnheader" x='300'>Net</tspan>
</text>
<text x='30' y='30' font-size='18px' text-anchor='middle' role="row">
<tspan role="rowheader" x='30' dy='1.5em' font-weight='bold' fill='crimson' text-anchor='start'>Q1</tspan>
<tspan role="cell" x='100'>$ 223</tspan>
<tspan role="cell" x='200'>$ 195</tspan>
<tspan role="cell" x='300'>$ 28</tspan>
</text>
<text x='30' y='30' font-size='18px' text-anchor='middle' role="row">
<tspan role="rowheader" x='30' dy='2.5em' font-weight='bold' fill='crimson' text-anchor='start'>Q2</tspan>
<tspan role="cell" x='100'>$ 183</tspan>
<tspan role="cell" x='200'>$ 70</tspan>
<tspan role="cell" x='300'>$ 113</tspan>
</text>
<text x='30' y='30' font-size='18px' text-anchor='middle' role="row">
<tspan role="rowheader" x='30' dy='3.5em' font-weight='bold' fill='crimson' text-anchor='start'>Q3</tspan>
<tspan role="cell" x='100'>$ 277</tspan>
<tspan role="cell" x='200'>$ 88</tspan>
<tspan role="cell" x='300'>$ 189</tspan>
</text>
<text x='30' y='30' font-size='18px' text-anchor='middle' role="row">
<tspan id="q4" role="rowheader" x='30' dy='4.5em' font-weight='bold' fill='crimson' text-anchor='start'>Q4</tspan>
<tspan role="cell" x='100'>$ 402</tspan>
<tspan role="cell" x='200'>$ 133</tspan>
<tspan role="cell" x='300'>$ 269</tspan>
</text>
</g>
</svg>
This has the advantage of being supported on multiple platforms, in multiple browsers, and with multiple screen readers. The ARIA tables demo includes information about browser and screen reader support for Windows, MacOS, iOS, and Android.
]]>This post assumes you have a custom domain you want to use, and a Github account with one (or more) repositories that have a gh-pages branch. In my case tink.uk is the domain I want to use, and ljwatson.github.io is the root for the repositories that use gh-pages.
You need to create a custom sub-domain for each repository. To do this, you need to edit the zone file for your custom domain. The way to do this depends on where your domain is registered, so check out the help documentation provided by your domain registrar.
Add a CNAME record to your zone file for each sub-domain you want to create. I created four sub-domains for tink.uk, using the following CNAME records:
Example CNAME records
Name
Type
TTL
Value
decks
CNAME
1800
ljwatson.github.io.
design-patterns
CNAME
1800
ljwatson.github.io.
playground
CNAME
1800
ljwatson.github.io.
test-cases
CNAME
1800
ljwatson.github.io.
When you've added all the CNAME records, save your changes and make sure your custom domain is using the updated zone file.
Meanwhile, you need to configure each Github repository to recognise the sub-domain name you've just created for it. To do this, go to the Github page for each repository, and in Settings add the sub-domain to the Custom domain field (don't forget to save it).
Once the CNAME records have propagated, you'll be able to use the sub-domain to reference the gh-pages branch of your Github repositories. For example, design-patterns.tink.uk instead of ljwatson.github.io/design-patterns/.
Thanks to Dan Hopkins for his help.
]]>This is where ARIA itself is defined. ARIA 1.0 was released in March 2014, and ARIA 1.1 is expected to become a W3C Recommendation within the next few months.
The ARIA specification describes the roles, states, and properties that can be used to poly fill accessibility for screen readers. This is the specification you need when you want to find out which roles, states, and properties are available for use with HTML and/or SVG.
The ARIA APG is a collection of design patterns for common interface components. The ARIA APG 1.0 was never officially released, but the ARIA APG 1.1 is under active development.
The APG describes the keyboard interaction expected for custom components like tabpanels, dialogs, and menubars. It also explains which ARIA roles, states, and properties are required to make custom components accessible to screen readers. If you're creating custom components in HTML or SVG, the ARIA APG is the reference manual you need.
The AAM specifications document the way native components are mapped to platform accessibility APIs. The Core AAM 1.1, HTML AAM 1.0, and SVG AAM 1.0, are all works in progress.
The HTML AAM describes how native HTML elements are mapped to ARIA roles, and to the equivalent roles on different platforms (and consequently different browsers). The SVG AAM attempts to do the same for SVG elements, but at present browser support for SVG accessibility API mappings is much less robust than for HTML. The AAM specifications are where you should go when you want to know how browsers should handle different elements in terms of keyboard focus, role, state, and property information for accessibility.
This specification documents the way tools should parse ARIA in HTML. ARIA in HTML 1.0 is under active development, and is expected to become an official W3C Recommendation later this year.
ARIA in HTML is intended for use by conformance checkers, and tools that test for accessibility. If you want to know how these tools are expected to handle ARIA when it's used with HTML, this is the specification to use.
This describes a set of best practice principles for Using ARIA in HTML. It's a reliable W3C Note, that is continually updated.
Using ARIA provides practical guidance for using ARIA in development. If you use ARIA with HTML, this specification is an essential reference point to help you make smart implementation choices.
]]>There are several common patterns where the current thing in a group or collection is highlighted visually:
Usually the current thing is differentiated from the other things in the collection using CSS. For example:
<style>
.current {
font-weight: bold;
background-color: #cc33ff;
}
</style>
<nav>
<ul>
<li><a href="/" class="current">Home</a></li>
<li><a href="/">About</a></li>
<li><a href="/">Contact</a></li>
</ul>
</nav>
CSS is predominantly a visual medium, and with one notable exception it isn't exposed to screen readers * and therein lies the problem.
It's possible to solve this particular use case by removing the href attribute on the current link, rendering it inert and preventing screen readers from announcing it as a link at all. Or by using a foreground image instead of a background image, and giving it an alt text (like "Current page") that conveys the relevant information to screen reader users.
There are reasons why you may not want to use either of these solutions, and in any case they're only relevant to this particular pattern. What if the current thing isn't a link, or isn't an interactive element at all? For example:
<table>
<caption>July 2016</caption>
<tr>
<th>Mon</th><th>Tue</th><th>Wed</th><th>Thu</th><th>Fri</th><th>Sat</th><th>Sun</th>
</tr>
<tr>
<td></td><td></td><td></td><td></td><td>1</td><td>2</td><td>3</td>
</tr>
<tr>
<td>4</td><td>5</td><td>6</td><td>7</td><td>8</td><td>9</td><td>10</td>
</tr>
<tr>
<td>11</td><td>12</td><td>13</td><td>14</td><td>15</td><td class="current">16</td><td>17</td>
</tr>
<tr>
<td>18</td><td>19</td><td>20</td><td>21</td><td>21</td><td>22</td><td>23</td>
</tr>
<tr>
<td>24</td><td>25</td><td>26</td><td>27</td><td>28</td><td>29</td><td>30</td>
</tr>
<tr>
<td>31</td><td></td><td></td><td></td><td></td><td></td><td></td>
</tr>
</table>
aria-current
attributeThe aria-current
attribute is intended to solve this problem. Here's the official attribute definition:
Indicates the element that represents the current item within a container or set of related elements. The
aria-current
attribute is an enumerated type. Any value not included in the list of allowed values should be treated by assistive technologies as if the value true had been provided. If the attribute is not present or its value is an empty string, the default value of false applies and thearia-current
state must not be exposed by user agents or assistive technologies.
According to the ARIA 1.1 specification, the aria-current
attribute can be given one of a predefined set of values (enumerated tokens):
page
represents the current page within a set of pages;
step
represents the current step within a process;
location
represents the current location within an environment or context;
date
represents the current date within a collection of dates;
time
represents the current time within a set of times;
true
represents the current item within a set;
false
does not represent item within a set.
So the aria-current
attribute can be used to solve the first use case in this post like this:
<style>
[aria-current] {
font-weight: bold;
background-color: #cc33ff;
}
</style>
<nav>
<ul>
<li><a href="/" aria-current="page">Home</a></li>
<li><a href="/">About</a></li>
<li><a href="/">Contact</a></li>
</ul>
</nav>
When a screen reader encounters the link identified with aria-current
, it will announce something like "Home, current page link".
Whenever aria-current
is used with a value other than true, that information is incorporated into the screen reader announcement. For example in this set of steps, a screen reader will announce "Do this, current step link"."
<ol>
<li><a href="/" aria-current="step">Do this</a></li>
<li><a href="/">Do that</a></li>
<li><a href="/">Do the other</a></li>
</ol>
Whereas in this calendar example, a screen reader will announce something like "Sat 16, current date".
<table>
<caption>July 2016</caption>
<tr>
<th>Mon</th><th>Tue</th><th>Wed</th><th>Thu</th><th>Fri</th><th>Sat</th><th>Sun</th>
</tr>
<tr>
<td></td><td></td><td></td><td></td><td>1</td><td>2</td><td>3</td>
</tr>
<tr>
<td>4</td><td>5</td><td>6</td><td>7</td><td>8</td><td>9</td><td>10</td>
</tr>
<tr>
<td>11</td><td>12</td><td>13</td><td>14</td><td>15</td><td aria-current="date">16</td><td>17</td>
</tr>
<tr>
<td>18</td><td>19</td><td>20</td><td>21</td><td>21</td><td>22</td><td>23</td>
</tr>
<tr>
<td>24</td><td>25</td><td>26</td><td>27</td><td>28</td><td>29</td><td>30</td>
</tr>
<tr>
<td>31</td><td></td><td></td><td></td><td></td><td></td><td></td>
</tr>
</table>
The aria-current
attribute can be applied outside of these use cases. When aria-current
is set to true, then screen readers simply indicate that something is "Current", whatever that thing may be. For example:
<ul>
<li><a href="/" aria-current="true">Apples</a></li>
<li><a href="/">Bananas</a></li>
<li><a href="/">Cherries</a></li>
</ul>
Working examples, screen reader support and links to screen reader demos are available for all the aria-current patterns mentioned in this post.
]]>An emoji is an ideogram (a picture that represents an idea or concept). Unlike an image that is displayed from a (.png, .gif or .jpg) file, an emoji is displayed using a Unicode character. On the web this is often done using the hexadecimal or decimal representation of the Unicode character. For example:
<span>☃</span>
This presents a snowman emoji (☃) in the browser. Unless you're looking at it with a screen reader, in which case there is a chance you'll be completely unaware it's there at all.
The first problem is that browsers do not always expose the emoji as an image in the accessibility tree. The <span>
element is semantically neutral * its role may not be communicated by the browser and/or used by screen readers.
The second problem is that the content of the <span>
is just a character reference. The browser may not automatically assign it an accessible name or an accessible description. The upshot is that emoji may not be recognised or only partially recognised with some browser and screen reader combinations.
These things are simple to fix though. We need to tell the browser to expose the emoji (or its container) as an image, then give it an accessible name.
Use the ARIA img
role to expose the <span>
as an image in the accessibility tree:
<span role="img">☃</span>
Then use the `aria-label attribute to give the emoji an accessible name:
<span role="img" aria-label="Snowman">☃</span>
If you use the <i>
element as a container for an emoji, you probably shouldn't, but the same solution can be used:
<i role="img" aria-label="Snowman">☃</i>
Now the browser will expose the emoji as an image in the accessibility tree, and use the value of the aria-label attribute as its accessible name. When a screen reader queries the accessibility tree it will use this information to tell the user that there is an image of a snowman (☃) on the page.
Update 30th December 2016: Adrian Roselli has posted a useful follow-up article to this one * Accessible emoji, tweaked.
]]>When you develop accessible apps, webapps or websites, one of the challenges is understanding how screen readers consume content. Making sense of speech output and relating it to the different objects on-screen is an acquired skill.
Narrator dev mode helps developers identify which on-screen objects are accessible to Narrator, which object Narrator is currently focused on, and what information is available to Narrator about that object.
When Narrator dev mode is enabled, the screen is masked except for the object Narrator is focused on. Only accessible objects and the text that is associated with them is visible. In other words, Narrator dev mode presents the visual equivalent of using Narrator in addition to Narrator's default speech output.
Narrator must be turned on before Narrator dev mode is enabled. It's a good idea to get a basic understanding of Narrator commands before experimenting with Narrator dev mode, but if you want to jump right in, the following table has the commands you'll need.
Useful Narrator commands
Action
Command
Turn Narrator on/off
Control + Windows key + Enter
Discover Narrator commands
Capslock f1
Discover Narrator context sensitive commands
Capslock f2
Turn Narrator dev mode on/off
Capslock Shift f12
Getting into the habit of testing your app, webapp or website with Narrator and Narrator dev mode won't give you a comprehensive knowledge of how the entire screen reader ecosystem consumes content. But it will help you understand the linear nature of screen reader interaction, and whether your product exposes information that is useful to screen readers * and that is knowledge that can be transferred to testing with other screen readers and/or platforms.
]]>]]>The earliest "historical" thing I can vaguely remember was the tail end of the Vietnam War. I have vivid memories of an oil crisis in 1973 which resulted in the issuing of ration coupons (I remember my school teacher being really pleased because he drove a frugal Mini and he could get much further on his ration).
I remember a miners’ strike in 1974 resulting in regular power cuts, including my favourite one which happened during a report on Nationwide on how power stations were handling it. The man at the control desk said, "And now we’ll switch off the Sydenham district," and reached out for the switch. The electricity went off in our house and we learned that the "Sydenham district" was their name for the whole of south east London.
I remember that the power shortages resulted in the illumination of only every other streetlight and the introduction of the three-day week. I remember that that brought down the government and there being two general elections in 1974 because no one could decide who should dig us out of the hole.
I remember inflation over 20%. I remember bread shortages and sugar shortages.
I remember the IMF crisis in 1976. Then there was the "Winter of Discontent" in 1978/79. Another recession in 1980 (I think there’d already been two in the ’70s), accompanied by record unemployment. I remember riots in Bristol, Toxteth and Brixton.
I remember the Falklands War in 1982. Then there was the Miners’ Strike in 1984 culminating in the gutting of the Trade Union movement, the end of collective bargaining and the wholesale dismantling of the UK’s industrial base.
As a constant, terrifying, heart-breaking background to all that through the ’70s and ’80s there was sectarian unrest in Northern Ireland * daily reports of murders and bombings, both in Ulster and on the mainland.
I haven’t even touched on the Arab-Israeli conflict, Apartheid, famines, natural disasters and assorted other horrors.
Or Watergate.
I remember another recession in 1990/91. I remember the house-price crash which left the value of my home "underwater" for nearly 10 years.
I remember Black Wednesday in 1992 when Sterling crashed so low it had to be withdrawn from the ERM. There was the Gulf War and the war in Kosovo. The 2001 terror attacks on the USA led to more war in Iraq and Afghanistan, and chaos throughout the Middle East.
Mixed in with all that for the past 40-odd years that I’ve known what was going on there were assorted political crises and scandals big and small. Some of them involving blowjobs in the Oval Office.
I’ve seen a lot of hand-wringing in all branches of the media over the past couple of weeks explaining just how fucked we are now, like it’s a new thing. But look again at that list. It’s not even close to being a comprehensive list of all the shit things that have happened in my lifetime, but it should serve to illustrate that life isn’t more shit now than it’s ever been. If you want to think it’s shit then by all means go ahead, but at least have the perspective to realize that it’s not the shittest * it’s exactly as shit as it’s always been.
And yet somehow we’re all still here. We do our bit every day to try to make things better. Just like we always have. So… you know… get the fuck over it.
The word “semantic” is regularly used in the context of web development. We talk about “semantic code” and the “semantics” of a given element, but what do we really mean by it, and why is it important?
The word semantic comes from the French sémantique, meaning non-comparable. In English it has three meanings:
It is the second definition that is relevant here. Semantic code has both structure and meaning, and both things are equally important.
When we write HTML we give content structure. We define things like paragraphs, lists, tables and headings. Writing semantic code means choosing the most appropriate element to define the required structure. This helps interoperability.
There is a common understanding of what each element represents and does. For example the p
tag represents a paragraph of static text, an a
element is interactive and will fetch a new resource when activated. This common understanding enables browsers and other user agents to provide default functionality and styling for many elements.
When we use HTML we also give content meaning. Most HTML elements have an implicit role that defines the element’s purpose. For example the a
element has an implicit role of “link”, the img
element’s implicit role is “graphic” or “image” (depending on the platform), and the header
element has an implicit role of “banner”.
These roles are discoverable using the accessibility APIs supported by the platform, and they are used by assistive technologies like speech recognition tools and screen readers.
<a href="https://tink.uk">Tink UK</a>
When a screen reader queries the browser for information about the piece of content represented by the HTML shown above, it will discover that it is a link to this website. The screen reader identifies the element’s role as “link”, uses the text content inside it to give the link an accessible name, and makes this information available in synthetic speech.
Compare this to the pseudo link represented by the following code:
<span class="link">Tink UK</span>
The browser does not recognise this structure as a link, and so does not provide any of the expected behaviour (the styling is provided using CSS). The span
element is semantically neutral, so it does not have an implicit role that is useful in this context.
It is possible to use JavaScript, ARIA, and CSS to polyfill the missing semantic information, and to provide some (but not all) of the styling and behaviours associated with the a
element. That takes a lot more code, a lot more effort, and it usually results in a very brittle implementation compared to that of a native html element.
HTML semantics are therefore important in two ways: We get a consistent understanding of content structure and native behaviour, and we get a common understanding of the content’s meaning and purpose. The best thing of all, is that we get those things for free whenever we use HTML as intended.
]]>When you use <input type=”password”>
to create a native password field, the browser obscures the characters that are entered into the field, and prevents them being discoverable with cut/paste. At the accessibility layer the field has a role of “text” and a state of “protected”, which informs assistive technologies like screen readers what type of field they are dealing with. When a screen reader encounters a native password field it automatically stops echoing keys as they’re pressed, and instead announces the characters as they’re rendered inside the password field – as asterisks for example.
When you use a <div>
or <span>
to create a custom password field you must obscure the characters with scripting, and there is nothing you can do to protect against cut/paste discoverability. At the accessibility layer there is no role or state available to assistive technologies, and because screen readers do not recognise it is a password field they continue to echo keys as they’re pressed.
So there are security issues with custom password fields, irrespective of accessibility. There is no way to tell whether a password field is native or custom, and no guarantee that your password is adequately protected from discoverability.
There are also accessibility issues. It is not possible for assistive technologies to tell that it is a password field (or even a text input), and nothing to stop screen readers announcing the characters of your password as you type them.
To solve the first of the accessibility issues, a password role has been proposed in ARIA 1.1. When applied to a custom password field, the role will inform assistive technologies that the field should be considered a password field. The proposal will likely be extended to make the role more specific – to inform assistive technologies that it is a custom password field.
The password role also attempts to solve the second accessibility issue. According to the password role definition, when the role is applied to a custom password field screen readers should stop echoing keys as they’re pressed and instead announce only the characters as they are rendered inside the password field.
Neither of the other security issues with custom password fields will be solved by the password role, but ARIA is not the place to solve those problems. They do need to be solved (and soon), but it is a much wider discussion than accessibility alone.
If it sounds as though the ARIA password role is a good solution, there is a catch. The role itself is a step in the right direction, but if screen reader vendors don’t implement it properly it could leave screen reader users vulnerable. If the screen reader continues to announce the characters of your password as they’re typed, someone may overhear them. If the scripting fails or is implemented badly, it could also leave screen reader users uncertain about whether their password was displayed on-screen in plain text or whether the characters had been obscured because all they would hear would be the characters of their password echoed back to them. This also leaves open the possibility, however unlikely, of a malicious attack targeted at screen reader users.
So here is the catch… there is no requirement for ARIA to be tested in screen readers before it becomes an official W3C standard. This means the password role could be put out there in the wild with no reassurance that any screen reader had implemented it properly.
Each feature of a W3C standard is expected to be properly implemented and tested in at least two user agents. For most standards this means two shipping browsers or browser engines, but ARIA is unique in that it is the only standard that requires a screen reader to be used.
Until now this hasn’t been a problem. Testing ARIA implementations by examining platform accessibility API and browser support has been more than sufficient. If a screen reader vendor then implemented support for a feature incorrectly it might cause a feature not to work as expected, or a role to be misidentified, but irritating as that might be it does not represent a security risk for screen reader users.
The upshot is that the password role will offer better security for screen reader users if it is implemented correctly, but unless we can be sure it is implemented correctly there is a risk to screen reader users.
Testing ARIA with screen readers seems like the sensible thing to do. Screen readers are dependent on the accessibility APIs and browsers, but they also play a role in correctly supporting ARIA. If the ARIA Working Group chose to include screen readers in its testing for ARIA 1.1, even if only for the password role, that would be another step in the right direction.
There is another option, but it involves breaking the browser’s "fourth wall". If browsers implemented the same behaviour for custom password fields using the password role, as they do for native password fields, the security problems would go away – all of them, not just the accessibility ones.
Browsers have traditionally held back on providing functionality or making changes to the UI, based on ARIA. Custom password fields give us a compelling use case for revisiting this idea though, so perhaps the more progressive browser vendors will look again at their position on this. Then again, perhaps it is time to look beyond ARIA for an HTML solution that will give us the ability to imbue custom controls with native behaviours.
Update 6th November 2016: On 7th July 2016 the ARIA WG agreed to move the password role to ARIA 2.0. This move was prompted by the realisation that no clear use cases for the role existed.
However, at a joint meeting between the ARIA WG and Web Platform WG on 19th September 2016, there was discussion about ARIA 2.0 roles having parity with all HTML elements. If this goes ahead it would mean the inclusion of the password role by default, as it would be mapped to <input type="password">
. It may be that the browser vendors will look again at the role and imbue it with the same characteristics as its native HTML counterpart, but for now the issue of ARIA role and HTML element parity remains open for discussion on Github.
application
role should be used with caution, but this post looks at a use case where the application
role is used to good effect.
Screen readers sometimes intercept keystrokes and repurpose them for screen reader specific tasks, like navigating content by headings or reading a line of text. The application
role prevents the screen reader from intercepting keystrokes, and passes them back through to the browser as though the screen reader wasn’t running. Read Understanding screen reader interaction for more information.
It is this intercept behaviour that means some screen readers, notably those that run on Windows, do not work with keyboard shortcuts provided using JavaScript. The screen reader intercepts the given shortcut keys and uses them for its own purpose, instead of letting them through to the browser and the waiting Javascript. Read Time to revisit accesskey for more information.
The answer may seem simple at this point – use the application
role to stop the screen reader intercepting the shortcut keys, so the Javascript can do its job. The problem with this is that it will stop the screen reader intercepting every keystroke, not just those intended to be used as keyboard shortcuts. In other words, none of the keystrokes used for navigating and interacting with content will be available to the user. This seriously impairs their ability to read and interact with content in the normal way.
The decision to use the application
role must therefore be taken carefully. It should only be used when there is no need for a screen reader user to be able to use any of the standard keyboard commands at their disposal. To put it another way, if you use the application
role, the responsibility to provide all keyboard interaction lies with you.
One such case is the Shower Presentation Engine by Vadim Makeev. It is an HTML, CSS, ARIA and JavaScript based engine for writing presentation slides. It has the ability to display slides in a gallery or slideshow view, and to switch between views in response to click, tap or keyboard.
When in slideshow view there is a limited set of expected interactions: move to next/previous slide or move to first/last slide. The Shower engine provides keyboard shortcuts for each of these interactions, but it does so using JavaScript, resulting in the problem described above for Windows screen reader users.
In gallery view a screen reader user might want to navigate slides by heading, list or other element type, or to query content by word, line or paragraph. When in slideshow view and giving a presentation there is (almost) no need for any of the standard screen reader commands to be used though.
The exception is that a screen reader user needs to be able to query the title of each slide, and possibly the content of each slide as it appears. Instead of providing a specific keyboard shortcut to enable screen reader users to do this, the Shower engine accomplishes this using an ARIA live region, that causes screen readers to automatically announce the title and content of each slide as it appears in the slideshow view.
The slideshow view is therefore that rare thing – a suitable place to use the application
role. There is a clearly defined set of interactions, with complete keyboard support provided using JavaScript, and no need for standard screen reader commands to be used whilst giving a presentation.
Thanks to a contribution from Dan Hopkins, the Shower engine applies the application
role to the element whenever slideshow view is enabled. When switching back to gallery view, the application
role is removed and standard screen reader behaviour resumes.
TLDR: The only viable way (in my opinion) for the flexbox disconnect to be resolved, is in the browser (as with the Firefox "bug") and the accessibility tree.
When content is presented in sequence, we expect to read it and navigate through it in a logical order. In the West this is typically top to bottom/left to right. In the following example, the expected order is "1", "2", "3".
<div>
<a href="/">One</a>
<a href="/">Two</a>
<a href="/">Three</a>
</div>
With flexbox it is possible to display content in a different visual order, without changing the DOM order. Items inside a flex container can be visually sequenced using the CSS order
property. In the following example, the visual order is "3, "2", "1", but the DOM order is still "1", "2", "3".
<div style="display: flex;">
<a href="/" style="order: 3;">One</a>
<a href="/" style="order: 2;">Two</a>
<a href="/" style="order: 1;">Three</a>
</div>
When you use the tab key to move through the content, there is a disconnect between the visual order and the keyboard navigation (DOM) order. In this simple example it's mildly awkward, but in a situation where flexbox is used to layout a complex interface it could make things horribly unusable.
There are two things that can be done to bypass the DOM order at the code level, but neither of them solve the flexbox disconnect problem.
tabindex
attributeThe HTML tabindex
attribute can be used to impose a specific DOM order on content. If the tabindex
values match the corresponding order
property values, the visual and DOM orders can be brought into alignment. In the following example the visual and keyboard order are both "3", "2", "1".
<div style="display: flex;">
<a href="/" style="order: 3;" tabindex="3">One</a>
<a href="/" style="order: 2;" tabindex="2">Two</a>
<a href="/" style="order: 1;" tabindex="1">Three</a>
</div>
The problem is that tabindex
is scoped to the document. If the above code were included in a page, it would completely hijack the tab order. The three items with tabindex would be the first three things on the page to receive keyboard focus, irrespective of their overall location in the DOM structure. In other words, you can use tabindex to solve the flexbox disconnect, but only by pushing the problem up to the document level.
aria-flowto
attributeThe aria-flowto
attribute can be used to provide an alternative keyboard navigation for screen reader users. In the following example the visual and aria-flowto orders are both "3", "2", "1", and the DOM order is "1", "2", "3".
<div style="display: flex;">
<a href="/" style="order: 3;" id="i1">One</a>
<a href="/" style="order: 2;" id="i2" aria-flowto="i1">Two</a>
<a href="/" style="order: 1;" id="i3" aria-flowto="i2">Three</a>
</div>
The first reason aria-flowto
does not solve the flexbox disconnect, is because it complicates rather than simplifies the problem. It introduces a third mode of navigation, and one that is only applicable to screen reader users (who must use specific keyboard commands to benefit from it).
The second problem is that aria-flowto
has extremely poor accessibility support. Of the various popular browser and screen reader combinations, only Jaws with Firefox, or Narrator with Edge and IE, has support for aria-flowto.
A more promising option is for the browser to handle the flexbox disconnect.
Firefox realigns the tab order to match the visual order (based on the order property). When you use the tab key to move through the above example, focus moves to each link in the order in which it appears on screen.
Interestingly this behaviour is considered to be an implementation bug because it's contrary to the FlexBox specification. This may explain why neither the DOM order or the accessibility tree seem to be altered (the realignment appears to happen elsewhere), but the upshot is that it solves the disconnect problem for people using the tab key to navigate content.
It doesn’t entirely solve the flexbox disconnect problem for screen reader users though. If a screen reader user tabs through the content the realignment happens as described, but screen readers that use a virtual buffer will also present the content in DOM order when the virtual mode is used. It’s worth mentioning that this is the case with all the options mentioned here, and it’s likely to be the case with any future option that doesn't rearrange the DOM and/or accessibility tree.
So the current situation is unsatisfactory for developers and users alike. As Rich Schwerdtfeger explains, the CSS "don’t use it" recommendation is unacceptable, and hacking things at the code level isn’t desirable or worthwhile in this instance.
The Firefox implementation/bug seems to have merit though. It isn’t perfect, but it comes close to solving the problem with a minimum of fuss and bother for developers and a reasonable outcome for keyboard users.
You can use dry, medium or sweet cider. I use dry cider with 2tbsp brown sugar because it's sweet enough without masking the natural sweetness of the apples and spices. It's entirely up to you though.
You can substitute 500ml of cider with apple juice. This adds to the sweetness without using more brown sugar.
If you want to give your mulled cider a real kick, you can add a slosh of Calvados, apple brandy, or Grand Marnier. The wonderful thing about mulled cider is that you can flagrantly disregard measurements and quantities and do what you like with it. Start with the basic recipe, taste often, and experiment as you go!
]]>If you want to make sorbet (instead of sauce for ice-cream), you'll also need:
I regularly make strawberries with balsamic vinegar to pour over vanilla ice cream. With more people in my life who have dietary restrictions (lactose intolerant, celiac, vegetarian, vegan) and who don’t get to enjoy ice cream with the rest of us, I opted to modify this recipe to turn it into a sorbet. Because that turns the recipe into a two-for-one, I can make just the sauce if I’m lazy and/or want it over ice cream, or I can go make a big bowl of sorbet if I’m a bit more patient (which I sometimes then have with a scoop of vanilla anyway).
]]>The parent/child relationship is a cornerstone of HTML structure. Every element in an HTML document is the parent, sibling or child of another.
In some cases these relationships are critical to the semantics of a document. A table contains rows, headers and cells for example. Without the correct arrangement of all the elements that the table consists of, the structure breaks down and becomes meaningless.
These relationships are not always available in the DOM though. Sometimes because of the way code is written, sometimes because the structures themselves do not exist. HTML has no elements for tree structures, and SVG has no concept of data structures like lists or tables for example.
Let’s take an artificial example, which isn’t likely to happen in the wild, but which makes it easier to explain how aria-owns
works.
<ul>
<li>Fruit</li>
<li>Vegetables</li>
</ul>
<ul>
<li>Apples</li>
<li>Bananas</li>
</ul>
The two lists are separate in the DOM. Beyond the fact they are siblings in the DOM tree, they have no relationship with each other.
Now suppose that we want to change the way the lists are displayed, so that the second list appears as a child of the first. We could rewrite the source code and rebuild the DOM, or we could style the lists so they have the appearance of being nested.
The problem with the first option is that it takes effort and could effect performance. The problem with the second option is that it’s only cosmetic. An assistive technology like a screen reader (that relies on the DOM for information) will still consider the two lists as completely separate.
We can use the aria-owns
attribute to fix this problem though.
<ul>
<li aria-owns="child">Fruit</li>
<li>Vegetables</li>
</ul>
<ul id="child">
<li>Apples</li>
<li>Bananas</li>
</ul>
The aria-owns
attribute creates a parent/child relationship between the two lists in the accessibility layer. The DOM tree remains unchanged, but the accessibility tree now exposes the two lists as though they were nested. In other words, the lists are now exposed like this in the accessibility tree:
<ul>
<li>Fruit
<ul>
<li>Apples</li>
<li>Bananas</li>
</ul>
</li>
<li>Vegetables</li>
</ul>
Which is to say that is what should happen. Unfortunately screen reader support for aria-owns
is still inconsistent.
Using this aria-owns test case, Jaws has support with Chrome, firefox and Internet Explorer. NVDA has support with chrome, but not Edge, Firefox or Internet Explorer. TalkBack does with Chrome, but VoiceOver does not in Safari (OSX or iOS).
Updated on 19th October to include results for Narrator (on Windows 10). The aria-owns
attribute is not supported by Narrator with either Edge or Internet Explorer.
I don’t know who will read this. I don’t even know why it has suddenly become important to write it, but for whatever it’s worth, this is an account of an event in my life that changed everything.
I lost my sight over the course of 12 months from late 1999 to late 2000. It was mostly my fault that it happened. I was diagnosed with Type I Diabetes when I was a little girl. At the time they explained that I would have to eat a precise amount of food each day, and that I would need to inject a precise amount of insulin to handle it. These measurements were reviewed and revised on an annual basis.
Note: Type I Diabetes means your body stops producing insulin and you have to inject it instead. Type II Diabetes (the kind you hear about on the news) means that your body still produces insulin but is unable to absorb it properly, which is why it can often be controlled through diet and tablet based medication instead of insulin injections.
When I was a little older I asked my paediatrician why it had to be this way. I wanted to know why I couldn’t work out how much food (carbohydrate) I was about to eat, measure my blood glucose, and then calculate my insulin dose based on those and other factors. After-all I reasoned, this is what my body would have done naturally, so why couldn’t I emulate that behaviour? To this day I don’t know whether he actually patted me on the head, or whether my sub-conscious has superimposed that memory based on his reply (“don’t be so ridiculous”), but it doesn’t really matter in the scheme of things.
Looking back I understand that was the moment that everything changed. It would take another 15 years for the impact of that moment to be felt, but that was where it all began. That was where the rebellion started.
At some point during my teens I discovered that I could skip an injection without anything terrible happening. It wasn’t something I did intentionally, at least not then, but it made me think I could get away with it from time to time. From then on I did just enough injections to stop anyone from figuring out what I was up to. I stopped monitoring my blood glucose levels almost entirely, and as soon as I was old enough I stopped going to see my doctor for annual checkups.
Throughout my student days I had a riot. I went to drama school where I worked hard and played harder. I smoked, danced, partied, fell in and out of love, discussed Stanislavsky and Brecht until the wee small hours, drunkenly declaimed Shakespeare, and kept on ignoring the fact I was diabetic.
By the time the century was drawing to a close I was working in the tech industry. Quite by accident I had gone to work for one of the UK’s first ISPs in early 1997. Somewhere along the way I taught myself to code and by 1999 was working as a web developer. This was the era of the .Com boom and everyone was having fun. There were pool tables in the office, Nerf guns on every desk, insane parties that the company wrote off against tax, Paul Okenfold was the soundtrack to our lives, and we’d fall out of clubs at 6am and drive to Glastonbury Tor to watch the sunrise just for the hell of it.
One morning in October 1999 I woke up with a hangover. As I looked at my reflection in the mirror I realised I could see a ribbon of blood in my line of sight. As I looked left then right, the ribbon moved sluggishly as though floating in dense liquid. Assuming it was a temporary result of the previous night’s antics, I left it a couple of days before visiting an optrician to get it checked out. When I did the optrician took one look at the backs of my eyes and referred me to the nearest eye hospital for further investigation.
When diabetes is uncontrolled for a time it causes a lot of unseen damage. It works something like this: When you eat something your blood sugar levels rise and your body produces insulin to convert that glucose into usable energy. If enough insulin isn’t available then the cells in your body are starved of energy and begin to die and the excess glucose remains trapped in your blood-stream. If that wasn’t enough, the excess glucose smothers your red blood cells and prevents them from transporting oxygen efficiently around your body.
One of the ways this damage eventually manifests itself is Diabetic Retinopathy. In an attempt to get enough oxygen to the retina at the back of your eye, your body creates new blood vessels to try and compensate. The new blood vessels are created as an emergency measure and so they’re weaker than they need to be. This means they’re prone to bursting and haemorrhaging blood into the eye – creating visible ribbons of blood like the one I could see. The lack of oxygen to your retina and the accumulation of blood in your eye has the inevitable effect of damaging your sight.
The people at the eye hospital told me I would need laser treatment. This might halt the breakdown of the blood vessels at the back of my eye and enable the remaining vessels to strengthen enough to get the oxygen where it needed to go. It seemed like a reasonable option since my sight was still reasonably good at this point.
Laser treatment isn’t pleasant. It requires an anaesthetic injection into the eye, before a laser is fired repeatedly at the blood vessels at the back. One side-effect of this is that it plays havoc with your retina. If you think of your retina like a piece of clingfilm pulled taut over the open end of a jar, then imagine how it distorts when pressure is applied to it, you won’t be far off the effect laser treatment can have. I remember one round of treatment skewing the sight in one eye around 90 degrees and making everything slightly pink for several hours. Trust me when I tell you that it’s impossible to remain upright with one eye seeing straight and the other 90 degrees out of whack!
Looking back now I realise my consultant knew that laser treatment was almost certainly a futile gesture. Despite this no-one ever came straight out and told me the consequences of having advanced Diabetic Retinopathy, it was always spoken of in terms of progressive deterioration without ever mentioning the logical conclusion of that progression.
I do remember the day I admitted it to myself though. My sight had been steadily worsening over the months and it was a day in the spring of 2000 that it happened. I have no idea what prompted it, but I was walking down the stairs at home when it hit me like the proverbial sledge hammer * I would be blind. Until that moment I had never believed people when they said an emotional reaction could be like a physical blow. I don’t doubt that anymore. With absolute certainty I knew I would lose my sight and that I only had myself to blame. I sat on a step a couple up from the bottom of the stairs and fell apart. I cried like a child. I cried for my lost sight, for all my broken dreams, for my stupidity, for all the books I would never read, for the faces I would forget, and for all the things I would never accomplish.
Having come unravelled I couldn’t pull myself together, and after a few weeks reluctantly realised I needed help. My doctor prescribed anti-depressants that effectively put me to sleep for about 23 hours out of every 24. After six weeks I decided enough was enough and took myself off the meds for good.
A curious thing had happened in the intervening weeks though. Whilst I was asleep my mind appeared to have wrapped itself around the enormity of what was happening. This wasn’t any kind of revelation, but it was a recognition of what I was up against, and that was enough for the time-being.
Over the ensuing months I gave up work as my sight continued to deteriorate, and I stayed at home and tried to keep busy. Most people look baffled when I tell them that going stir crazy was one of the hardest things to deal with about losing my sight. People who know me understand that to me boredom is a fate worse than death (or blindness as it turns out)!
There were days when I raged out of control. Days when I screamed and threw things at the people I loved just because they were there. There was the day I stumbled in the kitchen and up-ended a draining rack full of crockery that smashed into a thousand pieces around me. The days when I demolished a keyboard I could no longer use, or kicked the hell out of the hoover because the cable was so caught up around the furniture I couldn’t untangle it. Days too numerous to count when I bruised, cut, scratched or burned myself in pursuit of everyday tasks, and the rage and the tears would overwhelm me allover again.
That amount of fury isn’t a good thing. It took my relationship with the most important person in my life to the brink of collapse, but fortunately patience is one of his abiding qualities even though I tested it to the limit during those times. My friends and family went through this every bit as much as I did, only they managed it with a degree of grace, humour and affection I was incapable of recognising at the time. Now I know with absolute certainty that if it hadn’t been for their collective love and support, things would have turned out very differently for me.
Towards the end of that year, not long before Christmas, the last of my sight vanished. To think of it now, it seems that I went to bed one night aware of a slight red smudge at the farthest reaches of my vision (the standby light on the television), then woke the next morning to nothing at all. I don’t suppose it happened exactly like that, but it’s close enough.
I do remember being surprised to learn that only 3% of blind people are completely blind. Most have some degree of light perception or even a little usable vision, but I’m one of the few who can see nothing at all, and nothing is the best way to describe it. People assume it must be like closing your eyes or being in a dark room, but it’s not like that at all. It’s a complete absence of light, so it isn’t black or any other colour I can describe.
To offset what would otherwise be an incredibly boring view, my mind obligingly gives me things to look at instead. It shows me a shadowy representation of what it thinks I would see if I could – like my hands holding a cup of tea in front of me. Since my mind is constrained only by my imagination, it rather charmingly overlays everything with millions of tiny sparkles of light, that vary in brightness and intensity depending on my emotional state.
My retinas are long since gone, so no actual light makes its way to the backs of my eyes. This is what gives my eyes their peculiar look – each pupil is permenantly open to its fullest extent in an effort to take in light. Oddly this is the one thing that still makes me feel a little uncertain about being blind, but given that I no longer really remember what I look like, perhaps there will come a time when that uncertainty will fade.
With the last of my sight gone, I discovered something I hadn’t expected. Now I couldn’t see anything, things started to get a lot easier to deal with. Looking back now I realise that was because I stopped trying to look at what I was doing, and started to use my other senses.
I’ll pause at this point to clear up a common misconception – I do not have extraordinary hearing, sense of smell or any other sense. I do pay more attention to those other senses though, so although I’ll often hear a phone ringing when other people don’t, it’s only because I’m devoting more of my concentration to listening than they are.
The next few months were a time of discovery, sometimes painful, often frustrating, but also littered with good memories. I learned how to do chores, how to cook, where to find audio books, how to cross the road, what it feels like to drink too much when you can’t see straight to begin with, and many more things I’d learned once in my life before. The one thing I didn’t do was learn Braille, at least not with any dedication. I simply didn’t have the patience to go back to reading baby books, at a time when so much else was new and strange. I did learn something far more important though.
I discovered something called a screen reader. A piece of software that could be installed on my computer and which would speak on-screen content to me in a synthetic voice. It would even echo my keystrokes back to me as I typed, which was just as well because I suddenly realised I couldn’t touch-type despite having used computers on/off since the early 1980s!
I then decided to enrol on an online course with the Open University. The course was called “You, your computer and the internet”, a subject that was child’s play for someone who had been using the internet for nearly a decade. For someone who could barely use a computer with a screen reader, it was something of a practical challenge though. It took me an entire day to figure out how to log into the course environment, and for a while every new task seemed to take as long. Day by day things got easier though, and by the time I finished the course I was well on my way to regaining an important part of my life. Somewhere along the way I’d also rediscovered my love of learning, and promptly enrolled in another course. One thing led to another, and I eventually graduated with a degree in computer science in 2010.
It’s been 15 years since all this happened. Somewhere along the way I went back to work, and I now have the uncommonly good fortune to be working and collaborating with lots of smart and interesting people, many of whom I’m delighted to call friends.
I still find technology challenging sometimes, because we have yet to reach a time when things are engineered to be accessible as standard. That too is changing.
So life moved on, as life has a habit of doing, and as I celebrated my 40th birthday last year perhaps it gave me cause to reflect.
]]>There are three configuration changes you can make to firefox. I'm obliged at this point to say that unless you're confident about changing such things, you probably shouldn't try this. That said, you're smart people and besides, it's your browser...
Note: This workaround does not seem to work with Firefox 40 on Windows 10 with the Jaws 17 public beta. Please use the comments to add further information about this workaround * but no technical support requests please.
]]>The CSS Speech properties are intended to give content aural style, in the same way other CSS properties give content visual style. The CSS Speech module is largely unsupported in browsers, but the Web Speech API can be used to demonstrate something of the way CSS Speech might work in practice.
The CSS Speech module and Web Speech API both use Text To Speech (TTS). The CSS Speech module describes how a service that uses TTS (like a screen reader or voice assistant) speaks web content, and the Web Speech API produces synthetic speech using a TTS engine.
There are TTS engines bundled with most platforms. Voice assistants like Siri, Cortana or Google Talk tend to use platform TTS engines. Screen readers may also use the platform TTS, but often come bundled with alternative TTS engines that offer a wider choice of voices or language support.
TTS voices are reasonably good at mimicking basic human speech patterns. Most respond to punctuation as though a person was reading the content aloud – they pause at commas, pause a little longer at full stops, or increase in pitch at the end of a sentence that asks a question. Some also simulate breathing patterns – decreasing in pitch when speaking long sentences without punctuation (as though running out of breath), or simulating the sound of breathing itself.
The trouble with TTS voices is that they’re completely flat. There is little variation in cadence or tone, and no emotional inflection at all. When you use a screen reader or listen to a voice assistant, everything is spoken in the same unchanging voice.
If CSS Speech was supported by browsers, it would be possible for a screen reader or voice assistant to determine the aural style of content and speak it accordingly. Content might be spoken more slowly, more loudly, or in a different voice for example.
In the absence of consistent browser support for CSS Speech, it isn’t possible to determine the computed aural style of content. Even if it were, there is no way to relay that information to a screen reader and force it to respond accordingly. There are no APIs for interacting directly with screen readers.
It is possible to use the Web Speech API to simulate the way a screen reader might respond to CSS Speech though. A basic demo is available (with the warning that it’s held together with chicken wire and sticky tape). Thanks to Aurelio De Rosa, from whom I borrowed the workaround for the getVoices() bug in Chrome.
The CSS Speech properties let you define aural styles for content in the same way you define visual style. There is even an aural box model for arranging the spacial and temporal aspects of the aural presentation.
At present only WebKit/Safari has support for CSS Speech, and then only for the speak
and speak-as
properties.
As the display
property determines whether content is rendered visually, the speak
property determines whether it is rendered aurally. The speak
property can be set to auto
, none
or normal
, with auto
being the default.
When the speak
property is set to auto
, it defers to the state of the display
property. display: none;
is set the computed value of the speak
property is also none
, otherwise it is normal
.
There are CSS properties for manipulating other basic speech characteristics:
The voice-volume
property determines the relative loudness of the TTS output. It can be set by keyword (silent
, x-soft
, soft
, medium
, loud
, x-loud
), or by decibel (15DB
). By default it’s set to medium
.
When voice-volume: silent; is set, the content is rendered aurally but spoken at zero volume. In this respect it is similar to visibility: hidden;, which causes content to be rendered in the DOM but hidden from view.
The voice-rate
property determines the speed at which content is spoken. It can be set using a keyword (normal
, x-slow
, slow
, medium
, fast
, x-fast
), or a percentage. It defaults to normal
.
The voice-pitch
property determines the frequency at which the content is spoken. It can be set using a keyword (x-low
, low
, medium
, high
, x-high
), or by percentage. It is set to medium
by default.
p {
speak: normal;
voice-volume: loud;
voice-rate: medium;
voice-pitch: low;
}
The Web Speech API consists of two interfaces: The SpeechRecognition interface and the SpeechSynthesis interface. The SpeechRecognition interface handles speech input, and can be used to enable voice commands within a web application. The SpeechSynthesis interface handles synthetic speech output via a TTS engine.
According to Can I Use, the Speech Synthesis Interface is supported in all major browsers.
The SpeechSynthesis interface is available on the window object. Its methods are speak()
, pause()
, cancel()
, resume()
, and getVoices()
.
The speechSynthesisUtterance()
constructor is used to create a speech object. The text to be spoken is then attributed to the speech object using the text()
method. Other attribute methods include volume()
, rate()
and pitch()
.
Passing the speech object to the SpeechSynthesis interface using speak(), causes the default TTS voice to speak the text content.
var utterance = new SpeechSynthesisUtterance();
utterance.text = "Tequila";
window.speechSynthesis.speak(utterance);
The getVoices()
method can be used on the SpeechSynthesis interface to return a list of available TTS voices. A TTS voice can then be assigned to the speech object using the voiceURI()
method.
Chrome and Safari use voice()
instead of voiceURI()
, so the examples and demos use voice()
.
The demos are poor imitations of screen reader interaction. Only the tab key is supported as a means of navigating through content, and to make even this work it was necessary to make non-interactive content focusable (something that should never be done in production code). Hopefully the demos do illustrate how content could be made more interesting to listen to with CSS Speech though.
If/when browser support for CSS Speech becomes more prevalent, it would be important for them to provide a mechanism for ignoring aural styles. Not everyone will want to listen to content spoken in different ways, and it should be possible for them to turn off this feature.
All the demos mentioned in this post, plus some earlier examples of the SpeechSynthesis interface are available on Github.
]]>If you don't have a blow-torch you can use a grill. Heat it to the highest possible setting (it's important that it's as hot as possible so the sugar melts without heating up the crème underneath), then place the ramakins under the grill and watch carefully until the sugar has melted.
You can use cinnamon instead of vanilla (technically this makes it a crema catalana, the Spanish version of a crème brulée). Just put half a cinnamon stick in with the cream/milk (removing it before mixing with the eggs).
You can use vanilla bean paste if you can't get hold of a vanilla bean pod. It's also possible to use vanilla extract, but that can add an artificial flavour to the taste if you're not careful.
]]>Websites like Facebook, Twitter and Google provide keyboard shortcuts for repetitive tasks, like posting a status update, replying to a tweet, or deleting an email. The shortcuts are intended to make things easier for keyboard-only users, but also for mouse users (because it’s often quicker to hit a keyboard shortcut than it is to locate the relevant control and click on it with a mouse).
Each of these sites (and many others) provide keyboard shortcuts using JavaScript. On the face of it this seems like a reasonable way to do things, but layering keyboard functionality on top of the application means that none of the shortcuts are available to Windows screen reader users.
Windows screen readers use a virtual model to support interaction with content rendered in the browser. A virtual buffer of the content is created, based on information from the DOM and accessibility tree. It is this virtual version of the content that the screen reader presents to the user.
Interaction with the virtual buffer works like this: the screen reader listens for keyboard events. When it detects a key press that corresponds to a screen reader command, that action is executed in the virtual buffer. For example, if the h key is pressed when a Windows screen reader is running, focus will move to the next heading (h1 – h6) in the virtual buffer. Otherwise the key is passed back through to the browser, where it will be caught by the JavaScript that handles the shortcuts provided by the website.
The trouble is that Windows screen readers utilise almost every available key for screen reader specific commands. This means that few keys make it past the screen reader to the browser, and consequently the JavaScript shortcuts are never triggered.
To take a specific example, p is the Facebook shortcut for posting a status update. If you use it without a Windows screen reader running, focus moves to the “What’s on your mind” field. If you try it with Jaws running, focus will move to the next paragraph, and in NVDA nothing will happen at all. In both cases the p key is intercepted by the screen reader, and never makes it any further.
It is possible to tell both NVDA and Jaws to pass the next key through to the browser. In NVDA you press the NVDA modifier (insert) f2, followed by the key you want to pass through. In Jaws it’s the Jaws modifier (insert) 3, followed by the key to pass back to the browser. This is not a user-friendly way to interact with a website though, even assuming you know it’s possible in the first place.
If the screen reader could identify the shortcut keys provided by the website, it would be possible to give the user a choice: use the screen reader’s native shortcuts, or use the shortcuts provided by the website. The problem is that when JavaScript is used, there is no information about the available shortcuts exposed in the browser’s accessibility tree. Each of the platform accessibility APIs has a shortcut property (dating back to the days of accesskey
), but it isn’t possible to access the APIs with JavaScript.
An April 2015 update to Jaws 16 introduced a feature that attempts to solve the problem. The “Web application reserved keystrokes” setting tells Jaws to stop intercepting keys that have been reserved by the website. In the case of Facebook, Jaws would ignore the p as a command to move to the next paragraph, and send the key through to the browser where it would trigger the JavaScript shortcut for posting a status update.
This feature does solve the problem (for Jaws users at least), but it’s something of a dirty hack. In order for Jaws to know which keys have been reserved by the website, it’s necessary to reference the shortcuts in the HTML. For example:
<div data-at-shortcutkeys={‘j’: ‘Key to navigate to next post’, ‘q’: ‘Key to place focus in chat’}></div>
Jaws then has to scrape the DOM to get at that information. It’s a method that’s prone to error, and scraping the DOM is always much less efficient than querying the accessibility API.
What we need is a way for websites to provide keyboard shortcuts that are exposed in the browser’s accessibility tree – so that the information can be queried by any assistive technology using the accessibility API. Perhaps it’s time to revisit accesskey, fix the broken bits and extend it to become a useful thing?
]]>Quick recap: The before/after
selectors insert content either before or after an element’s existing content. For example the CSS shown below will insert "bar" immediately after the "Foo" that is already present in the HTML:
#a:after { content: 'bar'; }
<a href="/" id="a">Foo</a>
The result of which can be seen in this test case. For more information on using these selectors, read Chris Coyer’s introduction to after/before on CSS Tricks.
CSS generated content isn’t included in the DOM. Ordinarily browsers take information from the DOM to create the accessibility tree, but in this case the generated content is still factored into the accessible name computation for the element.
Using the test case mentioned before indicates that generated content is accessibility supported in most browsers, and recognised by screen readers accordingly. Internet Explorer is the only browser regularly used with a screen reader that does not expose the generated content as the accessible name for the element. Screen readers depending on MSAA/UIAutomation in IE are therefore unaware of the generated content.
Browser and screen reader support for CSS generated content
Chrome 41 (Android) | Chrome 41 (Windows) | Firefox 36 (Windows) | Internet Explorer 11 (Windows) | Safari 8 (OSX) | Safari 8.1 (iOS) | |
---|---|---|---|---|---|---|
Jaws 16 | N/A | Yes | Yes | No | N/A | N/A |
NVDA 2015.1 | N/A | Yes | Yes | No | N/A | N/A |
TalkBack | Yes | N/A | N/A | N/A | N/A | N/A |
VoiceOver | N/A | N/A | N/A | N/A | Yes | Yes |
With Internet Explorer accounting for about 15% of traffic (in March 2015), there is good reason to consider the viability of using CSS generated content.
There is another more important consideration however * the separation of content and structure from design. Introducing content through the design layer breaks the standards model, and so it should be done only when the generated content does not alter the meaning of the original content. In other words, use CSS generated content to change or supplement the design, but not to create or alter important content on the page.
]]>There are several different ways to markup web content so it's meaningful to search engines, but search engine support for each format is inconsistent. This makes it awkward for developers who must identify the schema that most closely matches their requirements.
The vocabularies available on schema.org are the result of a public collaboration led by Google, Microsoft, Yandex and Yahoo! This means that developers can choose a single format for marking up information that is supported by all four major search engines (Google, Bing, Yandex and Yahoo!).
The schema.org vocabularies can be used with Microdata, RDFa or JSON-LD. Microdata and RDFa can both be dropped straight into HTML, and of the two Microdata is the easiest way to start using the schema.org vocabularies.
The posts in the recipe book category of this website use schema.org vocabularies with Microdata to markup information. The basic HTML looks something like this:
<article>
<h1>Scrambled eggs on toast</h1>
<p>Recipe introduction...</p>
<h2>Information</h2>
<dl>
<dt>Makes</dt>
<dd>Two helpings.</dd>
<dt>Time</dt>
<dd>15 minutes.</dd>
</dl>
<h2>Ingredients</h2>
<ul>
<li>4 large eggs.</li>
<li>90ml (6tbsp) single cream.</li>
...
</ul>
<h2>Method</h2>
<ol>
<li>Beat the eggs lightly...</li>
...
</ol>
<h2>Notes</h2>
<p>Recipe notes...</p>
</article>
itemscope
attributeThe first step is to establish the scope of the thing being marked up. We do this by adding the itemscope
attribute to the element that contains the relevant content:
<article itemscope>
...
</article>
itemtype
attributeThe next step is to identify what type of thing it is. The schemas on schema.org are based on type, where each type has a set of properties that describe the thing in question. The schemas are arranged in a hierarchy, with properties being inherited down through the ancestoral pathway.
The type that sits at the top of the hierarchy is Thing. It's the most generic of all the schema.org types, with properties like name and description. Types that inherit from Thing include Event, Organization, Place and CreativeWork.
One of the types that inherits from CreativeWork is Recipe, which is exactly the type of thing we need here. The type is defined using the itemtype
attribute with the URL for the Recipe schema as its value. The itemtype
attribute is applied to the element that has the itemscope
attribute:
<article itemscope itemtype="https://www.schema.org/Recipe">
...
</article>
itemprop
attributeFrom here it's simply a matter of applying the relevant Recipe schema properties to the rest of the content. This is done using the itemprop
attribute with the property name as its value.
The name property is assigned to the heading that contains the name of the recipe:
<article itemscope itemtype="https://www.schema.org/Recipe">
<h1 itemprop="name">Scrambled eggs on toast</h1>
...
</article>
The text property is assigned to the introductory text at the start of the recipe, and can later be applied to the notes at the end of the recipe:
<article itemscope itemtype="https://www.schema.org/Recipe">
<h1 itemprop="name">Scrambled eggs on toast</h1>
<p itemprop="text">Recipe introduction...</p>
...
</article>
The recipeYield property is used to markup the number of servings the recipe will make.
The totalTime property is used to indicate the amount of time it will take to create the dish. The Recipe schema also includes properties for prepTime and cookTime, should you want to break it down.
Properties relating to time must be presented in ISO8601 format. To do this and maintain human readability, the HTML5 time
element is used:
<article itemscope itemtype="https://www.schema.org/Recipe">
<h1 itemprop="name">Scrambled eggs on toast</h1>
<p itemprop="text">Recipe introduction...</p>
<h2>Information</h2>
<dl>
<dt>Makes</dt>
<dd itemprop="recipeYield">Two helpings.</dd>
<dt>Time</dt>
<dd><time datetime="PT0H15M" itemprop="totalTime">15 minutes.</time></dd>
</dl>
...
</article>
The ingredients property is used to markup each of the recipe's ingredients, and the recipeInstructions property is assigned to each of the steps that explain how to create the dish:
<article itemscope itemtype="https://www.schema.org/Recipe">
<h1 itemprop="name">Scrambled eggs on toast</h1>
<p itemprop="text">Recipe introduction...</p>
<h2>Information</h2>
<dl>
<dt>Makes</dt>
<dd itemprop="recipeYield">Two helpings.</dd>
<dt>Time</dt>
<dd><time datetime="PT0H15M" itemprop="totalTime">15 minutes.</time></dd>
</dl>
<h2>Ingredients</h2>
<ul>
<li itemprop="ingredients">4 large eggs.</li>
<li itemprop="ingredients">90ml (6tbsp) single cream.</li>
...
</ul>
<h2>Method</h2>
<ol>
<li itemprop="recipeInstructions">Beat the eggs lightly...</li>
...
</ol>
<h2>Notes</h2>
<p itemprop="text">Recipe notes...</p>
</article>
The Recipe schema has many more properties that could be used, including recipeCuisine, nutrition and cookingMethod. Properties from the CreativeWork schema could also be added to other posts on this site, including about, comment, datePublished and keywords.
When you use any kind of markup it's worth checking it conforms to the standards you're using. The Google structured data testing tool checks code snippets or a URL for conformance with the schema.org vocabularies.
That's how easy it is to start marking up your content so that Google, Bing, Yandex and Yahoo! Can reliably identify what it is. Should making things easier for search engines seem like wasted effort, it's worth bearing in mind that the better search engines get at identifying things, the better we'll all get at finding them!
]]>When you set the timeout initially, don’t forget that some people will take longer to use your application than others. This might be because someone lacks confidence with the web or because they find reading or comprehension difficult. In both cases an extended period of inactivity might be the result of them taking extra time to read and understand the content, and a timeout notification will only exacerbate the problem. Make the initial timeout as generous as possible.
When the timeout is imminent, a common approach is to display a notification on-screen. The notification usually indicates how much time is left before the timeout occurs, and includes an option to postpone it for a further period of time.
It’s better to use DOM scripting to add the notification to the page at the appropriate time. If you rely on toggling display:none;
in the CSS, the notification will be permenantly available to anyone who has CSS disabled and/or who uses their own stylesheets.
When the notification appears there are a few simple things you can do to make sure everyone is aware of it. Your basic notification might look like this:
<div>
<h1>You will be logged out soon</h1>
…
</div>
Use the tabindex attribute with a value of -1
on the div
. When the notification appears on-screen, use obj.focus()
to move keyboard focus to the notification. This will ensure that keyboard users (including people using screen readers and screen magnifiers) are aware that the notification has appeared.
when the user confirms the timeout extension, keyboard focus should be returned to its original location on the page. This stops keyboard users from having to manually navigate back to their original point in the content.
Use the group role on the div
. This will indicate to screen readers (and other assistive technologies that support ARIA in the future) that the content contained within the div
has a common purpose.
Use the aria-labelledby attribute on the div
and use the idref of the h1
as its value. This will give the div
an accessible name based on the content of the h1
element so that screen readers can differentiate the group from other content.
<div tabindex="-1" role="group" aria-labelledby="notification">
<h1 id=""notification">You will be logged out soon</h1>
…
</div>
In doing these things you’ll make sure that people have a reasonable amount of time to use the application comfortably, that they’re aware when a timeout is about to happen, and that they can act to postpone it in good time.
]]>I lived in China in the late 90’s and was in food heaven. When I came back to the UK I really struggled to find any decent Chinese food cooked the way it’s cooked in China and not in the west. Out of desperation I taught myself how to cook properly.
This beef recipe is as easy as they come and for me captures the taste of China. It’s also a dish that keeps on giving as you can use it for lots of other dishes which I’ve also included below.
I use the left overs for all sorts of things including the sauce which is called ‘master sauce’. Master sauce is kind of like the Chinese version of stock and can be frozen or used right away. Below are a few of the things I make with both the sauce and the beef.
Stir-fry onions, vegetables meat or tofu in sesame oil. Add noodles and a splash of the sauce and you’re done.
One of my favourite street foods is tea eggs. These are eggs that have been hard boiled in a tea and soya sauce mixture. They are delicious hot and can be brought at stalls with bubbling vats of hot tea with eggs in.
I like to make ‘tea egg’s in the master sauce mixture along with the beef. The trick is to boil them for 10 minutes or so, remove the eggs from the sauce then gently crack the shell by rolling the egg on a counter. Don’t remove the shell. Return the eggs to the mixture and cook for another 30 to 40 minutes. The sauce will seep into the egg giving it a lovely rich flavor, the cracks also make beautiful patterns on the egg after it has had its shell removed.
Eat as is, or cut them in half and put them in noodle soups. For a real treat use quail eggs.
This is another hybrid on a very traditional Chinese dish of soft bean curd with soya and sesame oil.
Place a cube of soft bean curd on a plate, chop up coriander, spring onion, and some chili and pop it on top of the bean curd. If you have tea eggs chop a couple of them up and add them too. Pour on some of the master sauce. Heat a tablespoon of sesame oil then pour that on top to give it a smoky look and taste.
This makes a good canapé and is another mix of east and west. Mix some crème fresh and wasabi. You can make this as strong or as mild as you like. Place a dollop in mini Yorkshire puddings with a generous slice of beef on top. Sprinkle chopped up chives on top and serve.
If you are into good Chinese food the best recipe books are:
The aria-controls
attribute creates a cause and effect relationship. It identifies the element(s) that are controlled by the current element, when that relationship isn’t represented in the DOM. For example a button
that controls the display of information contained within a div
:
<button onclick="showInfo();" aria-controls="I">Show info</button>
…
<div id="I">Information.</div>
The more widely the two elements are separated in the DOM, the more useful the aria-controls
attribute becomes. Imagine a checkbox that controls the filtration of search results, or a tab that controls a tabpanel:
<ul role="tablist">
<li role="presentation"><a href="#" onclick="displayTab(1);" role="tab" aria-controls="panel1" aria-selected="true">Tab 1</a></li>
<li role="presentation"><a href="#" onclick="displayTab(2);" role="tab" aria-selected="false">Tab 2</a></li>
...
</ul>
<div role="tabpanel" id="panel1">...</div>
<div role="tabpanel" id="panel2">…</div>
When a User Agent (UA) supports aria-controls
, it makes it possible for focus to be moved from the current element directly to the element it controls. The alternative is to navigate through all the intervening content in hopes of discovering what might have changed elsewhere on the page. For this reason, aria-controls
should only be used to point to something that is available in the DOM and which can be navigated to.
UA support for aria-controls
is still somewhat inconsistent. Firefox and Internet Explorer both expose aria-controls
through their accessibility API (IAccessible2 and UIAutomation respectively). Assistive Technology (AT) support is less encouraging though. Jaws (14+) supports aria-controls
, but only in Firefox. When Jaws encounters the aria-controls
attribute it announces “Use the Jaws key Alt and m to move to the controlled element”. NVDA, VoiceOver and Window Eyes don’t seem to support it at all in any browser.
Despite the fact support amongst browsers and screen readers could be better, the aria-controls
attribute is still worth using. Creating relationships within the DOM is certainly more robust, but it isn’t always practical or achievable. The presence of aria-controls
won’t damage the User Experience (UX) for people using UA that don’t support it, but it will enormously improve the UX for those people whose browser/screen reader combinations do.
It was the week that HTML5 became a W3C Recommendation, the week we celebrated the 25th anniversary of the web, and 20 years of the W3C itself.
It was also a week where people and ideas came together. Robin Berjon organised a set of lightning talks. Each person was given four minutes to share their ideas, and the result was an hour of interesting, informative, funny and challenging content.
All the TPAC 2014 lightning talks are available on Vimeo, but for me the highlights included Robin Berjon on Web Ants, Mark Sadecki on Canvas accessibility, Chaals McCathie Nevile on Good Intentions, and Dr Stanley Dards (AKA Daniel Davis) on the real origins of the web.
For my part, I chose to share some ideas about how we can (and should) design like we give a damn!
]]>Windows screen readers have multiple modes of interaction, and depending on the task being carried out they’ll automatically switch to the most appropriate mode. This post explains why Windows screen readers behave the way they do, and how your code can influence that behaviour.
When a document is rendered in the browser, Windows screen readers like JAWS and NVDA access the Document Object Model (DOM) either directly or through the available accessibility APIs. The DOM is a hierarchical representation of the objects in the web-document, and the information that’s retrieved from it is augmented by the screen reader and displayed to the user as a virtual copy of the original.
By creating a virtual copy of the document, screen readers make it possible for blind people to interact with content in ways that would otherwise be impossible on the Windows platform. This happens because the screen reader intercepts most keypresses before they reach the browser, triggering an interaction with the virtual document instead.
For example the left/right cursor keys are intercepted and used to move focus to the previous/next character in the content, and the up/down keys move focus to the previous/next line instead of scrolling the page.
This behaviour also makes it possible to navigate through content using shortcut keys that are native to the screen reader. Most Windows screen readers follow a broadly similar shortcut convention: For example t moves focus to the next table, h to the next heading, l to the next list, g to the next graphic and so forth. It is also possible to open dialogues that list all the elements of a particular type – for example form controls or links.
In JAWS this mode of interaction is known as virtual mode, and in NVDA and Window-Eyes as browse mode. The copy of the original document is generally referred to as the virtual buffer.
Not all keypresses are captured by the screen reader however. When the tab key is pressed it is automatically passed through to the browser where it causes keyboard focus to move to the next piece of interactive content, exactly as though the screen reader weren’t running. The same thing happens in other circumstances too, when the enter key is used to activate a link or the space key to select a checkbox for example.
This intelligent processing happens automatically and without the user being aware of it, but there are circumstances in which the user needs to know about a change of interaction style. When interacting with a text field or combobox the user needs to know that the keys they press will do something other than perform a screen reader navigation command – for example that h will type a character instead of move focus to the next heading, or that the down cursor key will select an option in a combobox instead of move to the next line of content.
In NVDA this mode of interaction is known as focus mode, and in JAWS it’s forms mode. Window-Eyes doesn’t give it a name, but simply refers to browse mode being off. There are subtleties to this mode of interaction though. For example, NVDA will automatically enter/exit focus mode when the tab key is used to move focus on/off the form field, but not if the cursor keys are used. JAWS will automatically enter/exit forms mode whichever method is used to move focus to the field, although as of JAWS16 it’s possible to configure JAWS to ignore forms mode when navigating through content using the cursor keys. Both screen readers can also be forced to switch modes manually, and both indicate the switch in mode with an audible “click”.
There is one anomaly amongst form fields when it comes to forms/focus mode. Although it’s possible to select a radio button without switching modes, it is necessary to be in forms/focus mode in order to use the cursor keys to cycle through the radio buttons in a group. Being unaware of this can sometimes lead to the mistaken belief that a radio group is somehow flawed.
Although this mode switching may seem unintuitive to someone unused to Windows screen readers, it works well in practice and most screen reader users are unaware of what’s happening “under the hood”. From a development point of view it’s far more important to understand something of screen reader mechanics though.
For the most part a screen reader will handle the different interaction modes automatically, providing the underlying code of the original document is robust semantic markup. All bets are off when it comes to custom/rich internet widgets though. A custom widget (like a menubar or tab set) is a web-based component that’s designed to behave like its counterpart in a software application. As a rule Windows screen readers don’t use a virtual buffer with software applications, so putting a custom widget in a web-document suddenly forces two screen reader paradigms into the same space.
A set of tabs is a good example: When interacting with a set of tabs in a software application, the left/right cursor keys cycle between each of the tabs in the set. When a set of tabs is transposed into a web-document the same interaction design pattern is supported by the script that provides the widget’s functionality. Herein lies the challenge though: A Windows screen reader will intercept the left/right keystrokes and use them to move focus within the virtual buffer, instead of passing them through to the browser to interact with the set of tabs.
ARIA (known as WAI-ARIA on formal occasions) is the solution. When certain ARIA roles are applied to custom widgets, they inform the screen reader that the element (or group of elements) has a specific purpose, and also that virtual/browse mode is not appropriate. The result is that the screen reader switches into applications mode and treats the widget as though it were a component of a software application.
To all intents and purposes, applications mode is the same as forms/focus mode – it causes the screen reader to pass keystrokes back through to the browser so they can fulfil their original purpose. For example, when the tablist and tab roles are used as part of the tab widget design pattern, using the tab key to move focus onto the first tab in the set causes a Windows screen reader to automatically switch into applications mode. From that point all the keyboard interaction is handled by the script. This does mean of course that the script driving the functionality of the widget has to be setup to handle keyboard interaction!
With thanks to Hans Hillen.
]]>My disability is personal to me, and I share that information at my discretion. Proponents of screen reader detection say it would be discretionary, but that’s like choosing between plague and pestilence. Choosing between privacy and accessibility is no choice at all.
We’ve spent years encouraging people to move away from text-only websites, and with good reason. If there is one thing that history should have taught us by now, it’s that social segregation is a bad idea.
The best screen reader detection can hope for, is to tell whether a screen reader is present or not. People use screen readers for different reasons – because they’re blind, partially sighted, Dyslexic or on the Autistic spectrum for example. Even this may be overly optimistic. Unless screen reader vendors agree to share some kind of UA string, the detection mechanism is likely to be access to the browser’s accessibility layer. In that case it may only be possible to detect an assistive technology in the most general sense, whether it be a screen reader, screen magnifier, speech recognition tool or something else entirely.
We’ve spent time turning to web standards and feature detection, instead of browser sniffing and excluding the ones we didn’t care to support (guilty as charged). Screen reader detection leaves us vulnerable to the same exclusion allover again, only this time feature detection won’t come to the rescue. The relationship between screen readers and browsers is symbiotic, and in terms of traditionally detectable features, screen readers derive most of their capability from the browser.
Most things that make a website usable with a screen reader are achieved by conforming to web standards, and the rest require relatively little modification. In these days of responsive design, including a media feature for screen readers would automatically double the work involved. You’d need to serve up a screen reader alternative for every break point version of your website.
Hence this blog post. But as the conversation continues, please bear in mind that this isn’t really about screen reader detection, or even assistive technology detection, at all. What is really being discussed is disability detection, and that is a very different thing altogether.
]]>The Cyber Streetwise website has already received criticism from industry and the press, for its poor technical architecture and unintuitive interface design. Even with such criticisms in mind, the appalling lack of accessibility on the Cyber Streetwise website is simply astonishing.
The Cyber Streetwise website cannot be used with the keyboard. It can’t be used if you use a screen reader, screen magnifier or speech recognition tool. It catastrophically fails almost every principle of inclusive design there is.
The only thing that surpasses the inexcusable absence of accessibility, is Cyber Streetwise’s belief that accessibility was a considered part of the website’s development to begin with.
When contacted on Twitter to ask why accessibility for disabled people hadn’t been considered, @CyberStreetwise replied to say "it was! We’re deploying the accessible version of the site this week". Pushing an update to a website over a month after launch is certainly an interesting definition of having considered something!
When asked whether they meant the update would make the website accessible, or that an alternate version would be made available, @CyberStreetwise replied to say it would be a "progressively enhancing experience".
Checking the Cyber Streetwise website a week later, it was still impossible to use with a keyboard or any of the aforementioned assistive technologies. When contacted again, @CyberStreetwise replied to say "the accessibility update has already been completed! Thank you for your comment, remember to #BeCyberStreetwise". They helpfully followed up with "There's a whole different version of the site that can be accessed by clicking on the 'accessible site' link.".
Closer inspection did indeed reveal the presence of a "link" leading to an alternate version of the website. It’s been years since an alternate version of a website was considered a reasonable (or even necessary) way to provide accessibility. Apart from building websites like it’s 1999, it’s the antithesis of progressive enhancement.
The truly ironic thing, is that the link can’t be accessed with the keyboard, or a screen reader, screen magnifier or speech recognition tool. It isn’t even a proper link, it’s a span with onclick functionality and absolutely no semantic meaning whatsoever.
At the time of writing, @CyberStreetwise remains silent on the subject of whether they’d actually tried the link with anything other than a mouse.
Accessibility could not have been considered during the design and development of the Cyber Streetwise website. If it had, it would have been built from standards compliant markup with fully integrated accessibility. Instead it’s a peculiar mess of pre-loaded JavaScript that one developer described as "code you might write for a bet".
Having signally failed to build in accessibility from the outset, Cyber Streetwise is now adding insult to injury by offering an alternate version for disabled people to use. Patrick Lauke summed it up beautifully: An alternate website is segregation not inclusion.
Choosing to call out a website for poor practice isn’t a decision taken lightly. Web development can be complex and challenging. However the accessibility failure of the Cyber Streetwise is systemic. All the evidence of the finished website suggests that accessibility was ignored at every level, from the moment the website was commissioned to the moment it was launched.
]]>The situation arises when you have a form inside a container with role="main"
applied to it. For example:
<div role="main">
<form method="post">
<div>
<label for="username">Username: </label><br>
<input type="text" id="username">
</div>
<div>
<label for="password">Password: </label><br>
<input type="password" id="password">
</div>
<div>
<input type="submit" value="Log in">
</div>
</form>
</div>
The problem occurs in Internet Explorer 8, 9 and 10 (and possibly older versions too). It first appeared with Jaws 13 and persists in Jaws 14.
In an added twist the problem only happens when Jaws verbosity is set to beginner (the default) or intermediate. If verbosity is set to advanced the erroneous announcements don't happen.
There is a workaround that stops Jaws from doing this. Ordinarily it's good to be cautious about hacking around user agent bugs, but in this case it isn't too much of a dirty hack. Just add role="form"
to the form element.
<div role="main">
<form role="form" method="post">
...
</form>
</div>
It shouldn't be necessary to do this. The role maps to the element natively in HTML anyway. In this case though, adding role="form"
directly in the HTML doesn't break the natural semantics of the page, and it won't cause any backwards/forwards compatibility issues either.
Josephe’s notes: In the USA top sirloin is below the sirloin and the tenderloin but above the bottom sirloin which is behind the short loin. I believe in the UK that would come somewhere in the rump. Pain in the rump you say? You just want 90 to 92 percent lean ground (minced) beef.
Josephe’s notes: you can use any pasta you like, sometimes I like to use shells, my wife, Linda, prefers full bore spaghetti. Advanced: make your own pasta.
Josephe’s notes: I use a long handled wooden spoon to stir. If the sauce starts to stick to the bottom of the pot, reduce the heat some and use the spoon to scrape it back into the mix.
Steam will condense on the lid, pour that off into the sink. You want to reduce the sauce over a long cooking time so be sure to keep pouring the liquid off the lid.
Caution: bubbling hot sauce will spurt at you when you open the lid. That’s where the long handled spoon comes in handy. I stick the spoon in and start stirring using the lid as a shield.
Josephe’s notes: Once I have the sauce simmering I make the breadcrumbs. I use sourdough bread, you can use whatever you have around. I make sandwiches with the large pieces and breadcrumbs with the smaller end pieces. I store up the ends in the freezer until I’m ready to make a batch. Pro tip: I keep all my bread in the freezer and toast it when ready to use so it’s always fresh.
Josephe’s notes: Store any leftover breadcrumbs in the freezer for next time.
Josephe’s notes: In dry weather and in torrential downpours, at night and during the day, all year I use an outdoor propane grill. I grill the sausages, being very careful not to burn them. This requires constant turning. You may not have a grill for your sausages. Get one. Otherwise, use whatever barbarous method you may have of browning the sausages.
Josephe’s notes: A method for boiling pasta? Really? Yes, really. People use far too much water when boiling pasta. It doesn’t take much boiling to cook angel hair pasta, which is one reason I prefer it. After all, I’ve just spent the better part of a day preparing to eat, the last step needs to be fast!
play the soundtrack by Nino Rota to Fellini’s Juliet of the Spirits on your stereo.
My parents emigrated from Ireland to New York in the late 1940s. The Italian women in the neighborhood took one look at my painfully thin dad and proceeded to teach my mom Italian cooking. You should have tasted her lasagna. Magnifico!
The first time my dad saw a dish of spaghetti he thought it was worms. This basic dish, spaghetti and meatballs, fed us all for days each time mom made it. Nothing beat a meatball sandwich the day after. It was, and still is, my ultimate comfort food.
My daughter, Siobhan, using her speech generating device, starts asking for spaghetti just before the fall weather begins. Then I truly know it's that time of the year again.
]]>Let's take an example that crops up from time to time:
<nav>
<ul role="navigation">
<li><a href="home.html">Home</a></li>
<li><a href="about.html">About us</a></li>
<li><a href="contact.html">Contact us</a></li>
...
</ul>
</nav>
This approach is problematic for two reasons: It breaks the intended relationship between ARIA and HTML, and it offers a poor experience for screen reader users.
Here's how the ARIA specification defines the navigation landmark role:
A collection of navigational elements (usually links) for navigating the document or related documents.
At first glance this seems to suggest that the navigation
role can be applied to the ul
element, because it's a collection of navigational links. It actually causes a conflict though. The ul
element already has an ARIA role of "list", and it's treated as a list by other accessibility APIs.
Here's how the HTML5 specification defines the nav
element:
The nav element represents a section of a page that links to other pages or to parts within the page: a section with navigation links.
The nav
element was purpose built to contain a number of navigational elements. This makes it a much closer fit for the ARIA navigation
role than the ul
element. In fact the ARIA navigation
role maps directly to the HTML5 nav
element.
When the navigation
role is applied to the ul
element it creates extra verbosity for screen readers that support both ARIA and HTML. For example NVDA announces "Navigation landmark" when it encounters the start of the nav
, then again when it encounters the navigation
role. Jaws does the same thing with a slightly different announcement ("Navigation region"), and it also announces the end of each region. It announces "Navigation region end" when it encounters the /ul
, then again when it encounters the closing tag.
Things are further complicated by the conflicting roles: The native list
role and the applied navigation
role of the ul
. NVDA announces "Navigation landmark, List of 3 items". It does this in Firefox because the accessibility API concatonates the two roles, but in Internet Explorer it has to go into the DOM to create the same effect. Jaws appears to ignore the accessibility APIs entirely. In Firefox it fails to announce the list, although it does preface each list item with "Bullet". In Internet Explorer it does neither, effectively ignoring the list semantics altogether.
The upshot is that the navigation
role should be applied to the nav
element. This represents the relationship between ARIA and HTML5 correctly, prevents the loss of the list semantics, and reduces screen reader verbosity to a manageable level.
<nav role="navigation">
<ul>
<li><a href="home.html">Home</a></li>
<li><a href="about.html">About us</a></li>
<li><a href="contact.html">Contact us</a></li>
...
</ul>
</nav>
]]>This recipe is one I’ve been using for many years. I developed it when I was vegetarian in the 80s and it was inspired by the recipe book I bought from the veggie restaurant at Sussex University. I used to love spaghetti bolognese and so it made a good substitute. We still cook it probably once every ten days and I love the way the aubergine combines with the tomatoes and the onion when it’s cooked. The Marmite adds lower notes to the overall flavour.
Occasionally, if I have more time I will use the aubergine sauce to create a superb vegetarian lasagne. For that you’ll need to layer some lasagne sheets with the cooked sauce in am oven-proof dish. Then make a white/cheese sauce and pour over the top – with a bit of grated cheese. Bake in the oven at 200C for 45-50 mins until the lasagne is soft. Serve with a tomato and lettuce side salad.
With all the strong flavours about, the dish suits a red wine – maybe an oakey Rioja or Navarra.
]]>The first problem we need to solve: It must be possible to provide a detailed description for complex images.
A complex image might be something like the blueprint for a building, a UML diagram, a renaissance oil painting or a movie poster. A person who is unable to see the image would use a detailed description to understand and enjoy its content. A person who struggles to understand the image would use the detailed description to interpret and appreciate the image.
The second problem we need to solve: It must be possible to discover and access the detailed description.
Complex images might be used to educate, inform or entertain. Whatever the circumstances, there must be a way for people to find out that a detailed description is available, and a way for them to access it. This means the call to action must be visible (for sighted people), and programmatic (for non-sighted people).
The third problem we need to solve: It must be possible for a detailed description to include structured content.
A detailed description of a product in an online store might only need text. A detailed description for something like a graph or chart is likely to need more structure, so a detailed description must be able to include headings, lists, data tables and so forth.
In short we need a mechanism for providing detailed descriptions for complex images. The detailed description must be discoverable by anyone, and have content that is appropriate to the image it corresponds to.
The pros and cons of longdesc have been discussed ad nauseam (let’s not do it all again). For the moment a flawed solution is better than no solution at all, but now seems like a good time to re-energise the search for a better solution. Look at it this way: If longdesc had never existed, how would we solve these problems today?
]]>header
, footer
, nav
, article
, section
and aside
elements give different regions of a document meaning. Amongst other things, that meaning can be understood by screen readers, and the information used to help blind and partially sighted people orient themselves and navigate around the page.
Now that HTML5 is being used on mainstream websites (such as that of the UK Government), it’s interesting to see how well screen readers support this handful of elements. Testing included Jaws 14, NVDA 2012.3 and Window Eyes 8, with Firefox 18 and Internet Explorer 9, and VoiceOver (Mac OSX 10.8.2) with Safari 6.
Thanks to Alastair Campbell and Kiran Kaja for VoiceOver testing, and Steve Faulkner for the mighty useful HTML5 Accessibility test page.
Screen reader support for HTML5 sections
Element | Jaws 14/FF 18 | Jaws 14/IE 9 | NVDA 2012.3/FF 18 | NVDA 2012.3/IE 9 | VoiceOver/Safari 6 | Window Eyes 8/FF 18 | Window Eyes 8/IE 9 |
---|---|---|---|---|---|---|---|
article | Yes | Yes | No | No | No | No | No |
aside | Yes | Yes | Yes | No | Yes | No | No |
footer | Yes | No | Yes | No | Yes | No | No |
header | No | No | No | No | Yes | No | No |
nav | Yes | Yes | No | No | Yes | No | No |
section | Yes | Yes | No | No | No | No | No |
Of the six HTML5 elements tested, header
is the least well supported.
Only Window Eyes does not support any of the HTML5 elements (in any browser). VoiceOver, Jaws and NVDA all support some elements, but not always consistently across browsers.
NVDA reports the start of supported elements. For example the aside
element is announced as “Complimentary information landmark”. Jaws and VoiceOver report both the start and end of supported elements. For example Jaws announces “Article” and “Article end” for the article element, and VoiceOver uses the format "Entering article landmark" and "Leaving article landmark".
One difference between the screen readers that support the footer element is that VoiceOver reports it as "footer", whilst NVDA and Jaws use "Content information" and "Content info" respectively.
Jaws, VoiceOver and NVDA include supported elements when navigating by landmark. This makes sense because the header
, footer
, nav
, article
, section
and aside
elements all Map to ARIA landmarks. Worth noting that all screen readers tested have good support for ARIA landmarks.
Those elements that are not supported in a particular screen reader/browser combination are handled in the same way as a div
or other element without semantic meaning would be. Which is to say that the content is accessible, but the element itself is not acknowledged in any way.
The challenge with this recipe is making the Hollandaise sauce and poaching the eggs, so they’re both ready at the same time. One approach is to cook with a friend and split the tasks between you, another is to buy in some ready made Hollandaise sauce from the grocery store.
Hollandaise sauce has a bit of a reputation for being difficult to make. It isn’t, but you have to take care not to over heat it (or the sauce curdles). Using a bowl over simmering water (known as a bain marie) is the secret because it stops the sauce coming into contact with direct heat.
]]>Jaws scripts no longer available. A shortcut ("q") for moving directly to the main
element is now available in Jaws as standard.
The main element extension specifies a way to markup the primary content area of a web page in HTML5. There are several good reasons for introducing the main
element, including a more reliable way for screen readers to pinpoint the start of the primary content area on the page.
ARIA roles can be applied to HTML elements to give them semantic meaning. For example a div with role="navigation" is announced by Jaws as "Navigation region".
The advantage of ARIA roles is that screen readers can use them to navigate through a page. Use the semi colon key in Jaws, the d key in NVDA, or the roter with VoiceOver to move between these landmarks.
The disadvantage is that developers must consciously include ARIA roles in their HTML code. The advent of HTML5 elements with stronger native semantics means that (in time) this will no longer be necessary (with HTML at least). For example the HTML5 nav
element already causes Jaws to announce "Navigation region", whether role="navigation" is present or not.
Until the main
element extension was proposed by Steve Faulkner, there was no direct mapping between role="main"
and an HTML5 element. This means that a screen reader mechanism for moving focus to the primary content area, would be entirely dependent on role="main"
being added to the HTML.
One of the reasons for introducing the main element is that it's easier for developers to code and style <main></main>
, than it is to use <div id="main" role="main"></div>
.
On the basis that the main
element is more convenient for developers (and is therefore more likely to be present), it's much more reliable for screen readers to hook into as well. It then becomes feasible for screen readers to introduce commands for moving straight to the primary content area of the page.
These (proof of concept) Jaws scripts for Firefox demonstrate how this interaction might work. When installed they introduce a new layered keystroke: Jaws key + Space j.
Invoking the command causes Jaws to look for the main
element with role="main" applied. The main
element doesn't exist yet, so Jaws doesn't recognise it. The script cheats by using the MoveToTagWithAttribute function instead.
If Jaws finds that combination of element and role, it jumps focus to that point and announces "Main content area" or "Start of main content area" if beginner verbosity is enabled). Otherwise it announces "No main content area" (or "There is no main content area on this page").
There is a test page included in the zip file that uses main role="main"
, but there has been at least one sighting of <main>
in the wild already.
Quick recap: Screen readers rely almost entirely on the HTML of the page, and focus is moved from one element to another using a range of navigation commands. In other words, a screen reader can only focus on one element at a time.
That's where the trouble starts. If a screen reader is focused on a form field, it can't be focused on the bit of the page being updated as well.
The Marks & Spencer website is a good example. When an item is added to the shopping basket, the basket summary at the top right of the page gets updated. In fact it's the only confirmation that the item has been successfully added.
The update is easy to see at a glance, but not at all easy with a screen reader. First you have to discover that something has changed, then you have to find out where it happened. Even once you know this, you still need to move focus back and forth between the summary and the product information, every time you add an item to the basket.
A massively simplified version of this interaction might look like this:
<!DOCTYPE html>
<html lang="en">
<head>
<title>Tequila</title>
<script>
var items = 0;
function updateItems () {
items = items + 1;
document.getElementById("quantity").innerHTML=items;
}
</script>
</head>
<body>
<h1>Tequila</h1>
<p>Tequila makes me happy...</p>
<p><button onclick="updateItems()">Add tequila to basket</button></p>
<h2>Basket summary</h2>
<div>
<p>Your basket contains <span id="quantity">0</span> items.</p>
</div>
</body>
</html>
When the button is activated with a screen reader, nothing appears to happen. The page doesn't reload, so focus remains on the button and the screen reader stays silent. The basket summary is updated, but the screen reader user remains oblivious.
The aria-live attribute can be used to turn the basket summary into an ARIA live region. ARIA enabled screen readers can monitor ARIA live regions for changes, and automatically announce them as they happen. The monitoring is done in the background, so even if the screen reader is focused somewhere else on the page at the time, changes within the live region are still announced.
Adding the aria-live
attribute to the basket summary:
<h2>Basket summary</h2>
<div aria-live="assertive">
<p>Your basket contains <span id="quantity">0</span> items.</p>
</div>
The aria-live
attribute takes three possible values: off (default), polite and assertive. The polite value means that the screen reader will wait until its finished it’s current task before announcing the update, and the assertive value means the update is announced as soon as the screen reader encounters a slight pause (in reality it's almost always immediate).
The aria-atomic attribute defines whether all of the content in the ARIA live region should be announced, or only the part that's changed.
Adding the aria-atomic attribute to the basket summary:
<h2>Basket summary</h2>
<div aria-live="assertive" aria-atomic="true">
<p>Your basket contains <span id="quantity">0</span> items.</p>
</div>
The aria-atomic
attribute has two possible values: true and false (default). Using aria-atomic="true"
means that all of the content within the ARIA live region will be announced, even though only a small part of it has changed. So screen readers will announce something like "Your basket contains 3 items", instead of just "3".
The All Star route deviation calculator is one of the best examples of this technique in the wild. Developed by Matt Lawson of Nomensa, the form is used to calculate cost savings based on reduced mileage.
As you manipulate information within the form, your potential cost saving is dynamically updated on the page. Because the updated information is an ARIA live region, using the form with a screen reader couldn't be easier.
]]>You can use a teaspoon of horseraddish sauce or English mustard, instead of the worcestershire sauce. The potato wedges can be dusted with paprika, or you can use a packet of potato seasoning for a spicy flavour.
You can use almost any kind of cheese. Cheddar is a traditional option, but Port Salut, Monterey Jack or Dolcelata all make tasty alternatives. If that isn’t enough, you can even grill a couple of slices of bacon to add to the cheeseburgers.
]]>Happily you can start making your HTML5 more inclusive with a minimum of effort. One simple way is to make your forms easier for people to complete by using the HTML5 placeholder
attribute.
The HTML5 placeholder attribute lets you provide a short hint that helps people enter the right information into a form field. The hint might be an example of the required information, or a description of the expected format.
The placeholder hint or description is displayed inside the empty form field, and is removed as soon as the field is given focus. In other words, the HTML5 `placeholder attribute replaces one of the most common uses of JavaScript on the web.
You can apply the placeholder
attribute to input
fields like text
, email
, tel
, search
or URL
. It can also be applied to a textarea
.
<label for="email">Email address:
<input type="email" id="email" name="email" placeholder="you@example.com"/></label>
The placeholder
attribute is not a replacement for a label. It's tempting to think that it could be, especially when screen real estate is at a premium and space for a visible label is limited.
The trouble is that once the field is given focus and the placeholder text disappears, there's nothing left to explain what kind of information should be entered. That makes things awkward for most of us, but it's particularly challenging for people with memory difficulties because there's no text label to remind them what information is expected.
It's also inconvenient for keyboard users. Instead of tabbing onto a field and reading the label, they need to look ahead to read the placeholder before moving focus to the field itself. Using a placeholder without a label makes the task of completing a form counter intuitive and much harder work.
By default the placeholder text is displayed in light grey. This is to create the impression that the text is temporary and not something that has been entered by the person completing the form. It also makes it difficult to read.
It's a good idea to style the placeholder text to give it greater visibility. Choosing a foreground color that meets the 4.5:1 contrast ratio recommended by WCAG 2.0 SC1.4.3 is a good way to do this.
The drawback is that styling for the placeholder
attribute is not well supported yet. It's necessary to use vendor specific prefixes, and even then support is limited.
input::-webkit-input-placeholder {color: ##912cee ;}
input:-moz-placeholder {color: ##912cee ;}
The HTML5 placeholder
attribute is supported in recent versions of Chrome, Firefox, Opera and Safari/Webkit. It isn't currently supported in Internet Explorer, but that should change when IE10 is released later this year.
Until the web sheds itself of all those pesky versions of Internet Explorer, it's a good idea to use a script such as jQuery.placeholder to handle support for legacy browsers. With that in mind though, you can go right ahead and use the HTML5 `placeholder attribute to make your forms easier for everyone to complete.
]]>The challenge with using any new screen reader is learning the keyboard commands to operate it. Narrator introduces a whole new set of keyboard commands, quite unlike those used by other screen readers.
Like other screen readers, Narrator uses a modifier key. This tells the computer that the keyboard command is meant for the screen reader, and not any other application you might be using. Under Windows 8, the modifier key for Narrator is the Caps lock key.
A few important commands you'll need to know, no matter what you're doing:
Knowing where to get help is always a good plan. You can pull up lists of Narrator commands wherever you are:
Most Narrator options are set in the Ease of Access Centre. You can also change a few settings on the fly:
There are different ways you can use Narrator to move around the screen and discover what's there:
Narrator has several commands for reading and exploring text:
Narrator also includes a set of commands for exploring and reading content displayed in tables:
Narrator has a few commands for quickly moving around web pages:
You can manipulate the Narrator cursor in relation to the mouse or keyboard cursor:
As well as the Narrator specific commands, there are many different Windows 8 shortcut keys that make it possible to move around Windows 8 much more easily.
]]>Windows 8 doesn't have a start menu, at least not in the way we're used to. Instead it has the Metro start screen. It displays your apps in a rough grid pattern that takes up most of the screen.
When you return to the Metro start screen you can browse through your apps, or begin typing to search for a specific app.
The charm bar slides out from the right of the screen. It lets you access the following features:
Instead of going through the charm bar, you can open the settings panel directly to adjust the settings for the app you're using. You'll find the shutdown options here, as well as settings for things like volume, brightness and network connectivity.
You can also open the other panels in the charm bar directly:
The classic Windows desktop with the Recycle bin and other familiar features, is still there in Windows 8. It's just tucked out of the way a bit more.
There are a couple of other keyboard shortcuts for classic Windows features:
The Ease of Access Center is still part of Windows 8. It contains the Magnifier, Narrator, and on screen keyboard applications, plus settings for making Windows easier to see, usable without a mouse or without sound. You can also start Narrator directly with Windows key enter.
There are many more Windows 8 keyboard shortcuts available. Download a spreadsheet of 30 Windows 8 shortcut keys (XLS, 26Kb), including those featured in this post.
]]>If you haven’t come across ARIA landmark roles before, you might find the following screen reader demo and articles worth a look:
Let’s take a typical situation. A web page that has two navigation blocks. You apply the appropriate ARIA landmark role to each containing element. The code might look something like this:
<nav role="navigation">
<p>Choose an aisle to browse:</p>
<ul>
<li><a href="fresh.html">Fresh foods</a></li>
<li><a href="dairy.html">Milk and dairy</a></li>
…
</ul>
</nav>
<nav role="navigation">
<p>Choose a shelf to browse:</p>
<ul>
<li><a href="milk.html">Milk</a></li>
<li><a href="butter.html">Butter</a></li>
<li><a href="eggs.html">Eggs</a></li>
…
</ul>
</nav>
By applying the landmark role you’ve made it easy for screen reader users to move from one landmark to the next.. Screen readers like NVDA, Jaws and VoiceOver include shortcuts for moving quickly between the landmarks on a page. In this case (depending on the screen reader) you’ll hear something like "Navigation region start" or "Navigation landmark".
With just a little more ARIA you can make even more information available. With the above example there isn’t any way to tell what differentiates the two navigation blocks. Screen readers will report them both in exactly the same way, until you drill down into the content itself of course.
Using aria-labellebdy you can supplement the information available to screen readers. This creates an association between the nav
element (or div
if you’re using HTML4) and a piece of text elsewhere on the page. The effect is that screen readers will now announce "Aisle navigation region start" or "Shelf navigation region start", or whatever the screen reader specific variation might be.
<nav role="navigation" aria-labelledby="firstLabel">
<p>Choose an <span id="firstLabel">aisle</span> to browse:</p>
<ul>
<li><a href="fresh.html">Fresh foods</a></li>
<li><a href="dairy.html">Milk and dairy</a></li>
…
</ul>
</nav>
<nav role="navigation" aria-labelledby="secondLabel">
<p>Choose a <span id="secondLabel">shelf</span> to browse:</p>
<ul>
<li><a href="milk.html">Milk</a></li>
<li><a href="butter.html">Butter</a></li>
<li><a href="eggs.html">Eggs</a></li>
…
</ul>
</nav>
This technique makes it a lot easier for blind and partially sighted people (with ARIA enabled screen readers) to do something very like visually scanning a page.
]]>When it comes to tabbed interfaces on web pages, there are two disadvantages for screen reader users. The visual clues that establish the tab metaphor are not available, and the required form of interaction isn’t obvious. This makes understanding and using a tabbed interface with a screen reader an interesting experience!
The first point is straight forward. The visual design helps a sighted person to understand the concept of layered content. The use of light and shadow gives a sense of depth, and the tabs are often styled to mimic their real life counterparts.
This problem can be solved with ARIA. The tablist, tabpanel and tab roles make additional information available to screen readers like NVDA and Jaws. A tablist
is a set of tabs, and each tab
has a corresponding tabpane
l.
<ul role=”tablist”>
<li role=”tab”>Dogs</li>
<li role=”tab”>Cats</li>
<li role=”tab”>Sheep</li>
</ul>
<div role=”tabpanel”>
<h2>Dogs</h2>
…
</div>
<div role=”tabpanel”>
<h2>Cats</h2>
…
</div>
<div role=”tabpanel”>
<h2>Sheep</h2>
…
</div>
ARIA bridges the visual gap, and helps blind people understand a little more about the widget they’re dealing with. Instead of announcing "Dogs", "Cats" and "Sheep", screen readers like NVDA and Jaws will announce "Dogs tab", "Cats tab" and "Sheep tab".
The second point is more complex. To some extent the additional information provided through the ARIA indicates what form of interaction is required.
It’s standard screen reader behaviour to provide auditory clues that let people know what kind of action can be taken. Prefacing link text with the word "Link" sets up the expectation that pressing the Enter key will activate it for example. In this case the word "Tab" sets up the expectation that the set of tabs can be traversed, usually left to right (or vice versa).
The trouble is that it isn’t the action taken once a tab has focus that’s important. It’s the action taken to focus on the tab in the first place that makes all the difference.
For a tabbed interface to work, screen readers like Jaws and NVDA need to switch into applications mode In other words they need to start passing the keystrokes through to the browser, instead of intercepting them to perform screen reader specific commands.
Ordinarily a screen reader intercepts the left/right arrow keys and uses them to move focus backwards/forwards one character at a time. To interact successfully with a set of tabs, the left/right arrow keys need to be ignored by the screen reader and used to move focus backwards/forwards through the set of tabs instead.
Triggering applications mode is the key to getting screen readers like NVDA and Jaws to work with tabbed interfaces. Using the tab key to move focus to one of the tabs is the key to triggering applications mode.
The demo here uses Hans Hillen’s accessible jQuery UI components page. It’s a terrific set of accessible widgets, and the code is available from Hans’ Github repository.
This demo was recorded using NVDA 2012.2 beta and Firefox 12.
]]>To be honest, I don’t get the chance to cook very often. Not that I’m complaining: my wife, Jessica, loves to cook and is very, very good at it. But if Jessica is away, then I get to break out my chops. Literally.
My go-to meal for one is a pork chop recipe from TV chef Valentine Warner. He presented a programme called “What To Eat Now”. There was an accompanying book, which is very good, but for some reason, this recipe wasn’t included.
I know it *sounds* weird to have anchovies and capers with pork, but trust me: it is absolutely delicious. Serve it with some mashed potatoes and a vegetable of your choice—I’m quite partial to purple sprouting broccoli. Have a glass of hearty Shiraz (or maybe a Grenache-Shiraz mix). Enjoy!
]]>This is a terrific recipe for feeding lots of people – particularly good as a rich, warming winter meal, but tasty in summer too! It’s a great way to use up left-over booze, or an excuse to open a new bottle.
This recipe is adapted from (I suppose I shouldn’t say steal, should I?) one by Antony Worrall Thompson, but the figures are approximate – feel free to experiment with quantities. It’s a long list of ingredients, but only the meat, veg, stock and red wine are absolutely essential...
It tastes particularly good the next day, or you could freeze the casserole, and save it for later. Sometimes I make some dumplings to go on top – add them 45 minutes before the end of cooking time, and turn the heat up to 160 C.
This casserole goes well with mashed potatoes and some kale or spring greens on the side; serve any leftovers with tagliatelle and some parsley chopped over the top.
]]>It’s hard to pick a favorite, and admittedly, I’m a bit of a meat and potatoes out-doorsy type of guy, so here’s one of my favorite recipes, which incidentally contains both meat and potatoes and is typically eaten outdoors!
]]>Windows 8 has several accessibility improvements. One of the most noticeable is Narrator, Windows’ built in speech capability. Narrator has grown up in Windows 8, and now comes with a lot more to offer.
Narrator is much more configurable under Windows 8. Under Windows 7 you can choose Narrator’s voice, and alter its speed, pitch and volume. It’s rudimentary at best. Narrator under Windows 8 has many of the configuration options you’d expect to find with an old school desktop screen reader.
Like many other desktop screen readers, Narrator uses a modifier key. The Capslock key is used in combination with other keys to create a suite of Narrator specific keyboard commands.
Under “General” settings you can choose to permanently turn on the Narrator modifier key. This saves you from having to include the Capslock key before each Narrator command. You can toggle the modifier key lock on/off at any time using one of Narrator’s many new keyboard commands (Capslock z).
You can configure Narrator to start at logon, and to start minimised. There is an option to turn keyboard keystroke echo on/off, and you can also toggle this setting on the fly ((Capslock f12).
There is an option to enable visual highlighting of the Narrator cursor, and this is on by default. Particularly helpful if you’re working on something alongside sighted people. The final setting under “General” lets you choose how long Narrator notifications are retained for reading.
You can choose to enable reading and interaction with a mouse, useful if you have enough vision. If your device supports a touch interface, another option appears that lets you make the same choice about reading and interaction via touch.
Narrator is a lot more sophisticated when it comes to cursor modes. By default the Narrator cursor follows the keyboard cursor, a setting you can set within the “Navigation” settings. You can also route the Narrator cursor to the keyboard cursor (Capslock g) or to the mouse pointer (Capslock t).
You can choose between two Narrator cursor interaction modes (normal and Advanced). It isn’t entirely clear what the difference between the two is, so that’ll take a bit more investigation (and probably another blog post). The setting can be toggled on the fly as well though (Capslock a).
The settings here remain largely unchanged from Windows 7. You can select the voice that Narrator uses, then set the speed, pitch and volume it uses. It must be said that the voices available under Windows 8 are much better though.
On a desktop computer, Microsoft David is the default. It’s a pleasant American voice that’s easy to listen to. The desktop alternative for people in the UK is Hazel. It’s another voice that’s easy on the ear, this time with a British accent.
Narrator has 72 keyboard commands (as of the June 2012 release edition. When Narrator is running you can call up a list of available commands (Capslock f1). Within the “Commands” settings you can change the keystrokes required to invoke any particular command.
Under Windows 8, Narrator can be used to interact with content much more Intuitively. For example it has commands for reading the current/next/previous paragraph, sentence, word or character. It can read the column or row header from a table, the entire row or column, as well as the cell content itself. In fact many of the basic commands you’d expect from any screen reader are available.
Narrator has clearly acquired many more features under Windows 8. Only time and use will tell whether the experience of using it has similarly improved. In the meantime it does seem as though Narrator might be on its way to being a screen reader, instead of a last resort when your primary screen reader dies.
]]>A simple, fun, and tasty appetizer. My family has used this recipe for almost 30 years! Tip: Strong arms required for all the stirring. Alternatively, just take turns. Goes great with football games!
]]>It was tough getting it down to one, but in the end I decided on Broccoli Pesto Pasta. I discovered it in BBC Good Food Vegetarian Christmas 2007 magazine, and thought it would be a good one to try. The original recipe was quite simple and tasty, but over the years I’ve changed things so that it hardly resembles the original.
I particularly like this recipe as it’s very healthy, lemony, and despite having had it regularly for the last 4 years, I still absolutely love it.
]]>I recently began eating more raw foods and have fallen in love with Alissa Cohen’s Raw Food for Everyone. This recipe has been the biggest surprise. I had no idea how good it would taste. I like it better than regular pudding. So, for the surprise factor and the simplicity, I’ll call it my “favorite of the moment”.
]]>This is a recipe from Thailand, the home of my lovely missus and where I met her when we were both teachers in Bangkok. She teaches Thai cookery for a living now.
]]>I’ve had this recipe for years now: I don’t recall exactly where it came from, but I dug it out of one of my mother’s old recipe books back in the early ’90’s. In those days I owned an old house in the country (just outside of Ottawa, Canada) where I had a silly-sized vegetable garden – it measured something like 12 ft X 40 ft and I grew so many tomatoes and hot peppers that I harvested them in a wheel-barrow (in fact, I ended up growing all of the ingredients for this recipe in my garden, including the garlic and herbs). Each Autumn I would settle into a production-mode weekend where I canned dozens of quart Mason jars of this salsa – enough to supply me for the full year and also give away to family, friends and neighbors.
While the fresh salsa is great straight out of the pot, I found that the canned salsa mellowed and melded the flavors, which I preferred. Enjoy!
]]>nav
element is a big help to screen reader users, as well as a step forward in semantic meaning.
With HTML4, a typical navigation block might look like this:
<div>
<ul>
<li><a href=”home.html”>Home</a></li>
<li><a href=”about.html”>About us</a></li>
…
</ul>
</div>
Using a screen reader it would be possible to move through a page that uses this approach, and access the navigation without difficulty. One strategy might be to use your screen reader’s shortcut key for moving from one div
on the page to the next.
The drawback is that to a screen reader, one div
looks pretty much like another. There isn’t any semantic information that a screen reader can use to inform you about the purpose of the content you’re dealing with.
This is where the nav
element comes in. Here’s what the HTML5 specification says about the nav
element:
"The nav element represents a section of a page that links to other pages or to parts within the page: a section with navigation links."
So the same navigation block in HTML5 might look like this:
<nav>
<ul>
<li><a href=”home.html”>Home</a></li>
<li><a href=”about.html”>About us</a></li>
…
</ul>
</nav>
The nav
element makes it possible for screen readers (and other user agents) to identify the purpose of the content. NVDA 2011.3 already does this. It won’t be long before other screen readers also support the nav
element, but in the meantime you can use ARIA landmark roles.
Adding the ARIA navigation role to the nav
element is a useful belt and braces technique.
<nav role=”navigation”>
<ul>
<li><a href=”home.html”>Home</a></li>
<li><a href=”about.html”>About us</a></li>
…
</ul>
</nav>
nav
elementUse the nav
element wisely. The HTML5 specification isn’t too prescriptive, but instead offers the following guidance:
"Not all groups of links on a page need to be in a nav element – the element is primarily Intended for sections that consist of major navigation blocks."
Using the nav
element too often will rapidly reduce the benefit to screen reader users. In the same way that one div
looks like another to a screen reader, so does one nav
element to the next. Using the nav
element to mark up just one or two key navigation blocks helps keep those sections semantically distinct from the rest of the page.
To highlight this, the HTML5 specification gives the following example:
"… it is common for footers to have a short list of links to various pages of a site, such as the terms of service, the home page, and a copyright page. The footer element alone is sufficient for such cases; while a nav element can be used in such cases, it is usually unnecessary."
Use the nav
element to mark up the primary (site) navigation block. If a secondary (page) navigation block is needed, use the nav
element here as well.
Otherwise, it’s your judgement call as to when you use the nav
element. Just remember that screen reader users will benefit most when the nav
element is used lightly, and that doing this won’t have a negative impact on other people either.
The caramelised onion chutney should be left for around four weeks before eating, to really bring out all the flavours.
]]>Originally I recommended using a flat h1
heading structure. Four years later the HTML5 outline algorithm is still not supported in browsers, and without that an h1
only heading structure is meaningless. Better to use explicitly ranked HTML headings and take care to get the heading hierarchy right!
Updated: 2 October 2022
The HTML Standard has been updated to reflect reality. No browser supports the outline algorithm properly (some support it visually in the default browser styles, but none support it semantically), finally resolving issue #83.
HTML5 headings make it easy to syndicate and reuse content, without breaking the heading hierarchy. Using HTML5 headings and keeping the heading hierarchy backwards compatible proves unexpectedly complicated though. The HTML5 specification has a solution, but is it the right one?
HTML5 uses sectioning content to break a page up into chunks. The article
, aside
, nav
and section
elements can be used to explicitly create sections. Here’s what the HTML5 specification says about sectioning content:
Sectioning content is content that defines the scope of headings and footers… Each sectioning content element potentially has a heading and an outline.
This means that each section contains its own heading hierarchy. Here’s what the HTML5 specification says about it:
The first element of heading content in an element of sectioning content represents the heading for that section.
This means that you could use an h1 at the start of each sectioning element. For example:
<body>
<h1>Favourite colours</h1>
<section>
<h1>Purple</h1>
<p>Purple is my favourite colour.</p>
<aside>
<h1>Reasons</h1>
<p>I like purple because…</p>
</aside>
</section>
<section>
<h1>Pink</h1>
<p>Pink is my least favourite colour.</p>
<aside>
<h1>Reasons</h1>
<p>I dislike pink because…</p>
</aside>
</section>
</body>
HTML5 calculates the rank of each heading based on its location within the sectioning content elements. The above example would therefore have the following structure:
If the same page were to be coded with semantically correct HTML4.01, it might look something like this:
<body>
<h1>Favourite colours</h1>
<div>
<h2>Purple</h2>
<p>Purple is my favourite colour.</p>
<div>
<h3>Reasons</h3>
<p>I like purple because…</p>
</div>
</div>
<div>
<h2>Pink</h2>
<p>Pink is my least favourite colour.</p>
<div>
<h3>Reasons</h3>
<p>I dislike pink because…</p>
</div>
</div>
</body>
The idea is that the HTML5 heading hierarchy is adaptable. You could pull out the section
elements and slot them into another HTML5 website, without breaking the heading hierarchy.
Things get a touch more involved when you create a heading hierarchy within a single section. The HTML5 specification goes on to say:
The first element of heading content in an element of sectioning content represents the heading for that section. Subsequent headings of equal or higher rank start new or implied sections. Headings of lower rank start implied sub sections that are part of the previous one.
This means that if you use nested headings within the same section, HTML5 will behave as though they’re explicit sub sections. For example:
<body>
<h1>Favourite colours</h1>
<section>
<h1>Purple</h1>
<p>Purple is my favourite colour.</p>
<h2>Reasons</h2>
<p>I like purple because…</p>
</section>
</body>
In the above example the aside
element has been removed, so that section is no longer explicitly defined. The “Reasons” heading is of lower rank than the “Purple” heading though, so HTML5 implicitly assumes there is a sub section there anyway. The outline would look like this:
If the “Reasons” heading had been an h1
instead, HTML5 would automatically close the explicitly defined
<body>
<h1>Favourite colours</h1>
<section>
<h1>Purple</h1>
<p>Purple is my favourite colour.</p>
<h1>Reasons</h1>
<p>I like purple because…</p>
</section>
</body>
The resulting outline structure would therefore be:
If you wanted to conform to the HTML5 specification, and there was no need to provide backwards compatibility, the h1
only technique would be fine.
It’s going to be some while before we reach that utopia though. In the meantime, developers can use explicitly ranked headings to support the browsers and assistive technologies that haven’t implemented HTML5 headings at all. Here’s what the HTML5 specification has to say on the subject:
Sections may contain headings of any rank, but authors are strongly encouraged to either use only h1 elements, or to use elements of the appropriate rank for the section’s nesting level.
This means that the following HTML5 examples would both represent the same outline structure.
<body>
<h1>Favourite colours</h1>
<section>
<h1>Purple</h1>
<p>Purple is my favourite colour.</p>
<aside>
<h1>Reasons</h1>
<p>I like purple because…</p>
</aside>
</section>
</body>
<body>
<h1>Favourite colours</h1>
<section>
<h2>Purple</h2>
<p>Purple is my favourite colour.</p>
<aside>
<h3>Reasons</h3>
<p>I like purple because…</p>
</aside>
</section>
</body>
The outline structure represented by both of the above examples would be:
The Jaws screen reader briefly introduced support for the HTML5 outline algorithm in version 13. The implimentation was broken however, and subsequently removed in Jaws 15. No browsers currently support the outline algorithm.
The question has been asked whether the h1
only or explicitly ranked headings technique is the best one to use.
Inspiration comes courtesy of an old school friend, founder and CEO (Chief Eternal Optimist) of Architecture for Humanity. His philosophy? Design like you give a damn!
It’s a philosophy to change the world by. It’s a mantra that is as relevant in the digital world as it is in the physical. It’s a clarion call to anyone who believes things need to change, and that together we can do it!
Design like you’re Banksy. Read the rule book, then rip it up and create something that challenges the accepted wisdom.
Design like you’re Da Vinci. Be curious about everything, and never stop questioning how you can make things better.
Design like you’re Brunel. Know that planning is important, but that imagination makes the extraordinary possible.
Be bold, be disruptive, be innovative, be inclusive. Be whatever you want to be. But whatever you do, let’s get out there and design like we give a damn!
]]>When you’re browsing a website using its primary navigation, it’s helpful to know which page you’re on, and which pages you can go to. HTML5 makes it simple to do this, but the technique moves away from a common approach used in HTML4.01.
When you’re looking at a website’s navigation, the current page should be visually distinct from the other pages. When you’re using a screen reader, the separation needs to be programmatic rather than visual.
A common technique used with HTML4.01 is to use the strong
element to markup the current page:
<ul>
<li><a href="home.html">Home</a></li>
<li><strong>About us</strong></li>
<li><a href="contact.html">Contact us</a></li>
…
</ul>
This has the advantage of creating both the visual effect and the programmatic effect needed to differentiate the current page from the rest.
The HTML5 specification subtly changes the purpose of the strong
element though. Instead of marking up content with strong emphasis, it should be used to markup content of strong importance. More on this (and other similar changes) within the HTML5 spec can be found on the HTML5 Doctor website.
This change in the specification probably doesn’t make it wrong to use the above technique, since it could be argued that the current page is something of strong importance. It is one of those potentially murky areas though, and besides, the HTML5 spec recommends a much cleaner approach instead.
If the a element has an href attribute, then it represents a hyperlink (a hypertext anchor).
If the a element has no href attribute, then the element represents a placeholder for where a link might otherwise have been placed, if it had been relevant.
In other words you can strip out all the usual attributes from the a
element, to turn it into a placeholder instead:
<ul>
<li><a href="home.html">Home</a></li>
<li><a>About us</a></li>
<li><a href="contact.html">Contact us</a></li>
…
</ul>
As far as the screen reader is concerned, the current page link is deactivated. This provides the separation needed to understand which page you’re on, and which pages you can go to. To create the same separation visually, you can apply a CSS class that gives the current page a different appearance from the other links within the navigation.
]]>How ARIA landmark roles help screen reader users
This video was first posted on Nomensa's Humanising Technology blog, on June 29th 2011.
]]>So just what is a decorative image? It seems to me that one person’s eye candy is another person’s emotional link to a website.
For some, decorative images are those horizontal rules, bullets and other forms of minor clipart we find sprinkled around the web. For others, the term is wider ranging. It includes more content rich images such as photos and artwork.
So, you might ask, what’s the problem with this varied point of view? The answer is simple. Alt texts.
If you Google for the term “decorative image + alt texts”, you’ll come across countless sites that suggest that a decorative image be given a null alt text. It’s possible that we can all agree that for minor forms of clipart, a null alt text will do nicely. But it gets a little more difficult when it comes to more complex images.
I’m not talking about diagrams, blueprints or other information rich images. There’s no argument that they should always carry an alt text, possibly even a long description. I mean the vibrant, emotion rich images that provide a website with a sense of atmosphere.
It’s sometimes argued that providing such images with descriptive alt texts provides too much “noise” for a screen reader user. If we screen reader users stopped to listen to every alt text, every time we came across an image, then this assumption would probably be right. But I’ll let you into a secret. We won’t.
Like sighted users, we’ll skip around the content of the page until we find something that interests us. If the first few syllables of an alt text sound promising, we’ll pause to read. If they don’t, we’ll move on to the next element on the page. Also like sighted users, we’re often likely to pause on something unimportant, but which captures our imagination.
A good alt text can conjure up wonderfully stimulating mental images. A friendly smile is the same in print, photo or wax crayon. Whether you listen to an image or see it, the emotional response is the key factor, so why should we recommend that these emotion rich images should be given a null alt text and hidden from screen reader users?
Perhaps it’s time we introduced another group of images: Emotion rich images and encouraged the practice of providing descriptive alt texts for them. If people don’t want to listen to the alt text, they won’t. If people don’t want to pause and look at the image, they won’t. In either case, it’s good to have the choice.
]]>]]>Ladies and gentlemen of the class of '97:
Wear sunscreen.
If I could offer you only one tip for the future, sunscreen would be it. The long-term benefits of sunscreen have been proved by scientists, whereas the rest of my advice has no basis more reliable than my own meandering experience. I will dispense this advice now.
Enjoy the power and beauty of your youth. Oh, never mind. You will not understand the power and beauty of your youth until they've faded. But trust me, in 20 years, you'll look back at photos of yourself and recall in a way you can't grasp now how much possibility lay before you and how fabulous you really looked.
You are not as fat as you imagine.
Don't worry about the future. Or worry, but know that worrying is as effective as trying to solve an algebra equation by chewing bubble gum. The real troubles in your life are apt to be things that never crossed your worried mind, the kind that blindside you at 4 pm on some idle Tuesday.
Do one thing every day that scares you.
Sing.
Don't be reckless with other people's hearts. Don't put up with people who are reckless with yours.
Floss.
Don't waste your time on jealousy. Sometimes you're ahead, sometimes you're behind. The race is long and, in the end, it's only with yourself.
Remember compliments you receive. Forget the insults. If you succeed in doing this, tell me how.
Keep your old love letters. Throw away your old bank statements.
Stretch.
Don't feel guilty if you don't know what you want to do with your life. The most interesting people I know didn't know at 22 what they wanted to do with their lives. Some of the most interesting 40-year-olds I know still don't.
Get plenty of calcium. Be kind to your knees. You'll miss them when they're gone.
Maybe you'll marry, maybe you won't. Maybe you'll have children, maybe you won't. Maybe you'll divorce at 40, maybe you'll dance the funky chicken on your 75th wedding anniversary. Whatever you do, don't congratulate yourself too much, or berate yourself either. Your choices are half chance. So are everybody else's.
Enjoy your body. Use it every way you can. Don't be afraid of it or of what other people think of it. It's the greatest instrument you'll ever own.
Dance, even if you have nowhere to do it but your living room.
Read the directions, even if you don't follow them.
Do not read beauty magazines. They will only make you feel ugly.
Get to know your parents. You never know when they'll be gone for good. Be nice to your siblings. They're your best link to your past and the people most likely to stick with you in the future.
Understand that friends come and go, but with a precious few you should hold on. Work hard to bridge the gaps in geography and lifestyle, because the older you get, the more you need the people who knew you when you were young.
Live in New York City once, but leave before it makes you hard. Live in Northern California once, but leave before it makes you soft.
Travel.
Accept certain inalienable truths: Prices will rise. Politicians will philander. You, too, will get old. And when you do, you'll fantasize that when you were young, prices were reasonable, politicians were noble and children respected their elders.
Respect your elders.
Don't expect anyone else to support you. Maybe you have a trust fund. Maybe you'll have a wealthy spouse. But you never know when either one might run out.
Don't mess too much with your hair or by the time you're 40 it will look 85.
Be careful whose advice you buy, but be patient with those who supply it. Advice is a form of nostalgia. Dispensing it is a way of fishing the past from the disposal, wiping it off, painting over the ugly parts and recycling it for more than it's worth.
But trust me on the sunscreen.
A pre-conference workshop on the accessibility features of HTML5 and ARIA, plus other advanced techniques for AJAX driven web applications.
HTML5 allows web pages to behave like native applications. The HTML5 specification addresses many accessibility issues, but also introduces new challenges.
Browser and access technology support for HTML5 media elements, captions and audio description.
Understand how IBM used WAI-ARIA to create accessible forums, wikis, blogs, social networks and rich text editing.
Presentation source to be confirmed.
Case studies and examples demonstrating how categorising WCAG into role based groups can help increase engagement and improve overall accessibility.
Presentation source to be confirmed.
A comparison between Jaws, NVDA and VoiceOver, and their ability to handle web content and accessibility standards.
Case study illustrating the challenges of implementing accessibility within a Fortune 100 company, and the lessons learned.
For accessibility requirements to be successfully built into the web development lifecycle, responsibility must be divided amongst the various contributors to the project.
Best practice approaches taken by Yahoo! To increase the usability of accessible web pages.
Panel discussion with four accessibility experts working on the HTML5 specification.
Presentation source/notes to be confirmed.
A look at the work being done by AOL and TPG to address the accessibility shortcomings of the jQuery widget library.
[
](https://access.aol.com/csun2011/)
Learn about the features and technical differences between the two most popular platforms for creating rich internet applications, Flash and HTML5.
Presentation source to be confirmed.
Ways to achieve WCAG 2.0 conformance with the HTML5 media elements.
Presentation source to be confirmed.
Could there be a secure, accessible and inclusive alternative to CAPTCHA?
One of the five minute micro presentations from the CSUN 2011 Tweetup.
[
](https://csuntweetup.com/)
* Accessible Twitter: An update
A look at ways to approach enterprise level accessibility, including project management and procurement.
A look at the barcamp/unconference format, and the practical challenges of holding an event based on first hand experience.
A look at the common misconceptions about accessibility commonly felt by engineers working on web projects.
Case study from CA Technologies and TPG. Exploring the use of Section 508 compliant video as a solution for product documentation, and meeting the challenges of audio description and captioned content.
Perhaps the most evocative idea to come out of CSUN 2011 was “the tribe”. John Foliot seemed to capture the CSUN zeitgeist when he described the global accessibility community as a tribe, drawn together by a shared ambition to make the web a truly inclusive place.
The Tribe was a theme that ran through many conversations and into sessions like the Web accessibility game plan. The idea seemed to gather quiet momentum throughout the week, and John issued a challenge to everyone who wants to be part of it:
Get out there, create, innovate, captivate and get a little disruptive. We have each other’s backs, let’s see what we can achieve together!
]]>NVDA (Non Visual Desktop Access) began as an open source project in the mid 2000s. VoiceOver had recently emerged as part of OS X Tiger on the Mac, but no viable free screen reader existed on the Windows platform.
In 2007 NV Access was set up as a non profit organisation to manage activities relating to the NVDA project. A grant from the Mozilla Foundation in early 2008 enabled James Teh to become a full time developer on the project, joining founder Michael Curran.
A grant from Microsoft later that year enabled NVDA's developers to work on Windows 7 UI automation support, and in 2009 Yahoo! stepped in with a grant that facilitated the introduction of several WAI-ARIA features.
The trouble is that now the money is running out. According to a message posted on the NVDA blog a couple of days ago, there is only enough money left to sustain the project until July 2011.
Their appeal is remarkably simple. Last year 50,000 people downloaded NVDA from their website. Even if they exclude 20,000 of those to allow for multiple downloads by the same people, that could still leave as many as 30,000 unique users. If each person contributed just AUS$10 (£6.20) that would give them AUS$300,000 to support the project.
Free access technologies are giving the old school screen readers a serious run for their money. NVDA is absolutely vital to this revolution, perhaps more so than any other screen reader. VoiceOver may be an excellent choice for anyone able to afford an Apple product, but with few available for less than £500 there is still a cost implication for many people. NVDA can be used on even the cheapest netbook, and can even be stored on a portable drive for use on a shared or borrowed computer.
Quite simply, NVDA's future lies in our hands. If you've ever used it to access a computer, if you've ever used it to test a website, spoken about it at a conference, recommended it to another person, or if you simply believe in the philosophy behind free software, the time is now. Please make a donation to the NVDA Project.
]]>Papa Sangre is a video game with no video. It’s a first-person thriller, done entirely in audio by an award-winning team of game designers, musicians, sound designers and developers.
Created by Somethin’ Else with support from 4IP, Papa Sangre uses the first real time 3D audio engine on a hand held device. It’s a remarkably slick app, with added kudos for being perfectly accessible to blind/partially sighted gamers.
You’re in Papa Sangre’s palace. His palace is in an afterlife that takes the form of a malevolent, unpredictable carnival: imagine a Mexican graveyard on the Day of the Dead — with the lights off.
The gameplay is extremely simple. You hold your iOS device in landscape orientation and tap the bottom left/right corners to walk forward. The faster you tap, the more quickly you move forward. Don’t tap too fast though or you’ll fall flat on your face in the dark!
You change direction by swiping left/right across the top of the screen. As you move through the game world the 3D audio engine creates a vivid landscape around you. You’ll need a set of earphones though. Without them it’s impossible to react to the 3D soundscape and play the game successfully.
Papa Sangre works seamlessly with VoiceOver. You don’t need the screen reader to play the game itself, only to start the game and move between levels. Useful cues are given to let you know when to turn VoiceOver on/off, so it couldn’t be easier.
All you know is someone is in grave danger and desperately needs your help. Can you save them and make your escape or will you be trapped in the blackness Forever?
Papa Sangre’s palace consists of a series of levels. Throughout the game your companion is a “fluttery watery thing” who provides help and guidance at each level, but can you really trust her?
You must collect the musical notes, and decide whether to rescue the lost souls you find along the way. Danger lurks around every corner though!
You must sneak past the many delightfully dreadful creatures that inhabit Papa Sangre’s palace. Snuffle hounds, vicious hogs, giggling grim reapers and other demented creatures will attack if they’re alerted to your presence. One wrong step and you could stray off the path of safety, crunch the bones of other victims, or start cymbals crashing loud enough to wake the dead!
Get out. Save the one you love. Do the right thing.
The production quality of Papa Sangre is brilliant. The characters are elegantly simple, and the dialogue far better than you’ll find in many (revenue churning) online game worlds.
At £3.99 from the iTunes app store it’s worth absolutely every single penny. Papa Sangre is a captivating adventure in a child like world of macabre fantasy!
]]>Possibly the biggest misuse of tables is using them to control page layout. Originally intended to markup tabular data, the concept of a layout table is now generally acknowledged.
As explained in a Webaim article on creating accessible tables, screen reader users often struggle with tables. The mechanics of interacting with a table vary quite significantly between different screen readers.
When focus is on a table, Window Eyes has two specific commands for entering/exiting tables mode (Control +/-). Once in tables mode, you can use Insert + arrow keys to move up/down or left/right from one table cell to the next.
Jaws and NVDA have a slightly easier approach. Once a table has focus, the Control, Alt + arrow keys can be used to move through the table without entering into any kind of tables mode.
When it comes to reading a table summary, there is a rare moment of consistency. Jaws, NVDA and Window Eyes all read the summary just after the table itself has been announced. VoiceOver also speaks the summary, but only once focus has paused on the table for a few seconds first.
No, not some kind of tabular existentialism, but a reasonably serious question. What are the characteristics of a layout table? If a table has only one row/column, is it really a table at all? How does a screen reader identify a data table anyway?
Using a simple table test page and some help from Everett Zufelt, I put several popular screen readers through their paces in Firefox 3.6. The four screen readers were:
Each of the screen readers was tested using its default configuration. Jaws and Window Eyes both have advanced table customisation settings, but it’s questionable how often such settings are used. More on this later though.
All four screen readers agree that a single row/column table without a summary is a layout table. In other words, the table itself isn’t announced, and only the content is read.
When a summary is introduced into a single row/column table, there’s a definite split amongst the screen readers. Jaws and Window Eyes continue to treat it as a layout table, whilst NVDA and VoiceOver both announce that a table is present before reading the content.
It’s when dealing with multiple row/column tables that the differences really start to appear. NVDA for example isn’t able to handle row/column headers at all. It’s a known bug that’s been partially fixed in the forthcoming 2010.2 release. NVDA will then be able to handle row/column headers in Firefox, but not in Internet Explorer.
If a table has multiple rows/columns, Jaws treats it as a correctly marked up data table whether the row/column headers have been marked up or not. It automatically uses the upper row and left hand column cell content as header information.
The table settings in Jaws mean you can switch to a more standards compliant interpretation of the code. The default assumption however, is that the table will not be marked up correctly and so Jaws fills in the gap on behalf of the user.
Window Eyes takes the reverse approach. By default it ignores even correctly marked up row/column headers. You must first change the table settings in Window Eyes before any information is made available.
Of the four, VoiceOver pays the closest attention to the code. Where the table has multiple rows/columns and no header markup, VoiceOver announces the table is present and simply reads the content.
When row/column headers are introduced using the
With such a range of different behaviours, it’s enough to make you wonder whether it’s worth using/coding tables properly at all. The short answer is yes, it is.
It comes down to the theoretical contract between access technology vendors and web developers. If accessibility is really going to succeed, the contract to support web standards must be upheld by both parties.
Of the four screen readers covered here, VoiceOver probably comes the closest to supporting web standards when it comes to tables. Given its track record, there is every reason to think that NVDA 2010.2 will do the same when it’s released, albeit only in Firefox.
Jaws and Window Eyes are the real conundrum however. Both are quite capable of standards support, but neither do it out of the box. Jaws at least behaves this way on the assumption that data tables won’t be properly marked up, and it’s difficult to argue with that for the time being.
Right now, there is work that needs to be done on both sides as far as the web standards contract goes. There are still far too many badly marked up and confusing tables out there, and there is still much that screen readers could do to improve the way they interact with the good ones.
The signs are encouraging though. A few versions back, Jaws didn’t have an option to switch to using marked up headers at all. When the feature was first introduced in 6.01 it briefly became the default setting, and perhaps in time Jaws will revert to this behaviour.
NVDA and VoiceOver bring the greatest hope for a standards compliant web. Both were created in a time of standards awareness, and neither is held back by the millstone of legacy code and backwards compatibility.
]]>The HTML 4.01 specification defines two attributes that can be used to prevent people from interacting with form fields:
disabled
readonly
Both attributes are boolean, meaning they can either be on or off.
The disabled
attribute can be applied to button, input, optgroup, option, select and submit form fields. For example:
<input type="text" disabled="disabled" />
When applied to a form field, the disabled
attribute means that the field:
The readonly
attribute can be applied to input and textarea form fields. For example:
<input type="text" readonly="readonly" />
When applied to a form field, the readonly
attribute means that a field:
Using these disabled form fields and readonly form fields test cases, page, I looked at the way three popular screen readers dealt with disabled and read only form fields. It wasn’t meant to be an exhaustive investigation, but more a chance to get a flavour of screen reader support. The screen readers in question were:
To keep things even, I put each screen reader through its paces in two different browsers:
The results were extremely varied, with little consistency across either screen readers or browsers.
All three screen readers correctly reported when a form field was disabled, except for Jaws in Firefox. For some reason, Jaws treats disabled textboxes and textareas as though they weren’t there in Firefox at all.
There are slight differences in the way each screen reader reports a disabled form field. Jaws and NVDA both indicate the field is “Unavailable”, whilst Window Eyes reports that the field is “Disabled”.
Things get a little more complicated with read only form fields. NVDA and Window Eyes treat read only textboxes and textareas as plain text, so don’t report them as read only fields. Jaws does treat them as form fields and reports them as “Read only”, with the exception of textareas in Firefox.
Read only radio buttons are recognised as form fields by all three screen readers, but none of them correctly report them as “Read only”. To add to the confusion, read only buttons could be selected using all three screen readers.
Jaws and NVDA both exclude disabled form fields from the tab order, whilst Window Eyes does not. Bear in mind that Jaws completely ignores disabled textboxes and textareas in Firefox, which isn’t quite the same as skipping them in the tab order.
All three screen readers include read only radio buttons in the page tab order. In a reversal of the way disabled form fields are treated, Jaws and NVDA both include read only textboxes and textareas in the tab order, whilst Window Eyes does not.
Generally speaking, NVDA treated disabled and read only form fields most consistently. It was also the most accurate in terms of following the HTML 4.01 specification.
Window Eyes was reasonably consistent across both browsers, but didn’t follow the specification for disabled and read only fields particularly well. Jaws on the other hand seemed a little more true to the specification, but gave the most inconsistent results overall.
If anyone would like to contribute results for other screen readers or can add more information to the above results, please drop me a line.
]]>Contacting Organisations About Inaccessible Websites is a splendid new resource from the Web Accessibility Initiative (WAI).
People who are familiar with web accessibility will have no trouble contacting an organisation. They'll happily get up close and technical with anyone they can find.
If that isn't you though, don't worry. Contacting Organisations About Inaccessible Websites has everything you need. Here's what it has to say about why feedback is important:
Your feedback to an organization can help improve the accessibility of websites for you and many other people who use the websites. Website owners have many priorities for changes and improvements, and the more an organization hears about accessibility from people who use their website, the more likely it is that accessibility will become a higher priority.
With that in mind, the document goes on to give practical advice on the following topics:
If you're still feeling a little cautious, an email template is provided. Some example emails are also included, to give you an idea of how to go about things.
There's one further piece of advice in the document. Actually, I think it's the most important piece of advice of all, so I'm going to repeat it here:
Consider also contacting organizations that do a good job of making their websites accessible and easy to use by people with disabilities and older users, to acknowledge and encourage their efforts.
Sometimes, the accessibility of a website is due to the organisation's policy. More often, it's down to a small band of accessibility champions making the case as best they can. In either case, receiving positive feedback can only re-enforce the message and encourage them to keep building in accessibility.
]]>Magnifier is a simple screen magnification tool. It lets you increase on-screen content up to 16 times in size. The magnified area tracks the mouse or keyboard, and Magnifier now supports full screen, lens and docked modes.
Speech Recognition is a tool that lets you control your computer using your voice. An interactive tutorial guides you through the setup process. You can then dictate into almost any application, such as your email or Microsoft Word, and serve the web by “saying what you see”.
Narrator is a basic screen reading tool. It lets you hear on-screen content read aloud, including information about your desktop and error messages. You can also use it to access most common Microsoft applications, such as Windows Mail or Notepad.
On Screen Keyboard is a virtual alternative to a proper keyboard. It displays a full keyboard on-screen, which you can resize and customise. You can select keys using your mouse or keyboard, and predictive text is also available.
Windows Touch is an alternative to a keyboard or mouse. It works in conjunction with a touch-screen monitor and lets you control on-screen content by touch. You can scroll through information, play multi-media, resize windows and pan or zoom on-screen content.
Visual Notifications are on-screen alternatives to audio cues. Instead of playing standard system sounds such as the email notification, Windows 7 can give you a visible cue such as an on-screen flash instead.
Windows 7 includes a range of options that help you use a keyboard:
Personalisation is a range of options that let you customise the way Windows looks and feels. Amongst other things, you can choose high or low contrast colour schemes, increase or decrease text size, and add or change audio sounds.
Windows 7 has plenty of accessibility features, aimed at helping people with a wide range of disabilities. In some cases, such as Speech Recognition and Narrator, they are simple tools in comparison to existing access technologies. In other cases, such as the On Screen Keyboard and Visual Notifications, they are robust solutions. In all cases, Microsoft has made a real effort to make Windows 7 as flexible and customisable as possible for people with disabilities.
]]>As Jared Smith rightly points out, Google Wave is still in early preview release. It's still in development, with many sections still shown as "under construction". That said, the complete lack of accessibility features does not bode well for the future.
For a screen reader user, the application is impossible to use. The principle navigation links cannot be accessed. They appear as plain text on the page, and no amount of simulating mouse clicks with the screen reader does the trick.
There is no apparent structure to the page. No headings, lists or other useful markup to help make sense of the information. Actually, there is one heading. It's part of the noscript alternative for people without JavaScript enabled, but I'm not sure that counts in the scheme of things.
Visually, I'm told there are a number of panes on the screen. Semantically speaking, that information isn't available to a screen reader user. Other information is also unavailable. For example it doesn't seem possible to see people in the contacts pane, identify messages or read the content of an individual message.
In short, it isn't possible to engage with Google Wave in a meaningful way if you're a screen reader user. The puzzling thing is why it's like this at all.
On their website, Google present 10 principles that contribute to a good user experience. Principle 6 includes the following commitment to accessibility:
Google is also committed to improving the accessibility of its products. Our desire for simple and inclusive products, and Google's mission to make the world's information universally accessible, demand products that support assistive technologies and provide a useful and enjoyable experience for everyone, including those with physical and cognitive limitations.
So there it is in black and white. A clear statement about Google's approach to accessibility. Surely we can rest easy knowing that when it comes to full release, Google Wave will have assistive technology support and offer everyone a useful and enjoyable user experience?
The trouble is that the evidence isn't there to back this up. Google Chrome, Google Docs and many other Google applications fail abysmally to honour this commitment. With web standards, and emerging technologies such as ARIA, it really doesn't have to be this way though.
Perhaps the preview of Google Wave has brought Google's commitment to accessibility into question again. Certainly a petition urging Google to become leaders in accessible website development is gaining support, and hopefully somewhere, Google are listening.
]]>KNFB Reader is a portable OCR package that works with your mobile phone. It's very convenient, easy to use, and highly accurate.
KNFB Reader uses the built in camera of your mobile phone. You capture an image of the text and it translates it into plain text. You can use it to read envelopes, letters, articles, medicine packaging, menus and quite a bit more besides.
It has a field of view report, which takes a practice image. KNFB Reader then provides guidance on aligning the text correctly.
It will tell you whether each side of the document is visible, and how much of the available image is filled with text. All four sides visible and 70% text fill is about optimum.
When you're happy with the alignment, you capture the actual image. KNFB Reader then converts the image text into plain text.
It corrects problems such as the text being upside down, and alerts you if the text is cut off at any point.
Once the image has been fully processed, KNFB Reader begins reading the text automatically. You can listen to the text as it is read out, or you can navigate through it in chunks.
KNFB Reader has several simple text navigation keys that make this extremely easy.
You can save captured images and the corresponding text files with KNFB Reader. When you hook your mobile phone up to your computer, you can transfer the files across for more permenant storage.
For American citizens, KNFB Reader can be used to identify US paper currency. It also has a translation feature. Foreign language text captured in an image can be translated into English from around 16 different languages.
With some practice, KNFB Reader is very easy to use. Each function takes just a single key press, so there's no need to remember complex actions.
Capturing images is the tricky bit at first. KNFB Reader comes with clear instructions and a training document to practice with.
When experimenting with the training document, KNFB Reader provides additional information to help you get the hang of it. This includes your distance from the surface of the document, and guidance on whether to move left, right, backward or forwards for better alignment.
At the moment KNFB Reader is only available for three mobile phones. The Nokia N82, N86 and 6220 Classic. The software requires the Symbian 60 series (3rd edition) operating system, so with luck more phones will be supported in the future.
It is also expensive. At £695 for the software, excluding your phone or mobile screen reader, the cost is breathtaking. Add a mobile phone (N86 with no sim card) and a mobile screen reader (Talks IMEI license) and you're looking at close to £1,200 for the package.
I've been using KNFB Reader for several days now, and I've found it tremendously liberating. Everywhere I've been, I've found more and more practical applications for using it.
It took me a long time to decide whether to buy the software. The cost is not to be taken lightly, and with no demonstration version available, I found it took quite a leap of faith. Positive feedback from many other users helped me choose.
It did take me several days of practice before I got the hang of capturing images successfully. Now that I have, I rarely use the field of view report and usually go for the live image capture straight away. It remains a useful feature though, particularly if my first attempt is wildly out of alignment.
I've also experienced some problems with light sensitivity. Other users assure me they can capture images in dimly lit conditions, but I don't seem to have that luxury. I'm hopeful I'll find a solution to this in due course. In the meantime, I can confirm that the camera shutter is indeed open...
]]>Greg Fields from RIM kicked things off by presenting 10 guidelines for accessible mobile interface design. Several of the guidelines reflected best practice across all digital content types, for example colour contrast and logical grouping of information. Other guidelines focused on the importance of using native UI libraries and inheriting global settings to prevent conflicts with user defined settings.
Christian Heilmann from Yahoo! used Finite Incantatem, the charm from Harry Potter, to explain why he believes that accessibility and technology are not dark magic. Presenting a series of examples, he demonstrated the innovation that can happen when technology, accessibility and imagination come together.
Towards the end of his presentation, Christian spoke briefly about the need for passion and talent within the web community. Without the talent to innovate,, and the passion to challenge established thinking, we cannot build a better web. Christian then asked the audience to get to their feet, grab their asses and swear to do everything in their power to make the web a better place.
Discussion focused on the challenges of building one web for everyone. Personalisation came through strongly as a potential way forward, with the thought that data from one original source could be tailored to suit an individual's requirements. Bandwith concerns were raised when the suggestion of using CSS to control delivery of content specific content was made. It was also pointed out that many people use mobile sites out of preference on their desktops because of the perceived improvement in accessibility.
Read Henny's notes from the panel discussion: Thoughts around universal access on mobile.
Lisa Herrod from Scenario Seven talked about the history of sign language, tracing its roots back to a small village in Kent in the 1600s. With the migration of Deaf people to America during that time, sign language also migrated and began to evolve into the language of today.
Lisa also explained that people who sign as their primary language are a cultural group, hence the use of a capital D when referenceing them. It's the same as referring to the English, or the Chinese. People who perhaps lose their hearing during their lifetime, and who do not sign as their primary language, tend not to consider themselves part of this cultural group. It's a clear example of the difference between the medical and social models of disability.
Steve Faulkner from The Paciello Group talked about two emerging standards from the W3C: HTML 5 and WAI ARIA. HTML 5 is the successor to HTML 4.1. It is still in development, but some browsers are already building in support for small parts of the specification. For the moment though, no access technologies support HTML 5.
Steve went on to talk about WAI ARIA, a standard that allows developers to build accessibility into rich internet applications for keyboard only and screen reader users. The specification hasn't been formally published yet, but support across both browsers and access technologies is already good.
Mark Boulton from Mark Boulton Design spoke about the designer's role within accessible web development. He stressed the need to introduce designers to accessibility appropriately, and not through lengthy reports and complex guidelines.
Mark explained why he feels accessibility has been marginalised, and disassociated from user experience design. Accessibility should not be left until last, or added as an after thought. It should be considered as part of the core design process.
Saqib Shaikh from Microsoft echoed Mark's thoughts when he spoke about how often accessibility is considered an optional extra or something that can be left out if budgets are tight. Sometimes a more subtle approach works best though. Saqib spoke about his efforts to promote best practice accessibility by talking to his fellow developers, and encouraging them to build in accessibility as they went along.
Saqib went on to talk about Silverlight specifically, and some of the accessibility features introduced in version 3.0. I'm afraid I missed some of Saqib's presentation, as we were preparing for the panel session.
The session looked at how guidelines and standards fit into the process of developing web technologies. Testing with people with disabilities was also discussed, and the dangers of testing with expert users was highlighted.
When looking for people to test with, focusing on the target audience and then finding people with disabilities within that group. In other words, if your website is for children, find young people with disabilities to test with, as they fall into your intended target audience.
The panel were asked why devices designed for accessibility were so often not designed to be beautiful. Although this tends too often to be the case, a community of people celebrating modern design in assistive technology can be found at Enabled By Design.
Where do I begin? For me, the cultural insight into the Deaf community was really fascinating. I was already aware of the social distinction, but learning more about the historical development of sign language and the way that digital communication transformed social interaction for people with hearing impairments, really made a difference.
Another clear message was the need to celebrate innovation and creativity. There is so much happening with web, mobile and console technology, and so much of it is happening with people at heart. With a little imagination,, a little collaboration and a little effort, the possibilities are limitless. Let's do it!
]]>Cognition covers an incredibly wide range of conditions and requirements. It's one of the least well documented areas of web accessibility, and perhaps the least understood. The four presentations at Standards.Next provided a fantastic insight into some of the key issues.
Antonia Hyde kicked off with a powerful message: People are not defined by their disability. She went on to say that people with learning disabilities do use websites, and they often treasure the ability to use the web in their own homes, where there is shelter from the bustle and distraction of the outside world.
Antonia played two videos of a gentleman using the Ebay and Amazon websites. Some key ideas that were highlighted:
Antonia also explained that people don't often know how to use their browser settings. Giving people the ability to choose text size and alternative colour schemes through the website itself can help overcome this. Providing a range of pre-defined options can also help protect the design.
Jamie Knight from JK3 explained that Autism isn't being stupid or being difficult. It's a different way of processing information and it affects people in different ways. It's just that some people are at a more noticeable end of the spectrum than others. Jamie spoke about some of the ideas he has about information on the web:
Jamie also talked about the importance of literal meanings. With typical good humour, he told a story from his childhood, about the time he was told to go and wash his hands in the toilet. He did precisely that, washed his hands in the toilet bowl. Jamie really emphasised the importance of clear language and user friendly instructions.
David Owens from Transmedia Gateway spoke about his experiences of testing with people with Autism. His message was simple. If people can't use it, you're building it the wrong way. David went on to talk about the lessons he'd learnt:
David also talked about the importance of sharing insights from usability testing. This really echoed the knowledge sharing and mutual learning message of the day.
Ian Pouncey from Yahoo! wrapped up the day by taking it back to basics. He talked about many of the standard concepts of web accessibility, but with a focus on cognition. Some of Ian's key points include:
Ian also explained that spelling mistakes can cause problems for people as they read the page. He suggested that using a screen reader was an ideal way to catch spelling mistakes before a page is published. It's certainly true that spelling mistakes are sometimes easier to pick up when listening, rather than reading.
Some key points really stood out for me during the day. For the first time I came down off the fence about style switchers. I'd never quite decided whether it was better to give people the ability to use their browser settings, or provide a widget on the page.
Now I've come to understand that some people will aalways find it difficult to go through their browser to make those choices.
Many of the ideas presented during the day were familiar, colour contrast, consistent design, good structure, literal meaning and so forth. What really struck me though was their importance in relation to cognition. We're accustomed to the blunt force benefits each of these concepts bring to people with other disabilities. For people with cognitive disabilities, the impact is more subtle, but no less dramatic.
Thanks to Henny Swan and Bruce Lawson for organising the day, and to Kath Moonan for finding a great pub to continue the discussions in!
]]>Jaws 11 makes use of two ARIA properties intended to facilitate accessible drag and drop:
The aria-grabbed
property indicates whether an element has been selected or not. If an element has been grabbed, it can be dragged and dropped elsewhere on the page. If an element does not have the aria-grabbed
attribute, it cannot be selected at all.
The aria-dropeffect
property describes what happens when the grabbed element is dropped onto a target. Amongst other things, an element can be copied or moved onto a target. More than one drop effect can be defined for a given element.
Freedom Scientific include the following information in the Jaws 11 public beta announcement:
"The CTRL+EQUAL SIGN keystroke opens the ARIA Drag and Drop dialog box. It shows a list of droppable objects, as tagged by the author, on the current Web page. When you select one of these objects, JAWS will move focus to that object. If no droppable objects are available, JAWS will announce the message, “No droppable elements were found on the page” instead of opening the dialog box."
Jaws 11 public beta ARIA drag and drop support can be tested in either Internet Explorer 8 or Firefox 3.x.
With thanks to Gez Lemon.
]]>Vinux 1.5 was the last stable release of the distribution, based on Ubuntu 8.10 (Intrepid Ibex). Vinux 2.0 beta is based on Debian 5.0.2 (Lenny). The decision to move to a different baseline distribution was taken in light of Vinux lead developer Tony Sayles concerns over Ubuntu's ongoing stability.
Vinux uses the Gnome desktop, which may seem like an unusual choice for a distro aimed at blind and partially sighted people. Graphical User Interfaces (GUI) do tend to be easier to learn than the typed commands of a console interface though, and a GUI also offers greater parity for people working alongside sighted colleagues.
The Vinux 2.0 beta GUI interface has been customised for partially sighted people. It offers one click access to all commonly used applications, and has a layout particularly suitable for wide screen displays. The desktop is also keyboard accessible, with easy movement between different panels. If the old school command line is more your style though, Vinux switches easily between GUI and console interfaces.
Orca is the primary assistive technology for the Vinux distro. It provides access to applications such as the Gnome desktop, which support the Assistive Technology Service Provider Interface (AT-SPI). Orca provides screen reading, screen magnification and Braille support capability.
Speech synthesis is handled by the Speech Dispatcher interface, a device independent layer that sits between the application and the Text To Speech (TTS) engine. As well as Orca, Speakup is also installed, which provides screen reader access when using the command line. Both Orca and Speakup use the ESpeak TTS, which means there's a consistent audio quality across the GUI and console interfaces.
Braille support is provided by BRLTTY, a package that enables access to the console interface using a range of refreshable Braille displays. Screen magnification in Orca is provided by the Gnome-Mag magnification service. Vinux supports both 2D and 3D screen magnification, with 2D magnification available from bootup at a simple keystroke.
Vinux 1.5 includes both Open Office and Evolution, providing easy access to spreadsheet, word processing and email functionality. Unfortunately, with the move to a Debian based distro, space has become limited on the CD version of Vinux 2.0 beta.
Neither Open Office or Evolution are included with Vinux 2.0 beta, although both applications can later be downloaded if you install Vinux on your hard drive. Icedove, Abiword and Gnumeric have been included instead, but only work with screen magnification and not the screen reader.
]]>Amongst other things, the ARIA roadmap defines live regions. Live regions are areas of a web page that allow updates to be announced, without the screen reader focusing on that part of the page.
The aim is to automatically provide screen reader users with appropriate information each time a live region of a page is updated. Live regions should help make many rich internet aplications more accessible to screen reader users.
There are three different types of live region:
*The rude live region type has been removed from the ARIA specification.*
The following tests were carried out independently on different virtual machines, each running a variation of Windows XP. The test cases were sourced from Charles Chen's AJAX Accessibility simple test cases.
Live regions that are marked as polite
should cause the screen reader to announce the update as soon as it's finished its current activity. For example, an update would be announced as soon as you finished reading the current line of text, or finished reading the page with a Say All command.
Test case: live="polite"
Screen Reader | IE8 | FF3.x | Notes |
---|---|---|---|
Hal 11 | No | No | Updates are not announced automatically, and are not available when the live region is accessed |
Jaws 10.x | Yes | Yes | Updates are announced automatically at the end of the current activity, subsequent updates are announced until a further activity is invoked by the screen reader |
NVDA 0.6P3.2 | No | No | Updates are not announced automatically, although the information can be determined by accessing the live region itself in FF3.x. The information cannot be determined by accessing the live region in IE8 |
SA To Go | No | No | Updates are not announced automatically, although the information can be determined by accessing the live region itself in IE8 . The information cannot be determined by accessing the live region in FF3.x |
Window Eyes 7.01 | No | No | Updates are not announced automatically, and are not available when the live region is accessed |
Live regions marked as assertive
should cause the screen reader to announce the update as soon as the current activity is finished, or perhaps sooner depending on the current activity.
For example an update would be announced at the end of a short activity such as reading a line of text, but would interupt a longer activity such as reading the page with a Say All command.
Test case: live="assertive"
Screen Reader | IE8 | FF3.x | Notes |
---|---|---|---|
Hal 11 | No | No | Updates are not announced automatically, and are not available when the live region is accessed |
Jaws 10.x | Yes | Yes | Updates are announced automatically at the end of the current activity, subsequent updates are announced until a further activity is invoked by the screen reader |
NVDA 0.6P3.2 | No | No | Updates are not announced automatically, although the information can be determined by accessing the live region itself in FF3.x. The information cannot be determined by accessing the live region in IE8 |
SA To Go | No | No | Updates are not announced automatically, although the information can be determined by accessing the live region itself in IE8 The information cannot be determined by accessing the live region in FF3.x |
Window Eyes 7.01 | No | No | Updates are not announced automatically, and are not available when the live region is accessed |
Live regions marked as rude
should cause the screen reader to announce the update immediately. For example it would overide any activity including reading a line of text or reading a page with a Say All command.
Test case: live="rude"
Screen Reader | IE8 | FF3.x | Notes |
---|---|---|---|
Hal 11 | No | No | Updates are not announced automatically, and are not available when the live region is accessed |
Jaws 10.x | Yes | Yes | Updates are announced automatically at the end of the current activity, subsequent updates are announced until a further activity is invoked by the screen reader |
NVDA 0.6P3.2 | No | No | Updates are not announced automatically, although the information can be determined by accessing the live region itself in FF3.x. The live region itself cannot be accessed in IE8 |
SA To Go | No | No | Updates are not announced automatically, although the information can be determined by accessing the live region itself in IE8 The information cannot be determined by accessing the live region in FF3.x |
Window Eyes 7.01 | No | No | Updates are not announced automatically, and are not available when the live region is accessed |
Given proper support in the screen reader, the experience of using a web page with polite or assertive live regions is good. The balance between interacting with the page, and being kept up to date with the activity in the live region, works well.
The experience of using a page with rude live regions is quite different. It's disruptive, irritating and frustrating, even when the interupt is set to long intervals. It's impossible to fluidly access any other content on the page, because the updates are relentless in their interuption.
]]>Twitter is also proving popular with blind and partially sighted people. The Twitter website itself is reasonably accessible, and there are several accessible alternatives available to people as well.
Accessible Twitter is an alternative Twitter website. It has all the functionality of the standard Twitter site, but is much more user friendly to blind and partially sighted people.
When typing in a tweet, Accessible Twitter automatically warns you when you only have a certain number of characters left. The audio warnings alert you to the fact that either 30, 15 or 5 characters remain from your 140 character limit.
TwInbox, previously known as OutTwit, plugs Twitter directly into Outlook. It adds a toolbar to Outlook, and once configured gives you the ability to send, receive and manage tweets as easily as your email.
A simple keyboard command calls up a message box from anywhere in Outlook. You can create any kind of tweet, including @messages and direct messages. TwInbox automatically checks that your tweet is under the 140 character limit and lets you publish your tweet at the touch of a button.
Jawter is an extension to the Jaws screen reader. It brings all the standard functionality of Twitter to Jaws, without the need for an intermediate Twitter application to be open.
Jawter lets you change the way tweets are displayed. You can change the order that the sender's name, the tweet time and the tweet itself are displayed. You can also choose to hear tweet times expressed as "11.30am", or "15 minutes ago".
McTwit is a powerful Twitter application developed specifically for blind and partially sighted people. It has a simple interface, which uses standard Windows controls to give you access to all the standard Twitter features. It's also available in a command line version.
Brian Hartgen has also produced Jaws scripts for McTwit. Whilst they're not required, they enhance this flexible Twitter application for Jaws users.
Simple keyboard commands for accessing your tweets. For example, Alt + n grabs the tweets from people you're following, Alt + R grabs all the @messages about you, Alt + r grabs all the direct messages sent to you, and Alt + p grabs all the tweets from the public timeline.
Qwitter is an invisible Twitter application that can be accessed from anywhere within Windows. Compatible with most popular screen readers, it provides simple access to most standard Twitter functionality.
Qwitter uses very simple commands to call up lists of tweets, @messages, direct messages and sent tweets. The commands are all easy to remember, and include useful options such as the ability to call up a list of links within a tweet. This makes it easy to choose a link and open it in the browser.
Tweets60 is a Twitter application for Series 60 mobile phones (3rd/5th editions). It is compatible with the Talks screen reader, and provides access to most standard Twitter functionality on the move.
Tweets60 lets you post tweets, manage who you're following, and keep up to date with automatic polling, all from your mobile phone. Despite all this functionality, it uses smart synchronisation technology to keep data useage to a minimum.
rel=accessibility
to the HTMl 5 specification was a good idea. Among the initial comments, Jared Smith noted that getting assistive technologies to support the idea might be a problem. For the major screen readers at least, I thought it could be easily solved though.
Bruce's idea went a little something like this:
So I want to float the idea of
rel=accessibility
that would allow assistive technologies to discover and offer shortcuts to accessibility information, such as a WCAG 2 conformance claim, or a form to request content in alternate formats (for example).
The Immediate reaction to the idea was extremely positive, with only one or two people voicing a concern. From a screen reader user's perspective, the idea certainly has merit.
There are of course many different techniques available to a screen reader user looking for a particular link on a page. It's possible to tab through the links or open a dialogue box listing all the links on a page. It's possible to search the page for a certain key word or phrase. Both approaches will work well, but neither would be as reliable or consistent as the ability to move directly to a link relating to accessibility.
The script enables the user to move backwards and forwards between links with the rel="accessibility"
attribute value. Where no such links are present, Jaws reports using the standard short or long message formats, depending on the user's verbosity settings.
Available keyboard commands are notoriously difficult to find in Jaws, particularly where quick navigation keys are concerned. In this case, 8/Shift 8 works well enough for the purpose. The keyboard command can be edited by the user through the quick navigation keys manager, if the chosen one proves unsuitable for any reason.
Appending the script to a quick navigation key made sense for two reasons. Firstly, because it's consistent with similar behaviour for moving through a web page using the virtual cursor. Secondly, because using the virtual.jss file means that the script is only effective when the virtual cursor is in use.
At present, the script has been tested with Jaws 10 on Internet Explorer 8 and Firefox 3.x. When I can find time, I'll run some tests on older versions of Jaws, with older browsers. None of the functions used require Jaws 10 specifically, so the script should be reasonably backwards compatible. Some slight changes may be needed to deal with different browser versions though.
Looking briefly to other popular screen readers, it should be straight forward to script Window Eyes to do something similar. It may also be possible with Hal, but my knowledge of that screen reader isn't up to scratch I'm afraid. SA To Go isn't scriptable, so we would need to wait for the developers to adopt it as a new feature. NDVA could be customised under the terms of the GNU General Public License of course.
Warning:Scripts no longer available.
]]>Note: Screen reader users may wish to set punctuation levels to 'All'.
A punctuation mark is used to represent the eyes, another for the nose and another for the mouth. They are entered along the line of text in that order and the idea is for the viewer to tip their head over onto their left shoulder, so that the punctuation becomes a face.
:-)
Becomes a face with eyes, nose and smiling mouth.
It has been argued that if proper expressive language were used, emoticons wouldn't be nescessary and in some media this is absolutely correct. But what opponents of emoticons don't take into account is that electronic communication is more akin to verbal communication than it is to written.
With written work there is time and room to describe and explain events or behaviour, but electronic communication is quick and dirty and doesn't permit time and space to offer in depth descriptions of very much at all.
History recalls that the first emoticon was posted to a Carnegie Mellon University bulletin board on September 19 1982, by Scott E. Fahlman.
propose that the following character sequence for joke markers: :-) Read it sideways. Actually, it is probably more economical to mark things that are not jokes, given current trends. For this, use :-(
For Fahlman's own view of how it all began, take a look at his page on Smiley Lore.
In the nearly three decades since the first emoticon was suggested, their scale and complexity have grown to become something close to an art form. Applications such as microsoft's Windows Messenger have the capability to turn the strings of punctuation into minature graphics, but this does sort of miss the point of emoticons in the first place.
The tables below provide a longhand description and a verbal description for each of the most commonly used emoticons.
Despite the fantastic range of emoticons now in existence, only a few basic ones form the core of expressions used.
Table of Common Emoticons
Emoticon | Longhand Description | Verbal Description |
---|---|---|
:-) | Colon, Hyphen, Right Paren | Smile |
:-( | Colon, Hyphen, Left Paren | Sad Smile |
;-) | Semi Colon, Hyphen, Right Paren | Winking Smile |
:-> | Colon, Hyphen, Greater | Grin |
:-P | Colon, Hyphen, Capital P | Poke Tongue |
:-/ | Colon, Hyphen, Slash | Frown |
:-\ | Colon, Hyphen, Vertical Bar | Stern Face |
:-1 | Colon, Hyphen, Number 1 | Smirk |
:-D | Colon, Hyphen, Capital D | Big Smile |
Many people choose to use different punctuation marks to represent the nose. You may see the above emoticons used with a lower case o, as a nose, or you may find that people forget the nose entirely, leaving just the eyes and mouth instead.
The fun does not stop there. Emoticons have been created to represent a wide variety of expressions and even characterisations.
As communication becomes quicker and more minimalist, the need for greater expression in a smaller number of keystrokes has become widespread.
Table of Less Common Emoticons
Emoticon | Longhand Description | Verbal Description |
---|---|---|
%*} | Percent, Star, Right Brace | Very Drunk |
:-* | Colon, Hyphen, Star | Kiss |
':-) | Apostrophe, Colon, Hyphen, Right Paren | Sarcastically raised Eyebrow |
:-O | Colon, Hyphen, Capital O | Surprised |
:'-( | Colon, Apostrophe, Hyphen, Left Paren | Crying |
Emoticons are also used to describe the way a person looks, particularly their facial features.
Table of Descriptive Emoticons
Emoticon | Longhand Description | Verbal Description |
---|---|---|
:-{) | Colon, Hyphen, Left Brace, Right Paren | Wears a Moustache |
:-)> | Colon, Hyphen, Right Paren, Greater | Wears a Beard |
%-) | Percent, Hyphen, Right Paren | Wearing Spectacles |
&:-\ | Ampersand, Colon, Hyphen, Vertical Bar | Bad Hair Day |
(:-) | Left Paren, Colon, Hyphen, Right Paren | Bald Head |
It is even possible to portray different characters using emoticons. These are often used in chat rooms to express a person's character.
Table of Character Emoticons
Emoticon | Longhand Description | Verbal Description |
---|---|---|
O:-) | Capital O, Colon, Hyphen, Right Paren | Angel Face |
>:-> | Greater, Colon, Hyphen, Greater | Devil Face |
8=X | Number 8, Equals, Capital X | Skull and Cross Bones |
(P-| | Left Paren, Capital P, Hyphen, Vertical Bar | Star Trek Borg |
:-[ | Colon, Hyphen, Left Bracket | Vampire |
<:+D | Less, Colon, Plus, Capital D | Clown |
When you first start using emoticons it's tempting to add one to every sentence, but as you become more accustomed to highlighting your messages with them you'll settle into a regular pattern.
A good rule of thumb is to use only one emoticon in any paragraph and no more than three or four in a message, depending on its length.
The mission statement for the W3C explains the organisation's aim:
The World Wide Web Consortium (W3C) develops interoperable technologies (specifications, guidelines, software, and tools) to lead the Web to its full potential.
The Web Content Accessibility Guidelines are recognised as the international standard for building accessible websites, and measuring web accessibility. They are part of a suite of guidelines that represent three different aspects of web accessibility:
The Web Content Accessibility Guidelines are intended to cover accessibility issues relating to all kinds of disability. This includes visual, hearing, cognitive and physical conditions. The aim is to provide a unified approach to web accessibility, preventing a solution for one user group unintentionally becoming an obstacle for another.
WCAG 1.0 were released in 1999, heralding the start of a new era in web development. For nearly a decade they have provided the basis for web accessibility, guiding the development of new websites and the evaluation of existing ones.
WCAG 1.0 consists of 14 guidelines. Each guideline covers a basic theme of web accessibility.
Each guideline is broken down into a series of checkpoints. There are 65 checkpoints in total, and each one has a priority level assigned to it. WCAG 1.0 defines three priority levels:
Depending on how many checkpoints a web page meets successfully, a different conformance level is achieved. WCAG 1.0 defines three conformance levels:
Web accessibility is measured on a page by page basis. The guidelines are applied to each individual page within a website and each page is awarded the appropriate accessibility level. A website is only as strong as its weakest page, but overall conformance is usually based on the level achieved by the majority of pages.
WCAG 2.0 were released in late 2008, bringing web accessibility guidance up to date with modern web development techniques. In the time since WCAG 1.0 were released, the assistive technologies used by people with disabilities improved dramatically. New technologies and techniques were also introduced into web development. WCAG 2.0 reflects these changes, taking a wider range of web technologies into account. WCAG 2.0 consist of four principles:
The principles are broken down into 12 guidelines:
Each guideline is further broken down into a number of checkpoints. There are 61 checkpoints in total, and each one has a number of success criteria assigned to it. WCAG 2.0 defines three levels of success criteria:
WCAG 2.0 measures conformance using the same three levels that define the success criteria. This approach also carries through the conformance levels used in WCAG 1.0, although the requirements are slightly different:
The W3C recommend using WCAG 2.0 as the basis for all new and updated content on the web. It also recommends referencing WCAG 2.0 in web accessibility policies or statements.
However, as the W3C acknowledge, it is possible to conform to WCAG 1.0, or WCAG 2.0, or both. It is therefore likely that both versions will run in parallel for some time to come.
Many organisations use WCAG 1.0 as the basis for their web accessibility policies and programmes. Similarly, many accessibility targets are defined using WCAG 1.0. For example, the minimum target for UK government websites is WCAG 1.0 Double-A.
It will take time for all such policies, programmes and targets to be revised to use WCAG 2.0. Whilst that process is happening, WCAG 1.0 will remain in operation in a practical sense.
For organisations looking to the future, WCAG 2.0 should be used as the basis for all web accessibility policies, programmes and targets. As web development techniques continue to evolve, WCAG 2.0 provides the most robust platform upon which web accessibility can be built or measured.
]]>