Our current gecko accessibility engine (GAE) happily serves chrome and content information through synchronous desktop accessibility API such as MSAA/IA2. For example a screen reader makes a synchronous IPC call over COM via MSAA/IA2 into our process where we (GAE) grab the information requested and return it (pass it back).
Firefox will almost certainly be moving web content rendering out of the chrome process. Communication between chrome and content processes will be done asynchronously because users like their browsers to be responsive. Desktop accessibility API is implicitly synchronous. In addition to the chrome (browser UI) process, content processes contain a lot of juicy information (i.e DOM) that needs to be served through desktop accessibility API.
What to do?
Forward messages between the chrome and content processes.
Fail. Google tried this and it is too slow.
Cache all the chrome and content information in one process.
Google is now doing this with some success. If a screen reader is detected, they cache the accessible DOM tree(s) in the main browser process. Cache latency can lead to screen readers getting stale information sometimes, but it is expected that firing desktop update events will mitigate this.
Create asynchronous desktop API.
Yeah, this might be a pipe dream.
I am currently thinking about something like H2, but that would allow assistive technology to transition to H3. In terms of pictures, here is my thinking:
This minimalist figure shows boxes for chrome, and content processes with asynchronous connections to box that represents an accessibility process called "Async Accessibility" (needs a better name). This box butts up to another box that essentially represents our current gecko accessibility engine, which in turn has a synchronous connection to assistive technology (over desktop API).
If we then begin providing desktop access to "Async Accessibility", my hope is that progressive AT could begin using this API when they are ready. As I write this I admit I have serious doubts this would actually happen, but I want to get this thinking out there.
This figure shows AT using asynchronous API to access "Async Accessibility" through some as yet undefined API and usage pattern.
In some ways H2 and H3 are similar, with H2 being more straightforward. I haven't attempted to estimate the engineering effort for either approach but there is a good chance we'll end up with something closer to H2. It all depends on resourcing and time, and how quickly our desktop firefox moves to multi-process content and how long we might support a single process mode.
I'd like to thank Josh Matthews (jdm) for sanity checking this before I posted it. Feedback is welcome. I'm hoping to iterate on this and to ultimately develop a solid implementation plan by mid December.
Thanks for reading.