Seeing AI in the App Store

Seeing AI from Microsoft

Seeing AI scene laptopI’m going to talk about an accessibility app called Seeing AI developed by Microsoft and while you might not need this yourself, it’s a fascinating, free tool for iOS that has huge capabilities. I’m going to walk through how I heard about it and the problem it originally solved and then I’m going to do some crazy multi-media. I took a series of short videos using iOS’s built-in recording function of each of the tools within Seeing AI. The videos will be embedded in the blog post as I go along and I’ll play the audio from them for the podcast, setting up what’s going on in each segment.

I was alerted to Seeing AI by Kevin Jones (@kevinrj) after I did the review of the NiteIze Taglit, the little LED lights I clamp on my shirt for walking Tesla in the evening to keep from being plowed down by an unobservant motorist.  Kevin is blind and wanted to know if there was any physical indicator to tell him whether the light was on, off, or blinking.

Since there wasn’t a physical indicator on the Taglits, he asked if I’d test a free app called Seeing AI from Microsoft. This app does a ton of cool stuff and I’ll walk you through all of it, but let’s start with the problem Kevin wanted to solve.

Seeing Ai is available in the App Store of course and there’s a link in the shownotes to it. When you launch Seeing AI, there will be a series of buttons across the bottom, the last of which says “Light”.

Light

After tapping on Light, I held the Taglit under the camera on my iPhone and I heard a solid, low tone. Then I clicked on the Taglit, and as the light turned on, my phone started playing a higher-pitched tone. I clicked the Taglit again, which makes the light blink and then I could hear an alternating high/low tone from Seeing AI. One more click and the light went off and Seeing AI gave me the solid low tone again.

I made a little recording of Seeing AI playing the sound. If you listen carefully, you’ll hear the slight click, followed by the tones indicating what the light is doing:

I think we can categorically say that Seeing AI easily will tell Kevin whether his Taglit is turned on, off, or blinking. But you know I couldn’t stop there, I had to see what else Seeing AI could do!

Short Text

In the next set of tools, you’ll hear Seeing AI read things out to us. If you’re blind and have VoiceOver enabled, it will do the reading and Seeing AI will not speak out loud. I think it’s pretty cool that Seeing AI will read out loud for those with low vision. It’s also useful for those with sight who want to test its features to learn about the power of the tools.

Seeing AI’s first button is entitled Short Text. They don’t define Short Text, but I interpret it as anything that’s not a document. With Short Text, Seeing AI will read to you real-time while doing optical character recognition (OCR) on the fly.

In testing Seeing AI, it’s really tempting for me to hold things up perfectly aligned and in view of the camera since I have the gift of sight. To stop my natural tendencies, I put on the blindfold I used for my Tech Talk at Macworld back in 2012 where I demonstrated VoiceOver on iOS and the Mac entirely blindfolded. Of course, I still had my blindfold!

I tested it on a tube of sunscreen for which I could not read the ingredients at all even with my reading glasses, and while not perfect, it was a lot better than without them.

In the audio/video you’re about to hear, I am holding up to the camera the box for my Ring Chime. I start by holding it perfectly in view and then I start making it more challenging, holding it at crazy angles. You’ll hear it refer to “IT” as though it’s an acronym, but the word should be “it”. In the third set of things it read, the box is at a 45-degree angle with the text nearly upside down. I thought it was very impressive.

Document

Now let’s say you’ve got a full page you need to read. This is where the Document section comes into play. Again testing with my blindfold, I put a white piece of paper with printed text on it down on my brown faux-wood desk. I heard Seeing AI tell me that it couldn’t see the upper left corner, so I moved in that direction. Then it said the bottom right was missing. Oh, duh, that means I’m too close. It gives you these verbal instructions to not only tell you how to align the document but also to make sure you’re far enough away that it can see the whole thing.

Once I was far enough away, it told me to hold steady, and then I heard the camera shutter. After that, it played a little tune for maybe 2-3 seconds and said: “processing”. At that point, nothing else happened so I had to peak out from under my blindfold.

Seeing AI was now displaying the entire document completely OCRd. On the bottom left was a play and stop button which when pressed would read out the text. If you have VoiceOver turned on, you can just drag your finger over the text and it will read it out loud, or you can hit the play/stop button. But if you’re low vision, there’s also an A+ and A- symbol that when tapped increases and decreases the text for easier reading.

I like that the Document setting is useful for both the blind and visually impaired, that’s really cool to see the dual use. One warning – the OCR is highly dependent on how much light you have. I got some poor results – so I moved a bright light over the document and it did much better. The good news is that if you’re blind, you can use the Light function in Seeing AI to find out if you have adequate lighting!

Let’s listen to the Document tool in action. Unfortunately, there’s a conflict between Seeing AI reading out loud and the screen recorder functionality in iOS so I wasn’t able to capture the reading part. I can tell you that it was well over 90% accurate in OCR accuracy though.

Product

The next one isn’t nearly as big of a challenge – it’s called Product and the problem it solves is identifying a product by its barcode. If you’re a sightling, it’s pretty easy to line up a barcode under the camera, but how the heck do you do that if you can’t see?

Seeing AI solves this in an interesting audible way. It gives you a fast beeping sound if it sees a barcode, and then makes a tone when it thinks it has captured the code. In the next recording below, you’ll hear it try and fail to find a barcode and then the second time succeed. I do it on three packages and even though I’m not holding the boxes aligned properly at all, it identifies all three products successfully using their barcodes.

Person

I’m not sure the next feature would actually be useful or rudely intrusive in real life but it’s called Person and it’s to help you identify people. The idea is that you take three photos of someone with Seeing AI, and then you give the person a name. The camera gives feedback on whether you’ve gotten them in the frame, but they suggest you have the person take their own picture for you. I think letting the person take the photos would make it a little bit less weird when you use it.

Anyway, after you’ve stored the person in your Seeing AI app, you can simply point your phone at them and it will tell you who they are and tell you how far away they are. I used Seeing AI to take pictures of Steve and me and it was flawless in telling us apart. I know we’re really similar being born on the same day only 4 hours apart, so that’s pretty amazing. Here’s what it sounds like as I hold the phone facing first Steve, and then flip the camera around towards me and then back.

Currency Preview

The famous blind jazz musician Ray Charles is reported to have insisted on always being paid in single dollar bills so he could be certain he wasn’t being cheated. With Seeing AI’s currency reader, he wouldn’t have needed to do that. Seeing AI isn’t the first iOS app to solve this. I told you about LookTel’s Money Reader app way back in 2011, but it’s cool that this feature is folded into Seeing AI.

The tool is called Currency but it’s labeled Preview to designate it as an experimental tool. Not sure why since obviously this one isn’t as hard as identifying a person.

Here’s a quick demo where I hold a $1 and $20 bill at all kinds of angles, showing only part of the bill to the camera and it correctly identifies them.

Currency Preview also has options to identify British Pounds, Canadian Dollars, and Euros. We had some British and European currency but oddly Currency Preview stayed completely silent when those options were set. I guess this is the experimental part they were talking about.

Scene Preview

The next experimental tool is supposed to identify scenes. To use Scene, you point at the area you want it to identify and then tap on the screen for it to take a picture. You hear the happy thinking sounds and the word “processing” and then it comes back with text that it reads out telling you what it found. From there you can save the photo it took, share the image, or close the image.

In the demo, I’ll point it first at my bathroom from an angle where you can only see the sinks, cabinets, and mirror. Next I point it at Tesla asleep on her dog bed, and finally, I point it at my bed and dresser with a window behind it.

The answers were a little bit comical as you can tell. After this demo, I was playing with it and took a picture of just part my computer screen with not much showing on screen. It came back with “probably a screenshot of an open laptop computer sitting on a desk”. For what it saw, that was a darn good description.

Color Preview

Another experimental feature is Color Preview. I’m not going to play a recording of this because it’s kind of boring. It does say colors but it’s highly dependent on lighting for accuracy. I’m not sure it would be of that much use, but maybe I don’t understand the use case.

Handwriting Preview

Probably the tool that surprised me the most is called Handwriting Preview. My handwriting isn’t illegible, but it’s also kinda weird. I can’t seem to decide whether I should be printing or using cursive so my text is pretty random. I read once that someone whose writing changes mid-word from printed to script and back has some sort of psychopathy. Either way, I figured my random handwriting would give Seeing AI a run for its money.

I wrote on a pad of paper the words, “Hi this is Allison Sheridan. I’m a podcaster with an EVER so slight Apple bias.” In the demo you’ll hear, I first bring the camera over the pad with it right-side up and centered.

With no indicators to tell me whether it can see the page or the text, it suddenly says “Processing” and then speaks the exact words I wrote. I did notice that it said “command” at the beginning but I finally figured out why. My keyboard is slightly viewable in the upper left, and the command key is showing! Pretty funny.

Next I turned the paper so it was at a diagonal. You’ll hear a lot of gibberish out of Seeing AI with that angle but a few of the correct words are still in there. I then turned the page upside down and you’ll hear only gibberish. Finally, I put it back upright and you hear the text clearly read.

Handwriting Preview is definitely impressive but a bit more challenging to use. I guess my advice would be that if you hear gibberish, try turning the paper to different 90-degree angles until you hear something intelligible. You’d also have to flip back and forth between short text and handwriting to figure out what’s on the paper in the first place.

Conclusion

Seeing AI is an amazingly powerful toolset that is free from Microsoft. I really like the new Microsoft with their increased focus on accessibility and of course their support for iOS. I highly encourage you to download Seeing AI and see for yourself.

1 thought on “Seeing AI from Microsoft

  1. Emma - December 29, 2022

    One of the most decisive arguments is that the use of artificial intelligence can increase the company’s revenue by several times. In addition, for this purpose, you can always find cool performers here https://geniusee.com/artificial-intelligence

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to top