Several years ago, Ilya Ovodov, a specialist in computer vision, and his wife, Olga, from Mendeleyevo settlement in Moscow region, adopted a sightless girl, Angelina. She was nine, she could read and knew Braille letters, although she was just preparing to enter the first grade. It was for her that Ilya Ovodov invented Angelina Braille Reader, the program which translates Braille into ordinary text. Thanks to this tool Angelina was able to study at a regular school.
Now Ilya’s invention is used by teachers, parents and students. The project won several contests, and the Russian President Vladimir Putin promised to support the reader.
In his interview to the Special View portal, Ilya Ovodov told us how the program was created, about its working principles and the future ahead of it.
Let’s go back to the very beginning: how did you come up with the project and how much time did it take to develop it?
The idea came very easily: every author writes the book he or she would like to read. It had been two and a half years since we’d adopted Angelina. We took her from Sergiyev Posad orphanage. The girl had had a very hard life. When you take a child from an orphanage, blindness is the least of your concerns. Do you know how they teach blind children in Russia? First of all, it is done at a residential school. We live in Moscow region, so we had several specialized schools to apply to: in Korolyov, in Sergiyev Posad where she came to us from, and residential school # 1 in Moscow. So we were supposed either to send her to a residential school, or to waste half a day driving her to school and back. And for a child, the most important thing is to feel that he or she has a family, Mom and Dad to talk to.
But then the miracle happened. I guess, in such circumstances God helps the blind and the orphans. There is only one school in our settlement. When Olga went there with Angelina, a primary school teacher said: “Oh my, you’ve got a new child? Is she blind? You know, I am a certified visual impairment specialist. This year my fourth-graders will graduate from primary school, and in the future I will be able to give her individual lessons”.
Audio description: a coloured photo. Angelina in a grey jersey is sitting at a writing desk. There is a Braille script aid in front of her. To the left, there are a desk cup and a doll in pink clothes. The girl is propping the doll with her hand.
We began studying at home and there were all these textbooks. By the end of the first year Larisa Nikolayevna (Angelina’s teacher, — editor’s note) was about to give up. She spent so much time preparing for these lessons. And then my wife (the author of the idea) said, “Listen, Ilya, computer vision is your job. At work, you are writing programs that can do amazing things. Can’t you write a program to decipher all these dots?” I had no other choice but to get down to writing this program.
Imagine: here is our child sitting with her Braille textbook, and me beside her with the same textbook in regular script. Naturally, text layout is completely different. Even if you know Braille, it is difficult to read the writing, especially if the child’s hand is covering a part of the page. After I wrote the program, it took several months for it to start working properly, when I was able to take a picture of a textbook and get the idea of what was written there and where each portion of the text was placed. The quality used to be lower than now. But all the same, it was usable.
In the second grade we began taking pictures of textbooks and print them out. As a result, we had transcriptions for all the textbooks where each Braille letter had a corresponding letter of flat script written under it. After that it dawned on us that we were not the only people who had to struggle like this, and we were not the only ones who needed the program.
Our next goal was to improve the quality and give other people the opportunity to use the program. I worked on it in my free time. The users needed the program to work online. For me this was an unusual challenge. I am a computer vision specialist, but not an expert in web design and creating apps. So I had to learn all this, too. Even now the website is a make-do-and-mend affair, but it works. Also, I added the function of sharing the translation result via email.
At the moment, we are getting users’ feedback, so things that need fixing seem clearer. One of our priorities is making the text the users have already translated available for opening and downloading. I put some effort into improving the quality of the program. This academic year, we decided we should show the tool to other people. We sent it to specialized schools. I hope that now, especially after the president said some words of appreciation about it, people will start using it. Such a waste — to have an amazing tool, but no way to share the information about it with those that need it most.
Audio description: a coloured group photo. The Ovodovs have gathered by a Christmas tree and a fireplace: six adults, three teenagers and two girls of about ten. All of them are dressed casually. Some of the family members are standing, three of them are sitting on chairs in the front row, Ilya among them. He is gently hugging his wife Olga who is standing beside him. Next to them, Angelina is sitting in the lap of a thickly-built aged man with a beard. Her gaze is unfocused. A large golden retriever is sleeping at her feet.
Please elaborate on the program’s working principles.
The technology allowing the things I’m doing now appeared merely a few years ago. When I started to work on the program a year and a half ago, being a technician, I read a lot of articles, learned about similar tools, composed a list of reference. Among my computer files there are about 50 articles on optical recognition of Braille script. I analyzed them and thought I could base my work on programs that were available at the time, because I don’t like to recreate the wheel. But they all were using, as specialists would say, classic computer vision approach. It does not include any neuronets, but utilizes more simple and direct solutions. All of them get stuck when it comes to peculiarities of Braille script. One can interpret the script properly only if Braille dots are arranged along straight lines. Most works on optic recognition of Braille script mentioned the problem of putting a book printed in Braille into a scanner, which is quite a challenge. Besides, the scanner one would need in this case is not an ordinary one from a computer shop, but a specialized one. You need to align Braille script so that the curving of the lines would not exceed the distance between the lines.
By that moment I often used neuronet approaches in my work. It turned out that neuronets were more flexible when it came to recognizing Braille symbols. Using the context around a Braille symbol, they can find out which dot is in the row. If you know how to handle these neuronets, they are very resistant to the cases when the page is not totally flat or warped, or if the picture has been taken at a wrong angle.
The algorithm works in stages. The first stage is Braille symbols recognition. The symbols are recognized as a whole. Then the symbols are united into lines. The next thing is to translate Braille text into ordinary text. Translating a text written in one language is not a problem. But then there are nuances, such as numeric characters, math problems, paragraphs, fractions, words or phrases in English, Roman numerals. This is difficult.
Simply put, there is a symbol showing the beginning of a text in English, but turning back into Russian often is not marked as it should be. Or, for example, we have the Roman numeral I. It is followed by the text in Russian. The program may omit the marking symbol or make an error in its recognition, and after that everything else goes wild. As a result, we have to write some linguistic rules to avoid such troubles. Of course, in this aspect there is much to be improved. For instance, at the moment I had to limit the transition into English within Russian text to just one word. One can assign English as the language of the original text, and then it will be translated into English flat script, including all the Russian letters.
One needs a complex approach based on what we are going to analyze, what kind of a text, what is the correct way to interpret it, but for now, this is just a task ahead of me. Maybe, this is what should be done in the future.
Audio description: a coloured photo. A sunny day. Ilya and Angelina are sailing in a yacht amidst a sea. They are winding a rope onto a barrel of a winch. Ilya is wearing a grey T-shirt and a wide-brimmed bucket hat. He is turning the handle of the winch. Angelina is wearing a bright pink sundress. She is sitting in his lap and helping him. Behind them Russian flag is attached to the board of the vessel, mountains are silhouetted at the horizon.
Do any mistakes occur while recognizing Russian text?
There may be problems, but that depends on the quality of the original. If it is a simple literary text, the quality of symbol recognition is quite high. If we have inserts in another language or math problems within the text, errors are more likely to occur.
But our website has Help Page with tips on how to take a picture of the text. The program has its limitations. If you follow the instruction precisely, it will work just fine. People often write to me that the program doesn’t work, gives out gibberish. But then I look at the picture they have fed it and see that these people never even read the instruction.
And the requirement is simple: the light should not be directed onto the page from above, but fall diagonally, from the side opposing the camera. The second requirement: the picture of the page must be taken from above.
But I am planning to adjust the program so that neuronet would be able to cope with difficult situations. For this reason when you use the program, the following question pops up: “Do you give your permission to use the text you are uploading in order to further improve the program?” What is more, I am planning to publish all the texts with such permissions on the net so that somebody might develop this line.
How many people are using the service now?
340 people registered at www.angelina-reader.ru website. About a hundred of them are those who got interested and joined in during the testing stage in the first year of the program development, when it was still quite unfinished and weak. The rest came this year around September because we started to actively promote the initiative, shared the tool with a number of schools.
If I am not mistaken, the program users are mostly teachers and parents?
Teachers, parents and students. I can give you an example: there is an amazing young woman who graduated from The Higher School of Economics with a bachelor’s degree this year. She wrote her diploma work in Braille. Her wonderful teachers studied Braille in order to teach her. When the program came out, they began to make extensive use of it. The young woman asked me to add French Braille into my reader, but I haven’t succeeded yet. I don’t know French and don’t see how to make it happen. And what’s more, I don’t know French Braille.
At the moment, my priority is to make this program function properly for it to be used in Russian schools.
Am I right to suppose that this program has no analogs in Russia or elsewhere?
I’ve been looking for any for a year and a half but have found none. The only thing I was able to find was a company in Russia that offered a program recognizing Braille and installation of the same specialized scanner. But frankly speaking, I couldn’t run their program properly.
One of our testing team members lives in Australia. The child’s mother showed the program to the teachers at a specialized school, and they said, “Wow! We’ve never seen anything like this. Why don’t we have such a thing the Russians have?”
As for the methods I’ve used, after I’d already implemented the algorithms, I learned that Chinese developers had sent an article to a US computer vision conference where they’d described an approach of Braille recognition rather similar to the one I’d invented.
Unfortunately, if a year and a half ago it was a revolutionary idea, now it doesn’t seem to be so extraordinary. But what the Chinese described was just an algorithm, not a usable solution. On the other hand, I owe much to those Chinese developers, because they had the only dataset of about one hundred pages of Braille text, where every symbol was marked. I used this dataset as my starting point, and without it I wouldn’t be able to apply my system so fast and make it so effective.
After my neuronet learned from the Chinese dataset, it was able to recognize texts in Russian. And this was some foundation I could build upon. I could select things that were acceptably identified in the texts I prepared in Russian, I could delete errors. I had this interesting and quite out-of-the-box idea: I used short poems to teach my neuronet. They are quite convenient because in poems, we know for sure what is written in each line. We can compare the resulting text delivered by the program with the original text of the poem line by line, and see what has been successfully identified and what hasn’t, which parts in which pictures have been properly recognized.
In an interview, you said that the program is installed on your personal computer. All the users that come to your website are actually “logging into” your computer. In the future, are you planning to put your program elsewhere?
I’m hoping to work on it soon. In May, I applied for a competition organized by The Agency for Strategic Initiatives (ASI). The competition is called World AI&Data challenge, and it has three stages. The first stage is the contest of problems. The persons who think they’ve found a socially important problem and have the data they need to solve it, file the problem for the contest. Next, there is the selection among the applications, when the ASI picks out the problems that can be solved by the community. The next stage is the solution itself. The problem of recognizing Braille script was among those that passed into the second stage. The idea seems to be in the air. I suggested my program and took the second place. The first place was given to the solution for flood prediction.
Thanks to this competition, several days ago I was talking about my program to Vladimir Putin. Agency for Strategic Initiatives arranged this. Also, Yandex company has given me its support. It gave me a grant so that I could develop my system based on its facilities. I hope that the program will soon move out of my computer, especially as the number of users is growing.
What are your plans for the future? Are you going to add other languages, or make the program autonomous, or create a mobile version?
Mobile version is still an open question. But at the moment it doesn’t seem to be a priority. I think that in the immediate future the program will stay Internet-based.
At the moment I’ve got two website versions: a PC version and a smartphone version. They are different. In the mobile version there is a Take A Picture button. One just pushes it, takes a picture using smartphone camera and sends it directly into the program. The only drawback is that it works as a website. Some people don’t know that you can make an icon of the web page and put it onto your smartphone homescreen, so it will look like an app.
I’m planning to improve the program’s functions, add the opportunity of viewing things the user has already recognized, to go back to previous files. The website itself doesn’t look like much, I’d like it to look better. My IT colleagues have already offered their help. I hope we’ll team up nicely.
As for other languages, I have that in mind. But when this comes to be is still uncertain.
What does the project need right now?
People can already use the program, so I would like the information about it to spread around. But the most important resource which I am always short of — is my personal time. I’m coordinating the project, directing it, and now I’ve got several people who are ready to contribute to it.
IT specialists have a concept of an OpenSource project. The original code of the project is open. It is published on the net (you may find the source code of the program here), and anyone can download it. Those who are good at programming can improve the code and then send a suggestion for improvement to the central webpage. In any case, in order for the project to develop, it needs a person who will map out its strategy and decide which aspects need improving, what are the project’s needs. At the moment these are my responsibilities.
This project is non-profitable. I’m thinking about whether to add Donate button, so that people could contribute as much as they wish. I don’t want to make the users pay for the reader, and for that I have both ethical and pragmatic reasons. I also hope that people who are actually going to contribute will do so for the love of art and humanity.
Audio description: a coloured photo. Ilya Ovodov and his wife Olga are standing in front of high doors and heavy green curtains. Ilya is wearing a white shirt. He is looking at his wife, hugging her shoulders lightly, his eyes half-closed, a gentle half-smile on his lips. Olga is looking directly at the camera. She has an honest round face with no makeup, fair hair, a slight smile on her lips. She is wearing a blue jumper with rounded neckline.
Your project became one of the winners of We Are There For You Moscow region governor’s award, a contest organized by The Agency for Strategic Initiatives (ASI). The president promised to give you information and grant support. Does this all prove that the society has begun to care more about the problems of people with visual impairments?
Yes, I have a feeling that in the world and in Russia, the society is turning towards the disabled, and the blind in particular. We can see Braille signs, wheelchair ramps, beeping street lights appear all over the place. Even the fact that of all the projects presented for the contest ASI chose this one, shows that they see the program for the blind as a priority.
But at the same time I’ve got another thing that gives me pain. I wanted to tell Vladimir Putin about it while I was describing my project, but I got a bit confused.
In our family, we have two children with disabilities: Angelina and Varvara. Angela recently underwent a procedure in St. Petersburg, by a marvelous surgeon, Oleg Diskalenko. At the orphanage year after year she was given the verdict: no treatment possible, the prognosis is unfavorable. Oleg Diskalenko operated on her, and now Angelina is able to read font size 80. Varya who we’ve taken from an orphanage 5 years ago and who could sit only in lotus position, now can walk around the house using special devices. She is already thinking about what she could do for a living. She is twelve now. After the orphanage they most likely would have ended up in a psychoneurological residential facility.
The government amply supports us with money as parents of children with disabilities. But the real heroes are those parents who are raising such children on their own, because they don’t get such support. Disability pension amounts to 10 thousand rubles. For a family, it is a catastrophe. That is why such families are sorely tempted to put disabled children into an orphanage where the government will pay them big money, much more than it is paying us now. But the child will end up in a psychoneurological residential facility anyway. Alternatively, the orphanage will look for foster parents for the child and offer money to them. And what will be the motivation of such a foster family? Some people may adopt a child because they sincerely care, but others may be enticed by the money. This is always a risk. And this is a real problem. It would be fairer to support the biological parents who are raising the child and giving him or her proper treatment.