Viewpoint is an AI-powered accessibility tool designed to run alongside your existing screen reader and accomplish tasks that traditional screen readers often struggle with. By leveraging Google's latest Gemini artificial intelligence model, Viewpoint can help you navigate inaccessible software, read on-screen text, and access scanned or otherwise inaccessible PDF files.
The application works by taking a screenshot of your screen, which it then analyzes to identify user interface (UI) elements and text, presenting them back to you in an accessible format. It offers four distinct modes to handle different situations.
The primary modes of Viewpoint are:
- UI Mode: This mode is designed for interacting with inaccessible software. When activated, Viewpoint identifies any on-screen UI elements like buttons and text fields, allowing you to navigate and interact with them.
- OCR Mode: This mode identifies all text currently visible on the screen and displays it in a simple, accessible dialog for your screen reader to announce.
- Query Mode: This allows you to ask natural language questions about what is on your screen or to find specific UI elements. For example, you could ask "where is the login button?", and Viewpoint will find it and place it in UI mode for you to interact with.
- PDF Reader: This mode is built to extract text from scanned or image-based PDF files that are normally inaccessible to screen readers and displays the content in a simple dialog.