Latency dropped to 38ms end-to-end
Rewrote the inference pipeline with quantized weights so narration feels instant.
Point your phone at the world and it tells you what's there, out loud and as it happens. Handy for getting around, reading signs, and the everyday things sight makes easy.
Best on mobile · Allow camera access · Korean, English and Nepali supported
No download needed · Works in browser · Try in Korean: select 한국어
01 · The problem
Every day, billions of people move through places that weren't built with them in mind. We're here to close some of that gap.
Unfamiliar streets, big indoor spaces, crowded stations. They're all hard to move through when you can't see what's coming.
1.3 billion people live with a disability. Most digital tools still aren't built around how they actually live.
Signs, menus, labels, a prescription bottle. So much of daily life is printed, and it's out of reach right when you need it.
Travelling, studying, an emergency. It all gets harder when the words around you are in a language you can't read.
02 · The solution
What you'd take in with a single glance, DrishtiLabs works out in about 40 milliseconds and says back to you, right away.
Captures the world as it is.
On-device vision models analyze the scene.
It reads objects, text and distance, and works out what you're trying to do.
Natural, contextual narration in your language.
03 · Capabilities
Wherever we can, it runs right on your device. That keeps it fast, and keeps your data with you.
Detects 1000+ everyday objects with bounding-box precision and depth estimation.
/ 01It picks up how people, surfaces and spaces relate, not just what's in the frame.
/ 02Reads signs, menus, labels and documents aloud in real time.
/ 03Clear, quick narration that still cuts through on a noisy street.
/ 04Speaks 40+ languages and translates what's around you as you go.
/ 05Designed alongside the blind and low-vision community from day one.
/ 0604 · How it works
Open DrishtiLabs on your phone or wearable. The camera streams a live view to the model.
Camera online · 1080p · 30fps
A vision-language model reads objects, depth, text and motion in under 40ms, with the full response back to you in about 300ms.
Detected: crosswalk, 2 pedestrians, traffic light (red)
It only says what matters, clearly and calmly, in your language.
Wait. Red light. Two people on your left.
Tap or say a question. DrishtiLabs answers based on what it's seeing right now.
"What does that sign say?" → Emergency Exit
05 · Vision

Prototype v2 · built in Kathmandu · ~12g · OV2640 camera · earbud audio (prototype) · bone conduction Phase 1 Korea
Proof-of-concept vision pipeline with live narration.
End-to-end product with onboarding and offline modes.
iOS and Android launch with wearable companions.
Custom optics with always-on, hands-free guidance.
Open accessibility platform for partners worldwide.
06 · Build logs
Rewrote the inference pipeline with quantized weights so narration feels instant.
DrishtiLabs guided a low-vision tester through a 1.2km route in Kathmandu with zero misses.
Shipped a calmer narration style after 30+ interviews with accessibility advocates.
Trained a denoising adapter that improves sign reading accuracy by 22% at dusk.
07 · Team
We're hiring researchers, engineers and accessibility specialists who care about this as much as we do.
Founder · CEO
Building AI that sees the world so more people can experience it.
Web & Backend
Keeps the backend fast and dependable, so the AI is there the moment you need it.
Research & Strategy
Connects the research with design that actually works for the people using it.
Open role
Engineering, research, design and accessibility roles open.
08 · Accessibility
We care about accessibility in the product and on this site too. We try to build it in from the start rather than bolt it on at the end, and we keep improving as we learn from the people we build for.
Every interactive element is reachable and operable with a keyboard alone, including a skip-to-content link.
Semantic HTML, landmarks and ARIA labels help assistive technologies announce content clearly.
Type, colour and spacing are chosen for legibility, and the site ships with fully themed light and dark modes.
The entire interface is available in English, Korean and Nepali, built on an i18n system designed to grow.
Layouts adapt fluidly from small phones to large desktops so content stays comfortable on any screen.
Accessibility is an ongoing commitment. Found a barrier? The link below reaches us directly, and we keep refining.
An intense 30-second audio experience. Headphones on, eyes closed, only what you can hear.
Found an accessibility barrier? We genuinely want to know.
Tell us how we can do better09 · Contact
Investors, partners, accessibility experts, early testers. We'd love to hear from you.
Kathmandu - Remote