Handsign - an ASL Practice App

I built a web app to practice American Sign Language (ASL) alphabets in one week using a 10-year-old laptop. The app is live at handsign.pages.dev.

This isn’t a technical deep-dive but rather a summary of my experience turning a frustration into a focused MVP. It started, as all good products should, with empathy.

Starting with a real problem

The idea didn’t come from a brainstorming session; it came from a movie I recently watched called CODA. It was about a non-deaf daughter struggling to relate to her deaf family. Her challenge highlighted a fundamental barrier: learning ASL is hard, not because the resources don’t exist, but because the journey is full of friction.

My goal became a statement of intent:

Make learning ASL more accessible.

Your user is not everyone

At a first glance, the ecosystem around ASL is complex: instructors, learning platforms like Udemy, students, and even governments driving inclusion. To focus my effort, I prioritised stakeholders based on a set of criteria:

Market Size: Where is the largest available audience?
User Need: Which segment is most underserved?
Growth Potential: What offers long-term viability? (I noted this as a corporate concern, but less so for my immediate goal).

After reviewing some publicly available research reports, the answer was obvious: students and learners. They represent the largest, most underserved market with the most acute need. This is who I would build for.

I decided to reach out to some online ASL learning communities on discord and reddit and conducted open-ended interviews with different kinds of ASL learners (active, stalled, aspiring) to understand their learning barriers and identified three core problems:

Dialect Paralysis: Beginners don’t know which dialect to start with.
The Practice Gap: It’s hard to get real-time practice without being immersed in the deaf community.
Time Scarcity: Learners lack time for dedicated practice.

To move forward, I had to prioritise again. I assessed these problems against criteria that matter: Is the problem real? How high is the customer impact? And does solving it align with my goal of making ASL more accessible?

Problem	Is this a real problem?	Impact to the customer	Does it tie to my goal?
Not enough time	M	M	H
Can’t find someone to practice with	H	H	H
Don’t know which dialect to learn	M	L	L

The problem was getting clear. The inability to practice and get real-time feedback was the most painful and acute problem.

The goal was further refined:

The core problem to solve is practicing ASL with real-time feedback.

Building within my means

Wishes are ideas without constraints, and I had a few constraints:

Time: I had just one week to build a MVP.
Cost: The only hardware I had was a 10-year-old laptop with 8GB of RAM. This meant any ML models had to be CPU-reliant, not GPU-dependent.
Effort: I am already familiar with web development and Machine Learning. I would have to leverage what I already knew.

With these constraints in mind, I evaluated three potential solutions supported with feedback from user surverys:

Solution	Cost/Effort	Directly impacts user problem	Unique Value/Differentiator
Anki-style memorization system	High	Low	Low
Camera-based Sign Recognition App	High	Medium	High
Platform to connect learners with practitioners	Low	High	Low

The hi-fi platform would have the highest impact, but the cost and effort were too high for an MVP. The lo-fi Anki system was low effort but failed to address the core problem of practice and feedback.

The architecture of an MVP

So I decided to build a computer vision app that runs entirely in the browser. This approach meant no downloads, no installations, and maximum accessibility. The ML model runs client-side, processing the video feed directly on the user’s device.

We’ll get a bit technical now. The system architecture is composed of three core components:

Component #1: Gesture Capture

To capture hand movements via the webcam, I used a third-party react module to processes the webcam feed and output a JSON array representing the hand’s position. I confirmed it worked by testing the output data type and ensuring all dependencies loaded correctly.

Component #2: Hand Detection

Interpreting the captured gestures and detecting ASL alphabet signs required comparing the incoming JSON data from the Gesture Capture module against a trained dataset of signs and ranks potential matches by a confidence score. The initial accuracy for a few alphabets was around 30%, which I improved by focusing the training data different poses of the same alphabet. I chose to optimize for accuracy over latency, as users have a higher tolerance for slight delays than for incorrect feedback.

Component #3: Flashcard Component

Now to provide visual feedback to the user, this component takes the highest-confidence letter from the Hand Detection component, looks up the position of the hand joints, and draws a skeletal overlay and the corresponding letter on the screen. Fortunately, I found it performing adequately in low-light environments.

Why the browser is the best first bet

I deliberately chose to build a web app and not a native mobile app. My rationale is simple:

Low Barrier to Entry: A web app is instantly accessible. There is nothing to download. This reduces friction and speeds up the feedback loop.
Zero Hosting Costs: Static file hosting on platforms like Cloudflare or Netlify is practically free.
Maximum Accessibility: It works on any device with a modern browser, from a laptop to a smartphone.

I ended up choosing this tech stack for speed and efficiency:

React.js for modular, extensible components, Tensorflow.js for client-side ML processing, Cloudflare for fast, cheap, and scalable static hosting.

An MVP knows what to leave out

A critical part of building an MVP is not just deciding what to build, but deciding what not to build. I explicitly left some things out:

No User Sign-ups: The primary goal is to validate the core mechanic. User accounts can come later.
No Fraud Protection: The app runs client-side and doesn’t store or record video, mitigating privacy risks from the start. For a public-facing service, I’d leverage a provider like Cloudflare for built-in CDN and privacy protection.

Validation

The app, handsign.pages.dev, is now live. It’s rough, but it works. I took the app back to the communities I first interviewed. The usability testing was invaluable. The app achieved roughly 70% accuracy for ASL alphabets, but the sessions revealed clear areas for improvement:

Improve 3D Sign Detection: The current model struggles with signs requiring wrist rotation (like ‘J’ or ‘Z’). The next step would be to explore a 3D model (like Tensorflow 3D), though this presents a challenge for CPU-only processing.
Refine the User Interface: The UI is functional but rough. A seamless overlay for the hand-tracking graph would improve the experience, especially on mobile where the current react video library is limited.

What I actually learned

Winning

Winning

Honestly, I just wanted to build something that might help people learn ASL. I ended up with a rough app that works about 70% of the time. More importantly, I learned that asking “why this problem” and “why this solution” forces you to make better choices, even when you’re just messing around on weekends.

The stuff that actually mattered:

Start with empathy, not tech. A real user problem is the only testable foundation.
Focus on the single most painful problem. Don’t try to solve everything at once.
Embrace your constraints. They are a feature, not a bug, that forces you to be creative.
Close the loop. Validating with real users is the only way to know if you’ve built something of value.