Cuttlefish

Cuttlefish was a research and development problem first and a product idea second. The opportunity was to make Auslan learning more accessible by turning a face-to-face-only challenge into something that could be explored in the browser, with AI helping bridge the gap between gesture recognition and conversational practice.

Context

According to Deaf Connect, only around 30,000 people in Australia know how to communicate in Auslan at varying levels. Unlike spoken languages, Auslan does not have the same depth of online learning infrastructure, and current digital tools struggle to emulate conversational use. That creates a clear access problem for people in remote locations, people with low mobility, and people who are incarcerated. It also creates a retention problem for students who can learn in class but have little opportunity to practise outside it.

Role

I was the Creative Technologist responsible for researching and developing the solution. I identified the opportunity, pitched it for internal buy-in, built the prototype and engine, and shaped the instructional design for an MVP that could be taken to organisations in the sector. I also met with organisation leaders to understand the landscape, test the fit of the concept, and pressure-test the idea against real funding and governance constraints. In parallel, I managed the project end to end through roadmaps, scoping, estimation, reporting, and momentum.

Hurdles

The core challenge was not simply teaching Auslan online. It was finding a format that could make conversational practice feel credible, accessible, and useful without pretending the web could replace face-to-face learning entirely. The solution also had to work for a broad audience that included children and adults, support users with limited connectivity, and respect the cultural and grammatical complexity of Auslan itself.

Solution

The concept was a web application that could recognise gestures and converse with the user in Auslan through a browser. Users would grant camera access in the same familiar way they would on a video call, which reduced friction and made the interaction feel understandable from the start. AI would interpret complex gestures into text, opening the door to real-time conversation and more interactive educational design. The experience was also designed around accessibility, eager loading, gamification, and positive feedback loops so the learning model could be enjoyable as well as technically credible.

Process

I began by researching the problem space and defining a solution that could be defended internally and externally. After the concept was pitched and approved, I built the proof of concept to test the technical feasibility of gesture recognition in sequence, including recognition from both hands and the face at the same time. I also built a dictionary-management system so admin users could create and manage gesture data and training inputs more easily. From there, I continued improving accuracy through additional training data and explored how facial recognition could support grammatical markers, intensity, emotion, and other modifiers.

Key decisions

The first decision was to keep the product browser-based so it could be accessible to anyone with a web connection, rather than requiring a native app. The second was to frame the product around educational value and conversational practice rather than just gesture detection, because the learning experience had to be useful outside the classroom. The third was to treat Deaf grammar, culture, and etiquette as core parts of the system, not optional extras. The fourth was to prove the engine and the instructional model together, because the concept only mattered if the product logic and the learning experience could both hold up.

Outcome

The project reached a working proof of concept capable of recognising an alphabet of gestures in sequence at reasonable speed, including signals from both hands and the face within more complex gestures. The dictionary-management system was also proven, which made the underlying model more practical to maintain. The next step was to prove the conversational interface itself by generating textual responses in Deaf grammar and converting them into animated gestures.