Intro
As the solo product designer at Canny, I worked closely with Product and Engineering to shape Autopilot. It's an AI-powered system for capturing and organizing customer feedback from the tools teams already use.
Keeping all of that feedback actionable takes real effort. Autopilot reduces that work by sorting and surfacing what matters. Teams stay in control of what gets prioritized while Autopilot handles the busywork.
Context
Think of Canny like Reddit for product feedback. Users post requests, others upvote and comment, and teams use that signal to decide what to build next.
That works well when the volume is manageable. But once usage grows, the backlog becomes a living system. Posts to review, duplicates to merge, customers to respond to.
And if teams are collecting feedback in other tools too, Canny starts to feel like one source among many rather than the place where decisions get made.
Feature requests from Canny's users
Problem
Which is exactly what was happening. As teams scaled, we noticed that feedback stopped living in Canny.
High-signal requests were showing up across support conversations, sales threads, CRMs, Slack, emails, and app reviews. Each had different context and different owners. Canny was supposed to be the source of truth, but when half the signals never made it there, it became noise in its own users' stack.
Capture became inconsistent. Duplicates multiplied. Important asks got buried in threads nobody had time to comb through. The less teams knew what was in Canny (and elsewhere), the less useful it became.
















Feedback, everywhere, all at once
Strategy
Instead of asking our users to bring feedback to Canny, we wanted to make Canny the place feedback naturally ends up.
If we could pull feedback from the tools teams already use and surface it as reviewable insights, they'd spend less time hunting for signals and more time acting on them.
We mapped the workflow end to end and worked backwards from the failure modes. Missed signal, duplicates, and conflicting sources.
The concept we landed on was Autopilot. It connects external tools, extracts feature requests, deduplicates them, and returns them to Canny as drafts for review. It handles the repetitive work without taking decisions away from the team.
Source ingestion and routing
Constraints
We began exploring Autopilot in late 2023, when newer AI models made reliable extraction practical. We gave ourselves 6 months to reach public launch, with 4 months to develop a MVP and 2 months to run a closed beta.
The core team was small to keep scope focused: Myself, a PM, and 4 engineers. I was embedded with engineering throughout the project, joining daily syncs, iterating on shared specs, and reviewing builds as they shipped.

Technical planning with the engineers
We couldn't overbuild the bet or pull engineering focus from the rest of the product. But beyond just scoping down, I had to make sure the design still gave teams enough context to feel in control. AI was still entering its peak in terms of what it could do for businesses, and most people didn't trust it yet. So the interface had to do a lot of that trust-building on its own.
The biggest requirement we had to satisfy was accuracy. We ran discovery calls with our users and noted that we had to hit at least 90% before they would trust the system enough. We benchmarked across multiple models and prompt stages using our own data, then ran Autopilot on that backlog end to end until results held up.
Cost pressure also made development more challenging. Benchmarking is token-heavy by nature, so we had to be deliberate about where we spent compute. We reduced unnecessary model calls with lightweight pre-filtering, batched and cached where possible, and reserved heavier models for low-confidence cases.
Solution
We built the smallest loop that could handle real volume. It needed to be fast, predictable, and auditable. Something that fits into a daily workflow.
I centered the experience on one surface, the Inbox. Connected sources feed in candidates with quick actions, showing source context, the extracted request, and a suggested create or merge target.
Autopilot inbox
Manual mode works like an approval queue. Automated mode turns the same view into an audit log of actions taken. Both live on the same surface, so teams don't have to switch contexts as they gain trust in the system.
Merge targets and results stay visible so teams can verify consolidation is correct. This mattered especially once automation was enabled, where confidence in outputs needed to be immediate, not something you go looking for.
Filters and source breakdowns let teams separate signal by channel and type without leaving the inbox. Spam is surfaced, not silently handled. Credits appear where value is realized, not buried behind processing.
Knowledge Hub
Accuracy depended on context. Conversations rarely include the rules, terminology, and edge cases behind what teams ship and support.
I added a lightweight way to provide that context through the Knowledge Hub. Teams upload reference material relevant to their product so extractions and replies stay grounded, without turning setup into a project.
Validation
Before launch, we ran a closed beta to validate Autopilot under real volume.
Typeform had the clearest proof point. They handle about 7,500 tickets a month from Zendesk across support, sales, onboarding, and CS. At that volume, manual tagging was never going to scale.
In a side-by-side review of 1,600 tickets, Autopilot processed the set in 53 minutes, surfaced 109 feature requests, and hit 93% accuracy, about 30 points higher than manual review. Deduplication held up too, with a 98.3% acceptance rate.
At that pace, a full 7,500-ticket month is processed in about 4 hours.
Impact
Autopilot fundamentally changed how teams used Canny. What had been a backlog they tried to keep updated became a workflow they returned to daily.
- Productivity improved 20-30x over manual review.
- We logged 80% more feature requests after launch, generating over 100k insights for teams using Canny.
- In its first year, Autopilot drove $300k+ in ARR, roughly 10% of Canny's total. For a bootstrapped company, that was meaningful leverage from a single feature.
Beyond the numbers, it also shifted the product direction. Autopilot became the foundation for Canny's next phase of growth. It proved that we could evolve the product beyond what it had been and create a new vertical in the space.

Celebrating Autopilot's launch in Perugia