Unlocking AHA Moments
Table of Contents
- Introduction: Navigating the Maze of Modern Web Platforms
- Our Approach: Simplifying the User Journey
- Technical Underpinnings: GPT’s Role
- The Competitive Landscape
- Why This Matters?
Introduction: Navigating the Maze of Modern Web Platforms
Have you ever felt overwhelmed while using a sophisticated web platform, like Google Cloud, due to its plethora of options? This common challenge hinders many users from fully utilizing these powerful tools.

The way better experience could the following: you have a chat assistant which would ask you about your goal and then would guide you through the process of achieving it or even better doing it for you!

My recent pitch at an AI Tinkerers event presented an innovative solution that we developed at pyne to this problem, harnessing AI to simplify and personalize user interaction with complex websites.
Our Approach: Simplifying the User Journey

Achieving this requires significant processing power to understand and navigate a website's structure, a task often too complex and costly. Our solution? We focus on identifying and recording the most common user flows, offering detailed, AI-driven demonstrations for end customer. It could be in the form of a video, chatbot, interactive guide or even automation that makes clicks over the website.
The Process
- Capturing User Interaction: We developed a browser extension to record user clicks and voice. I would like to emphasize that the output is not a traditional video recording, it is events file that captures a DOM of a website and an audio recording.
- Analyzing Data: The captured data, including an
events.json
andaudio.webm
, undergo processing to filter out relevant information from user actions and website structure data. - Creating the Tour: The refined data is then used to construct a user tour on our platform, which can be edited for precision.
- Presenting the Guide: The final step involves integrating the guide into the user’s web experience, either through a Chrome extension or a future SDK.

The Output
- The tour highlights specific webpage elements accompanied by text explanations, enhancing user understanding and engagement.
- Our platform supports various formats, including videos, chats, and automated tours.
- Leveraging a rich database of actions, we can automate tasks based on user input, like integrating Notion seamlessly.

Technical Underpinnings: GPT’s Role

Our system utilizes GPT capabilities in several key areas:
- Transcription of Voice to Text: Converting user voice input into actionable text.
- Aligning Clicks with Transcription: We use time stamps and the DOM structure to synchronize user actions with their verbal instructions.
- Selector Selection: Addressing the challenge of selecting reliable CSS selectors or XPaths, we are developing a prototype for semantic analysis of selectors.
- Generating Interactive Sections: Creating user-friendly tooltips and interactive elements.
- Storing and Retrieving Click Information: A vector store enables efficient retrieval of specific user actions and related information.
The Competitive Landscape
- Adept: With over $200M in funding, Adept is training foundational models for visual reasoning across web pages.
- Induced AI: Backed by Sam Altman, they focus on Robotic Process Automation.
- Open-Source Contributions: Pioneers like Andrej Karpathy and projects like natbot and webLM.
Why This Matters?
Our end-to-end solution begins with a simple browser extension and culminates in an immersive, AI-crafted product tour. This topic matters to SaaS companies because, if they are honest, most of their users have no clue about their product’s capabilities. Wit our AI-first solution to this problem, we enable them to hyperpersonalise their product’s experience to each user, thus democratizing access to complex web platforms, making technology more approachable and user-friendly.