Still opening a browser tab, typing a question, and reading the reply. This video shows you how AI goes from answering to actually doing things, one step at a time.
What the Video Is About: A Six-Stage Arc from Chat to Agent
The video uses "getting increasingly hooked" as its central metaphor, treating each AI tool as a different level of intensity. I have organized its main thread into six stages, each with a direct link back to the original video. The following is a paraphrased summary; watch the source video for the full experience.
Stage 1: Web-based chat, easy and low-stakes
It starts the way most people start: everyone around is playing with it. A bit of Doubao for casual chat and image generation. Then Qianwen, stable enough for Chinese text, something like a reliable snack. At this stage AI feels light, harmless, and nothing like an addiction. (Video 00:07)
Stage 2: Starting to lose control
Then comes Deepseek. It feels different: logical, analytical, and cheap. So you open one conversation, then another, then another. You finish a question and immediately want more, opening a new thread the way you eat sunflower seeds, unable to stop. (Video 00:36)
Stage 3: The tool shift. AI starts actually doing things.
GPT changes the feel. You share an idea and it takes action: you say you want a tool, it breaks down the requirements; you say you want a script, it gives you a plan; you say you want a page, it thinks through the structure, the code, and the interactions together. There is a moment of self-deprecating humor in the video: the interviewee used to comfort herself that doing things by hand had value, that the process mattered, and then realized the process could be outsourced too. (Video 01:06)
Stage 4: Multi-model collaboration. You have hired a team.
Then Gemini, with its enormous context window that can swallow documents, web pages, videos, and images all at once, producing output with an aesthetic sensibility that holds up to scrutiny. (Video 02:15) And then Claude and Claude Code, which had been held at arm's length until now. The first use is disorienting. It is not just answering questions; it is understanding what you are trying to accomplish. When requirements are vague, it asks clarifying questions. When file structure is a mess, it helps untangle it. Architecture, logic, code, bug fixes, optimization suggestions, all in one place. The video's description: you get the illusion of having hired a team that never sleeps, and you realize long ago you stopped thinking of it as a single tool. (Video 02:52)
Stage 5: The behavioral flip. Demand starts generating itself.
This is the sharpest observation in the video. Before: you had a problem, so you opened AI. Now: you open AI, and that generates the problems. (Video 04:07) You only wanted to build a small webpage. Once it was done, you wanted to add login. Once login was done, you wanted an admin panel. Once you had the admin panel, you wanted a database. Once you had the database, you wanted to turn it into a SaaS. Once you had SaaS, you wanted to add payments. Once payments were in, you wanted to send it to real users. The video's observation is exact: AI does not stop you. It just says, "Sure, let's take it one step at a time."
Stage 6: Four models running in parallel, and then Codex breaks everything
Eventually four model threads run at once: Deepseek for the cheap bulk work, GPT for the main push, Gemini for ingesting data and aesthetic review, Claude for architecture and coding. Then Codex enters and everything unravels. Every new idea feels like it should become a project. Stumbling across a GitHub repo at midnight triggers the urge to replicate it. AI canvases, auto-editing tools, MCP implementations, all of them. The computer fills up with unfinished scenes from a dozen half-started productions. (Video 04:53)
