I agree and sadly I wouldn't hold hopes to see actual meaningful changes (granted - last time had windows was win 7),
My reasoning is from bitter experience. I saw too many these honest talks/commitments - it always this pattern when product/company starts to decline. Suddenly somebody with technical background shows up talks about past mistakes and what need to fix. Even sometimes holds discussion, which is usually very reasonable. But as time goes there only cosmetic changes with excuses like lack of resources, market wind changed this time, too hard make changes due politics and etc.
Something that comes to mind for me is the old Bill Gates trustworthy computing memo [0], from the era when early windows xp was getting flak for poor security. That was supposedly the turning point where they started those overhauls towards service pack 2 and likewise added a security focus in other products, and they decided they couldn't sneak in easter egg flight simulators into excel any more because it just added opportunities for flaws.
What stands out to me is the organization needs to be accept that change is needed and 'walk the walk', and also that those efforts take time. I've no idea what things are in motion in MS, but I wonder how quickly they can turn the ship, how much momentum is in their current direction and how much force is in turning. Moving the taskbar seems like addressing a loud persistent talking point, but it's one among many. What's the timeline (even though windows version timing seems like 'when they need branding')? Win12? Win13?
The only thing I'd add is that not only did he tweet the infamous tweet that caused the backlash, Pavan ridiculed those in the backlash (since deleted). Also, Satya still spews the same "agentic OS" narrative as recent as last week.
So, I hope for the best, but I don't plan on taking them at their word.
Everyone at MSFT who is senior is a lying piece of shit these days. I remember on here Satya being treated like the second coming of Jesus due to his promises. Any comments against him were downvoted.
Absolutely nothing wrong with an "agentic OS", agentic UX is the future of personal computing. The ideal is that something intelligent understands what you want to do and gets it done.
Unless you really think we've reached the pinnacle of user interface with repetitive clicking around and menus.
The problem is with shoving AI down user's throats. Make it an option, not the only option.
> The ideal is that something intelligent understands what you want to do and gets it done.
Maybe? For a couple of decades, we believed that computers you can talk to are the future of computing. Every sci-fi show worth a dime perpetuated that trope. And yet, even though the technology is here, we still usually prefer to read and type.
We might find out the same with some of the everyday uses of agentic tech: it may be less work to do something than to express your desires to an agent perfectly well. For example, agentic shopping is a use case some companies are focusing on, but I can't imagine it being easier to describe my sock taste preferences to an agent than click around for 5 minutes and find the stripe pattern I like.
And that's if we ignore that agents today are basically chaos monkeys that sometimes do what you want, sometimes rm -rf /, and sometimes spend all your money on a cryptocurrency scam. So for the foreseeable future, I most certainly don't want my OS to be "agentic". I want it to be deterministic until you figure out the chaos monkey stuff.
I think your last paragraph is the real issue that will forever crush improvements over clicking on stuff. Once you get to "buy me socks" you're just entering some different advertising domain. We already see it with very simple things like getting Siri to play a song. Two songs with the same name, the more popular one will win, apply that simple logic to everything and put a pay to play model in it and there's your "agentic" OS of the future.
I beg to differ that "the technology is here". Everyone I see who uses voice commands have to speak in a very contrived manner so that the computer can understand them properly. Computer vision systems still run into all sorts of weird edge cases.
We've progressed an impressive lot since, say, the nineties when computers (and the internet) started to spread to the general consumer market but the last 10% or so of the way is what would really be the game changer. And if we believe Pareto, of course that is gonna be 90% of the work. We've barely scratched the surface.
> it may be less work to do something than to express your desires to an agent perfectly well
As I use AI more and more to write code I find myself just implementing something myself more and more for this reason. By the time I have actually explained what I want in precise detail it's often faster to have just made the change myself.
Without enough detail SOTA models can often still get something working, but it's usually not the desired approach and causes problems later.
yeah for me even with other people, the amount of times you think "it would be easier for me to just show you" is maybe 30% of interactions with agents currently.
perplexity keeps trying to get me to use "computer" and for the life of me I can't think of anything I'd actually do with it.
It all depends on where the the AI is running. The problem with the idea, is that for the majority of Windows boxes where it would be running do not have the bare metal hardware to support local models and thus it would be in the cloud and all of the issues associated with that when it comes to privacy/security. It would be neat, given MSFT's footprint, to look to develop small models, running locally, with user transparency when it comes to actions, but that doesn't align with MSFT's core objectives.
AFAIK the existing Copilot features always use the NPU and do not fall back to the cloud. Given that Windows 12 will require an NPU I don't see why it would fall back either.
This is true for only features of Copilot+. The issue that MSFT faces, especially as it pushes Copilot EVERYWHERE is the reality of the majority if the hardware running Windows does not, and will not have, the NPU required for 12, nor is there the actual consumer purchasing power, to upgrade hardware to have an NPU. This a reality that MSFT just does not seem want to deal with while the push the technology onto consumers because its not based off of the reality of the install base they are dealing with but rather trying to justify their strategic investment into AI in the B2C space without doing the proper product market fit to justify it.
- "summarize the discussions on hacker news of last week based on what I would find interesting".
- "Plan my summer vacation with my family, suggest different options"
- "Look at my household budget and find ways to be more frugal."
There are thousands of things I can think of when it comes to how an agentic OS would work better than the current Screen Keyboard paradigm. I mean all these things I could now do with Claude or Codex and some of these things I already do with these tools.
>What specifically does an agentic OS UX look like beyond giving claude access to local files and a browser?
Providing the structure of a unified framework: APIs, safeguards, routing to the appropriate model or pipeline, and controlled access to devices and data. The capability is already there. What’s missing is a sane permission system that operates at the level of intent. Having used OpenClaw, that’s IMO the missing piece. It’s a fun experience, but in its current state I would not trust it to autonomously run any meaningful part of my life.
UX-wise, chat is kind of a crutch. It’s slow and inherently limiting. I imagine something closer to a natural, ongoing conversation paired with an execution layer: some sort of approval or review dashboard where planned actions are ready for approval or returned for refinment before they happen. Probably with a conservative moderator agent in the loop that flags things based on preferences and hard-coded policies.
Calling it an OS isn’t accurate, I agree. But that's how people will perceive it. Most people already think of the application layer on Android as "the OS," not the kernel or drivers. This will be the first-class interface on your device, so that’s what it gets called. It doesn’t mean browsers or dedicated applications go away.
Three years ago I would not have thought the IDE would stop being the application I spend most of my time in. Now it’s mostly a passive code viewer and Git browser.
Compare that to everyday workflows. Researching anything still feels incredibly antiquated. Buying a phone, planning a vacation, comparing options means opening dozens of tabs, copy-pasting specs or prices into spreadsheets, reading through fine print, dealing with low-quality or honestly untrustworthy reviews, checking distances manually on maps. It’s boring and tedious work.
Meanwhile, in a professional life, these systems already behave like a team of secretaries: always available, reasonably competent, and scalable. Not perfect, but easily good enough to offload a huge amount of cognitive overhead.
What I'm trying to say is the long path is "get shit done". No work is completed by reading AI summaries of informative content. Its just productivity porn
Even theoretical AI still has the other mind problem from economics.
Communicating and predicting desires, preferences, thoughts, feelings from one mind to another is difficult.
Fundamentally the easiest way of getting what you want is to be able to do it yourself.
Introduce an agent, and now you get the same utility issues of trying to guess what gifts to buy someone for their birthday. Sure every now and then you get the marketers "surprise and delight", but the main experience is relatively middling, often frustrating and confusing, and if you have any skill or knowledge in The area or ability to do it yourself, ultimately frustrating.
We've already been through this when people a decade ago thought voice was the future of the computer.
When that completely didn't work, we thought that augmented reality was the future of the computer, which also didn't work out.
You need a screen to be able to verify what you're doing (try shopping on Amazon without a screen), which means you also need a UI around it, which then means voice (and by extension agents which also function by conversation) is slower and dumber than the UI, every time.
Meanwhile I have yet to see any brand excited to be integrated with ChatGPT and Claude. Unlike a consumer; being a purely "reasoning-based" agent, they're most likely to ignore everything aesthetic and pick the bottom of the barrel cheapest option for any category. How do you convince an AI to show your specific product to a customer? You don't.
"Agentic typewriters are the future of typewriting. The idea is that something intelligent understands what you want to type and types it for you. Unless you really think we've reached the pinnacle of typewriter interfaces with repetitive key taps and carriage returns."
See how that sounds a bit silly? It's because it presents a false dichotomy. That our choice is between either the current state of interfaces or an agentic system which strips away your autonomy and does it for you.
We’ve had computing technology that clearly understands what the user wants to do. It’s called a command line interface. No guessing, no recommendations, no dark patterns, no bullshit.
The author of this commitment is the same person (Pavan Davuluri) spearheading move of Windows into an Agentic OS: https://www.windowscentral.com/microsoft/windows-11/windows-...