Apple needs on-device AI to do chores for me with the apps I have installed. Apple has everything it needs:
* Apps are already logged in, so no extra friction to grant access.
* Apps mostly use Apple-developed UI frameworks, so Apple could turn them into AI-readable representations, instead of raw pixels. In the same way a browser can give the AI the accessibility DOM, Apple could give AIs an easier representation to read and manipulate.
* iPhones already have specialized hardware for AI acceleration.
I want to be able to tell my phone to a) summarize my finances across all the apps I have b) give me a list of new articles of a certain topic from my magazine/news apps c) combine internet search with on-device files to generate personal reports.
All this is possible, but Apple doesn't care to do this. The path not taken is invisible, and no one will criticize them for squandering this opportunity. That's a more subtle drawback with only having two phone operating systems.
> iPhones already have specialized hardware for AI acceleration.
This really is the problem. Why do I spend hundreds of dollars more for specialized hardware that’s better than last years specialized hardware if all the AI features are going to be an API call to chatGPT? I am pretty sure I don’t need all of that hardware to watch YouTube videos or scroll Instagram/web, which is what 95% of the users do.
> Apple needs on-device AI to do chores for me with the apps I have installed
Nevermind that—iOS just needs to reliably be able to play the song I’m telling it to without complaining “sorry, something went wrong with the connection…”
This is all possible, but an absolutely terrible idea from a security point of view, while prompt injection attacks are still a thing, and there's little evidence they will stop being a thing soon.
I think on-device AI will show up more front and center but in a few more years.
A big issue to solve is battery life. Right now there's already a lot that goes on at night while the user sleeps with their phone plugged in. This helps to preserve battery life because you can run intensive tasks while hooked up to a power source.
If apps are doing a lot of AI stuff in the course of regular interaction, that could drain the battery fairly quickly.
Amazingly, I think the memory footprint of the phones will also need to get quite a bit larger to really support the big uses cases and workflows. (I do feel somewhat crazy that it is already possible to purchase an iPhone with 1TB of storage and 8GB of RAM).
2TB microsdxc cards have been available for a year or so, and 1TB cards have been available for several years and are even quite affordable. They work in many Android phones including my cheap Motorola. So it's Apple's sky-high premiums that has made their 1TB phones surprising.
I agree completely, it's really unfortunate how AI on apple devices has been going. The message summarization is borderline useless and widely mocked, meanwhile their giant billboard ads for it are largely stupid and uncompelling. Let me choose to give it access to my data if I want to do really useful stuff with on device processing. They've been leaning into the privacy thing, do the stuff that would be creepy if it left my device, generate push notification reminders for stuff I forgot to put in the calendar, or track my location and tell me I'm going to the wrong airport. Suggest birthday gifts for my friends and family, idk.
Edit: And add strong controls to limit what it can and cannot access, especially for the creepy stuff.
Apple is generally anti market hype. It is a smart PR move to avoid mentioning AI after the Apple Intelligence fiasco, their researchers leaving, and the bubble sentiment at the moment.
> In the same way a browser can give the AI the accessibility DOM, Apple could give AIs an easier representation to read and manipulate.
Apps already have such an accessibility tree; it's used for VoiceOver and you can use it to write UI unit tests. (If you haven't tested your own app with VoiceOver, you should.)
They have a super fast and slick file storage app. Some of the features that are natural additions to that feature set work quite well, like document scanning. But so much of what Dropbox does seems like they can't stay put and be happy with their core offering. Of course, they have to do this to increase revenue, for fear of becoming a mere commodity. It's tough.
Random sampling over time is substantially as effective as having someone enforce the law 100% of the time. It's something like how randomized algorithms can be faster than their purely-deterministic counterparts, or how sampling a population is quite effective at finding population statistics.
It feels less fair though. When everyone is driving x mph over the limit but only you get pulled over, it sucks. So I agree for efficiency of enforcement, but I'd rather see 100% enforcement (automated if possible), with more warnings and lower penalties.
That's a pretty extreme example, maybe the idea doesn't hold as much there. But yeah, if 99% of murders weren't prosecuted, the 1% who get charged might feel like they were singled out (and maybe they were, because of some bias or discrimination). Again, 100% enforcement is better.
It doesn't just "feel" less fair, it often is -- bc it's not truly random, it's selective enforcement which leads to things like "driving while black".
Unpopular opinion, but I actually like traffic enforcement cameras. They don't know what race you are, and they never end up escalating to using lethal force.
The problem with 100% enforcement is it doesn't allow law enforcement any discretion, and then you end up having to actually officially change the speed limit which would probably never happen
Definitely true in practice, but I don't think we want discretion. What I mean though is as a deterrent, you can either have a "fair" fine that's enforced 100% of the time, or 2x the "fair" amount with 50% enforcement, etc. When it's 100x the "fair" amount with 1% enforcement, and you see everyone else not being enforced, it feels unfair.
Traffic rules do require some discretion though - if eg you don’t allow crossing a double yellow line but a car is broken down blocking the lane, does that mean that the road is now effectively unusable until that car is towed? Lots of examples.
But I’m with you on more enforcement. I’m totally fine with automated traffic cameras and it was working great when I was in China - suddenly seemingly overnight everyone stopped speeding on the highways when I was in Shanghai, as your chances of getting a ticket were super high.
All this points to "personality" being a big -- and sticky -- selling point for consumer-facing chat bots. People really did like the chatty, emoji-filled persona of the previous ChatGPT models. So OpenAI was ~forced to adjust GPT-5 to be closer to that style.
It raises a funny "innovator's dilemma" that might happen. Where an incumbent has to serve chatty consumers, and therefore gets little technical/professional training data. And a more sober workplace chatbot provider is able to advance past the incumbent because they have better training data. Or maybe in a more subtle way, chatbot personas give you access to varying market segments, and varying data flywheels.
I'm sure we're already there with musicians as well. If you take Taylor Swift's annual revenue and multiply it by 5-10, you'll easily cross a billion. But she does require tons of staff for the stadium shows that provide the bulk of the revenue, etc. Not literally one person, but it's a company with a bus factor of one.
And I'm willing to bet all that staff is contracted. Her personal employees (people she pays a regular fixed salary to) is probably in the single digits, or possibly even 0.
Just complaining that the world is bad is a good way to waste your energy and end up being a cynic.
So why is not all information organized in structured, open formats? Because there's not enough of an incentive to label/structure your documents/data that way. That's if you even want to open your data to the public - paywalls fund business models.
There have been some smaller successes with semantic web, however. While a recipe site might not want to make it easy for everyone to scrape their recipes, people do want Twitter to generate a useful link preview from their sites' metadata. They do that with special tags Twitter recognizes, and other sites can use as well.
The good news is that LLMs can generate structured data from unstructured documents. It's not perfect, but has two advantages: it's cheaper than humans doing it manually, and you don't have to ask the author to do anything. The structuring can happen on the read side, not the write side - that's powerful. This means we could generate large corpuses of open data from previously-inaccessible opaque documents.
This massive conversion of unstructured to structured data has already been happening in private, with efforts like Google's internal Knowledge Graph. That project has probably seen billions in cumultative investment over the years.
What we need is open data orgs like Wikipedia pick up this mantle. They already have Wikidata, whose facts you can query with a graph querying language. The flag example in the article could be decomposed into motifs by an LLM and added to the flag's entry. And then you could use SPARQL to do the structured query. (And that structured query can be generated from LLMs, too!)
Many companies face model regressions on actively used workflows. Microsoft is the cloud provider who won’t force you to upgrade to new models. This has driven enterprises facing model regressions to Microsoft, not just for workflows facing this problem, but also new workflows just to be safe and not have to migrate clouds if there is a regression.
* Apps are already logged in, so no extra friction to grant access.
* Apps mostly use Apple-developed UI frameworks, so Apple could turn them into AI-readable representations, instead of raw pixels. In the same way a browser can give the AI the accessibility DOM, Apple could give AIs an easier representation to read and manipulate.
* iPhones already have specialized hardware for AI acceleration.
I want to be able to tell my phone to a) summarize my finances across all the apps I have b) give me a list of new articles of a certain topic from my magazine/news apps c) combine internet search with on-device files to generate personal reports.
All this is possible, but Apple doesn't care to do this. The path not taken is invisible, and no one will criticize them for squandering this opportunity. That's a more subtle drawback with only having two phone operating systems.
reply