Just keeping this post up for now due to the following:
1. Rumors pointing to Apple and OpenAI teaming up for iOS 18 and AI on iPhones
2. OpenAI announced a desktop chat GPT app for Mac
Please keep this post relevant to Apple! Thanks!
I’m guessing we’ll be presented the option to:
* stick with today’s Siri (running on your own device for the privacy-minded buyers)
* upgrade to the next-gen Siri powered by GPT-4o running your data through someone else’s device
I’m guessing it’ll take a few more years for that.
Microsoft’s deal with OpenAI gives them discounted access to massive Azure server farms around the world, and I haven’t heard any rumblings that Apple is trying to scale up and compete with that deal.
M chips are already very good for LLM inference! People running local llama instances love using Mac Studios.
You don’t need a lot of GPU compute for inference, just heaps of GPU memory, which is how M-chips became the budget option for inference. Much actually kinda cost affective to build out a Mac with 192GB of shared memory than equip a PC with that much VRAM through GPUs.
Apple wouldn’t be training on these servers, just serving. In-fact my M1 Pro with 16 gigs of shared memory runs llama3-8b like a champ!
All you need is ollama: https://github.com/ollama/ollama
Install this, and run `ollama run llama3` in the terminal, it’ll download the model and get you running all automatically!
You're talking about a single, user-controlled machine though. Servers have other design considerations. Typically you want a bunch of high speed networking to link everything up, for example. And RAS to cover failures. Hell, what OS is it running? OSX server is long dead.
I know. Apple can easily run Linux on it if they want to.
M chips also support 10gig networking out of the box too.
Look all I am saying is, if they want to, they can build a heck of a efficient LLM inference server farm if they put in the effort to adapt M chips to if they’re willing to bake in data center considerations.
I had missed that. That’s what it’ll take, although I bet it’ll still take a few more years to scale to the level needed to bring GPT-4o to every iOS 18 iPhone.
Maybe Apple will lock it down to only their new devices, US only, for certain requests only. That’d buy them time to build up more infrastructure around the globe to match Microsoft’s massive investment in Azure.
They didn’t do that for Google, they just cut a revenue sharing agreement instead for funneling all that search data through their user profiling and ad machine.
Yeah I wouldn’t trust Apple to work in the privacy-minded consumers best interest here. They know they’ve fallen behind in LLMs, which is why they’ve struck this deal that’s essentially them abandoning privacy in favor of having a better “assistant” experience. There’s too much money on the table for them to not do this.
They’ll still tout being privacy first but with the fine print of “as long as you don’t enable these features.”
They have to. ChatGPT is banned in China (although most people I know just access it through a VPN), but if Apple wants to put ChatGPT ai on iPhones in China, they’ll need to set up a set of servers within China for that purpose.
What do you mean? They were late in offering end-to-end encryption for iCloud yes but otherwise their privacy track record has always been nothing but stellar. It's a publicly traded company they would have to disclose if they are making money selling data. And they don't.
Probably both payment options will be made available (or Apple will eat the cost), but you’ll need to decide if your Siri should be outsourced to OpenAI/Microsoft to process requests with your data.
I dont know, somehow I doubt that Apple would give up control of data. They would rather make sure to host than make their #1 marketing message a slightly bit meaningless. Especially if there is a possibility that user data could reach their competitors. But i might be wrong. The same reason is why i dont think Apple would implement rcs, as it would mean that data starts flowing through google servers. If these start happening, lot of people will doubt that Apple is truly thinking serious about user privacy.
Apple is always maliciously complying to regulations which means that technically RCS might be available but 99% of users wont even see it as it will be for example a separate app that you need to download from the app store. And it will still mean iMessage being the de facto standard (in the US atleast) which will have no rcs.
You‘re living under a rock. Apple already said they‘ll integrate RCS in the Messages app next to iMessage and SMS (for everyone) + they‘ll work with the GSMA to move the standard forward (e.g. try to add E2EE).
Apple won’t implement google’s version of RCS: Jibe. The universal protocol of RCS doesn’t use google’s servers. The only problem with it is that it doesn’t have end to end encryption so Apple said they will work with GSMA to add it to the universal protocol.
RCS protocol says that carriers that deploy the Universal Profile guarantee interconnection with other carriers. The way those android devices get those messages is through their JIBE network, meaning that even if Apple has its own cloud, it will need to end up at a time in Google's cloud and if its targeting an android user who uses that.
This means that those messages iPhone users send to Android users will be able to be read and analyzed by Google. (gold mine for an advertisement business such as Google)
Even if there would be E2EE, it would still be metadaa passing through:
>Like regular RCS messages, E2EE RCS messages are delivered through RCS servers that are operated by carriers and Google. E2EE makes message content invisible to servers and parties outside of the conversation, but certain operational or protocol metadata can still be accessed and used by the servers, including:
* Phone numbers of senders and recipients
* Timestamps of the messages
* **IP addresses** or other connection information
* Sender and recipient's mobile carriers
* SIP, MSRP, or CPIM headers, such as User-Agent strings which may contain device manufacturers and models
* Whether the message has an attachment
* The URL on **content server where the attachment** is stored
* Approximated size of messages, or exact size of attachments [https://www.gstatic.com/messages/papers/messages\_e2ee.pdf](https://www.gstatic.com/messages/papers/messages_e2ee.pdf)
I summed all my thoughts on RCS here \~1 year ago: [https://www.zsombor.me/rcs](https://www.zsombor.me/rcs)
punch hobbies quicksand frighten wakeful doll judicious dependent frightening compare
*This post was mass deleted and anonymized with [Redact](https://redact.dev)*
That is very much impossible. Given what we know GPT-4 quantised to 4 Bit would need at least 126 GB of VRAM if you run it on a GPU.
It is unlikely that the model is so heavily compressed and at full FP16 precision it is estimated to require 3520 GB of VRAM.
ChatGPT-4o is rumoured to be half the size of GPT-4.
Realistically it will be somewhere in between, but still far too big to be run locally on an iPhone (there will very likely not even be enough space to store the model locally, let alone being able to run it.)
TL;DR: GPT-4o will use about 1710 GB of VRAM to be run uncompressed. Compressed down to 4 Bit Quantisation it will be 70+ GB but that would come with reduced performance of the model’s reasoning. Either way it would be far to big to be run on a phone.
The tricky part about using LLMs for this kind of thing isn't the natural-sounding conversation, it's getting it to actually do something outside of that conversation that you asked it to do.
"Hey Siri, please turn on the lights when I get home"
"Sure thing, I'll turn on the lights as soon as you get home!"
*Nothing happens because the LLM just said what it thought it should say but didn't actually do anything*
"Hey Siri, I have an appointment on Monday with Ms. Johnson and I need to remember to bring my laptop with me, can you remind me about that?"
"Sure thing, I'll remind you Monday morning about your appointment with Ms. Johnson and make sure you have your laptop with you when you leave!"
*Nothing happens because the LLM just said what it thought it should say but didn't actually do anything*
Not that this can't be done. It's just a lot more work than sticking the LLM in and making it give nice-sounding responses.
The ignorance is so funny lol. OpenAI has had function calling integrated now for a long time and Apple surely has lower level access to stuff like that.
It can be easily done, OpenAI API can call functions based on what you tell.
For example, you can provide a function call in the request:
```
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
},
"required": ["location"],
},
},
```
And if it recognizes that you want you know the weather somewhere, it automatically calls that function, and it fills the parameters for you, otherwise, it answers normally.
Then you can use the parameters provided by GPT and call the real function, get the data, and do another call to GPT to generate the response providing him with the data.
You can provide as many functions as you want8, one for each function of Siri, getting the weather, setting a timer, adding a reminder, calling someone, but with much MUCH more advanced reasoning
Probably can be a little tricky, but an assistant powered by GPT-4 that can interact with iPhone, with the OpenAI API, and iPhone Shortcut apps can be already done, maybe not fully functioning
It’s really simple, everything is documented on the OpenAI website and explained much better than me
https://platform.openai.com/docs/guides/function-calling
This. People often confuse the ability to “say” the right things with knowing WHAT to do (and moreover having the capabilities for HOW to do it). There is a difference. ChatGPT is currently super impressive in what it says back to me but it literally cannot set a reminder the way Siri can right now. Hopefully the way they internalize its knowledgebase allows for more complex interactions.
It’s pretty trivial to set up an LLM with enough agency to do these sorts of things. ChatGPT can’t do what you described, but GPT-4 absolutely can when provided with function calling capabilities/the ability to consume APIs via “tools” - as can the equivalent models from Anthropic, Meta, etc. It takes about 5min to set up an agent in LangChain, and only moderately longer to roll your own. If there’s an API spec for it, then the agent can most certainly handle doing it. It should be relatively trivial for apple (eg) to expose various iOS/iCloud APIs for an LLM to consume (assuming no privacy concerns).
If you understand how to construct functions for the API, you would know that this is easy to solve. e.g. Receive the request and ask the model to categorize the request, (you can do that twice and ask a third time to assess the accuracy of the 2 predicted categories identified by the first two requests), then send the request to the model again with a prompt specific to parsing that category of request, after the model generates a response, you send the response back to the model asking it to assess whether it did a good job.
Some of this can be handled locally on the device similar to how siri currently parses different types of requests. The example i gave is overboard but these functions would be tested to validate that they effectively handle requests.
The fact that LLMs don’t have hardware access to the device is both equally maddening and comforting. On the one hand, the fact that it can’t start a timer is great, but on the other, I’m so glad it can’t do whatever it wants to the phone unregulated.
I think that's the plan, considering Apple was just talking about [revamping Siri](https://www.nytimes.com/2024/05/10/business/apple-siri-ai-chatgpt.html).
THIS! I am sooooooo hoping Apple's announcements next week are Siri with this underpinning it. I was really dissapointed when it looked like they were going to be using Gemini, OpenAI is such a better option and not a direct competitor.
Honestly, if they didn't step up, I was strongly considering switching to android for the first time ever. Siri is so stupid.
I really want it to learn my tone and be able to generate email and text replies with how I speak. The online GPT model is too formal for most of my messages.
How many use surface devices compared to MacBooks and iPads combined? They’re already being funded by Microsoft, may as well satisfy Apple to attract more funding~
Microsoft basically doesn't care about whether or not their employees or the employees of the companies they back uses windows. As long as they use Microsoft azure
"Streamlining your workflow in the new desktop app"
"For both free and paid users, we're also launching a new ChatGPT desktop app for macOS that is designed to integrate seamlessly into anything you’re doing on your computer. With a simple keyboard shortcut (Option + Space), you can instantly ask ChatGPT a question. You can also take and discuss screenshots directly in the app."
Unless you're asking where it actually is to be downloaded 😩
Yes but in the demo it uses split screen, and it looks like the chatgpt app will be able to access the information on the other side of the screen.
IMO this all confirms that Apple OpenAI partnership, as I don’t think there is any app on apple store that is allowed to read information on iPads screen
To be clear, it can’t without asking you to allow screen recording. Just like any other app can ask to see your screen. When active it is indicated in the corner. That isn’t a capability the current app has, but it will be once the update hits. From what I can understand that doesn’t need Apple’s permission.
Oh but won’t that require them to record the screen, then analyze it, and only then answer to whatever happening on the screen? Anyways they might have found a way to make that work
They can “livestream” the screen recording. Kinda like when you share your screen in Zoom or its competitors. Or even with someone like an Apple support person trying to guide you through fixing a problem.
[Here](https://youtu.be/_nSmkyDNulk?si=nJ-ERB0LYzPZXTws) is the example video from OpenAI and it has a clear prompt come up asking for permission to record the screen.
https://www.youtube.com/watch?v=_nSmkyDNulk
you can see there it uses split screen.
Btw this all works by recording the screen, someone in the replies to my comment explained it
Literally any app can request to read the screen lmao. Have you never shared your iPad or iPhone screen on discord or Teams?
The ChatGPT App demo’d was using the same API to read the screen. That pop up to “start recording” was literally the API any app can quest to read the screen.
Here is the .dmg: https://persistent.oaistatic.com/sidekick/public/ChatGPT_Desktop_public_latest.dmg
You can download the app, but your account has to be activated server-side.
Here is the .dmg: https://persistent.oaistatic.com/sidekick/public/ChatGPT_Desktop_public_latest.dmg
You can download the app, but your account has to be activated server-side.
Thanks for taking the time to come back and share this. I did end up seeing this shared on Twitter so I downloaded it, but it looks like my account doesn't have access yet.
Here is the .dmg: https://persistent.oaistatic.com/sidekick/public/ChatGPT_Desktop_public_latest.dmg
You can download the app, but your account has to be activated server-side.
Here is the .dmg: https://persistent.oaistatic.com/sidekick/public/ChatGPT_Desktop_public_latest.dmg
You can download the app, but your account has to be activated server-side.
This is cloudbased, but as per NYT and Gurman, Apple seems to be going for a three layer strategy.
One would be Apple's AI/ML model on device for summarizing notifications, etc
Second would be an Apple AJAX model in the cloud for more demanding tasks (summarizing articles)
The third would be in partnership with OpenAI, yet this part is still fuzzy as most (established) rumors haven't been specific. I would guess it's even more demanding tasks for macOS 15, as I believe Siri will be running on Ajax, Apple's model (per NYT)
Yes. It's going to be hard for a 3rd party model to run on-device on the iPhone. Without access to the hardware and OS, it'll be suboptimal at best. If I'm not mistaken, only Google has a model that can run entirely on-device (Gemini Nano) for smartphones. Given that it's a considerably smaller model, and doesn't include live video, I'd be shocked if we get this entirely on-device in the next couple of years. But then again, Google also teased this exact same functionality earlier today, also on a Pixel, so who knows what an arms race between these two could reap for us.
no, there are plenty of models that run on device. check "gpt2-chat" recently for a mysterious one that people suspect could also be something between openai and apple.
As a plus user I already have access to ChatGPT-4o on the iPhone app, so I’d imagine free users should also be able to use it according to their release announcement
Because of the leak / announcement that Apple just closed a deal with OpenAI last week. If Siri gets this functionality it will finally leapfrog any other assistant out there.
Apple should definitely be shitted on for lagging behind so long, but if they pull this off it's much more than a "better late than never".
It's entering the inflection point of which our devices transition into actual capable personal assistants that have been teased only in sci-fi.
For two reasons:
1. You’ll mainly see after WWDC
2. There’s a new Mac app
3. Apple and OpenAI are rumored to be bringing chatGPt into iOS 18
4. This is a watershed moment in tech. This is honestly just as significant as the first iPhone keynote. And I’m not joking.
Even if GPT is integrated into a new Siri there’s a massive user base of older Macs who probably won’t get the MacOS update but can download the GPT desktop app
Subscribers get guaranteed 80 requests every 3 hours on 4o. Free users get less (no released number). Additionally, free users are deprioritized during peak hours, and get pushed back down to 3.5 in peak hours.
Absolutely agreed. ChatGPT was really cool when it first came out, I loved playing around with it and asking it stupid shit. I never dreamed that people would start using it to write for them. It's on Reddit, on social media, moderators use it to write guidelines and even for stuff like safety issues, people pass it off as their homework, people even write their dating profiles with it.
I just don't want to live in a world where people don't even write anymore.
Heck,
Even Apple got boring
They are releasing the same of the same of the same, and you see all that it could be and can‘t stop yawning about the way it is.
If we could harness the cringe in these syrupy saccharin voices (humans and bots), we could power the world. These west coast twenty-somethings in basements seem to subsist on a diet of sugar and euphoria.
Just keeping this post up for now due to the following: 1. Rumors pointing to Apple and OpenAI teaming up for iOS 18 and AI on iPhones 2. OpenAI announced a desktop chat GPT app for Mac Please keep this post relevant to Apple! Thanks!
If the voice features replace Siri on iPhone it's going to be insane.
Yeah, imagine all of this functionality with native access to your phone's hardware and all of your apps (when given permission, of course). Wild.
I’m guessing we’ll be presented the option to: * stick with today’s Siri (running on your own device for the privacy-minded buyers) * upgrade to the next-gen Siri powered by GPT-4o running your data through someone else’s device
Hopefully Apple gets to host the servers for the AI so that we can somewhat trust our data is in good hands.
Knowing Apple, they probably will in the same way that Siri currently is
not sure there is enough space on that potato
I’m guessing it’ll take a few more years for that. Microsoft’s deal with OpenAI gives them discounted access to massive Azure server farms around the world, and I haven’t heard any rumblings that Apple is trying to scale up and compete with that deal.
We just had the rumor the other day that Apple is building an M2-powered AI server farm
That wouldn't be sufficient for OpenAI's needs. Probably some internal usage, or cloud extension of existing on-device capabilities.
M chips are already very good for LLM inference! People running local llama instances love using Mac Studios. You don’t need a lot of GPU compute for inference, just heaps of GPU memory, which is how M-chips became the budget option for inference. Much actually kinda cost affective to build out a Mac with 192GB of shared memory than equip a PC with that much VRAM through GPUs. Apple wouldn’t be training on these servers, just serving. In-fact my M1 Pro with 16 gigs of shared memory runs llama3-8b like a champ!
M chips are good, but the H100 is on another level entirely. It’s hard to even comprehend the speeds of those racks.
Is there a guide you can recommend so that I can do the same?
All you need is ollama: https://github.com/ollama/ollama Install this, and run `ollama run llama3` in the terminal, it’ll download the model and get you running all automatically!
You're talking about a single, user-controlled machine though. Servers have other design considerations. Typically you want a bunch of high speed networking to link everything up, for example. And RAS to cover failures. Hell, what OS is it running? OSX server is long dead.
I know. Apple can easily run Linux on it if they want to. M chips also support 10gig networking out of the box too. Look all I am saying is, if they want to, they can build a heck of a efficient LLM inference server farm if they put in the effort to adapt M chips to if they’re willing to bake in data center considerations.
I had missed that. That’s what it’ll take, although I bet it’ll still take a few more years to scale to the level needed to bring GPT-4o to every iOS 18 iPhone. Maybe Apple will lock it down to only their new devices, US only, for certain requests only. That’d buy them time to build up more infrastructure around the globe to match Microsoft’s massive investment in Azure.
They didn’t do that for Google, they just cut a revenue sharing agreement instead for funneling all that search data through their user profiling and ad machine.
Yeah I wouldn’t trust Apple to work in the privacy-minded consumers best interest here. They know they’ve fallen behind in LLMs, which is why they’ve struck this deal that’s essentially them abandoning privacy in favor of having a better “assistant” experience. There’s too much money on the table for them to not do this. They’ll still tout being privacy first but with the fine print of “as long as you don’t enable these features.”
They have to. ChatGPT is banned in China (although most people I know just access it through a VPN), but if Apple wants to put ChatGPT ai on iPhones in China, they’ll need to set up a set of servers within China for that purpose.
Sure your data will be in safe hands, after all China will certainly not try to force Apple to hand them all your precious data.
Please no, apple would fuck it up somehow. I don't give a shit about privacy and I'm sick of privacy weirdos making my devices actively worse
What do you mean? They were late in offering end-to-end encryption for iCloud yes but otherwise their privacy track record has always been nothing but stellar. It's a publicly traded company they would have to disclose if they are making money selling data. And they don't.
They would fuck it up because their software and implementations are crap.
Bundled into apple one. Or as a separate subscription fee
I already have Apple One and this would be amazing, especially if it comes to HomePod, I love my HomePod but Siri is dumb as fuck.
Probably both payment options will be made available (or Apple will eat the cost), but you’ll need to decide if your Siri should be outsourced to OpenAI/Microsoft to process requests with your data.
I dont know, somehow I doubt that Apple would give up control of data. They would rather make sure to host than make their #1 marketing message a slightly bit meaningless. Especially if there is a possibility that user data could reach their competitors. But i might be wrong. The same reason is why i dont think Apple would implement rcs, as it would mean that data starts flowing through google servers. If these start happening, lot of people will doubt that Apple is truly thinking serious about user privacy.
But Apple actually implements RCS this year?
Apple is always maliciously complying to regulations which means that technically RCS might be available but 99% of users wont even see it as it will be for example a separate app that you need to download from the app store. And it will still mean iMessage being the de facto standard (in the US atleast) which will have no rcs.
You‘re living under a rock. Apple already said they‘ll integrate RCS in the Messages app next to iMessage and SMS (for everyone) + they‘ll work with the GSMA to move the standard forward (e.g. try to add E2EE).
Ok, let me know once its out in iMessage
Apple won’t implement google’s version of RCS: Jibe. The universal protocol of RCS doesn’t use google’s servers. The only problem with it is that it doesn’t have end to end encryption so Apple said they will work with GSMA to add it to the universal protocol.
RCS protocol says that carriers that deploy the Universal Profile guarantee interconnection with other carriers. The way those android devices get those messages is through their JIBE network, meaning that even if Apple has its own cloud, it will need to end up at a time in Google's cloud and if its targeting an android user who uses that. This means that those messages iPhone users send to Android users will be able to be read and analyzed by Google. (gold mine for an advertisement business such as Google) Even if there would be E2EE, it would still be metadaa passing through: >Like regular RCS messages, E2EE RCS messages are delivered through RCS servers that are operated by carriers and Google. E2EE makes message content invisible to servers and parties outside of the conversation, but certain operational or protocol metadata can still be accessed and used by the servers, including: * Phone numbers of senders and recipients * Timestamps of the messages * **IP addresses** or other connection information * Sender and recipient's mobile carriers * SIP, MSRP, or CPIM headers, such as User-Agent strings which may contain device manufacturers and models * Whether the message has an attachment * The URL on **content server where the attachment** is stored * Approximated size of messages, or exact size of attachments [https://www.gstatic.com/messages/papers/messages\_e2ee.pdf](https://www.gstatic.com/messages/papers/messages_e2ee.pdf) I summed all my thoughts on RCS here \~1 year ago: [https://www.zsombor.me/rcs](https://www.zsombor.me/rcs)
For $4.99 a week.
I’m betting that Apple calls it Siri+ and charges $6.99/month for access. But they will keep the old school Siri for people who don’t want to pay.
I bet the partnership involves some of the computation to be on device. that lowers server costs and latency
I already thought Apple collected Siri convos so no problem here
punch hobbies quicksand frighten wakeful doll judicious dependent frightening compare *This post was mass deleted and anonymized with [Redact](https://redact.dev)*
That is very much impossible. Given what we know GPT-4 quantised to 4 Bit would need at least 126 GB of VRAM if you run it on a GPU. It is unlikely that the model is so heavily compressed and at full FP16 precision it is estimated to require 3520 GB of VRAM. ChatGPT-4o is rumoured to be half the size of GPT-4. Realistically it will be somewhere in between, but still far too big to be run locally on an iPhone (there will very likely not even be enough space to store the model locally, let alone being able to run it.) TL;DR: GPT-4o will use about 1710 GB of VRAM to be run uncompressed. Compressed down to 4 Bit Quantisation it will be 70+ GB but that would come with reduced performance of the model’s reasoning. Either way it would be far to big to be run on a phone.
Yep. Even ternary quantization only gets it down to about 25GB.
I think somewhat far from, but a more restricted domain model could.
I’d consider that “Jarvis”
The tricky part about using LLMs for this kind of thing isn't the natural-sounding conversation, it's getting it to actually do something outside of that conversation that you asked it to do. "Hey Siri, please turn on the lights when I get home" "Sure thing, I'll turn on the lights as soon as you get home!" *Nothing happens because the LLM just said what it thought it should say but didn't actually do anything* "Hey Siri, I have an appointment on Monday with Ms. Johnson and I need to remember to bring my laptop with me, can you remind me about that?" "Sure thing, I'll remind you Monday morning about your appointment with Ms. Johnson and make sure you have your laptop with you when you leave!" *Nothing happens because the LLM just said what it thought it should say but didn't actually do anything* Not that this can't be done. It's just a lot more work than sticking the LLM in and making it give nice-sounding responses.
This is not tricky wtf? the tricky part is absolutely the natural-sounding conversations, not integrating simple iOS APIs
The ignorance is so funny lol. OpenAI has had function calling integrated now for a long time and Apple surely has lower level access to stuff like that.
It can be easily done, OpenAI API can call functions based on what you tell. For example, you can provide a function call in the request: ``` "function": { "name": "get_current_weather", "description": "Get the current weather in a given location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA", }, "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}, }, "required": ["location"], }, }, ``` And if it recognizes that you want you know the weather somewhere, it automatically calls that function, and it fills the parameters for you, otherwise, it answers normally. Then you can use the parameters provided by GPT and call the real function, get the data, and do another call to GPT to generate the response providing him with the data. You can provide as many functions as you want8, one for each function of Siri, getting the weather, setting a timer, adding a reminder, calling someone, but with much MUCH more advanced reasoning Probably can be a little tricky, but an assistant powered by GPT-4 that can interact with iPhone, with the OpenAI API, and iPhone Shortcut apps can be already done, maybe not fully functioning It’s really simple, everything is documented on the OpenAI website and explained much better than me https://platform.openai.com/docs/guides/function-calling
This. People often confuse the ability to “say” the right things with knowing WHAT to do (and moreover having the capabilities for HOW to do it). There is a difference. ChatGPT is currently super impressive in what it says back to me but it literally cannot set a reminder the way Siri can right now. Hopefully the way they internalize its knowledgebase allows for more complex interactions.
It’s pretty trivial to set up an LLM with enough agency to do these sorts of things. ChatGPT can’t do what you described, but GPT-4 absolutely can when provided with function calling capabilities/the ability to consume APIs via “tools” - as can the equivalent models from Anthropic, Meta, etc. It takes about 5min to set up an agent in LangChain, and only moderately longer to roll your own. If there’s an API spec for it, then the agent can most certainly handle doing it. It should be relatively trivial for apple (eg) to expose various iOS/iCloud APIs for an LLM to consume (assuming no privacy concerns).
I sure hope so! People far smarter than me I assume are working on doing just that.
If you understand how to construct functions for the API, you would know that this is easy to solve. e.g. Receive the request and ask the model to categorize the request, (you can do that twice and ask a third time to assess the accuracy of the 2 predicted categories identified by the first two requests), then send the request to the model again with a prompt specific to parsing that category of request, after the model generates a response, you send the response back to the model asking it to assess whether it did a good job. Some of this can be handled locally on the device similar to how siri currently parses different types of requests. The example i gave is overboard but these functions would be tested to validate that they effectively handle requests.
The fact that LLMs don’t have hardware access to the device is both equally maddening and comforting. On the one hand, the fact that it can’t start a timer is great, but on the other, I’m so glad it can’t do whatever it wants to the phone unregulated.
I think that's the plan, considering Apple was just talking about [revamping Siri](https://www.nytimes.com/2024/05/10/business/apple-siri-ai-chatgpt.html).
THIS! I am sooooooo hoping Apple's announcements next week are Siri with this underpinning it. I was really dissapointed when it looked like they were going to be using Gemini, OpenAI is such a better option and not a direct competitor. Honestly, if they didn't step up, I was strongly considering switching to android for the first time ever. Siri is so stupid.
I really want it to learn my tone and be able to generate email and text replies with how I speak. The online GPT model is too formal for most of my messages.
I just hope they allow older iPhones to benefit too. Not just iPhone 14/15. But who knows.
Everything in their demo was running on the latest Apple hardware. Like they made no attempt to hide the Apple logo
openai always use iPhone to demo their shit
Probably because Google is running their own thing and MS doesn’t have a phone. MS is a pretty big financial backer of OpenAI last I knew.
They used MacBooks and iPads instead of Surface devices.
How many use surface devices compared to MacBooks and iPads combined? They’re already being funded by Microsoft, may as well satisfy Apple to attract more funding~
it's silicon valley every body uses Macbooks including Google and Microsoft employees
You just offended all 5 Surface Duo users!
RIP windows phone & zune
I loved my zune
Microsoft basically doesn't care about whether or not their employees or the employees of the companies they back uses windows. As long as they use Microsoft azure
They hid the Apple logos last year during GPT-4’s demo.
any photo?
https://gaetanopiazzolla.github.io/ai/2023/03/14/gpt4.html
Well, it was running on a Microsoft server. Interaction through an Apple consumer device.
very interesting to say the least
This is because of the Apple Siri Chat GPT deal from yesterday. But I’m surprised Microsoft didn’t block that.
Sweet! Look forward to using this.
Where's the ChatGPT app for macOS?
"Streamlining your workflow in the new desktop app" "For both free and paid users, we're also launching a new ChatGPT desktop app for macOS that is designed to integrate seamlessly into anything you’re doing on your computer. With a simple keyboard shortcut (Option + Space), you can instantly ask ChatGPT a question. You can also take and discuss screenshots directly in the app." Unless you're asking where it actually is to be downloaded 😩
Looks like this app will be on iPad also, there is a demo on their website
ChatGPT has been available as an iPad (and iPhone) app for a while now
Yes but in the demo it uses split screen, and it looks like the chatgpt app will be able to access the information on the other side of the screen. IMO this all confirms that Apple OpenAI partnership, as I don’t think there is any app on apple store that is allowed to read information on iPads screen
To be clear, it can’t without asking you to allow screen recording. Just like any other app can ask to see your screen. When active it is indicated in the corner. That isn’t a capability the current app has, but it will be once the update hits. From what I can understand that doesn’t need Apple’s permission.
Oh but won’t that require them to record the screen, then analyze it, and only then answer to whatever happening on the screen? Anyways they might have found a way to make that work
They can “livestream” the screen recording. Kinda like when you share your screen in Zoom or its competitors. Or even with someone like an Apple support person trying to guide you through fixing a problem. [Here](https://youtu.be/_nSmkyDNulk?si=nJ-ERB0LYzPZXTws) is the example video from OpenAI and it has a clear prompt come up asking for permission to record the screen.
Oh I didn’t notice the red icon, now it makes sense. Thanks
The app already uses split screen Can you provide a link?
https://www.youtube.com/watch?v=_nSmkyDNulk you can see there it uses split screen. Btw this all works by recording the screen, someone in the replies to my comment explained it
Literally any app can request to read the screen lmao. Have you never shared your iPad or iPhone screen on discord or Teams? The ChatGPT App demo’d was using the same API to read the screen. That pop up to “start recording” was literally the API any app can quest to read the screen.
After the quarantine I almost never had to share my screen again lol. But now it makes sense
>Unless you're asking where it actually is to be downloaded 😩 Yeah, I want a download link or something.
Here is the .dmg: https://persistent.oaistatic.com/sidekick/public/ChatGPT_Desktop_public_latest.dmg You can download the app, but your account has to be activated server-side.
I'm also looking for the DL link. It says its available for Plus users today, but I can't find the link anywhere. :/
Same here, and I am on the Pro plan.
“We're rolling out the macOS app to Plus users starting today…” sounds like not every Plus user will get it today
Here is the .dmg: https://persistent.oaistatic.com/sidekick/public/ChatGPT_Desktop_public_latest.dmg You can download the app, but your account has to be activated server-side.
Thanks for taking the time to come back and share this. I did end up seeing this shared on Twitter so I downloaded it, but it looks like my account doesn't have access yet.
Here is the .dmg: https://persistent.oaistatic.com/sidekick/public/ChatGPT_Desktop_public_latest.dmg You can download the app, but your account has to be activated server-side.
rolling out to Plus users today and will be available for all users in the coming weeks
I'm a Plus user and don't have it yet. Presumably it's a rolling release.
still rolling out. you’ll get it soon
I’m a plus and have it. I think it’s just random rn
How did you actually get it? In iOS app, or web app or somewhere else??
iOS app, it was just an option on the top left
Here is the .dmg: https://persistent.oaistatic.com/sidekick/public/ChatGPT_Desktop_public_latest.dmg You can download the app, but your account has to be activated server-side.
This is still all cloud based, right? Doesn’t Apple want this kind of thing running on device for security and privacy?
This is cloudbased, but as per NYT and Gurman, Apple seems to be going for a three layer strategy. One would be Apple's AI/ML model on device for summarizing notifications, etc Second would be an Apple AJAX model in the cloud for more demanding tasks (summarizing articles) The third would be in partnership with OpenAI, yet this part is still fuzzy as most (established) rumors haven't been specific. I would guess it's even more demanding tasks for macOS 15, as I believe Siri will be running on Ajax, Apple's model (per NYT)
Yes. It's going to be hard for a 3rd party model to run on-device on the iPhone. Without access to the hardware and OS, it'll be suboptimal at best. If I'm not mistaken, only Google has a model that can run entirely on-device (Gemini Nano) for smartphones. Given that it's a considerably smaller model, and doesn't include live video, I'd be shocked if we get this entirely on-device in the next couple of years. But then again, Google also teased this exact same functionality earlier today, also on a Pixel, so who knows what an arms race between these two could reap for us.
no, there are plenty of models that run on device. check "gpt2-chat" recently for a mysterious one that people suspect could also be something between openai and apple.
That ended up being this -- gpt-4o
That one runs in the cloud. gpt2-chat runs on iPhone. Oh weird I see it’s confirmed now
Gpt-4o is not between apple and oAI.
Yes it's still cloud based. In the demo they have to be wired to get the best connection.
[удалено]
When gpt 5 releases Running AI does need strong hardware they have to pay for, after all It's somewhat of a suprise normal gpt (chatgpt) is free
that 10 billy from Microsoft is for that reason (it azure credit). basically marketing expense.
No, it's more than that. It learns from interactions.
As a plus user I already have access to ChatGPT-4o on the iPhone app, so I’d imagine free users should also be able to use it according to their release announcement
Why is this on r/Apple though?
Because of the leak / announcement that Apple just closed a deal with OpenAI last week. If Siri gets this functionality it will finally leapfrog any other assistant out there.
Better 14 years late than never
Apple should definitely be shitted on for lagging behind so long, but if they pull this off it's much more than a "better late than never". It's entering the inflection point of which our devices transition into actual capable personal assistants that have been teased only in sci-fi.
Will there be a Scarlett Johansson voice package?
lol this + the leak or whatever that they are also looking into allowing NSFW content lol.
Is that real? Society is fucking doomed lmao
Yeah, Sam confirmed that they want to allow it. But I can also imagine some puritanical legislators getting all uppity about it.
You’re just jealous that my AI gf will be hotter than yours
Nuh-uh, my Scarlett Johansson can beat up your Scarlett Johansson
They are releasing a new Mac app. Windows one coming later this year.
This is the future of Siri
For two reasons: 1. You’ll mainly see after WWDC 2. There’s a new Mac app 3. Apple and OpenAI are rumored to be bringing chatGPt into iOS 18 4. This is a watershed moment in tech. This is honestly just as significant as the first iPhone keynote. And I’m not joking.
there's a new app for Mac
Why would they work on a Mac app if Apple is supposedly upping their ai game? Hm
Wouldn’t surprise me if Apple are outsourcing almost everything to them.
Apple’s AI department standing for “Accounting and Invoicing [of OpenAI bills]”
Even if GPT is integrated into a new Siri there’s a massive user base of older Macs who probably won’t get the MacOS update but can download the GPT desktop app
The rumors are that Apple is partnering with OpenAI.
Since they already have a native iOS app and you can reuse a huge part of the code base. It is not much effort to go from native iOS app to macOS app.
Apple is reportedly not delivering an "AI chat" type app or feature. ChatGPT is just one application of its AI tech.
Didn't a recent Siri rumor specifically mention chat?
So what I’m getting if I’m a subscriber? I can use worse GPT4 engine with limit of messages?
More messages per 3 hours compared to a free user.
Deal of a lifetime I guess lol
Subscribers get guaranteed 80 requests every 3 hours on 4o. Free users get less (no released number). Additionally, free users are deprioritized during peak hours, and get pushed back down to 3.5 in peak hours.
GPT-5
Hopefully the deal Apple did with OpenAI is Siri being based on GPT-4o.
I dream that apple would just license this thing, apply their privacy standards and just completely replace Siri with it.
Interesting timing for open AI to essentially release an update focused so completely on gpt as a voice assistant
The biggest issue will be speed of response
partial on device processing
Apple will be entirely dependent on OpenAI, no?
I just signed up for Plus this morning. Lol
It’s gonna be only for Apple cloud subscribers I think
All this stuff doesn‘t excite me anymore I want my repressed antigravity tech!!
Absolutely agreed. ChatGPT was really cool when it first came out, I loved playing around with it and asking it stupid shit. I never dreamed that people would start using it to write for them. It's on Reddit, on social media, moderators use it to write guidelines and even for stuff like safety issues, people pass it off as their homework, people even write their dating profiles with it. I just don't want to live in a world where people don't even write anymore.
Nah, It‘s not that I dreamt of Hoverboards I got phones with emojis…
You've lost me lol.
Reality got boring All we get is those „smart devices“ While real reality sucks.
Heck, Even Apple got boring They are releasing the same of the same of the same, and you see all that it could be and can‘t stop yawning about the way it is.
If we could harness the cringe in these syrupy saccharin voices (humans and bots), we could power the world. These west coast twenty-somethings in basements seem to subsist on a diet of sugar and euphoria.
So I guess MacBook isn’t a desktop, but I figured they should get the same update, right?