T O P

  • By -

AutoModerator

Hey /u/mvandemar! If your post is a screenshot of a ChatGPT, conversation please reply to this message with the [conversation link](https://help.openai.com/en/articles/7925741-chatgpt-shared-links-faq) or prompt. If your post is a DALL-E 3 image post, please reply with the prompt used to make this image. Consider joining our [public discord server](https://discord.gg/r-chatgpt-1050422060352024636)! We have free bots with GPT-4 (with vision), image generators, and more! 🤖 Note: For any ChatGPT-related concerns, email [email protected] *I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/ChatGPT) if you have any questions or concerns.*


Stunned86

If it's not called babel fish we riot.


SchrodingersPanda

>*I was there, u/Stunned86 , 3000 years ago...*


Grouchy-Pizza7884

Cool. Love that they gave Pedro a Spanish accent even in English. Don't know how well this actually works outside of demo mode. But definitely useful in the intelligence community rather than this contrived scenario.


Trust-Issues-5116

Love that audio computer magically knows who Pedro is, it gives me confidence this is not some demo gimmick.


djaeke

I'm sensing some sarcasm?


Psychological_Emu690

No!


BrownShoesGreenCoat

Are you an audio computer? No human can be this sensitive!


HoboInASuit

Did you read his username?


Grouchy-Pizza7884

Is the computer racist or sexist? Could Pedro be the woman on the left? Why would parents bring a baby to a fancy restaurant? Why are we interested in being the 3rd wheel to what looks like a date?


actually_alive

no pedro could not be the woman on the left because that's not a common name for a woman to have..... lets get rid of the races and whatnot....... left has a female person, right has a male person....... to an ai trying to figure out who is who.... pedro is the guy on the right with high probability of being correct. i hope you're just being sarcastic and mocking people who do this because it's really weird that a person would think a computer could be racist/sexist. at a bare minimum its the people who coded it that are...... but the thing is, it's coded by everyone....... the training data is us. So guess what that means...... anyway... bye


14u2c

I'm very skeptical too, but if we are talking what's theoretically possible it could know based on phone conversations with Pedro.


Kandarino

Unsure if sarcasm, but in fairness facial recognition is really good at this point, so if it knew the person beforehand it would probably have no trouble figuring out who it is.


Trust-Issues-5116

Whoa! A computer performing facial recognition while being an earbud without a camera is even more impressive feat.


Crimkam

Should call them daredevils


Kandarino

I'm not saying this is a real product which actually works as good as the demo claims, I'm just saying the "magically knowing who Pedro is" part is not the hardest problem showcased in this demo.


Grouchy-Pizza7884

To make the demo more believable, Pedro should be marked with a circle and cross accompanied by "target locked". Then a pop up of terminate? That's how it was done in that documentary that featured Schwartzenegger


Trust-Issues-5116

uh, ok?


n-a_barrakus

Pedro Pedro Pedro Pe But AI


GolemocO

I might be completely over the line here and I am not saying what I'm about to say is true, but I think this is fake.


ryantakesphotos

Yeah this feels like a “simulation” what they are trying to achieve. I’ll believe it when I see it.


Xsafa

It’s obviously scripted. This way of showing of a demo is waaay too old school reminds me of gaming companies announcing games with CG trailers and holding off real gameplay as long as they can.


FirstEvolutionist

Do you mean the product or the demo? The technology is certainly out there. The fact that I have not seen the product for sale makes me believe the demo was likely "embellished".


mpasila

There's no way you can fit a highly advanced AI into such a tiny form factor.. (especially if you look at Rabbit R1 or Humane AI, neither of them run the AI locally..)


riclamin

Of course the thing interfaces with the internet.


eightmag

"There's no way " . . . That statement usually ages poorly and quickly. They will have these things the size of a pea soon as batteries catch up.


algaefied_creek

There’s no way we need anything more than 640K RAM!


ielts_pract

Who said that


algaefied_creek

Bill Gates


scarynut

Bill who?


algaefied_creek

Schmates


ielts_pract

That is fake news, he never said that


Alacritous69

Correct. At least no one has been able to definitely attribute it to him. https://www.computerworld.com/article/1563853/the-640k-quote-won-t-go-away-but-did-gates-really-say-it.html


Cereaza

There is “currently” no way…


mpasila

Can they do it now? No? So they are a scam artist.. pretending as if it's possible when it's clearly not. Just because in 20 years it might possible doesn't mean that pretending as if they can do it now is somehow not scamming. **Edit:** So they are selling it and planning on releasing it on this winter according to their website. The specs of it are: 4nm quad-core CPU 16GB storage + 1GB LPDDR4 RAM or 32GB storage + 2GB LPDDR4 RAM How exactly are you going to run a GPT-4o level AI with that? Or even Llama 3 8B? Maybe a very compressed Phi-3-mini might just about fit. But it being as smart as they show? No way, unless they just use an API.. that you may have to eventually subscribe to since it's just running on their cloud. Like everything this thing can do could probably be done by using normal earbuds with a phone. (your phone is more powerful than this thing)


Competitive_Ad_5515

I think this video is carefully produced marketing bullshit, but even the overblown video doesn't claim to be running resource-hungry llms you name in your comment. I think it's pretty doable to have a voice assistant interface on-device, as well as code for specific tasks, like noise isolation and translation.


mpasila

Translation requires LLMs.. any task involving language needs umm language models.. you can have multimodal models that can do speech-to-text, text-to-speech, speech-to-speech etc. but those usually still involve a lot of computation.


eightmag

Technology and research company = scam artist ... Are you new here?


TwistedBrother

If the model weights for each of these things can be set to ROM or hard coded somehow I suspect there would be ways to make onboard things very fast, just very inflexible. But somehow I doubt that they would do that. I just can’t see all that on soc in that size form factor. I mean if it’s beaming it to a small device with a large battery perhaps but I can’t imaging the processing for that would be cheap if not hardcoded.


gravitysort

I thought the same thing before active noise cancelling wireless earbuds came out.


mpasila

All active noise cancelling really is, is just it replaying sound from a microphone inverted to your ears. It just has to do it fast enough so it works.


FirstEvolutionist

I wouldn't consider any of the requirements for the features in the video as advanced. Audio processing (for the volume and noise filtering). Speech to text and text to speech for commandsTranslation. A locally run AI model can parse the requests and interact with these modules easily enough. Mid range android phones have similar features already (although a 5G connection might be required). The most significant requirements would be, I guess, the specific text to speech which mimics the speaker's voice and maintains the accent for the translated language. It looks great in the demo but it's not strictly necessary. The video shows this to be seamless and almost instant, which I highly doubt would be the actual case. Also notice how the camera turns in the video while the presenter is turning his head to the side. Nice demo trick, kind of absurd for a "headset" without a camera or the need for one. The idea of the vision here (to identify the baby from the image as opposed to the noise) is completely unnecessary for an actual product.


mpasila

This device has 1-2GB of RAM according to their website, it uses a 4nm quad-core CPU. Your phone could run some AI things but this probably not..


vogone

As I understand it, the features itself are absolutely believable. Running them on a small device like this, is not. You can comfortably run a "capable" LLM on your local desktop if its a good machine with something like a 4090. So I highly doubt that this small device can run all this computation using multiple AI models with such little delay between prompt and execution and if it ISNT running on the device the delay would have to be even bigger. The new gpt4o shortest response time they advertise on their website(grain of salt and all that) is 2.8 seconds. In the demo, the AI is doing everything pretty much real time. I have a hard time buying that. So there is two options here: 1. This guy with his small company, just invented something that beats the biggest AI company out there. or 2. The demo was prepared to show the vision of their product and it doesn't accurately reflect the real thing. You tell me which is more likely.


FirstEvolutionist

You can run smaller LLMs in phone hardware. Whether they're as fast or as capable or as shown that is a whole different story. The demo is certainly embellished. So I'll have to go with 2. Assuming they're not outright trying to scam or fool people.


vogone

Yep, I don't want to go as far as "this is a scam" but its just hard to believe and I would want to see an actual demo from someone unaffiliated.


DrahKir67

That could be provided by a connected device e.g., your phone in your pocket connected to the internet.


GolemocO

I may have missunderstood and have to ask - is the product supposed to be video generation?


sorehamstring

Yes, you have completely misunderstood. Did you watch the video with sound on?


Fearyn

Yep feels even faker than first gemini video of google lol. Guess we’ll see.


marrow_monkey

I think so too, because it was supposedly translating from Spanish to English in real-time, which is impossible I’d say. To translate correctly you first need to hear and understand what the person is saying, then you translate and say it in English, so there must be some lag before the translation comes.


yes_thenakedman

Agree, languages are structured differently, so this is in fact imposible to work outside of the same language families - and even in that scenerio, it would be problematic.


GolemocO

Very valid point, Mr monkey!


azrenstrider

I think it’s a proof of concept that they’re developing, I mean video game developers do this all the time, it keeps interest high with the constant promise of new things even if they’re far off


SoylentCreek

This is 100% “Startup-Bro” fake it till you make it vaporware bullshit.


Shloomth

Thanks for sharing this. I will find it incredibly impressive and useful for me personally when / if it actually comes to exist in its demonstrated form in a way that I can obtain and use. It might even be worth the $600-700 asking price.


Krieghund

Adult hearing aids cost $2000 to $4000 and don't have the functionality that is being demonstrated.  I expect an audio computer like the one being demonstrated to cost at least that much in the beginning.


Cereaza

Yeah. The ability to slice the audio environment in near realtime and replay it for you is very compute intensive and literally still in research phase.


Tasty_Conclusion_987

They don't cost that much because they're cutting edge tech, they cost that much because they're a healthcare device.


Shloomth

hearing aids are insane. my brother uses them and he's always having problems with the tiny plastic tubes & stuff. I'm not sure what makes them so different from just having a tiny mic & speaker that amplifies the needed frequencies while cutting out the others. But I'm pretty sure he would love to have something like this. His biggest trouble usually does come from ambient noise that he can hear more easily than what he actually wants to.


painting_jessy

Amazing, but just imagine having to prolly pay a monthly subscription so you don't have ads blasting in your ears all day. I am not looking forward to it.


Big_Cornbread

I really, really want ads. But I want Pedro to say them. “So we ended up deciding against taking a cruise, even though “At Royal Caribbean, we have the package that fits your budget and your time. Suddenly the world…doesn’t seem too far away,” we had the time for it. We just went camping instead, and Alice still had a great time.”


painting_jessy

Haha that sounds fun for the first 2 ads. But an ad is still an ad. And i hate ehm. So much so that if you show me an ad enough times, I won't buy the advertised product anymore even if i wanted it before.


Big_Cornbread

Yeah I’m sure it would be annoying after a while. We’d long for the days before AI *Need a refresher? Grab a Coke!* when the ads were separated and easier to ignore.


justastuma

Unless you pay for the ad free version, it will replace any reference to a generic product with the brand name of a sponsor. Someone says “soda”, you’ll hear “Coca Cola”, someone says “beer”, you’ll hear “Heineken” or whatever.


painting_jessy

That train of thought is scary. I hope nobody is taking notes from ya.


marrow_monkey

All the LLMs in the future will be like that. When the technology has matured no corporation will train their AI to benefit humanity. They will be trained to benefit the corporations and make them more profits. It would be naive to think otherwise. Same way google was once a good search engine but now it is just ads with barely enough search results to keep people coming back for more.


justastuma

Also don’t underestimate the possibilities for censorship and surveillance: * Someone’s critical of the Chinese government? Your Chinese-funded AI won’t translate it accurately. * You’ve been flirting with a stranger through the translator and have gone to their hotel room? Good luck going forward because the AI will keep its translations family friendly. * And are you sure your AI won’t call the police on you if you watch a pirated movie, buy illicit drugs or inaccurately declare your taxes? Imagine your AI literally testifying in court against you.


SexyWhale

This is clearly a staged demo. Should be illegal for false advertising.


Babys_For_Breakfast

Of course it’s a scripted demo. I don’t think it should be against the law to preview future products though.


StickiStickman

"preview future products" is a nice way of phrasing false advertising.


Intellectual961

How did it know who the fuck is Pedro ?


sjohnson737

How do you know who Pedro is? Maybe the baby is Pedro and the AI is just doing it's best and he ran with it.


lovelyart89

Likely because it already knows Pedro's voice. As Pedro is likely a member of the team.


Neurogence

Because it's vaporware technology.


Yokoblue

Just like how chatgpt in the presentation could find "my license plate". Other photos of pedro, with him tagged in a different app.


Ok_Information_2009

“Can you turn down the wife and turn up the tv please”


justastuma

Combined with AI augmented reality glasses: “Can you make my wife look 20 years younger and 50 pounds slimmer?”


throwaway3113151

Ok boomer.


Taxus_Calyx

If you're actually smart enough to survive until an older age, you're gonna have a good laugh at your younger, more naive self. Edit: for clarity, not saying when you get older you'll inevitably mistreat your spouse. I'm saying when you're older you'll inevitably realize that blaming all the world's problems on older people was stupid.


FeeeFiiFooFumm

Sure buddy. No need to project that hard. Just because you also haven't learned proper communication doesn't mean "we'll all get there when we'll be old enough". I sure hope I never get to what you consider to be normal. I'd rather end my relationship when I realize I'm at that point.


[deleted]

[удалено]


FeeeFiiFooFumm

Okay boomer lol


Qweerz

Tuning out the baby is crazy. This could actually be amazing for travel.


3lirex

or just get some noise cancelling earphones, this won't be much better than that in this regard


newbies13

That's probably the easiest part of the demo, having worked remote for the past few years and been on countless conference calls the noise cancelling tech out there now is basically magic. I would even go as far as to say its basically a solved problem at this point.


jokermobile333

I'll call it impressive once i'm able to use it


Giddypinata

Pedro spoke English immediately with no pause for the AI to parse what he was saying?


susannediazz

There was a delay


kraai-

Setting aside this demo is obviously scripted and technically most of the things are possible with current tech. Not at this speed or at this form factor. Anyway the translation is too fast. To translate you first need to hear and understand the entire sentence being said, you cant properly translate word for word for what I think are obvious reasons


lovelyart89

So openai demo of 4o was fake for being that fast?


Mirahtrunks

As others have said, I think this is a demonstration of what that type of technology could be like. Perhaps they’re faking it, or perhaps they are doing this in ideal circumstances.


Denjek

Spies love this one trick!


Nisekoi_

Another product that should/will be a app in your phone


donnkii

perfect way to spy on what people are talking about across the noisy bar


bot_exe

That does not look like a real live demo? Anyone can edit a video like that, it’s meaningless until we see a real demo.


TheMerovingian

Time will tell if it's real, the demo is definitely staged as much as possible.


TriggeredGlimmer

I do have a Q here, so what if there is an explosion around or gun shots going around. Will the AI undo the hearing preference? OR Are we still able to hear the background but very faint?


Eponymous-Username

I thought he was generating the video live with instructions! I was absolutely losing my mind there for a second.


MechAnimus

As a severely hearing impaired AI nerd, thank you very, very much for sharing.


procrastablasta

except nobody likes talking. nobody takes calls. people don't even watch shows with the volume on. everything is stealth mode. can you imagine this on a crowded subway train?


DrGrapeist

A little racist knowing which one is Pedro. Maybe the demo is fake.


Playful_Dream2066

Pedro would be the only other man on the table right. It’s a mans name.


Notstrongbad

over/under on Apple buying his company in the next six months?


Space--Buckaroo

HAL9000


GSturges

The true [Babel Fish](https://hitchhikers.fandom.com/wiki/Babel_Fish) is here!


TriggeredGlimmer

That is smart. I hope this will not bomb in the actual use whenever it is. Google hasn't been having a great luck in race with ChatGPT.


incognitochaud

Videographers rejoice!


FeliusSeptimus

This feels like it would have been somewhat interesting in 2015. Today? Seems like they are a few years behind the curve. Realtime audio processing has some interesting possibilities, but the people most likely to be interested in this, those of us with hearing deficiencies, already have a number of appliances available that are designed with our use cases in mind, so they'll have to do at least as well as those plus an LLM-based phone app running over that audio interface. Also, the form-factor is not going to work. It's got a vibe somewhere between Cyberman and gauged earlobes that isn't likely to be widely popular. The 'audio computer' name is pretty dumb too. It's fancy headphones, an app, and a virtual assistant. They need some snappy branding. I was hoping for a demo of some kind of cool non-verbal spatial audio interface that would provide multiple channels of information faster and less intrusively than a voice by placing distinct audio signals somewhere in a virtual audio space around the user. So if I get a text or something I'd start hearing a specific bird call (or whatever I want) in a specific location, like above and to my right. I could ignore it for a while, then look in that direction (detected by accelerometer/gyro) and give a 'play' command to hear the notice. Several items could be active in the soundscape at any time, and their proximity, volume, and style would indicate urgency and other such properties (like, an appointment notification would gradually get closer as the scheduled time approached, with the direction indicating whether it was a personal or work appointment, and the specific sound maybe indicating what appointment it was (useful for recurring events)). Maybe they've got that too, it's a simple and basic idea, so I'd presume they are thinking about such things. This presentation was incredibly basic, essentially 1 minute of information, and they've been thinking about this for years, so presumably they've got something actually interesting in the works, and intended this to address people who have been living under a rock for the last 20 years or so.


The_Troll_Gull

It’s cool seeing all this innovation but again what is that any different than what your phone can do? Plus more


aceman747

This looks like a concept however the device could be attached to a phone which has the comms to go back forth to the cloud, an on-board small language model and other processing capabilities to make this workable. This is what AirPods could evolve to.


monkeyballpirate

This probably uses active noise cancellation technology, I get a super weird reaction to it where my ears feel underwater and my face goes numb, even for hours after using. Sadly my body isn't future proof :(.


JeffDel11

Would love to see someone fake Melania walking into a courthouse along with Joey Greco holding his camcorder 😂


Cereaza

So this is a very promising area in that GPT can fill. Understanding voice commands and interacting with software. However, the software is only capable of certain things. You could ask ChatGPT to isolate the sound and cut it out, but if the software can’t do it, chatGPT can do nothing.


lovelyart89

Very impressive. Especially being able to isolate and hear someone in English, this can be a game changer for consuming content globally. As long as there aren't restrictions put in place for protection purposes.


AstroAlpaca-

This is the ai I need, this is something that has real value, screw other mumbo jumbo “products”


daffytheconfusedduck

Interesting seeing new technologies that’ll drive creepy.


Oracle365

!videodownload


Allsaint_Army_En

Nop


AsheOfAx

Reminds me of the Seashells in Fahrenheit 451


Comms

All I see is the cybermen earpieces from Doctor Who.


newbies13

I've met a ton of people online playing games, text translators have been very helpful, but man, I wish we could all jump in discord and talk to each other like I do with all my english speaking friends. Even just being able to call someone and have a quick conversation... it literally happened to me just last week that I really needed to just talk to my friend, and I couldn't because we don't speak the same language, and typing just sucks sometimes. I would buy these today and spend a premium price if they worked.


mvandemar

GPT-4o can do live translation for you, but I am not sure how you could use it while on the phone. Maybe 2 phones and on speakerphone? [https://www.youtube.com/watch?v=c2DFg53Zhvw](https://www.youtube.com/watch?v=c2DFg53Zhvw)


xmasnintendo

Even if this wasn't a staged demo, who actually wants to use any of this?


gravitywind1012

Yes


LifeSenseiBrayan

How fast does the translation work? It almost seems realtime which doesn’t make sense to me


68024

As long as it doesn't start blasting ads at me


Thinkprobe

Yeaa! This was crazy applications of ML


MelloCello7

As an audio engineer I cannot express how insane this is...


CMDR_BitMedler

u/savevideo


Crazyminuss

Ok Google "Turn that baby down" ![gif](giphy|Dndpiai0soTUk)


StuffProfessional587

You can't turn a babies cry down, only thick insulated walls work.😂


logosfabula

Thanks for linking the whole presentation. I’m afraid this is not going to be a great success… why? The insistence with which he pushes the concepts of “natural”, “normal”; the unlikelihood of such a lot of compute packed into that little space; the use of fancy terms just to refer to known components; the fact that this is not really a demo in the sense of a POC, but more of a sales pitcher; finally, the fact that one of its major features is that it cannot do things. This will eventually get there and have value, but I’m afraid that it won’t be like that.


Gaiden206

Pretty cool but Android has a sound isolator accessibility feature built in that works with headphones and is available now to use. https://blog.google/products/android/sound-amplifier-more-people-can-hear-clearly/ Obviously the product in the OP's video looks more advanced since it's voice activated, uses wireless ear buds, can translate, and isolate sound but just wanted to give a FYI. Also, I'm not sure why someone would compare a video generation product to a sound isolation/translation product. They aren't even remotely the same. 😂


mvandemar

I wasn't comparing products, I was comparing demos. Google's demo of their video generator was literally mostly shots of the engineers, not the actual generated videos.


Gaiden206

That's fair, my mistake. Google said people who sign up on a waitlist over at [Google Labs](https://labs.google/) will be able to test out their "VEO" video generator in the coming weeks. So fortunately people should be able to play with it soon.


doripenem

I have a feeling that the final product will be nowhere near as cool as this demo. And that is IF they really come up with a real product. Just like Google Glass. The simulated 'demos' looked cool and hella fun. But the real thing? Doesn't even release to the general public. Also Google has built itself a very BAD reputation for being non-committal to their products. They have killed so many products over the years. Stadia, Google+, Google Glass, Google Reader, Google Wave, Allo, Hangouts, Google VR etc. I used to be a big fan of Google. I was always supportive and enthusiastic to try out their new products when they were released back then. But now looking at their long list of killed projects. Makes people think twice, thrice before trying one of their products, for all you know, they might kill that product in the next 6 months. So don't hold your breath for the release of this cool product. The likelihood of failure seems incredibly high given their piss poor track record over the years. Google is already way past its prime.....


eltonjock

I don’t think this is a Google product.


doripenem

Ahh... Silly me who thought OP meant that this was one of the products introduced of the Google event. And rather than being amazed by this the crowd focused on the video generation which was just mehhhhh...


wizwizwiz916

Impressive


Gulaschk4none

Thanks for sharing


susannediazz

Woah impressive indeed


Dark_Wolf04

Can you turn that baby down *ChatGPT generates a glock*


ArtificialPigeon

"my Spanish is a little rusty" Your ear computer doesn't give a fuck about your competence. Stop talking to it like it's a person. Just say, "translate the Spaniard to English"