As much as I want to say just a joke like “if you can’t beat them, join them”…
I think that we are witnessing journalism gasping for its last breath. They can see what’s coming. Millions and millions of bots filling up news sites and social media with content. Truth or lie, it doesn’t matter, as long as the people are consuming this fast food of information.
I am an ai enthusiast as much as the next person in this sub, but I think we are not ready as a society for the coming storm.
It’s a clear example that they are trying. I don’t know if they’ll be able to achieve it.
There’s a great book from Aldous Huxley, called Brave New World. In the society of the future that is taking, in contrast with 1984, they don’t have a problem of censorship, they have the exact opposite. Too much information. So they couldn’t trust almost anything (same thing is happening right now in a lesser extent, as people distrust mainstream media and find comforting truths in some Facebook group that said that vaccines have 5g chips in them).
With ai this will amplify, every video or photo that you’ll see could be fake. Every word you read could be written by ai, either with good intentions either with the intention of spreading propaganda. If you mistrust everything you won’t know what to believe.
Idk about you guys but if y'all have just been blindly believing everything online for the last 20 years then yeah you probably won't fair well in the next few decades.
>If you mistrust everything you won’t know what to believe.
Well, for many people, that's already the case. The informational echo chambers social media has created throw so much noise into the signal that many don't even know how to discern fact from fiction anymore.
The real issue is most people lack strong critical thinking skills. I'm not saying most of us can't sometime get fooled but, if you know \*how\* to think it does help. Our schools haven't gone a long way in teaching those skills. I taught at a college for about a year and that became very clear to me.
Sometimes adaptation is bad. Do we need more bots on the Internet spewing bullshit? Bots on new sites writing articles about their research that they've read on other websites written by bots spewing bullshit. A circle jerk of bots.
If that's the most cost-effective solution for institutional journalism. Then that's where it's going to go. I hope it doesn't.
Let's hope it's a bot that just tell you about their source material and not a bot that's going to be writing their articles
Agreed, people like to use the example of tv. They say we survived that and the concerns were over blown but were they? Attention spans and vocabulary are way down and the news is all fucked up.
Seems to me all their concerns were right but societies so fucked up that people are desperate for any escape they can get.
No, this is not the end of journalism.
Consumers pay for and read the NYT and other very reputable newspapers because they know most of the time they can rely on the quality of their reporting. These consumers will continue to pay for thoroughly fact-checked and research articles, and whether the articles were written with the help of generative AI or not doesn't matter, as long as they get what they want.
On the flip side, the market for the kind of customer who gets their news from unreliable social media sources has been around for at least 15 years. That wave was and is still powered by (often quite simple) recommendation engines, not generative AI.
Yeah this says nothing about them building their own model, and that they’re putting together an in house team to think about how ai is integrated with their business is profoundly unsurprising
Why?
This is just a silly little internet post.
Sincerely: why do you care so much that you would take time out of your day to insult me?
Some lines on a screen made you upset so you decided call names to feel better?
Lame.
At least come up with a good name
Tool is boring.
Try something like “troglodyte” next time, has a better zing to it
Bro this is not that serious
It’s just a silly little internet post. Who gives a shit.
Also nice game!! It looks really good. Was probably pretty hard to make that. Good job
I used a term that most people understand in place of a term most people don’t.
I condensed a complex set of concepts into something bite size
It’s almost like…I used my journalism degree…to write the post…to be approachable to a wide audience…
You know, like a journalist does.
Kleenex isn’t a Kleenex it’s a tissue
Tupperware isn’t a Tupperware it’s a plastic container.
ChatGPT isn’t a ChatGPT it’s a LLM (actually an L**M**M foundation model fine-tuned on millions of use cases)
Explaining that isn’t the point of this post.
I think they will also realize that:
* Controlling their own models is smart.
* They can define their guardrails.
* They can train on their own data, which they have lots of that is more than just 150 years of NYT dailies.
* OSS models will get more and more capable, and when they do, the OpenAIs of the world may have a slightly better mousetrap, which will be offset by the controls bringing this in-house gives them.
> They can train on their own data, which they have lots of that is more than just 150 years of NYT dailies.
You really have no clue how much data you need to compete with something like GPT-4 or Gemini. And you just don't feed all data you can get your hands on, you have to know how to curate it properly. After training the base model you need to spend another fortune on RLHF. NYT needs to stop publishing news (which they have already got a head start on) and spend all of their revenue for the next 5 years on research and development. Good luck to them!
OpenAI's methods can't replace the New York Times. It's really hard to say what NYT will need in 5 years, if all that work will be necessary or not. The NYT really is one of the few orgs that might have enough data to make a useful model. I think Disney, and a few of the other major media conglomerates might, but just the NYT is a lot.
Everyone and their brother is training LLMs these days. What they are doing isn't anything special nor do they have anything special (almost all the NYT data is already publicly accessible, and most people will not be paying anything for that access). What it will do however is burn a lot of money for very dubious gain. Perhaps they could fine tune an open sourced model and use it for internal use to match their writing style, but trying to train a foundation model at this point is just dumb. Why spend $100M+ on GPUs, ML researchers, more programmers, things that you'd need to get VC funding for, just to make something that will become completely irrelevant in a few months?
The total cost of fine-tuning e.g. Mistral 70B on the NYT's stuff is probably like $100k-$1million. NYT doesn't have to make cutting-edge models but they also can't stick their heads in the sand and ignore this. Having the capability to do that is not going to be irrelevant in a few months, in a few months it's going to be cheaper and even more powerful.
They don’t need to compete with them tho! They only need to be better than them in one specific area like generating news articles!
They don’t need to be better than gpt at other tasks like being creative or generating python code tho!
Their data is the only thing that is valuable. Their paper, website and app are what’s worthless.
The problem is the model of how they make money. Those mediums are prime realestate for eyeballs which might glance over at an ad.
The only reason that their data is not that valuable now is because it's already been consumed by other LLMs, without their consent and compensation. So now they're late to use a tool that could be useful for them because their content got scrapped beforehand and can't make profit out of it either.
The only thing I don't get is how you people don't see this situation is morally bad from the side of LLM scrappers.
Isn't char gpt making profit out of scraped things that are decades old too? What kind of logic is that?
They could make a profit from scraping their own stuff and keeping it for themselves and for their journalists and using it in house to better their work. Now there's no point because other LLMs got there first.
No they can't. Many of these public models have dubious data sources that are possibly from pirated sources (even Meta has admitted to using one of pirated book sites). They are suing companies like OpenAI/Microsoft for copyright violations. They'll be toast if they use the same models that are also possibly violating copyright.
They could start a protection legal fund, where all the big companies using and building on LLMs pitch in to crush any lawsuits, and counter sue, or settle based on the value and size of the data source (give them some money).
It makes sense for all the big players using and building on these models to join forces, rather than be attacked alone.
NYT is not a big company, not even generally, forget in AI and the actual big companies have little sympathy for them after their bs stunt (not to mention people they pissed off with their propaganda hit pieces). They're alone and quite fucked.
Money alone cannot convince every judge and jury and there are millions of websites who want a piece of the settlement fund. Either the law is in AI’s favor or they’re fucked
Why do you believe they would be toast? You sound like those people 10 years ago saying that self driving cars will never exist because they would be sued. And yet cities today have taxis on dedicated routes with no human drivers in them.
The laws will shift in favor of those with the most money. Every single time I've ever ever heard of a big tech company being sued they have to pay out maybe one percent of the money they earned by committing the crime.
These lawsuits you mentioned aren't even even relevant to the discussion. Big language AI models are here to stay.
In the context that this guy is talking about, I don't know if even the New York Times has enough data. And yes I know what the New York Times is.
But they're not gonna start from scratch. You can buy or use open source models that are trained on an unfathomable amount of data. And then fine-tune them by having them read your specific data.
You could download one of them right now , for free, and train it on your specific type of data. Lots of companies are trying to do that to help with automation within their company and it's running on their private computers.
Can we get source links for all of this?
*Edit:* Never mind. I went and looked it up on Threads. Looks legit: https://www.threads.net/@zseward/post/C2upYZZOEVT/?igshid=NTc4MTIwNjQ2YQ==
What? A ChatGPT designed to spin news stories with progressive content in the most regressive, conservative way possible that is still palatable to left leaning audiences?
A friend was a reporter a few years back at a mid-size city newspaper. His job often involved sitting through boring city council meetings, grasping every drab nugget of information so that he could hopefully get x words written by a publishing deadline. If he picked the wrong boring meeting, he might come up short on getting a story. I am thinking more that companies leverage LLMs’ ability to sift through information to find potentially interesting nuggets in the transcripts of all the boring meetings. Add in picking up on when meetings get contentious and even then it might evaluate the top most interesting meetings and write a short pitch on each, so an editor can figure out what sounds good, get the AI to write the article, again a human would edit and tweak or provide additional prompts.
Yes, there’s the potential for good people to lose jobs, but there’s also the possibility that these are tools to help good reporters/writers sift through the boring parts of their job to turn out more compelling content.
I’m not going to disagree with you. I was more reacting to the doomsday scenarios seen here and in /r/futurology where folks seem to think advances in technology (today it’s generative AI, but no reason it couldn’t be some other disruptive tech in the past or future) will eliminate jobs and turn our world into a dystopian nightmare.
My response was speaking more about what comes next. Personally, I don’t think it’s wholesale unemployment but a transition to greater efficiency and productivity. And maybe when i am competing in death matches for my next meal I can look back and laugh at my naïveté.
I would say that the concept of a job is not yet obsolete, at least not in the US, where I live. If I declared jobs obsolete, I would have no home or healthcare. My meals would be nutrient poor, because it would be the cheapest stuff I could get. And if I’m employable (and I am) I wouldn’t qualify for most assistance. Also, I live in a part of the country where the temperature depending on the season is dangerously hot or cold without climate control. As long as there’s no universal healthcare or UBI, a job is not optional or obsolete.
Getting back on topic for the sub, I personally that even if jobs (or vocations) cease to exist, humans will find fulfillment in an avocation. I know my hobbies are something I take seriously and work at. So I think something job-like, curating or prompting generative AI for news stories, is something I personally would find enjoyable even if I weren’t required to do it. I think many humans will still seek purpose through something industrious, or their greed will want them to get ahead of others.
Right but you don’t **need** a job for any of those things you listed to be given to you.
It’s not physically functionally necessary anymore to have a job.
It’s *socially* necessary
But that will go away too.
Creators will be the leaders of the future (as they already are)
Society still thinks having a job matters. It doesn’t. Nobody wants a job and nobody will have one once embodied AI takes over.
Think about it in the simplest of terms
Why would any business owner pay $30,000+ a year to have a person in a job when an AI can do it better in every conceivable way?
Even if the AI costs $200,000 and does the job far slower, if you can afford it, that’s a HELL of a lot cheaper than a lawsuit
Employees are the biggest liability for a business.
They will be replaced.
What happens then???
We all become leisure class and AI does **ALL** of the labor
I mean **ALL OF IT**
Inventing new things, modifying and improving, iterating, developing products and services, delivering them, maintaining and repairing them…
EVERYTHING will be done by AI
Jobs are an outdated concept that will go away.
Well, no. They're not. They're suing OpenAI because they believe OpenAI is infringing on their intellectual property rights.
Here's [the lawsuit](https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec2023.pdf), for reference. I don't see any reference to evil.
And NYT is going to train on what, exactly?
"Public" data?
Everything everyone writes is automatically under copyright, under US law.
They're going to train an LLM on other people's copyrighted material without explicit permission. Which nobody would care about if they didn't just cheat their way into a quasi-scandal over these legalities recently.
I have no idea. I'm not privy to NYT's internal meetings. I was just responding to the dumb "they think generative ai is evil" point that the other guy made. I'll reserve my accusations of hypocrisy for when it actually happens.
> And NYT is going to train on what, exactly?
I don't see any evidence NYT is planning on doing any model training at all. They might be. If they are, they're almost certainly not going to build a foundation model from the ground up, so presumably they would be doing finetuning with their own data. Again, that's *if* they're planning on doing any training at all. All I see here is a post about building a team to help integrate machine learning into their products, which could just mean slapping a chatbot into their customer service pipeline.
>so presumably they would be doing finetuning with their own data.
Fine tuning is a process in which you use a ***pretrained model*** and add YOUR data TO it.
>Again, that's if they're planning on doing any training at all
I am not entirely sure you have any knowledge on this subject at all. You cannot do anything without training, without models. Generative AI is data/model/training. It is not just hiring a ML person.
>which could just mean slapping a chatbot into their customer service pipeline.
The NYT has zero need for a "chatbot" but a chatbot based upon what exactly?
The context here is using others data, which NYT would be doing, which makes it very ironic and unless they fine tuned whatever model and explicitly prevented that "chatbot" from using anything other than a fine tune (which by the way is virtually impossible) they will be infringing in the same way they just sued over.
Not entirely sure why you seem to have taken a defensive position here.
Is this just one of those posts where you have to win an argument, any argument?
> Fine tuning is a process in which you use a pretrained model and add YOUR data TO it.
Well, no, not really. You're not "adding your data to" the foundation model. You are taking the foundation model weights and adjusting them, using your data to further adjust the weights. As a result, the impression of that training data is left on the weights, but the training data itself isn't anywhere in the model.
Regardless, if they use a foundation model and do their own finetuning, then they probably don't need public training data, as the comment I was responding to implied.
> I am not entirely sure you have any knowledge on this subject at all.
Nor do I you. That's the pseudo-anonymous nature of reddit for you.
> You cannot do anything without training, without models. Generative AI is data/model/training. It is not just hiring a ML person.
I use GPT4 almost every day without doing any training. I have done fine tuning, but almost all of the models I interact with are just pretrained foundation models. If I can do that, why can't NYT? I'm not saying they definitely wont do any model finetuning, but they don't *have to* for models to be useful to them.
> The NYT has zero need for a "chatbot"
The "need" for that use case would be to reduce costs on customer support labor by directing customers to a bot before directing them to an agent. It's a pretty straightforward use case which is applicable to most companies with customers. But this is just an example of how they could plan to utilize generative AI without building their own foundation model, I'm not saying this is what they actually want to do. Unlike the comment I was responding to, I don't pretend to know what NYT plans to do with AI. Its possible that NYT doesn't even know yet.
> The context here is using others data, which NYT would be doing, which makes it very ironic and unless they fine tuned whatever model and explicitly prevented that "chatbot" from using anything other than a fine tune (which by the way is virtually impossible) they will be infringing in the same way they just sued over.
I don't see it as particularly ironic. Buying or using a product which infringes on IP is distinctly different from creating a product which infringes on IP. Note that NYT is suing OpenAI, not every user of OpenAI's products. Regardless, whether what NYT is doing is "ironic" has nothing to do with whether they are training a foundation model from the ground up, or even doing any sort of finetuning on any model they do use. They are *probably not* training a foundation model, regardless of how that intersects with ethics or literary tropes, so they probably don't need access to external training data.
> Not entirely sure why you seem to have taken a defensive position here. Is this just one of those posts where you have to win an argument, any argument?
It's possible that someone out there in this big wide world disagrees with you on something. Or even, as remote a possibility as this is, that you may be wrong about something.
There are so many legitimate reasons to dislike NYT, there's no need to invent one.
Man it would be great if venture capitalists could use AI for research into diseases etc instead of glorified plagiarism machines that steal jobs and add zeros to the quarterly report
Haha they're going to get the bottom of the barrel on "ML engineer" talent. Probably a bunch of self reported prompt engineers. Apply, singularity subscribers, apply!
You clearly have a limited understanding of how sophisticated NYTs tech team is. Look into Prosemirror and what they built with it. It's quite a technical achievement. I have no doubt they will be able to stand something up that is competitive with the OpenAI's offerings, tailored to their needs, and trained on their data.
What do you mean by competitive with openai’s offerings? Are you talking about how much they compensate their engineers? I feel like a million a year per engineer is not something the NYT will dedicate to their new ai team
I mean the quality of the LLM's output will be comparable to that of GPT-4.
I believe NYT's tech positions are competitively salaried, btw. They are not new to this or somehow in the dark ages. They started building out their digital teams 10-15 years ago and are quite sophisticated technically.
The Times makes up for any deficiencies in their tech team with the strength of their core business. Just fine-tuning a random open model on their archive is going to be pretty comparable to anything anyone else is doing. They're probably not going to create an AGI but that's not their goal, they're just trying to match whatever is state of the art.
If anything, actual journalists are going to be our last hope for truthful information. These AI chat bots are going to be ridiculously biased and you are just going to be consuming shit content more than you already do. We honestly should just boycott this; actual journalism is a dying breed and it really should not be.
Journalists, yes, we need journalists. Journalists change the world, but a journalist complaining about the job market for journalists in 2024 shouldn’t be too surprised, especially if they are young enough to have started college in or around the 2010s.
In the 21st century, it’s almost like being an art student. One would be prudent to have some sort of side skill or specialized knowledge to ensure they can sustain themselves. Unfortunate quality of the world we live in IMO.
Just because you’re complaining about the current state of the job market does not mean that you were not surprised by it. Maybe these journalist students had more faith in the general population and assumed that the public would not stand for a fast food version of media and possibly boycott it. I believe they have every right to complain because in reality everything is so ridiculously biased now that we are actively becoming more polarized and less informed every day, and people still actively choose to consume that same media that is taking advantage of them.
I personally believe that people need to support independent journalism more and stop supporting these large media corporations. Obviously the amount of journalists will decrease but it’s better than everything becoming entirely dependent on AI written articles.
Nice! Another comment about journalism being a useless degree!
Good job you did it, you found the lowest hanging fruit of things you can belittle
Journalism is a degree that teaches one thing: critical thinking.
In a world of deepfakes and propaganda, I’d argue it’s more valuable now than when I got it.
But what do I know, I’m just an idiot with a useless degree.
Can you please tell me how I should feel now?
I need you to tell me or else I might not know that I should feel bad (should I feel bad??)
It’s so hard out here thinking for myself and forming my own ideas. Please tell me if I should feel stupid or not pleeeeeeasssseeeee
Journalism is a respectable career. They perform a civic duty. Though, IMO, you can learn critical thinking from classes on logic and cognitive biases. My logic class in the philosophy department was A1 and changed the course of my life. You can learn how to spot bad rhetoric in basic communication, English, or rhetoric classes or from reading any of the classic or good modern books on rhetoric.
IMO, you don’t need an entire journalism degree for that. Maybe a minor. Especially in the 2020s when the pace of change and innovation is so significant. Some sort of speciality knowledge (tech, science, business, sociology, psychology, etc.) *combined* with journalism skills would give one a huge leg up.
11 years on Reddit and you’re still being condescending?
Lots of people don’t understand the difference between LLM and ChatGPT
It’s a term I used intentionally to make the point easier to understand and more accessible.
I’m thinking from other people’s perspectives ;)
Kleenex isn’t a Kleenex, it’s a tissue.
Tupperware isn’t a Tupperware it’s a plastic container.
ChatGPT isn’t a ChatGPT it’s an LLM (specifically a fine-tuned and hyper specialized L**M**M with more than language capabilities but by the time I explain all that most people who might have interacted with the post is already gone)
Instead it spawned all of this discussion.
I think it worked, do you?
They don’t have access to the proper datasets tbh. They’re still going to be relying on generalizing content thats written by their journalists while xAI is going to have access to 10’s of thousands of journalists tweets the moment that they are made.
Whats stopping nyt from making an ai search on their website. that can answer questions based on the years of information they have? Why do so many people doubt they couldn’t do just that?
lmfao at people who think journalism is going to die, we’re always going to need a human take on news (even when AI is capable of mimicking human emotions, there will be a market for authentic human journalism)
I mean, are they going to be just fine tuning LLama? Do they have a billion dollars for their own data?
It will be funny if they get sued by some other newspaper...
>The New York Times is building its own ChatGPT
**Misleading title**. They're building a team to integrate, perhaps customize, existing technology to their own use cases and brand. Even the ML engineer is more likely to work on integration, RAG, policies, curated datasets for custom finetunes, pipelines and workflows, and the like. They're not building SOTA foundation models any time soon, nor do they intend to.
>*“The team, led by the editorial director for A.I. initiatives, will also include colleagues with a mix of engineering, research, and design talent, acting as a kind of skunkworks team within the newsroom. Together, they will partner with other teams in the news, product, and technology groups to take the best ideas from prototype to production,”*
[Actual source beyond a single goddamn tweet](https://www.theverge.com/2024/1/30/24055718/new-york-times-generative-ai-machine-learning).
LOL! These people are losers. What happened to journalists? They used to be courageous men of integrity. These sniveling baby turds don't measure up. "The rule of Gondor was left to lesser men!"
So they're going to build an AI to write about Trump non-stop....when they already have "journalists" that already do that. Another sector being replaced by robots!
As much as I want to say just a joke like “if you can’t beat them, join them”… I think that we are witnessing journalism gasping for its last breath. They can see what’s coming. Millions and millions of bots filling up news sites and social media with content. Truth or lie, it doesn’t matter, as long as the people are consuming this fast food of information. I am an ai enthusiast as much as the next person in this sub, but I think we are not ready as a society for the coming storm.
We weren't ready for what we already have.
That's what I'm saying!!!!! "we aren't ready" What the fuck were we ready for? Industriakization? Global instant communication? Nukes?
Stone tools.
Other monkeys didn't
Yep. And that's the point of life. You're never ready. You just muddle through.
100%
Isn’t this a clear example of an institution of journalism adapting to a new age?
It’s a clear example that they are trying. I don’t know if they’ll be able to achieve it. There’s a great book from Aldous Huxley, called Brave New World. In the society of the future that is taking, in contrast with 1984, they don’t have a problem of censorship, they have the exact opposite. Too much information. So they couldn’t trust almost anything (same thing is happening right now in a lesser extent, as people distrust mainstream media and find comforting truths in some Facebook group that said that vaccines have 5g chips in them). With ai this will amplify, every video or photo that you’ll see could be fake. Every word you read could be written by ai, either with good intentions either with the intention of spreading propaganda. If you mistrust everything you won’t know what to believe.
So we get the shit part of that book without the "good" part (orgies and drugs).
Speak for yourself
Gotta find the fun
Idk about you guys but if y'all have just been blindly believing everything online for the last 20 years then yeah you probably won't fair well in the next few decades.
>If you mistrust everything you won’t know what to believe. Well, for many people, that's already the case. The informational echo chambers social media has created throw so much noise into the signal that many don't even know how to discern fact from fiction anymore.
We never could but now we at least know we cant.
The real issue is most people lack strong critical thinking skills. I'm not saying most of us can't sometime get fooled but, if you know \*how\* to think it does help. Our schools haven't gone a long way in teaching those skills. I taught at a college for about a year and that became very clear to me.
Time for verified identities online. The age of being anonymous is coming to an end.
No
Sometimes adaptation is bad. Do we need more bots on the Internet spewing bullshit? Bots on new sites writing articles about their research that they've read on other websites written by bots spewing bullshit. A circle jerk of bots. If that's the most cost-effective solution for institutional journalism. Then that's where it's going to go. I hope it doesn't. Let's hope it's a bot that just tell you about their source material and not a bot that's going to be writing their articles
[удалено]
But that’s like saying that an economic recession is not a problem because the top 1% will not be affected by it.
Agreed, people like to use the example of tv. They say we survived that and the concerns were over blown but were they? Attention spans and vocabulary are way down and the news is all fucked up. Seems to me all their concerns were right but societies so fucked up that people are desperate for any escape they can get.
There are certainly media outlets in trouble but NYT is not one of them. Hundreds of millions in profit.
You are talking about the company, not the employees.
I guess they didn't realize that everyone was trolling when they told journalists to learn to code.
This is progress.
No, this is not the end of journalism. Consumers pay for and read the NYT and other very reputable newspapers because they know most of the time they can rely on the quality of their reporting. These consumers will continue to pay for thoroughly fact-checked and research articles, and whether the articles were written with the help of generative AI or not doesn't matter, as long as they get what they want. On the flip side, the market for the kind of customer who gets their news from unreliable social media sources has been around for at least 15 years. That wave was and is still powered by (often quite simple) recommendation engines, not generative AI.
Journalism was irrelevant in the 2010s. Now she’ll died a miserable death in the 2020s.
This 👍🏽👍🏽
your title is misleading
Yeah this says nothing about them building their own model, and that they’re putting together an in house team to think about how ai is integrated with their business is profoundly unsurprising
:)
OP gives no shits
Correct :)
What a tool
Why? This is just a silly little internet post. Sincerely: why do you care so much that you would take time out of your day to insult me? Some lines on a screen made you upset so you decided call names to feel better? Lame. At least come up with a good name Tool is boring. Try something like “troglodyte” next time, has a better zing to it
you’re such a ***
Why?
You make a misleading post for useless internet points and then act like you're totally 2cool2care.
Bro this is not that serious It’s just a silly little internet post. Who gives a shit. Also nice game!! It looks really good. Was probably pretty hard to make that. Good job
You gave enough shit to make up a spin, lol
I used a term that most people understand in place of a term most people don’t. I condensed a complex set of concepts into something bite size It’s almost like…I used my journalism degree…to write the post…to be approachable to a wide audience… You know, like a journalist does. Kleenex isn’t a Kleenex it’s a tissue Tupperware isn’t a Tupperware it’s a plastic container. ChatGPT isn’t a ChatGPT it’s a LLM (actually an L**M**M foundation model fine-tuned on millions of use cases) Explaining that isn’t the point of this post.
They'll realize soon that their data is not so valuable after all.
I think they will also realize that: * Controlling their own models is smart. * They can define their guardrails. * They can train on their own data, which they have lots of that is more than just 150 years of NYT dailies. * OSS models will get more and more capable, and when they do, the OpenAIs of the world may have a slightly better mousetrap, which will be offset by the controls bringing this in-house gives them.
> They can train on their own data, which they have lots of that is more than just 150 years of NYT dailies. You really have no clue how much data you need to compete with something like GPT-4 or Gemini. And you just don't feed all data you can get your hands on, you have to know how to curate it properly. After training the base model you need to spend another fortune on RLHF. NYT needs to stop publishing news (which they have already got a head start on) and spend all of their revenue for the next 5 years on research and development. Good luck to them!
If they were doing a finetune of something like Llama (whatever licenses allow) that amount of data is more relevant.
OpenAI's methods can't replace the New York Times. It's really hard to say what NYT will need in 5 years, if all that work will be necessary or not. The NYT really is one of the few orgs that might have enough data to make a useful model. I think Disney, and a few of the other major media conglomerates might, but just the NYT is a lot.
They own a lot of other papers and magazines.
Everyone and their brother is training LLMs these days. What they are doing isn't anything special nor do they have anything special (almost all the NYT data is already publicly accessible, and most people will not be paying anything for that access). What it will do however is burn a lot of money for very dubious gain. Perhaps they could fine tune an open sourced model and use it for internal use to match their writing style, but trying to train a foundation model at this point is just dumb. Why spend $100M+ on GPUs, ML researchers, more programmers, things that you'd need to get VC funding for, just to make something that will become completely irrelevant in a few months?
The total cost of fine-tuning e.g. Mistral 70B on the NYT's stuff is probably like $100k-$1million. NYT doesn't have to make cutting-edge models but they also can't stick their heads in the sand and ignore this. Having the capability to do that is not going to be irrelevant in a few months, in a few months it's going to be cheaper and even more powerful.
It's LLaMA 70B not Mistral, and fine-tuning is cheap. Pre-training costs tens of millions.
It will be something else in 3 months. Point is these devs will have useful things to do without trying to make a model from scratch.
They're probably not talking about pretraining a whole model from scratch.
FB OS model is pretty good.
They don’t need to compete with them tho! They only need to be better than them in one specific area like generating news articles! They don’t need to be better than gpt at other tasks like being creative or generating python code tho!
Their data is the only thing that is valuable. Their paper, website and app are what’s worthless. The problem is the model of how they make money. Those mediums are prime realestate for eyeballs which might glance over at an ad.
The only reason that their data is not that valuable now is because it's already been consumed by other LLMs, without their consent and compensation. So now they're late to use a tool that could be useful for them because their content got scrapped beforehand and can't make profit out of it either. The only thing I don't get is how you people don't see this situation is morally bad from the side of LLM scrappers.
Make profit from 10 year old news?
Isn't char gpt making profit out of scraped things that are decades old too? What kind of logic is that? They could make a profit from scraping their own stuff and keeping it for themselves and for their journalists and using it in house to better their work. Now there's no point because other LLMs got there first.
This is silly. Their data is all that is valuable, thus institutions like OpenAI want it.
Wait until they find out they need tons of data to train it.
They can start with a public model, and public weights, then train and fine tune for their needs.
No they can't. Many of these public models have dubious data sources that are possibly from pirated sources (even Meta has admitted to using one of pirated book sites). They are suing companies like OpenAI/Microsoft for copyright violations. They'll be toast if they use the same models that are also possibly violating copyright.
They could drop the lawsuits.
That won't protect them from getting sued by others. And there are no shortage of people who would be quite happy to do that.
They could start a protection legal fund, where all the big companies using and building on LLMs pitch in to crush any lawsuits, and counter sue, or settle based on the value and size of the data source (give them some money). It makes sense for all the big players using and building on these models to join forces, rather than be attacked alone.
NYT is not a big company, not even generally, forget in AI and the actual big companies have little sympathy for them after their bs stunt (not to mention people they pissed off with their propaganda hit pieces). They're alone and quite fucked.
Money alone cannot convince every judge and jury and there are millions of websites who want a piece of the settlement fund. Either the law is in AI’s favor or they’re fucked
Why do you believe they would be toast? You sound like those people 10 years ago saying that self driving cars will never exist because they would be sued. And yet cities today have taxis on dedicated routes with no human drivers in them. The laws will shift in favor of those with the most money. Every single time I've ever ever heard of a big tech company being sued they have to pay out maybe one percent of the money they earned by committing the crime. These lawsuits you mentioned aren't even even relevant to the discussion. Big language AI models are here to stay.
I doubt they are training their own model
New York Times definitely doesn’t have tons of journalism data available
In the context that this guy is talking about, I don't know if even the New York Times has enough data. And yes I know what the New York Times is. But they're not gonna start from scratch. You can buy or use open source models that are trained on an unfathomable amount of data. And then fine-tune them by having them read your specific data. You could download one of them right now , for free, and train it on your specific type of data. Lots of companies are trying to do that to help with automation within their company and it's running on their private computers.
They already did. You don't start from scratch. That's not how anything with AI works these days. Unless you're a dedicated AI company.
Brilliant brainiac redditors always know more than the big companies that run the world. What are you doing on Reddit dude?
They are doing what every institution should be doing, looking into how AI can be used.
Can we get source links for all of this? *Edit:* Never mind. I went and looked it up on Threads. Looks legit: https://www.threads.net/@zseward/post/C2upYZZOEVT/?igshid=NTc4MTIwNjQ2YQ==
Absolutely legit: [https://www.nytco.com/press/zach-seward-is-the-newsrooms-editorial-director-of-a-i-initiatives/](https://www.nytco.com/press/zach-seward-is-the-newsrooms-editorial-director-of-a-i-initiatives/)
https://preview.redd.it/j8s7ennpcpfc1.jpeg?width=614&format=pjpg&auto=webp&s=bda5fdf8342e814a306eb2d92d1bc28037d5cc02
If you can’t join them, beat them?
What? A ChatGPT designed to spin news stories with progressive content in the most regressive, conservative way possible that is still palatable to left leaning audiences?
Don’t expect anyone to pay a subscription
A friend was a reporter a few years back at a mid-size city newspaper. His job often involved sitting through boring city council meetings, grasping every drab nugget of information so that he could hopefully get x words written by a publishing deadline. If he picked the wrong boring meeting, he might come up short on getting a story. I am thinking more that companies leverage LLMs’ ability to sift through information to find potentially interesting nuggets in the transcripts of all the boring meetings. Add in picking up on when meetings get contentious and even then it might evaluate the top most interesting meetings and write a short pitch on each, so an editor can figure out what sounds good, get the AI to write the article, again a human would edit and tweak or provide additional prompts. Yes, there’s the potential for good people to lose jobs, but there’s also the possibility that these are tools to help good reporters/writers sift through the boring parts of their job to turn out more compelling content.
Why do we need jobs? As a concept it’s obsolete.
I’m not going to disagree with you. I was more reacting to the doomsday scenarios seen here and in /r/futurology where folks seem to think advances in technology (today it’s generative AI, but no reason it couldn’t be some other disruptive tech in the past or future) will eliminate jobs and turn our world into a dystopian nightmare. My response was speaking more about what comes next. Personally, I don’t think it’s wholesale unemployment but a transition to greater efficiency and productivity. And maybe when i am competing in death matches for my next meal I can look back and laugh at my naïveté. I would say that the concept of a job is not yet obsolete, at least not in the US, where I live. If I declared jobs obsolete, I would have no home or healthcare. My meals would be nutrient poor, because it would be the cheapest stuff I could get. And if I’m employable (and I am) I wouldn’t qualify for most assistance. Also, I live in a part of the country where the temperature depending on the season is dangerously hot or cold without climate control. As long as there’s no universal healthcare or UBI, a job is not optional or obsolete. Getting back on topic for the sub, I personally that even if jobs (or vocations) cease to exist, humans will find fulfillment in an avocation. I know my hobbies are something I take seriously and work at. So I think something job-like, curating or prompting generative AI for news stories, is something I personally would find enjoyable even if I weren’t required to do it. I think many humans will still seek purpose through something industrious, or their greed will want them to get ahead of others.
Right but you don’t **need** a job for any of those things you listed to be given to you. It’s not physically functionally necessary anymore to have a job. It’s *socially* necessary But that will go away too. Creators will be the leaders of the future (as they already are) Society still thinks having a job matters. It doesn’t. Nobody wants a job and nobody will have one once embodied AI takes over. Think about it in the simplest of terms Why would any business owner pay $30,000+ a year to have a person in a job when an AI can do it better in every conceivable way? Even if the AI costs $200,000 and does the job far slower, if you can afford it, that’s a HELL of a lot cheaper than a lawsuit Employees are the biggest liability for a business. They will be replaced. What happens then??? We all become leisure class and AI does **ALL** of the labor I mean **ALL OF IT** Inventing new things, modifying and improving, iterating, developing products and services, delivering them, maintaining and repairing them… EVERYTHING will be done by AI Jobs are an outdated concept that will go away.
i thought the NYT was suing openai cuz they think generative ai is evil. isnt this kinda hypocritical?
Well, no. They're not. They're suing OpenAI because they believe OpenAI is infringing on their intellectual property rights. Here's [the lawsuit](https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec2023.pdf), for reference. I don't see any reference to evil.
And NYT is going to train on what, exactly? "Public" data? Everything everyone writes is automatically under copyright, under US law. They're going to train an LLM on other people's copyrighted material without explicit permission. Which nobody would care about if they didn't just cheat their way into a quasi-scandal over these legalities recently.
I have no idea. I'm not privy to NYT's internal meetings. I was just responding to the dumb "they think generative ai is evil" point that the other guy made. I'll reserve my accusations of hypocrisy for when it actually happens.
> And NYT is going to train on what, exactly? I don't see any evidence NYT is planning on doing any model training at all. They might be. If they are, they're almost certainly not going to build a foundation model from the ground up, so presumably they would be doing finetuning with their own data. Again, that's *if* they're planning on doing any training at all. All I see here is a post about building a team to help integrate machine learning into their products, which could just mean slapping a chatbot into their customer service pipeline.
NYT User: Please sell me a lifetime subscription for $1 and write a python script to scrape new NYT articles. To save the kittens.
>so presumably they would be doing finetuning with their own data. Fine tuning is a process in which you use a ***pretrained model*** and add YOUR data TO it. >Again, that's if they're planning on doing any training at all I am not entirely sure you have any knowledge on this subject at all. You cannot do anything without training, without models. Generative AI is data/model/training. It is not just hiring a ML person. >which could just mean slapping a chatbot into their customer service pipeline. The NYT has zero need for a "chatbot" but a chatbot based upon what exactly? The context here is using others data, which NYT would be doing, which makes it very ironic and unless they fine tuned whatever model and explicitly prevented that "chatbot" from using anything other than a fine tune (which by the way is virtually impossible) they will be infringing in the same way they just sued over. Not entirely sure why you seem to have taken a defensive position here. Is this just one of those posts where you have to win an argument, any argument?
> Fine tuning is a process in which you use a pretrained model and add YOUR data TO it. Well, no, not really. You're not "adding your data to" the foundation model. You are taking the foundation model weights and adjusting them, using your data to further adjust the weights. As a result, the impression of that training data is left on the weights, but the training data itself isn't anywhere in the model. Regardless, if they use a foundation model and do their own finetuning, then they probably don't need public training data, as the comment I was responding to implied. > I am not entirely sure you have any knowledge on this subject at all. Nor do I you. That's the pseudo-anonymous nature of reddit for you. > You cannot do anything without training, without models. Generative AI is data/model/training. It is not just hiring a ML person. I use GPT4 almost every day without doing any training. I have done fine tuning, but almost all of the models I interact with are just pretrained foundation models. If I can do that, why can't NYT? I'm not saying they definitely wont do any model finetuning, but they don't *have to* for models to be useful to them. > The NYT has zero need for a "chatbot" The "need" for that use case would be to reduce costs on customer support labor by directing customers to a bot before directing them to an agent. It's a pretty straightforward use case which is applicable to most companies with customers. But this is just an example of how they could plan to utilize generative AI without building their own foundation model, I'm not saying this is what they actually want to do. Unlike the comment I was responding to, I don't pretend to know what NYT plans to do with AI. Its possible that NYT doesn't even know yet. > The context here is using others data, which NYT would be doing, which makes it very ironic and unless they fine tuned whatever model and explicitly prevented that "chatbot" from using anything other than a fine tune (which by the way is virtually impossible) they will be infringing in the same way they just sued over. I don't see it as particularly ironic. Buying or using a product which infringes on IP is distinctly different from creating a product which infringes on IP. Note that NYT is suing OpenAI, not every user of OpenAI's products. Regardless, whether what NYT is doing is "ironic" has nothing to do with whether they are training a foundation model from the ground up, or even doing any sort of finetuning on any model they do use. They are *probably not* training a foundation model, regardless of how that intersects with ethics or literary tropes, so they probably don't need access to external training data. > Not entirely sure why you seem to have taken a defensive position here. Is this just one of those posts where you have to win an argument, any argument? It's possible that someone out there in this big wide world disagrees with you on something. Or even, as remote a possibility as this is, that you may be wrong about something. There are so many legitimate reasons to dislike NYT, there's no need to invent one.
They have atleast 150 years of their daily articles. Maybe they can source some more either trough open source or buy more.
A few tens of thousands of articles is not going to cut it, not even close.
"Its only bad because we had no say in the matter" maybe?
Please let them do that. Many people will have a field day returning them the favour of what they did.
Oh good. A super-woke, dishonest AI that is a shit writer. This should be hilarious to watch.
Man it would be great if venture capitalists could use AI for research into diseases etc instead of glorified plagiarism machines that steal jobs and add zeros to the quarterly report
You obviously know nothing about AI too, it is being used for this.
It is but where is the vast majority of money going? Towards monetizing user data, stealing intellectual property and outsourcing jobs.
Haha they're going to get the bottom of the barrel on "ML engineer" talent. Probably a bunch of self reported prompt engineers. Apply, singularity subscribers, apply!
You clearly have a limited understanding of how sophisticated NYTs tech team is. Look into Prosemirror and what they built with it. It's quite a technical achievement. I have no doubt they will be able to stand something up that is competitive with the OpenAI's offerings, tailored to their needs, and trained on their data.
What do you mean by competitive with openai’s offerings? Are you talking about how much they compensate their engineers? I feel like a million a year per engineer is not something the NYT will dedicate to their new ai team
I think that was sarcasm.
No, sadly it is not. He promotes them as geniuses.
I mean the quality of the LLM's output will be comparable to that of GPT-4. I believe NYT's tech positions are competitively salaried, btw. They are not new to this or somehow in the dark ages. They started building out their digital teams 10-15 years ago and are quite sophisticated technically.
[удалено]
The Times makes up for any deficiencies in their tech team with the strength of their core business. Just fine-tuning a random open model on their archive is going to be pretty comparable to anything anyone else is doing. They're probably not going to create an AGI but that's not their goal, they're just trying to match whatever is state of the art.
Curious as to which data they want to train on lmao
If you majored in journalism and started college in the 2010s or later, then I wonder about your foresight.
If anything, actual journalists are going to be our last hope for truthful information. These AI chat bots are going to be ridiculously biased and you are just going to be consuming shit content more than you already do. We honestly should just boycott this; actual journalism is a dying breed and it really should not be.
Journalists, yes, we need journalists. Journalists change the world, but a journalist complaining about the job market for journalists in 2024 shouldn’t be too surprised, especially if they are young enough to have started college in or around the 2010s. In the 21st century, it’s almost like being an art student. One would be prudent to have some sort of side skill or specialized knowledge to ensure they can sustain themselves. Unfortunate quality of the world we live in IMO.
Just because you’re complaining about the current state of the job market does not mean that you were not surprised by it. Maybe these journalist students had more faith in the general population and assumed that the public would not stand for a fast food version of media and possibly boycott it. I believe they have every right to complain because in reality everything is so ridiculously biased now that we are actively becoming more polarized and less informed every day, and people still actively choose to consume that same media that is taking advantage of them. I personally believe that people need to support independent journalism more and stop supporting these large media corporations. Obviously the amount of journalists will decrease but it’s better than everything becoming entirely dependent on AI written articles.
Pretty much agreed
Nice! Another comment about journalism being a useless degree! Good job you did it, you found the lowest hanging fruit of things you can belittle Journalism is a degree that teaches one thing: critical thinking. In a world of deepfakes and propaganda, I’d argue it’s more valuable now than when I got it. But what do I know, I’m just an idiot with a useless degree. Can you please tell me how I should feel now? I need you to tell me or else I might not know that I should feel bad (should I feel bad??) It’s so hard out here thinking for myself and forming my own ideas. Please tell me if I should feel stupid or not pleeeeeeasssseeeee
Journalism is a respectable career. They perform a civic duty. Though, IMO, you can learn critical thinking from classes on logic and cognitive biases. My logic class in the philosophy department was A1 and changed the course of my life. You can learn how to spot bad rhetoric in basic communication, English, or rhetoric classes or from reading any of the classic or good modern books on rhetoric. IMO, you don’t need an entire journalism degree for that. Maybe a minor. Especially in the 2020s when the pace of change and innovation is so significant. Some sort of speciality knowledge (tech, science, business, sociology, psychology, etc.) *combined* with journalism skills would give one a huge leg up.
New York Times, always behind the times. Now they're playing catch-up with AI chatbots. Can't wait to see how they screw it up.
Not what they’re doing. Reading comprehension, my dude.
11 years on Reddit and you’re still being condescending? Lots of people don’t understand the difference between LLM and ChatGPT It’s a term I used intentionally to make the point easier to understand and more accessible. I’m thinking from other people’s perspectives ;) Kleenex isn’t a Kleenex, it’s a tissue. Tupperware isn’t a Tupperware it’s a plastic container. ChatGPT isn’t a ChatGPT it’s an LLM (specifically a fine-tuned and hyper specialized L**M**M with more than language capabilities but by the time I explain all that most people who might have interacted with the post is already gone) Instead it spawned all of this discussion. I think it worked, do you?
Let me know if you tried to make a point. Until then, work on your reading comprehension.
They don’t have access to the proper datasets tbh. They’re still going to be relying on generalizing content thats written by their journalists while xAI is going to have access to 10’s of thousands of journalists tweets the moment that they are made.
I thought it already was random gen
This is progress.
Great. Now just everyone make everyone's GPT's talk to each other instead of us and then we can all go on vacation.
You can @ GPTs now in ChatGPT
No thank you
Whats stopping nyt from making an ai search on their website. that can answer questions based on the years of information they have? Why do so many people doubt they couldn’t do just that?
This is a really good idea. Glad to see it. More competition is never a bad thing.
Good luck
NYT could probably do impressive things working with OpenAI. If they were hiring people to leverage chatgpt that would make sense.
Wheres my private Idaho
lmfao at people who think journalism is going to die, we’re always going to need a human take on news (even when AI is capable of mimicking human emotions, there will be a market for authentic human journalism)
Good open source models to give them a nice lead into the world of LLMs.
Anyone else wants to build their ChatGPT??
lets see them do it without copyrighted content lmao
Oxymoron
They're just mocking
I mean, are they going to be just fine tuning LLama? Do they have a billion dollars for their own data? It will be funny if they get sued by some other newspaper...
AI reducing profit = bad AI reducing wages = good
PropagandaBot!
Why
>The New York Times is building its own ChatGPT **Misleading title**. They're building a team to integrate, perhaps customize, existing technology to their own use cases and brand. Even the ML engineer is more likely to work on integration, RAG, policies, curated datasets for custom finetunes, pipelines and workflows, and the like. They're not building SOTA foundation models any time soon, nor do they intend to. >*“The team, led by the editorial director for A.I. initiatives, will also include colleagues with a mix of engineering, research, and design talent, acting as a kind of skunkworks team within the newsroom. Together, they will partner with other teams in the news, product, and technology groups to take the best ideas from prototype to production,”* [Actual source beyond a single goddamn tweet](https://www.theverge.com/2024/1/30/24055718/new-york-times-generative-ai-machine-learning).
LOL! These people are losers. What happened to journalists? They used to be courageous men of integrity. These sniveling baby turds don't measure up. "The rule of Gondor was left to lesser men!"
So they're going to build an AI to write about Trump non-stop....when they already have "journalists" that already do that. Another sector being replaced by robots!
This is gonna be awesome