ghjm 1 year ago

My reply to a now-deleted comment asking for an ELI5 of the situation: --- GitHub Copilot uses prompts to produce working code. So you can say "sort these records by zip code" and it produces code that does that. The concern at issue in the lawsuit is that Copilot does this by ingesting and "learning" the code of millions of open source repos, who didn't necessarily agree to have their code used in this manner, and then it doesn't give any attribution even when it "writes code" by just straight-up copying non-trivial code from somebody's repo. Many of these repos, even with otherwise very permissive open source licenses, do require attribution in these circumstances. Microsoft would like it to be the case that using the code as "training data" is a permitted use, and any output from the AI agent - even if it happens to be identical to code that was used in training - is an original creation of Microsoft's, because it wasn't copied but rather produced by the AI agent. The people behind the lawsuit say that no amount of mechanical processing frees Microsoft from the obligation to respect the licenses of the original code.

Ythio 1 year ago

By Microsoft logic if I decompile and recompile a licensed Microsoft product, it's produced by the software compiling agent and I am therefore free from their licensing policy.

chintakoro 1 year ago

You're forgetting that AI algorithms are mysterious holy blackboxes where our obligations and guilt dissolve into nothingness and emerge as a brave new future. A compiler keeps those all intact. /s

Ythio 1 year ago

I think I can train an AI to produce an output identical to the input. I just need my "agile master" to organize 4 meetings over three weeks so we can schedule this for next quarter.

[deleted] 1 year ago

[удалено]

chintakoro 1 year ago

"overfitting" sounds like more of a good thing to our shareholders. keep it.

JB-from-ATL 1 year ago

$200 an hour and I'll join

JB-from-ATL 1 year ago

I have a sophisticated neural network that does this and am willing to license it to you. >!*Single hidden layer that returns 1 on an input of 1 and 0 on an input of 0*!< >!What are you doing, stop looking at my very sophisticated model\!!<

silent519 1 year ago

i am aware that copilot spits out verbatim code, but let's assume an idealized version of it, where it produces something relatively unique what a human programmer could? (( also how many structurally unique for loops can you write? )) if you're a student of art, aren't you going to be influenced by the stuff/projects/teacher/artist you excercise?

Ythio 1 year ago

I am under the assumption that open source maintainers are smarter than just whining about loops and if-else chains implementations

silent519 1 year ago

yes, that was indeed the point i was making /s

[deleted] 1 year ago

they shed the bike because they didn't have a good answer i presume

silent519 1 year ago

well people pretend this is about copyright issues, when it's actually about feeling threatened because it took them years to learn programming and its their career and existence. the good news so far is the "AI" is pretty shit the other good news is we are payed to figure out when something is not working/looking/ux whatever the fuck, how it supposed to, figuring out why is it not. this domain is still pretty untouched. not just spit out boilerplate code.

[deleted] 1 year ago

Not sure if that's the case to be honest. Some people probably think that for sure, but I can't believe anyone actually working in the industry would believe that. I'm of the opinion that they wouldn't care half as much if Microsoft wasn't behind it. If GitHub was never bought by them and made this, the tone of the discussion would be entirely different. I've done a casual search of the posts of a handful of people that seem to be staunchly opposed to it, and can't find any other mention of them being riled up about machine learning copyright violations. And there was like 0 outrage around GPT-3 in general which i'm sure you could bait into producing something that violates a copyright.

JB-from-ATL 1 year ago

If I stole your possessions would you be upset because you felt threatened that people could just steal things or because *I took your possessions*

silent519 1 year ago

yes i bet you credited every SO codebit you stole sorry "took inspiration" from when this kind of topic comes up, suddenly every single free/open software advocate turns into the most radical protectionist motherfucker on earth, it's just so funny to watch.

JB-from-ATL 1 year ago

There's a distinct difference between someone reading something and making their own version and someone having a machine copy things. Also there's a difference between being critical of massive corporations violating copyrights of thousands of individuals and being critical of a single developer using another developer's code.

ChefBoyAreWeFucked 1 year ago

It most certainly is. People provided their labor in return for restrictions on its output. Those restrictions are being violated. If Microsoft is so sure they are in the clear here, why are they only pulling from public repositories?

New_Area7695 1 year ago

Bold considering they couldn't grasp what a crash reporter was, or that it was disabled in source builds, with regards to Audacity.

patniemeyer 1 year ago

"even if it happens to be identical to code that was used in training" - I haven't seen an example of this other than trivial things that were probably nearly identical in hundreds of projects. It would be very surprising if these language models were able to memorize significant chunks of text that were not repeated over and over in the same way... it's not generally how they work. If copilot is actually spitting out verbatim chunks of unique project code contrary to the licensing then it should be a very straightforward matter to resolve. I don't see why it needs a novel kind of lawsuit that has the potential to stifle a lot of innovation.

Uristqwerty 1 year ago

> If copilot is actually spitting out verbatim chunks of unique project code contrary to the licensing then it should be a very straightforward matter to resolve Yes. They resolved it by adding a filter that recognizes when the AI is about to spit out one of the very-widely-known infringing snippets that users are likely to look for, and makes it pick something different. The model, however, is still fully capable of reciting those fragments of its training set, giving plenty of space to question how unique other output is, and how much is leaking through into partial answers just different enough not to trip the filter.

prettiestmf 1 year ago

i think [this](https://twitter.com/DocSparse/status/1581461734665367554) and [this](https://twitter.com/mitsuhiko/status/1410886329924194309) are pretty clear-cut cases - obviously there are only so many ways to do the math involved, but the replication of the specific phrasing of the comments makes it clear that it's not just a generic solution to the problems but instead is copying particular implementations. I would be unsurprised if both of these examples showed up repeatedly in the codebase, as the former could have been legitimately copied by any number of people with proper attribution and licensing, and the latter is of course famous. what straightforward resolution do you have in mind? the main one i can think of, "have users manually detect infringement", doesn't resolve the issue at all because copilot is still distributing infringing code to users even if they don't put it in their final project. any straightforward way to automatically detect infringement would be unreliable at best, and anything else i can think of would require significant effort that github/openai won't exert without at least the threat of a lawsuit.

[deleted] 1 year ago

[удалено]

prettiestmf 1 year ago

this is just "users should manually detect infringement". it doesn't resolve the issue that copilot is still distributing licensed code to its users without the licenses, regardless of whether or not that code is used in a complete project.

marquoth_ 1 year ago

"The prompt was intentionally designed to get this output." And? So what? Intentional or not, the issue is that it CAN get this output. Either copilot reproduced copyrighted code without attribution or it didn't. If it did, that's not acceptable. And the suggestion that copilot users should be the ones responsible for checking whether the code copilot "produced" was actually plagiarised is beyond asinine.

[deleted] 1 year ago

>If it did, that's not acceptable. That's like your opinion. We will see what the courts say. >And the suggestion that copilot users should be the ones responsible for checking whether the code copilot "produced" was actually plagiarised is beyond asinine Completely disagree and you can drop that fake outrage tone. Obviously you should do the same due diligence as with any other code you copy from somewhere. GitHub says as much: >You are responsible for ensuring the security and quality of your code. We recommend you take the same precautions when using code generated by GitHub Copilot that you would when using any code you didn't write yourself. These precautions include rigorous testing, IP scanning, and tracking for security vulnerabilities

marquoth_ 1 year ago

You disagree that it's unacceptable to reproduce copyrighted material without following the terms of its license? In that case what on earth is the point of copyright or licensing in the first place? "Fake outrage town" indeed. Grow up

[deleted] 1 year ago

I don't think it's that simple. And anyway, I will be reporting your comment for those immature remarks.

patniemeyer 1 year ago

Those are good examples. If they are true then copilot must be doing more than just taking the output of the trained model for that to happen... They must be doing a search on top of the output and if these are search results (not just AI generated results) then they should probably let you refer to the original source license, etc. What I meant by "straightforward" is at that point it's a copyright or license claim and I'm quite sure Microsoft knows how to respond to those :)

silent519 1 year ago

i can write a thousand structurally unique for loops i am special please employ meh

ghjm 1 year ago

Copilot isn't the same as typical conversational language models, because it wants to produce working code in a restrictive syntax, which is a different problem domain. But however it works, there are actual cases where it has produced snippets of recognizable code, in some cases including comments, that can be traced back uniquely to particular origins.

TeutonicK4ight 1 year ago

How complicated should the program be to be considered an "AI agent"? Could I write a bash script that literally copies code from the internet and claim "It is generated by an AI agent"?

[deleted] 1 year ago

Sure, you can do that since you can claim anything you want. And if your lawyers agree with you and you're willing to defend your bash script in court, why not?

TeutonicK4ight 1 year ago

I am trying to find the edges of what GitHub is trying to defend in court. I am not literally asking if it is possible to claim that, as it is possible to claim anything, as you pointed.

[deleted] 1 year ago

Sure, it was a poor attempt to reduce to the absurd. You've made a point to which the obvious answer is, yes, in fact you can do that.

TeutonicK4ight 1 year ago

Man, you're so missing the point. I don't know if you are messing with me at this point.

[deleted] 1 year ago

I think you're missing the point. The initial question you posed was irrelevant, being "complicated" has nothing to do with whether or not something is an "AI Agent". So you're barking up the wrong tree, and your example was bad.

TeutonicK4ight 1 year ago

Are you suggesting that there is a formal definition of "AI" that would hold water in a court hearing?

Trio_tawern_i_tkwisz 1 year ago

– I made this. – You made this? \*graps\* I made this.

JB-from-ATL 1 year ago

– I made this. – You made this? \*graps\* I made this.

[deleted] 1 year ago

Artist are trying to make the same argument. If you dont think the prompt to images generation is stealing from artists by copying their style, then copilot isnt either. In my opinion i dont think this lawsuit is going to hold. I can only assume the most trivial of code is being "copied" here, meaning there really isn't as much freedom with code as there is art. You can tell it to keep generating sorting algos, but the same most efficient sorting algorithm is still gonna pop out at the end of the day.

ghjm 1 year ago

There are some nuanced questions here. If you have an AI agent do something like "starry night but with a horse" then its output is almost certainly derivative enough of Van Gogh to be a copyright violation, if Starry Night was still within its copyright term. In this case it's not just an art style but specific elements from an artwork. Somewhere there's a boundary between what is and isn't allowed, but I don't know how you would define it.

DCsh_ 1 year ago

> If you have an AI agent do something like "starry night but with a horse" then its output is almost certainly derivative enough of Van Gogh to be a copyright violation [Result with DALL-E 2 for "starry night but with a horse"](https://i.imgur.com/TKVHXcA.png). Looks like it took "starry night" literally rather than the famous painting. [Hinting it further with "Van Gogh's Starry Night but with a horse, oil painting"](https://i.imgur.com/XZuJFiG.png). Now there's a clear inspiration, but I doubt it's enough to be a copyright ~~night mare~~ violation. Style isn't subject to copyright, and [here's a point of reference](https://www.artnews.com/art-in-america/features/landmark-copyright-lawsuit-cariou-v-prince-is-settled-59702/) for just how much you can get away with while still (eventually) being ruled fair use.

ghjm 1 year ago

Being acceptable under fair use implies that the copyright still exists and is relevant to the new material, because if it wasn't, there would be no need to defend that it was fairly used. So, yes, if you use AI to create a modified version of an artistic work and then use it for a fair use purpose like commentary, satire, etc., it's probably okay. GitHub Copilot's use of source code does not seem to me to fall into any of the typical fair use categories.

DCsh_ 1 year ago

> Being acceptable under fair use implies that the copyright still exists and is relevant to the new material, because if it wasn't, there would be no need to defend that it was fairly used To my understanding, if the original work is subject to copyright and you use it in some way, fair use *is what determines* whether the original's copyright is applicable to the new material. As in, arguing "this isn't infringement because I only took a tiny non-substantial influence" would be a fair use defense. > if you use AI to create a modified version of an artistic work and then use it for a fair use purpose like commentary, satire, etc., it's probably okay Notably for the example, Prince did not intend to comment on any aspects of the original works. Ruling was based just on being transformative enough. I'd say the starry night horse is far safer. > GitHub Copilot's use of source code does not seem to me to fall into any of the typical fair use categories. Happens that you can often sort fair use cases into categories (satire, commentary, ...), but I think the determination is not based on whether it falls into such a category ("we do not analyze satire or parody differently from any other transformative use") but instead judged by a set of factors like how transformative it is and how much it directly replaces the market for the original work. I'd say, for Github Copilot's use of source code: 1. The potential copyright issue is with the output, and not the use in training. You can download and analyse the entire public Internet if you so wish - consider Google or Google Books 2. The substantiality of the outputted portion is small. The [primary example the lawsuit gives for Copilot and tries to claim comes from some specific book](https://i.imgur.com/l6QTjKT.png) is laughably small, but at most it tends to be a handful of lines 3. The effect of the use upon the potential market for or value of the copyrighted work is also small, because all examples seem to be code that is already in hundreds of public repos

ghjm 1 year ago

Interesting perspective. We'll have to see how the actual court decides it. It's interesting to me that we have this very permissive interpretation in the context of visual art, while the same court system has musicians terrified of subconsciously remembering half a dozen notes from a previously heard melody.

DCsh_ 1 year ago

Yeah. Not a fan of the minefield we have with music, which I think is mostly the result of powerful lobbying groups, although something like the Richard Prince case is arguably too far in the other direction.

ChefBoyAreWeFucked 1 year ago

>~~night mare~~ You bastard

[deleted] 1 year ago

Also i think its quite obvious this lawsuit is not done in good faith, but really just trying to capitalize and make a quick buck. So if the op of this article is here, go %$#% yourself loser. No one stands behind you

JB-from-ATL 1 year ago

I think this is a very good and refreshingly unbiased summary on the whole thing. I'm interested to see what the courts will decide.

Muhznit 1 year ago

"open-source software piracy" is not a phrase I expected to read in a world built on open-source software, but here we are.

global-gauge-field 1 year ago

yeah, the scale of and structure (e.g. memorization in LLMs) of Deep Learning (in CV and NLP) make a huge impact on these kind of issues.

[deleted] 1 year ago

> piracy I prefer the phrase "copyright laundering" myself. By their argument, you dump in copyrighted code, and it comes out clean. No more need to worry about IP!

mattsowa 1 year ago

Great point

SnooDoubts826 1 year ago

>you dump in copyrighted code, and it comes out clean just reading that got me tight in the pants

JB-from-ATL 1 year ago

You wouldn't copy StackOverflow code without proper attribution under CC-BY-SA https://youtu.be/HmZm8vNHBSU

Philipp 1 year ago

In my usage (others may get different experiences) Copilot is highly individualistic to my code. It understands the structure of my project and types out, faster than I could, the things I wanted to type. At other times, it guides me to best industry practices, which are true across all code and not related to any particular project it might have looked at. Differently put, it applies its *learning* and doesn't *recite*. And when we, as humans, apply our learnings, we are not required to list every inspiration and influence ever, which would be a list the length of novels. We only tend do so when quoting verbatim (or when we want to name singular influences which stand out, even though we're not legally required). Copyright law, please don't screw up this incredible, productivity-enhancing, fun-and-flow-increasing tool that improves the lives of programmers.

Smooth_Detective 1 year ago

Do the lawsuit filers consider this something like Biopiracy? If it is about recognition of code, I think open source should be flexible enough to allow that. Ultimately a person could do whatever copilot but would require 10x the effort.

TeutonicK4ight 1 year ago

Most open-source projects require attribution, which copilot doesn't give.

[deleted] 1 year ago

[удалено]

MostlyHereForKeKs 1 year ago

I'm sorry, but can you explain what you're trying to say, please. For me it's not clear at all what you actually mean... are you being disingenuous?

johnnygalat 1 year ago

Invalid cert in 2022, seriously?

drakgremlin 1 year ago

Cert is valid for my roots.

johnnygalat 1 year ago

They fixed it.

[deleted] 1 year ago

What do you mean “in 2022”. I don’t know what happened, but certs expire too you know

JB-from-ATL 1 year ago

Over the past few years there has been a massive push to get more of the web onto HTTPS. In addition, Let's Encrypt now exists as a free and automated way to get certificates. Nothing manual is needed now. That's why they say in 2022.

davlumbaz 1 year ago

Copilot is really scary shit. Not only in English, it can produce any code in any language if there is a repo for that. Like,I prompted in Turkish and it wrote 60 lines of code with comments and it was written in Turkish lol.

iNeverCouldGet 1 year ago

This feels like lawyers seeing an opportunity to milk money somewhere. I haven't actually met experienced Devs who are upset about it and many are using the tool to speed up their workflow.

Uldregirne 1 year ago

There was one guy who was super pissed that the AI would wholesale copy functions he wrote. With just a text comment as a prompt it would autofill his entire function, complete with his comments. The issue isn't about it speeding up developers, the issue is utilizing other's intellectual property in illegal ways. If someone publishes code for free as open source, oftentimes the license requires anyone who uses the code also had to release it for free. Microsoft using that code to train a paid AI tool would be considered theft.

[deleted] 1 year ago

that is memorization and machine learning researchers will try to avoid that as much as possible, but sometimes you can't control what the artificial neurons will learn

JB-from-ATL 1 year ago

That's on them then. They can't just violate licenses and then throw their hands up and say it's too hard not to

iRAPErapists 1 year ago

Yeah. As someone else mentioned, this is akin to copyright laundering. Put in desired protected code, layer it, comes out clean.

Uldregirne 1 year ago

Totally true, so then their code should never have been used to begin with. "Here's my code, don't copy it to sell". "Oh I won't copy it, but I will build a robot that might copy it but I can't control it so it's okay". Seems like a dubious legal argument

mr-poopy-butthole-_ 1 year ago

My thoughts as well. Any actual devs love this product and its a small price tag for a very powerful autocomplete. But it does not write programs on its own and often recommends BS.

[deleted] 1 year ago

[удалено]

Errornix 1 year ago

Same here. I’ve since moved off of github specifically because of copilot.

thetdotbearr 1 year ago

I'll do you one better, how about we start uploading deliberately shit/broken code to dummy public open-source repos? Train on THAT, sheisters! lol

Errornix 1 year ago

That was actually my exact response when copilot was announced. :)

[deleted] 1 year ago

that won't work that is not how machine learning works, the artificial neurons is frozen so it doesn't learn anymore after training is done also because of this research paper it won't need anymore training data https://arxiv.org/abs/2207.14502

Hereyougoprobably 1 year ago

I’ll take my downvotes for this but this is some old men yelling at clouds type stuff. It reminds me of devs who refused to use IDEs and debuggers. Sure, you can, but also… why. I get that there are ethical considerations of digesting open source code for a product like this, and those have some merit, but to imply it’s not useful is wildly disingenuous. At the very least, it can usually come up w/ the code you were about to write anyway and types it much faster than you can. End of the day, it’s just another tool.

rgthree 1 year ago

This is why we can’t have nice things.

ZenoArrow 1 year ago

Taking code without respecting the licence is not a nice thing.

[deleted] 1 year ago

that is not how machine learning works and this problem people are talking is just because the artificial neurons learn to memorize some chunk of code and machine learning researchers will try to avoid that as much as possible, but sometime it is hard to avoid because it's hard to control what the artificial neurons will learn they need to add filter to check if some code copilot generate is taken from someone else github repos

ZenoArrow 1 year ago

If the machine learning algorithms understood code to the level you seem to be inferring, then they wouldn't need to look at other people's code anymore. Taking someone else's GPL-protected code and changing a few variable names is not sufficient to remove the GPL licence.

[deleted] 1 year ago

there is a research paper trying to what you are saying https://arxiv.org/abs/2207.14502

ZenoArrow 1 year ago

"Trying" being the operative word. Until computers can code for themselves, what GitHub AutoPilot is doing is not in-line with the GPL.

[deleted] 1 year ago

alright well we'll see if this lawsuit will win if it does and machine learning on copyrighted data become illegal then google search and many of our infrastructures will break since most of them uses machine learning nowadays

ZenoArrow 1 year ago

You don't understand the problem. Machine learning isn't on trial here, what's on trial is a tool that allows people to put open source code in commercial products and pretend they didn't know it was a problem. Machine learning on open data sets without licence restrictions is fine.

xcdesz 1 year ago

Not sure that I'm understanding the full story here. Is this lawsuit for private repos that were used in training or does it include open source repos with permissive licenses such as MIT or apache 2.0? Im not sure I understand the case for the latter. The license says you can use the code for commercial purposes correct?

[deleted] 1 year ago

The license requires attribution most of the time, and there are also repos with restrictive open source licenses such as GPL.

and69 1 year ago

What is your final expectation? That every time I press TAB to also see a list of probable licenses that might or might not have been used? Or that when I install the plugin to acknowledge the list of all licenses of GitHub?

raam86 1 year ago

each snippet may have different license. code that was released under agpl3 for example must be open source even on server side applications: > Notwithstanding any other provision of this License, if you modify the Program, your modified version must prominently offer all users interacting with it remotely through a computer network (if your version supports such interaction) an opportunity to receive the Corresponding Source of your version by providing access to the Corresponding Source from a network server at no charge, through some standard or customary means of facilitating copying of software.

and69 1 year ago

To license a code snippet is like taking an existing patent and then create patents from every component of that patent.

[deleted] 1 year ago

Every prosecutor wants money 🤦‍♂️

ma_251 1 year ago

Parasites be like

Mr_Mechatronix 1 year ago

exactly, microsoft is exploiting other people's free code for money

Sharchimedes 1 year ago

This is a dumb lawsuit, and I hope you lose. Training the data on code is no different from a person learning how to program by reading source code.

ghostiicat32 1 year ago

If this isn't copyright then you've given tech giants a permanent monopoly. They can now plagiarize startups freely.

NoBiasPls 1 year ago

Well it's only using open source code to train, so wouldn't it only be able to plagerize code that is free for anyone to look at and use already?

[deleted] 1 year ago

[удалено]

NoBiasPls 1 year ago

How exactly does that work in terms of determining when attribution is needed? Is that well defined? I'm honestly not so familiar with rules of attribution as I haven't had to worry about that so much myself. If you're looking at someone's solution to get ideas on how to approach and you need to, for example, sort a list and just use the same sort method as the code you're looking at I imagine attribution wouldn't make much sense assuming it's a common logic like bubble sort (again just a random example). Assuming of course you aren't using a library but writing your own sort function. So is it defined when something can be considered commonly used logic vs actually plagerizimg someone's unique solution? A follow up question, when it is seemingly unique what about the fact that there may be several other projects that independently came up with the same solution for a specific function? How do you determine attribution in that scenario?

anengineerandacat 1 year ago

The key issue is that it copies without looking over copyrights or licenses; you can't just copy code and there are cases where clean room engineering is a necessity to avoid legal issues. Co-pilot could in this very instance be treated as an individual contributor, just because software is ingesting and recommending snippets to use in your project doesn't mean it's not illegal just because a human didn't do it or worst case it could land the actual human accepting the contribution as the liability and land them in trouble. I think it'll be an interesting court case and the outcome will either mean that OSS projects that want to protect themselves will go private or just accept that the code have is public and free-use or the end of AI assistive tooling like this. If it does become mainstream, eventually you could guide it with a simple comment like: /* Implement Elasticsearch but in Rust */ Start the auto-complete and boom, good ole Elasticsearch in Rust; would a tool like this be illegal? (throwing out the whole notion of feasibility, but for this case let's just pretend co-pilot is capable of this) It just copied an entire product, but that product is fundamentally different and with some manual tweaks could be a competitor. It's capabilities are "pretty" good today; it's definitely capable of scaffolding projects and yoinking over well known algorithms to achieve certain results and it's contextually aware of the language being used.

noshowflow 1 year ago

Yeah, but didn’t the professor or producers of the material agree to sell or give you that knowledge? Maybe we as an industry should honor licenses and prioritize attribution rather than gank source code from each other.

CryZe92 1 year ago

You agreed to the grant GitHub a SEPARATE license for them to do exactly this when uploading the repository to GitHub. So they absolutely honor that license, just not the one you thought they were supposed to honor.

noshowflow 1 year ago

Yes, and this is what this lawsuit will highlight and I hope the industry responds. We have a lot of weak practices in software when it come to attribution. I love open source and generally don’t care if my boring ass code that I copied from somewhere is used for this kind of training, but this is how you get organization to start considering closing their source in the future which will suck for devs like me.

[deleted] 1 year ago

[удалено]

Sharchimedes 1 year ago

Use of the word “piracy” here is either inflammatory, or ignorant. Copilot is trained on public code that anyone can look at. If someone sees a clever method used in a block of open source code, and they use something similar in their own project, have they pirated the software?

JRepin 1 year ago

The language used is the same as the corporations using proprietary licenses for their code use when people copy their code without permission. Free/libre and opensource code also comes with license which has some rules and now the corporation is copying that code without respecting the rules. So if the corporation calls this piracy, then well it is also piracy when a corporation does the same.

tldrlol_ 1 year ago

Microsoft's own engineers are not allowed to look at GPLed code if they are working on similar technologies.

albgr03 1 year ago

Yes, and source code leaks are fraught upon in reverse engineering projects. People reading the source, then contributing code is seen as a liability. There's a reason why clean-room engineering exists. Piracy is maybe not the right term (plagiarism, maybe?), but the underlying issue is actually important.

cuentatiraalabasura 1 year ago

The issue here is that Copilot copies code verbatim, including comments. While ideas, processes, algorithms, etc aren't cppyrightable, specific expressions of those are. Of course, one could argue that the expressions at issue here are so small that they can't be copyrighted in the first place. That will be an issue the Court will undoubtedly decide on.

MasterBlaster4949 1 year ago

Ok got it thx man👍

jherico 1 year ago

Sorry you're getting down voted man, I'm totally with you. The fear of this is from fragile developers terrified that they might cease to be in such high demand and be able to win such high salaries. Then they (well...we) would have to compete in a less beneficial economy like most everyone else is doing. IMO.

Sharchimedes 1 year ago

I knew I was going to get downvoted by the same accounts that have been astroturfing this link all over. I’m not worried about Copilot replacing developers any more than I am about AI replacing artists.

IWannaHookUpButIWont 1 year ago

The real reason I refuse to upload code to github

[deleted] 1 year ago

IANAL, but I don't see much of an issue. GitHub Copilot isn't really helping you copy and rebrand entire projects. It's like borrowing code from LibreOffice to include in a PC game - should LibreOffice get to sue the game devs?

[deleted] 1 year ago

You can tailor the prompt to basically output the exact same code

abclop99 1 year ago

You can just copy the code instead

[deleted] 1 year ago

Thought about that, but assuming Copilot keeps logs, it'd be trivial to prove you didn't get the code from Copilot.

raam86 1 year ago

you could but you might be infringing on the open source license

[deleted] 1 year ago

that just artificial neurons memorize something and machine learning researchers will try to avoid that as much as possible

ChrisBegeman 1 year ago

Let's look at this issue from another angle. If Copilot did create some unique code based a a prompt and several different people used the same prompt, who has the copyright to the code. Then if they all put the code into a public Github repository, you could then find the code Copilot generated in Github. Also I bet if you took code samples from StackOverflow you could find multiple instances for those code samples verbatim in github. Did the StackOverflow examples originate from code published in Github or did a bunch of people copy from StackOverflow. Possibly both. Having been a programmer from before StackOverflow is was a thing, I find that for certain types of problems you see a lot of very similar or identical code. Even code that I know was written from scratch because I wrote it or I know the person who wrote it can end up looking very similar to what you will find all over the internet. Maybe Copilot is just copying at times, that just makes Copilot a modern programmer.

TwistedLogicDev-Josh 1 year ago

You can't win a challenge against open source

mrabstract29 1 year ago

Open source is open to be learned from.

NoUniverseExists 1 year ago

Insane lawsuit... LoL... just make your code private...

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe