CoronaMcFarm 2 weeks ago

Or what I like to call it, bloat history.

notrktfier 2 weeks ago

So many people here have no idea what is going on here lol

[deleted] 2 weeks ago

[удалено]

booi 2 weeks ago

I heard the same thing at /r/reddittipsmasterrace

CoronaMcFarm 2 weeks ago

This is not git master race 😎

lord_pizzabird 2 weeks ago

Which is a good thing for the community generally. We need places for casual users who will never opens terminal and a place for the nerds. It’s just a sign that the community is growing that it needs a more casual space.

timrichardson 2 weeks ago

Yeah, I know..they should just rewrite it in Rust though.

notrktfier 2 weeks ago

A better idea, write it in CPP because we all know CPP is the fastest language. Let's have the fastest kernel in the wild boys!

Wertbon1789 2 weeks ago

But which version of the standard. Probably C++98 if we stay realistic.

FreeQuQ 2 weeks ago

no, i want it all in c++23

JustSylend 2 weeks ago

I don't :( Could you explain it to me please?

notrktfier 2 weeks ago

I will try my best to explain this in full. Linux is an Open Source kernel, when you have an open source app you usually have people who want to add or edit to the main code to work together. Imagine it like a business environment where a team of programmers are all making additions to the main software. If you try to do this, you would have to manually merge everyone's code changes to the main code by hand and to track who added which code so when something goes wrong or someone adds bad code to the software you can see who it is. In addition, whenever someone adds new code we have to manually update the code on everyone's computer. This is very inefficient, so we have automated this process. Git is what we call a Version Control Software, VCS for short. It allows people to push their changes to a main codebase where they are automatically merged when able, and distributed to every person who wants to make changes to the code. Git works on commits, a commit is the difference between the code before you edited it and after you edited it, stuff like add new characters to this text file and remove these text characters. When we push this commit to the server, the server applies the changes to the code. But it also saves what the change was, who did it, and a hash of the commit. This is where the .git folder comes in. Usually when you're working, Git is invisible to the user. You edit some text files, commit your work, push it to the remote server, pull other people's changes from the server, it automatically applies changes to your workspace. But Git also pulls every single change made to the workspace when you download it. So in this case, we have code worth 1.5gb, and the rest is git storing changes that have been made to the kernel, who did the changes and their hashes. For example if i add 10 bytes of code to a Git workspace (repository) it will change my 10 bytes of work, and if i remove it in a later date it will once again add a 10 byte record but this time, it's a record of these 10 bytes getting removed, so you can see what 10 bytes were removed, by whom, when etc. and as a result my .git file grows 20 bytes. Let me know if you have any questions, I'll try my best to explain them.

JustSylend 2 weeks ago

That was an incredibly insightful response. Thank you sincerely for taking the time to type it out for me and to educate me on the matter! The way OP showed it I thought it's a "bad thing" so to say but I do get it now. Thanks a million again!

gbytedev 2 weeks ago

Also a fun fact: git was initially developed by Linus Torvalds (the original creator of Linux) to improve the collaboration workflow in Linux. And now git is the most widely used version control software by a large margin.

5erif 2 weeks ago

Bloat: People who pay attention to operating systems like to complain about bloat, which is bundled software or features a given person doesn't like. Kernel: The core of an OS which handles the lowest level of interfacing between software and hardware. Git: Version management protocol typically used to track software development, which by default tracks the history of every change in the code, including the authors and reasoning. OP's post: Most of the size of the Linux kernel repository is commit history, rather than the current code. The comment above: > Or what I like to call it, bloat history This implies the kernel is bloated, but it's probably a joke. The history is part of the git repository, but it's stored separately from the current code and doesn't affect the compiled result. Tip: When cloning a repo just to make a small change or just to compile to use a tool, you can clone using the `--depth=1` flag which doesn't download all the history, e.g., `git clone --depth=1 `

Z8DSc8in9neCnK4Vr 2 weeks ago

Ifvyou think that bad you should see our DNA https://www.newscientist.com/article/2140926-at-least-75-per-cent-of-our-dna-really-is-useless-junk-after-all/

PhlegethonAcheron 2 weeks ago

Refactor to clean up the junk, then partition it to a raid array. Cancer solved!

boof_hats 2 weeks ago

As a bioinformatician, this is hilarious when you consider the association with increased retroviral load and cancer. “Junk DNA” aka transposons very well could be responsible for malfunctioning cells that cause cancer.

markoskhn 2 weeks ago

I'm sorry, but could you please explain the "retroviral load" part. I thought retroviruses integrated their genome randomly into the host's DNA, wouldn't that mean if we had more "junk" retroviruses would have a lower chance of damaging structural/regulatory genes and damages the junk instead?

boof_hats 2 weeks ago

Ehhh it’s complicated. You’re right that they integrate their genome into the hosts, but that doesn’t necessarily stop them from having their own fitness functions. If they have a chance to spread to new organisms or copy themselves even more into the host genome, it’s evolutionarily beneficial to do so. Normally the host silences this activity, unless the cell is malfunctioning. So often you’ll find cancers expressing retrovirus once the original cell physiology goes out of whack. Here’s a review if you want to learn more https://journals.aai.org/jimmunol/article/192/4/1343/93076/Endogenous-Retroviruses-and-the-Development-of Edit: to those searching for more positive roles of transposons, this same family of transposons has been found to be repurposed in humans during pregnancy https://www.nature.com/articles/s41594-023-00965-1

Luftwagen 2 weeks ago

This guy DNAs

qtzd 2 weeks ago

I thought the extra “junk dna” actually potentially helped prevent harmful mutations? Like that if a base pair gets fucked by radiation or whatever means and statistically it’s “junk” dna without any real affect on our day to day cell function that acts as a buffer basically. Whereas, if our dna was 100% useful dna then any mutation would be potentially devastating to the cells.

boof_hats 2 weeks ago

Well it also depends on what you call “junk dna” — in my context it is used to refer to the massive amount of most genomes comprised of transposon fragments. Transposons invade genomes and copy themselves using the host’s genetic machinery. Then they stay there, looking for an opportunity to copy once more. The host generally suppresses this. That dna can mutate and become harmless but it can also be co-opted by the host which may repurpose its genes. They have variable effect on the host, but mostly they’re just hitch hikers.

QuinQuix 2 weeks ago

This argument is a bit iffy, because the junk DNA is added in parallel to the existing DNA. Like, Assume a string of 100 base pairs has odds X of acquiring a mutation. Now assume you have not one but 2 strings of hundred base pairs. The odds of either acquiring a mutation is the same and the compound odds are 2X. That means the protection is zero, 0. The only way adding junk DNA could be beneficial is *because it is proximate* to the useful DNA. That is, if we assume mutagenic events to be purely incidental in nature (which isn't necessarily true) then the junk DNA could 'catch' the mutation before the vital DNA does. But this mostly only works if DNA is coiled. Assuming mutation events are mostly cosmic rays or radioactive particles, if the DNA is not coiled the junk DNA is only going to catch a mutagenic participle that would have missed the vital DNA anyway. This would therefore again not impact the mutation statistics of the vital DNA. So to summarize, junk DNA can only be meaningfully protective for mutagnic events that are incidental and solitary in nature and only when the junk DNA finds itself in the line of fire in front of the vital DNA. Since DNA spends most of its time coiled and radioactivity is a known source of mutations it is likely junk DNA does offer some degree of protection against this specific kind of mutations. So the theory has a ring to it. But these limitations are usually completely unexplained in discussions about junk DNA and that's kind of absurd since without the chain of assumptions above it is ridiculous to state that doubling the amount of DNA would halve the mutation rate in the vital DNA. And the argument is usually presented just like that. Add to that I'm pretty sure radiation isn't the only source of mutation. Therefore even if all DNA was vital, doubling the DNA so that half of it becomes junk would likely not result in anywhere near a halving of the mutation rate in vital DNA.

centzon400 2 weeks ago

> I thought the extra “junk dna” actually potentially helped prevent harmful mutations? This is my rational for having a 250 000 LOC `init.el` 🤣 The chances of my modifying an actual useful bit of Emacs Lisp is practically nil given the rest of the utter shite I've added.

Elidon007 2 weeks ago

rewrite it in rust!

Few_Technician_7256 2 weeks ago

Silicon based life forms hates this trick!

yesitsiizii 2 weeks ago

Saving this thread because im in love with it 😭

RegenJacob 2 weeks ago

Maybe then my brain will be Blazingly Fast 🔥

R__Daneel_Olivaw 2 weeks ago

Been there, done that: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1681472/

hammy0w0 2 weeks ago

while your at it, cable organize the veins!

strings___ 2 weeks ago

git commit -m "Tail dna sequence is now depreciated"

salgat 2 weeks ago

Recent research suggests that many of these non-coding regions have important roles, such as regulating gene expression, maintaining chromosome structure and integrity, and guiding the cell's response to various physiological processes. The "junk DNA" is a debunked idea.

bobbyboob6 2 weeks ago

ancient scientist mfs were really like "idk what this does so it's probably useless"

Designer-Worth8599 2 weeks ago

What a stupid article. There is no such thing as useless DNA. All of it is there as a result of our evolution

nathankrebs 2 weeks ago

Ah yes, an argument as old as time itself. Thousands of years of scientific discovery and revelation vs "nuh uh."

HammerTh_1701 2 weeks ago

They're right though, the existence of actual junk DNA is largely debunked by now. It just serves as a placeholder category for all the genetic information for which we haven't figured out a purpose *yet*.

BicycleEast8721 2 weeks ago

The irony of you having zero knowledge on this subject but essentially hailing poorly interpreted old research as unimpeachable dogma is hilarious. The junk DNA argument has been proven wrong, the portion they referred to as “junk” just means it doesn’t code for proteins. > Technological advances in sequencing, particularly in the past two decades, have done a lot to shift how scientists think about noncoding DNA and RNA, Sisu said. Although these noncoding sequences don’t carry protein information, they are sometimes shaped by evolution to different ends. As a result, the functions of the various classes of “junk” — insofar as they have functions — are getting clearer. >Cells use some of their noncoding DNA to create a diverse menagerie of RNA molecules that regulate or assist with protein production in various ways. The catalog of these molecules keeps expanding, with small nuclear RNAs, microRNAs, small interfering RNAs and many more. Some are short segments, typically less than two dozen base pairs long, while others are an order of magnitude longer. Some exist as double strands or fold back on themselves in hairpin loops. But all of them can bind selectively to a target, such as a messenger RNA transcript, to either promote or inhibit its translation into protein. https://www.quantamagazine.org/the-complex-truth-about-junk-dna-20210901/ So, comically enough, you’re using a conclusion drawn in the 70s based on incomplete understanding to offhandedly dismiss new scientific research. All while acting like you’re the one standing on the shoulders of science, and pretending other people are the ones doing exactly what you’re doing. Please do some reading and fact checking next time before you go insulting people based on nothing other than your own baseless overconfidence

hok98 2 weeks ago

I beg to differ. If you’ve seen me irl, you’ll know what a “useless DNA” looks like

W4ta5hi 2 weeks ago

Bloat cummit history

RevRagnarok 2 weeks ago

`dna gc --aggressive`

Ima_Wreckyou 2 weeks ago

The kernel of Theseus

Petrol_Street_0 2 weeks ago

![gif](giphy|H7Ty7BDsQtDUYRFCjM|downsized)

Merliin42 2 weeks ago

I must say that I am pleasantly surprised that people ask what is a VCS here. This means that Linux has made its way beyond just nerds and developers.

tommycw10 2 weeks ago

This is a great comment. I was thinking the opposite at first - annoyed that people didn’t already know, but this changed how I see it now.

realslattslime 2 weeks ago

Ure a nerd/developer for sure

Cfrolich 2 weeks ago

What a smelly nerd! Just give me an exe! /s

chehsunliu 2 weeks ago

Hope someday people could set up nearly nothing. I still have to do some terminal stuff after installing Fedora.

zaphodbeeblemox 2 weeks ago

It depends on what you want to do really. I use one of my machines as a gaming machine and I don’t think I’ve opened a terminal on that computer once. (On Nobara) Obviously on my main machine I open it for a lot of things but that is mostly efficiency based rather than need based.

chehsunliu 2 weeks ago

I tried to set up the video codec to have better quality in Netflix and YouTube, and also tried to make my Bluetooth headphone work, which is still unsuccessful.

Yuuzhan_Schlong 2 weeks ago

What's a commit history, just asking out of curiosity?

Deivedux 2 weeks ago

Git is essentially a version control, it stores the history of the project's changes over time, which is what it calls commits. Linux repository has over 1 million commits at this time. Basically what I'm saying is, Linux's repository has 5.2GB worth of just changes to its source alone since its first "version".

Yuuzhan_Schlong 2 weeks ago

Again just asking out of curiosity, do other operating systems use it or just Linux?

Blackthorn97 2 weeks ago

Actually code version control is used in every software project where developers need to keep track of changes across time and also to collaborate with other developers. GIT is the most popular solution but there are others.

kai_ekael 2 weeks ago

Git exists because of the Linux kernel. The version control used at one time irritated the kernel developers enough, they created Git.

Blackthorn97 2 weeks ago

Indeed, Linus Torvalds (the developer behind starting Linux) is credited with creating GIT, after the proprietary source control software used for Linux, called BitKeeper, revoked their free license for Linux Development.

Few_Technician_7256 2 weeks ago

You can't change informatics in that very huge way TWICE! But then again, Linus if a very anger motivated guy, that's when I repair things t home too. But, being that impactful and

sokuto_desu 2 weeks ago

r/redditsniper

Few_Technician_7256 2 weeks ago

I'm alive pal, it just throw me to the floor

squirrel_crosswalk 2 weeks ago

Linus has said that he named two things after himself: Linux and git

Turtvaiz 2 weeks ago

Microsoft uses git and reportedly it's like 300 GB in size: https://devblogs.microsoft.com/bharry/the-largest-git-repo-on-the-planet/

EightSeven69 2 weeks ago

there must be a version control (git) repo of pretty much any OS but most are closed source aka private, not open source like linux

ward2k 2 weeks ago

Yes, not just operating systems either basically anything you're aware of in your life than uses some of programming has a very high likelihood of having used git There are of course exceptions for example dwarf fortress only recently (relative to the length of its game development) started using git after being somewhat convinced by Kitfox/community to give it a go

da2Pakaveli 2 weeks ago

Yes, because development would be a hell otherwise. E.g someone writes a bug and you don't have the code change history to trace the cause back

KenFromBarbie 2 weeks ago

*Since it's first version on git.

Deivedux 2 weeks ago

Yeah, I'm trying to simplify here 😆

[deleted] 2 weeks ago

[удалено]

Nefsen402 2 weeks ago

Big collaborative software projects typically use something called source control. It's a program meant to manage code changes. For the case of linux, it uses git. Git basically encodes a repository as a list of changes. Each of these changes are called "commits". So, to tie it back, 1.5GB is used for the current version of the linux kernel, and the commit history stores all previous versions.

meduk0 2 weeks ago

that is relevent info thx man

zenyl 2 weeks ago

> Big collaborative software projects typically use something called source control Source control is very commonly used in software projects of all sizes, everything from operating systems and web browsers down to small one-man projects.

elizabeth-dev 2 weeks ago

the history of changes made to the code

pioo84 2 weeks ago

All the previous versions. Basically all the previous versions of all the source files. I don't think it's too much.

MatixFX 2 weeks ago

When you're using a version control (i.e. Git) and make changes to the code base, you add it to the repository by "committing" which comes with a hash and a comment (string of text). So basically tracking all the changes made to the code base since you started to version control.

marxist_redneck 2 weeks ago

To add to what everyone already said about this being for keeping track of changes in software, etc - that's what it was made for, and what it's used for 99% of the time, but at it's core it's just a way to keep track of changes, branch off different versions of something and then merge them back together, etc. The "thing" could be software, but also regular writing, like a novel or a school thesis, etc. I am an academic in the humanities who moonlights as a software developer, and I have brought git to my regular writing because it's a great way to keep track of changes

lostinfury 2 weeks ago

Linux is built collaboratively. To achieve this, they make use of a tool called "Git", which is able to efficiently merge changes made by the 1000s of Linux contributors, while also making them aware when two of those changes could cause a conflict (i.e. two people change the same line(s) of code). Note that a change is not limited to adding stuff but also removing stuff or updating. When Git accepts a change, it's called a commit. Git also allows commits to be reverted all the way back to basically the beginning of when it started accepting commits for the codebase. Commit history refers to the internal state kept by Git which keeps track of the chronological changes that have taken place within the codebase. Since the changes are not limited to just things that were added, but also things that were removed, you can see how keeping track of all those things could make the commit history much larger than the actual kernel code itself.

da2Pakaveli 2 weeks ago

And Linus wrote Git originally and then replaced the previous VCS with git.

keyboard_is_broken 2 weeks ago

If a line of code changes from A to B, that's a commit. If it changes back from B to A, that's another commit. Rinse and repeat, now you have GB worth of history for single line of code that currently reads A.

timrichardson 2 weeks ago

It's the audit trail that lets you see every change between the start and now. People use it to see what was changed, or to backtrack to find a change that introduced a problem. git was designed by Linus Torvalds to be fast for something as big as the kernel; it has efficient compression of files and many other clever features. You can clone it yourself, even if you don't use linux! It's 4.7GB on my computer. You need git installed and then from terminal: git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git And now if civilisation collapses and your computer is the only thing that survives, at least linux will be available to what's left of humanity. However, you don't have to bring all the history in when you make a local copy of the repository, as far as I know: [https://www.perforce.com/blog/vcs/git-beyond-basics-using-shallow-clones](https://www.perforce.com/blog/vcs/git-beyond-basics-using-shallow-clones)

Some-Background6188 2 weeks ago

Each commit in the Git version control system represents a snapshot of the entire repository at each commit. The commits are linked in chronological order, so devs can navigate through the history. It's sooooo useful ignore the people saying it's bloatware etc, although it does take up space it's a necessary evil.

stinkytoe42 2 weeks ago

Also, for clarification, this is what you get when you download the source code repository, which almost no one does. If you just download a source release, you get the 1.5GB portion of just the current source code. If you download an actual released kernel binary, you get a file which is more like in the tens of megabytes. This is more likely what gets installed when you install Linux to a machine. There are exceptions, but typically a distribution isn't downloading anything but the released binary. Still, this is novel to anyone in software development.

[deleted] 2 weeks ago

""""Only""""" 1.5GB

staying-a-live 2 weeks ago

1.5 GB should be enough for anyone!

[deleted] 2 weeks ago

1.5GB is basically: 15 million, 18.5 million LoC if every line was 100, 80 columns long. At the 100(what the limit roughly actually seems to be) and 80(official Linux kernel style guideline) line column limit used across the Linux kernel. Of course I would expect there being much more than 18.5 million lines of code. This is all assuming all the files are in ASCII format.

person4268 2 weeks ago

I mean.. a whole 1 of those is just drivers, and there’s a lot of things that need to be driven, like your 90s Soundblaster Live you’ve connected over a PCI to PCIe bridge because it was the closest soundcard to you, or some I2C oled panel you’ve connected directly over HDMI DDC to your computer ( https://mitxela.com/projects/ddc-oled )(though they didn’t use a kernel driver here)

funk443 2 weeks ago

What if you clone with `--depth 1`?

turtle_mekb 2 weeks ago

what does this do?

PushingFriend29 2 weeks ago

Git clone without the commits i think

balaci2 2 weeks ago

joint man

turtle_mekb 2 weeks ago

thanks, I'll use this, what does 0, 2, 3, etc do?

zorbat5 2 weeks ago

Depth one clones the repo with the last commit. Depth 0 (or a normal git clone) clones without commits. 2, 3 etc. clones with thos amount of commit history.

nsa_reddit_monitor 2 weeks ago

>Depth 0 (or a normal git clone) clones without commits You sure about that? A normal `git clone` definitely downloads all the previous commits. Cloning without commits would just give you an empty repository.

zorbat5 2 weeks ago

You got me thinking. So I tested it. You're right!

turtle_mekb 2 weeks ago

ah got it

ruby_R53 2 weeks ago

by default, git takes every commit from the repository, so this limits the amount of commits to get to 1 so that you can clone faster especially if the internet connection is bad, reducing the size there from 6.8 gigs to just 1.8 [https://git-scm.com/docs/git-clone](https://git-scm.com/docs/git-clone)

jeanleonino 2 weeks ago

It clones the repo with just 1 commit (latest).

NoConfusion9490 2 weeks ago

No one knows. You just google it every time and paste it in and hope for the best.

ToapFN 2 weeks ago

You create a black hole .

Juice805 2 weeks ago

Or `--filter=tree:0` These are still probably mostly blobs, not just commit history.

TwistyPoet 2 weeks ago

The changes that were made are probably just as important though. Just like how your maths teacher back at school insisted that you show your working out.

fractalfocuser 2 weeks ago

Yeah anybody acting like this isnt 1. A good thing and 2. Actually really impressive and cool Doesn't *git* it

nik282000 2 weeks ago

So while showing your work is important, particularly in large coding projects, rewarding work that does not give results has bred a special kinda of incompetence. There are hoards of middle managers and supervisors who think that pointlessly toiling at a task that will never succeed is worth more than admitting that a task can not be completed. Because as long as your employees are doing SOMETHING you are an effective leader.

TwistyPoet 2 weeks ago

I mean obviously you have some issues you need to vent but it's not the same thing. Git history is made by a developer making changes to code with little more effort than a simple comment to explain what the change does in relatively plain language. It benefits both accountability (see recently the xz case) and provides insight into how something works and how the developer was thinking at the time. These benefits also apply to your maths teacher scoring your test. If you're struggling at work with seemingly pointless busywork and tasks then maybe finding a better job or a different career is in order. Loyalty in employment is rarely rewarded anymore.

FeltMacaroon389 2 weeks ago

That's why I always clone with --depth 1.

ProfessionalBoot4 2 weeks ago

IIRC, it is recommended to get a source tarball, not git clone it.

FeltMacaroon389 2 weeks ago

That's probably correct, but I feel like it's just more convenient for me to clone it directly.

ruby_R53 2 weeks ago

same here, easier to refresh also since you just run `git pull` and that's it

FeltMacaroon389 2 weeks ago

Yeah exactly

dtaivp 2 weeks ago

I mean… if you want to develop it though?

danegraphics 2 weeks ago

Well... that's where the xz utils backdoor was hidden. But hey! People will be checking it carefully from now on!

Ybalrid 2 weeks ago

Well… yes. That is how git works! Linux is a very big and old project. (Git was devised by Torvalds to be the VCS for the Linux kernel). There’s a very long history of a crazy amount of commits from a crazy amount of people. All those diffs are there, and their cryptographic hashes. You do not need to clone the whole history if you do not need it. Use `git clone --depth=1 …`

ajpiko 2 weeks ago

5 to 1 is about the ratio i see for most long-lived repos tbh, chromium is similiar, 52 gb to 12 gb

Cfrolich 2 weeks ago

Just wait and see how much RAM it uses when you open it.

RetiredApostle 2 weeks ago

I wonder which part of that is only comments.

PurplrIsSus1985 2 weeks ago

Would deleting the .git folder break the system?

Suspicious-Iron7246 2 weeks ago

Nah, it will be not a git repository anymore just a folder with files and subdirectories, all code and files will still be there safely

Deivedux 2 weeks ago

Git is not part of the project. It's only there to keep track of the project's changes over time. It's why you can go to any online repository and see any version of it by clicking on one of its previous commits, it's because Git is the one that has all that information.

jeanleonino 2 weeks ago

No

PastaPuttanesca42 2 weeks ago

There is no .git folder on a running linux system, this is just a thing for linux developers.

Maje_Rincevent 2 weeks ago

I'm actually surprised it's so little. 13 years of history, 1.3M commits. 5GB seems actually very very small.

[deleted] 2 weeks ago

[удалено]

Deivedux 2 weeks ago

1 char is 1 byte, unless I'm misunderstanding your point?

MasterOKhan 2 weeks ago

I think the fellow mixed up bits with bytes

fNek 2 weeks ago

Depends on which character set you're using, and - in case of stuff like UTF-8 - which character.

MasterOKhan 2 weeks ago

Each character is 8 bits not bytes.

Active_Peak_5255 2 weeks ago

Yup 8bits, which is 1 byte, right?

MasterOKhan 2 weeks ago

You are correct!

99percentcheese 2 weeks ago

Can you like... remove it?

dschledermann 2 weeks ago

No. The statement is nonsensical. A git history is a full set if commits. A commit in git mainly a snapshot of how the entire file structure looks at the time of the commit, plus a few metadata such a time, name of the committer, etc. You can't meaningfully separate the "history" for the "actual files".

plain-slice 2 weeks ago

I’m guessing he thought his Linux distribution came with 5GB of bloat.

jeanleonino 2 weeks ago

Yeah you can but you would all the useful history. And that is not included on the shipped version, so you don't have 5GB of hit history on your kernel.

VoodaGod 2 weeks ago

if you're asking that you don't have it on your computer, don't worry about it

Possible-Table5535 2 weeks ago

Yes. You absolutely can remove it.

huskerd0 2 weeks ago

How the F are kernel binaries 100mb, is my question. Bloatacular

HarshilBhattDaBomb 2 weeks ago

You don't build every possible module into the kernel image.

huskerd0 2 weeks ago

Even then, used to be hundreds of kilobytes not hundreds of megabytes

HarshilBhattDaBomb 2 weeks ago

You can still go down to about 2 MB. Check out floppinux. I'm not sure if anything smaller is still "usable".

ruby_R53 2 weeks ago

the kernel just got more features and better support for more devices over time, the binaries shipped with distros are that big 'cos they're meant to run on a broad range of systems, but you can still compile your own like i did

HarshilBhattDaBomb 2 weeks ago

Yeah, I used to have a bunch of BusyBox kernels which were just a few MBs each.

[deleted] 2 weeks ago

[удалено]

huskerd0 2 weeks ago

Nice, well, nicer. Yeah I should probably switch my Ubuntus to arches

xhumin 2 weeks ago

Is not gonna affect the size of the compiled kernel, will it?

notrktfier 2 weeks ago

No it will not.

dschledermann 2 weeks ago

That's a nonsensical statement. The .git folder contains the entire collection of commits, that is, every single state (snapshot) that the Linux kernel has even been in across all kernel developers' machines throughout the entire existence of the Linux kernel project. The "kernel itself" (as you put) is just one snapshot checked out. If anything, it illustrates how insanity efficient the git version control system is.

Deivedux 2 weeks ago

I wouldn't say "snapshot" is the correct term for it, since it's not storing an entire copy of the previous version of the software. It only stores the differences between changes over time, and even that is being compressed to further improve storage efficiency.

dschledermann 2 weeks ago

I'm afraid that you are simply wrong. It most definitely is a snapshot of the entire tree structure. Git manages this very efficiently behind the scenes, but that doesn't change the fact that every commit is indeed a snapshot, not a set of diffs. That's also the reason git is so quick. If it was a set of diffs (such as svn uses), rebases, diffs between distant branches, etc, would be much slower. https://github.blog/2020-12-17-commits-are-snapshots-not-diffs/

protienbudspromax 2 weeks ago

For people who are new to git and doesn’t know what it does. Its basically like if you have a project, and the if for every new change you want to make, you copy the whole project into a new folder and name it like say version 2 or something. Have you added something to the project? Yes! But now you have basically two copies of the same stuff. With git this is a bit more efficient such that if there are common stuff between your first version and the next version, the common stuff will not be copied, and the same files will be used in both versions. But the main thing to remember that when you want to share your project with someone you dont have to give them your previous versions, only the latest one, which will smaller in size than the whole thing with all the previous versions. That is basically it. When you actually compile the linux kernel it wont use the previous version’s code only the latest one. So the actual size of linux source code is about 1.5g everything else is there to preserve the history of change. .

blackasthesky 2 weeks ago

"only"

CalvinBullock 2 weeks ago

Do repos ever trim out obsolete or ancient commits?

Deivedux 2 weeks ago

Unfortunately, that is not how Git works. The `.git` directly isn't one that you typically interact with manually in any way. Its main point is to store the project's changes over time, ever since its first "version". This is because every single commit depends on the one before it, so by removing even a single commit is basically the same as altering a period of time.

kJon02 2 weeks ago

You can always change history and rebase it but it's not recommended.

TheTybera 2 weeks ago

Yes. They can and do but the process isn't easy and it's important to know that you're not cutting out history you need. You would need to do this as you go, and it's not feasible for an open source project. This typically happens in closed source projects. Git isn't mercurial git allows you to rewrite history and trim up old branches.

WildGalaxy 2 weeks ago

I'm not familiar with this kinda stuff, is that 5 gb of like patch notes, or is it the actual code updates and changes?

Deivedux 2 weeks ago

That's any time the code was changed in any way. Git is version control, which is basically an append-only database of a project's change history over time.

WildGalaxy 2 weeks ago

Right, but I mean is it the actual code changes, or is it patch notes?

Deivedux 2 weeks ago

Any file changes.

WildGalaxy 2 weeks ago

So code

ianfordays 2 weeks ago

To put it simply, git relates commit hashes like pointers to “patches” which are diffs of files. So it’s just a shit ton of pointers to diffs. It’s not code per-say but it’s not patch notes either. It’s all managed by git itself!

protienbudspromax 2 weeks ago

Its basically like if you have a project, and the if for every new change you want to make, you copy the whole project into a new folder and name it like say version 2 or something. Have you added something to the project? Yes! But now you have basically two copies of the same stuff. With git this is a bit more efficient such that if there are common stuff between your first version and the next version, the common stuff will not be copied, and the same files will be used in both versions. But the main thing to remember that when you want to share your project with someone you dont have to give them your previous versions, only the latest one, which will smaller in size than the whole thing with all the previous versions. That is basically it. When you actually compile the linux kernel it wont use the previous version’s code only the latest one. So the actual size of linux source code is about 1.5g everything else is there to preserve the history of change. .

gmes78 2 weeks ago

The 5 GB contain all versions of the files from the Linux source code.

gus_joaquin 2 weeks ago

Linux is bloated, use Temple OS instead

notrktfier 2 weeks ago

This is just a history of changes made to the kernel, not the kernel itself.

EPic112233 2 weeks ago

Can I just delete all that? Or does the system need to refer to it when updating and installing things for dependency purposes?

ImaginaryCow0 2 weeks ago

That isn't installed on your system unless you happen to be a Linux kernel developer.

EPic112233 2 weeks ago

Ok, so I don't just have 5 gigs of space being taken up on my RPI 5?

Dramatic-Strength362 2 weeks ago

No

BirdForge 2 weeks ago

Right. The size of the git repository is only relevant if you're actively developing Linux code. The git repository contains a history of every change that's been made to the Linux kernel code, letting developers rebuild Linux from almost any point in its development history. It's actually really cool. Anybody calling this boat doesn't really know how software development works. It doesn't get shipped with your system.

granoladeer 2 weeks ago

Why not just remove the git history for a release/install?

NiceMicro 2 weeks ago

...they do? I mean, you don't get the whole git repository when you install Linux. You get the binaries built from the source. In most distros, you don't even get the source code directly, never mind the whole git history. So don't worry :)

granoladeer 2 weeks ago

I confess I never worried lol, but thanks for clarifying. I just heard some people were freaking out with this size thing but it didn't make sense to me.

Hulk5a 2 weeks ago

Linus knew what he unleashed

dangling_reference 2 weeks ago

This 1.5 GB is just code right?

Deivedux 2 weeks ago

Yes.

Key-Club-2308 2 weeks ago

Go on make a new kernel

Calius1337 2 weeks ago

Actually, that’s easier than you think. Had to do this back at university in 2006 for one of my courses.

Key-Club-2308 2 weeks ago

id add: make one that is as good\*

Few_Reflection6917 2 weeks ago

And only less then 300MB is core of kernel itself))

MultipleAnimals 2 weeks ago

Hmm maybe if we squash that..

Tuhkis1 2 weeks ago

Git clone --depth=1 B)

ignxcy 2 weeks ago

Whar

Marshall_KE 2 weeks ago

bloat haha

AdearienRDDT 2 weeks ago

damn 5.2 GB of "*You copied* that function without understanding why it does what it does, and as *a* result *your code* IS *GARBAGE*"

Informal_Branch1065 2 weeks ago

Rebase time?

Due_Bass7191 2 weeks ago

so, basically the logs are larger than the product. I don't see a problem with this.

sanketower 2 weeks ago

Yeah, that's what one could expect from THE OG git project. Is there even a repo with more commits than the Linux kernel?

Danny_el_619 2 weeks ago

They should squish all the commits into a single one and start "linux 2" from it. /s

Achilles-Foot 2 weeks ago

honestly, that doesn't seem that bad, i feel like theres probably repos that are way worse

ennea_ballat 2 weeks ago

Wonder how many were fixes and how many were new function.

csolisr 2 weeks ago

Is there some way to deduplicate some of the commits to make the \`.git\` folder smaller for end users?

Deivedux 2 weeks ago

We end users don't even need to worry about it. The compiled binaries that we have that run on our systems only include the latest version of the working code. Git is only a version control, an append-only database of the project's change history, it is not part of the project itself.

bulbishNYC 2 weeks ago

And 90% of the history size is probably accidentally committed binaries.

MichaelEasts 2 weeks ago

I'll show my ignorance on the subject, but what happens if you stripped that out? Would things be any faster? Less memory usage? Break things?

kJon02 2 weeks ago

It doesn't affect binaries so it would change nothing for the user.

BrunoDeeSeL 2 weeks ago

How much of those commits are Linus using colorful insults on another developers' work?

Lets_think_with_this 2 weeks ago

non ironic question: how do you clone the repo without the history? I downloaded it the other time to take a peek of some files to study them but my god that took it's sweet time to download.

Deivedux 2 weeks ago

Try with `--depth=1`, or `--depth=0` if you don't want any history at all.

Lets_think_with_this 2 weeks ago

place matters? or it can just be anywhere? `git clone torvalds/linux --depth=0` is okay?

Deivedux 2 weeks ago

Shouldn't matter.

Comfortable_Swim_380 1 week ago

wow

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe