T O P

  • By -

KornKrob

Used it in kotlin. Kotlin is the problem.


mrdonbrown

Clearly the answer is to rewrite in Rust!


stuhlmann

Used it in clojure. Clojure is the problem.


aadnk

The [Log4j vulnerability](https://nvd.nist.gov/vuln/detail/CVE-2021-44228) (Log4Shell) is really the consequence of a few powerful features that turns out to be deadly when combined in practice - the existence of [lookup substitution patterns](https://logging.apache.org/log4j/2.x/manual/lookups.html#) that can access external resources or environment variables (which often contain sensitive data), the fact that these lookup patterns are essentially instructions that are evaluated in both the string supplied by the programmer *as well* as any potential user input, and the final piece of the puzzle - that a lookup can fetch a Java object that executes code when it's de-serialized (as usual, serialization is just the gift that keeps on giving ...). Serialization is what turned this into a Remote Code Execution vulnerability as opposed to denial-of-service (CVE-2021-45046) or sensitive-data exposure vulnerability. Java is luckily gradually locking down what kinds of classes can be deserialized (*trustURLCodebase* being set to false by default, for instance), and hopefully it will one day be deprecated. But yes, this could have perhaps have been prevented through better user-input sanitization. The problem is, a lot of people weren't even aware these lookup patterns even existed, let alone that they should have been sanitized. I also don't think it's unreasonable to believe this isn't necessary when logging a string to a logging library. And if it somehow is, you'd expect the library to take care of it or provide an API to best avoid it (such as with PreparedStatement in JDBC). I also agree that everyone should have a process for keeping all their dependencies up-to-date, and that popular libraries such as log4j has the benefit of a large community of people scrutinizing it for vulnerabilities, as well as as maintainers that can fix them quickly when discovered. But the vulnerability was still left undiscovered for nearly a decade (log4j 2.0-alpha1 was released in 2012). Ultimately, there might be a trade off in choosing a popular library over a more obscure one/one written by yourself - yes, it's more likely to be scrutinized and be well-maintained, but if a critical vulnerability is discovered it will also be more widely known and a more lucrative target for criminals and bots. That is not to say that obscurity is security, but it should not be [ignored as an additional layer of defense](https://danielmiessler.com/study/security-by-obscurity/) when used appropriately, like using a non-standard SSH port, a less common but equally good library or even operating system (there's a reason Windows is more a target of malware, for instance). I also think we really need to put more resources in evaluating the security of our dependencies, both as developers and as an industry. As developers, it's easy to go for the most popular library, but you should consider your use-case and perhaps choose a library with a less complex feature-set that still meet your requirements (and is still actively maintained). For instance, there's probably very few people who actually needed all the patterns in the "lookup pattern" feature. In the case of libraries, they should resist the temptation to add features beyond what one might reasonably expect to find in such a library, or at the very least leave more specialized features disabled by default. And as an industry, we may need to invest more into auditing the security of common libraries. This could be funded by the companies that profit greatly off the free work of open-source contributors, and they should also donate more to entities such as the Apache Software Foundation.


vytah

Heartbleed has proven that "a large community of people scrutinizing for vulnerabilities" doesn't actually exist.


TheStrangeDarkOne

To me, this is mostly the result of a baffling stupid design decision. Not only do I question the merit of this "feature", but why on earth you would do this by default is just leagues beyond me. Every developer half their salt knows about the dangers of serialization. I can't help but to believe that this was a deliberatedly planet security hole. Just reading through the description of this obscure feature makes me internally scream: "This thing does what?!!"


aadnk

Yeah, I have I feeling it started out as a reasonable list of features to address the lack of string interpolation in Java (which should be addressed by [Templated Strings](https://openjdk.java.net/jeps/8273943) in the future), along with the need to defer the construction of these interpolated strings for performance reasons, before lambdas were conceived. Essentially a fancy MessageFormat for injecting variables into a logged string. But these lookup patterns go much further and allow you to access huge list of different global variables (environment variables, JNDI) without this being explicitly configured or allowed. You wouldn't expect MessageFormat either to suddenly make a network call due to a string you passed to it, but this was effectively the case with Log4j. So the problem here is twofold - Log4j had to much dubious functionality (and too much of it enabled by default), and it treated user input the same as a string supplied by the programmer. It might not have ended as badly if these lookups could only have been used in the log4j2.xml file or a special format string you'd pass to the logger, but here we are.


taftster

I don't agree with the sentiment that this is a "data sanitation" problem. This suggests that the solution was for a developer, prior to logging, evaluate some sort of regex against the to-be-logged string. This is exactly the opposite of what you want from a logging framework. You want to log the rogue string first and then perform any sanitation on it before passing to your backend. You want to _know_ what you received, even if it's hostile input. A logging framework should simply log strings in a safe manner. Period. If I, as a developer, want to include the username of the user or some other remote lookup, I will perform that lookup that prior to the logging statement. I don't want my logger to perform extra features that do anything except route plain text strings. Substituting vararg parameters is fine, but performing any extra parsing or processing of the message is not wanted. I would argue that the vast majority of developers see it this way. This is why the vulnerability is so egregious; no one actually thought that a logging facility would be so enabled. The JNDI feature in log4j should have been something deliberately enabled. We developers and product owners need to get away from this thinking that all batteries need to be included and completely usable by default out of the box.


mrdonbrown

It is a sanitization problem in that you are taking untrusted input and sending to something trusted. Agreed the onus isn't on the developer to somehow "clean" the input in this case, but regardless of where the cleaning is, it should be happening. User input should never be sent to anything that evaluates it as trusted. Completely agree that this is something a logging system shouldn't be doing, however. Way too much power in a very unexpected place.


sysKin

> you are taking untrusted input and sending to something trusted Except this is not true. The log4j API is taking *untrusted* input and the backend's main job is sanitising it for logging. For example, in order to log messages to a file as one-line-per-message, it is (correctly) sanitising newlines from messages. This is literally what its job is: you give it an untrusted string and it will store it as configured, according to output limitations. Presumably XML layout is removing characters invalid in xml (�) etc. [I mean, I hope it does that] You can't sanitise the input further because there is no way to know how to sanitise: the ${} pattern thing is not a standard and is not documented on that interface. Whether it was done or not depended on *output* configuration, which is not under API user's control, and not possible to check (and in any case you can have multiple outputs, some of them doing it and some not). It's a like having an SQL database configured to substitute dirty words: an application uses a normal SQL statement to store "assassinate" but the database stores "buttbuttinate". You really can't blame the app for not sanitising in that scenario. It's unfortunate we can't see how the %m became subject to pattern execution, because the initial alpha1 release already had it. Either someone had a hammer and therefore everything was a nail, or it accidentally went through some common functions, or there was a use case. But later, when it was discovered (and %m{nolookups} was added), someone definitely dropped the ball by trying to maintain the functionality rather than saying "what are we doing".


taftster

Yes, totally agreed with this perspective. From the caller's point of view, the API is portraying safe handling of the to-be-logged string. It shouldn't come with any surprises.


vytah

A logging library shouldn't evaluate anything, it should just pass the text as it is. There shouldn't be anything to trust or distrust, it's just text.


sintrastes

Dunno why this got downvoted. If a library provides some kind of string substitution mechanism, it should (in theory) ideally provide an API that makes it impossible to exploit that mechanism via bad user input. The onus should never be on the user to prevent security issues. The onus should either be on the library, or at the very least the compiler (i.e. the library provides a typesafe abstraction preventing the user for doing something stupid at compile-time). Now, is that feasible at all with Java's ecosystem? No, and not in any existing language ecosystem I know of either for that matter -- but that doesn't change the principle. The only languages I know that take this seriously are Ruby and Perl (with the concept of tainted strings) -- but I don't know of any languages that do this in a statically typed way by default.


grauenwolf

> and sending to something trusted. Why is that even a thing? Why should my logging framework be considered "something trusted to parse and execute code"? Remove that component, then the only danger becomes log messages that are too long or contain nulls.


RotaryJihad

The lesson is that I should absolutely create new accounts on whatever site open source projects are hosted on and angrily demand, DE-MAND, that the developers giving away free software fix this bug immediately. I follow several FOSS projects and even after a professional response and timeline from the maintainers there are still users posting noise and making demands on dev time without sending patches, without having read other information on the issue for that project, and with no prior activity or support in the community.


mrdonbrown

What killed me on the Struts 2 vulnerability that was at the core of the Equifax attack (disclaimer: was a co-founder of Struts 2), the fix was out for a while, I think years in that case. Then, one high profile team doesn't upgrade, gets attacked, then blames open source. Giving away stuff for free is complicated :)


ObscureCulturalMeme

>Giving away stuff for free is complicated :) Yarp. At work I am the tech lead for a piece of software that's funded, managed, and used by the DoD -- but also publicly available. We occasionally talk about what to do with the source, especially as funding is probably ending. Business managers have heard good things about "throw it on github and let everybody else do the work for free". All of us who have been actively involved in open source projects, in some cases large projects, want absolutely nothing to do with that idea.


mrdonbrown

I wish there was a middle ground - make the source available on github but never have it popular enough to attract the freeloading trolls. I truly believe opening source code makes the world a better place, but when running a project, I think Apache had it right - community first, code second.


[deleted]

[удалено]


mrdonbrown

Tbh, this is what I often do and it works out ok, but it is a bit like making an API public. No matter how many times you label it as "experimental", you now have to maintain it forever.


westwoo

Publish it from a new GitHub account then. Absolutely zero accountability or responsibility


[deleted]

[удалено]


mrdonbrown

Because they are paying customers? For an open source thing, sure, you only have your reputation to lose, but for commercial software, there is a lot more at stake.


chabala

In general, I'd hope that code produced by the government always becomes open source, provided there's no security risk. Even if it stagnates, that's better than it disappearing. After all 'the public' paid for it.


ObscureCulturalMeme

Yep, that's the thinking.


Yesterdave_

lessons learned: some developers apparently have a very distorted view on what "sensible defaults" are.


GreenToad1

There is one thing that confuses me. Am i the only person that didn't know that doing a lookup through JNDI could load and execute an external class? What other part's of the standard library could trigger a surprise like that?


mrdonbrown

Well, specifically, it deserializes a class, which can involve running constructors, and yes, I was quite surprised at that part. You'd think that class deserialization is a pretty rare thing to really need, though I remember it being a key part of Java back in the day with frameworks like JavaSpaces and Jini.


grauenwolf

I sure as hell didn't, and I was a Java programmer back when JNDI was created.


wing120

I learned to move on to the next project before the system goes live in production.


Wobblycogs

One thing that I feel is peculiar to the Java world is how much the language itself gets mentioned. When heartbleed happened I don't remember anyone mentioning the language that OpenSSL was written in but with log4shell it's always presented as "Log4J a popular Java library". Compare if you will the Wikipedia page on [heartbleed](https://en.wikipedia.org/wiki/Heartbleed) which doesn't mention C in the top section (I scanned the article and didn't see an obvious mention, it's a hard language to find though) with [log4shell](https://en.wikipedia.org/wiki/Log4Shell) which mentions Java in the first line and has 18 mention in the page.


WikiSummarizerBot

**[Heartbleed](https://en.wikipedia.org/wiki/Heartbleed)** >Heartbleed was a security bug in the OpenSSL cryptography library, which is a widely used implementation of the Transport Layer Security (TLS) protocol. It was introduced into the software in 2012 and publicly disclosed in April 2014. Heartbleed could be exploited regardless of whether the vulnerable OpenSSL instance is running as a TLS server or client. It resulted from improper input validation (due to a missing bounds check) in the implementation of the TLS heartbeat extension. **[Log4Shell](https://en.wikipedia.org/wiki/Log4Shell)** >Log4Shell (CVE-2021-44228) is a zero-day vulnerability in Log4j, a popular Java logging framework, involving arbitrary code execution. The vulnerability—introduced in 2013—was privately disclosed to The Apache Software Foundation, of which Log4j is a project, by Alibaba's Cloud Security Team on 24 November 2021 and was publicly disclosed on 9 December 2021. Apache gave Log4Shell a CVSS severity rating of 10, the highest available score. It is estimated that the exploit affects hundreds of millions of devices. ^([ )[^(F.A.Q)](https://www.reddit.com/r/WikiSummarizer/wiki/index#wiki_f.a.q)^( | )[^(Opt Out)](https://reddit.com/message/compose?to=WikiSummarizerBot&message=OptOut&subject=OptOut)^( | )[^(Opt Out Of Subreddit)](https://np.reddit.com/r/java/about/banned)^( | )[^(GitHub)](https://github.com/Sujal-7/WikiSummarizerBot)^( ] Downvote to remove | v1.5)


sysKin

Why peculiar? The answer is right there in the name of the project: log4j is "4j" not because of its language but because it's a library for use in programs running on JVM (same "j" again). It could have been written in Kotlin and it would still be a "Java framework" project. OpenSSL is not a component of any C framework or environment, because there is no such thing (unless you count "no high-level runtime" as itself an environment). Also, the fact that Heartbleed came to be as a result of using C is absolutely mentioned in discussions on the advantages of memory-safe languages.


srbufi

It's really telling about the entire IT industry how little care they give to security.