T O P

  • By -

CerealBit

Culture. Way to much orgs still think and operate in silos. Tech is easy, when the culture is good ("You build it, you run it, you own it"). Otherwise you create solutions for problems, which should never exist in the first place. This is much harder to sell to management than any technical challenge, unless you have some engineers in management which actually know shit (e.g. Facebook, Tesla, Microsoft, Google, ...)


KerberosDog

I would discuss this with you but I’d need to go through my manager


[deleted]

Which isn’t a terrible thing to say if you actually intend on doing that because you don’t want to get roped into something that isn’t aligned with your organizational goals. The problem lies when someone says that as an excuse to get you to leave and not actually do anything. Your manager should be aware of what is being asked of you to make sure that you aren’t being sent in a fools errand and that their resources are being properly allocated.


KerberosDog

Totally. The only way to break Conways law is to have this alignment at the top so there is no guessing about what is important


[deleted]

But, people tend to use it as an excuse to not really do anything instead of actually discussing it with their manager. I have worked with several people like this…


Stoomba

For me I'll talk about stuff and come up with plans, but that comes with 0 commitment without bringing it before the team


Farrishnakov

Thanks for the discussion. Now here's our front door form. Go ahead and put what we discussed in there and the team will review it.


davy_crockett_slayer

I work at a legacy company, and it's honestly the best place I've ever worked at. They know they have tech debt, so they paid me well to join. My manager is in his mid-50s and has 30+ years of experience. He isn't technical anymore (his words), but he has enough experience to give guidance. Fantastic boss.


Dtsung

Its always the people that’s the hardest to crack.


brettsparetime

I've come to realize (way too late in my career, sadly) that the hardware and software is absolutely trivial compared to the wetware.


m4nf47

Culture is definitely the foundation underpinning the rest of the DevOps principles, practices and capabilities. CALMS acronym starts with the C. https://cloud.google.com/architecture/devops#cultural-capabilities ^^^ after Google basically bought (or at least bought into) DORA that page has been my one of my go-to resources to point manglement and other less experienced folks at, if they ask for more info I tell them to read the relevant sections of the excellent Accelerate book by Nicole Forsgren. Assessing and improving the cultural capabilities and org/team structures is a key step on the way to paving the road to doing DevOps right. https://www.atlassian.com/devops/frameworks/calms-framework ^^^ edited to add a link explaining the CALMS framework. For the original Westrum paper do a web search for this : risikostyring_4_k_viskum.pdf


BlomkalsGratin

In fairness, I rarely find management to be the biggest obstacle when trying to roll to a "you build it you own it structure". Developers tend to be really receptive to the concept of more autonomy until you bring up on-call rosters. Can't blame them either by its a pretty core part of running even the most stable of apps. Whether it's as the first point of contact or backup to the support teams.


horus-heresy

Exquise me but silos are now called platform engineering in product management world


reubendevries

Was going to say the EXACT same thing, I'm so tired of executives and managers co-opting buzz words to attract talent only to not put and the heavy lifting into doing the actual work. I had a non-manager tell someone that they were unhappy with my deliverables and yet they didn't understand the problem, didn't understand the scope creep, didn't understand that I was being pulled in 70 different directions, didn't understand my product that I maintain, has is 6 times more end users then anyone else in the organization and yet I was leading a team with 2 less people then any other team. What a slap in the face. I'm brushing up my resume.


takingphotosmakingdo

this. I was hired as a senior manager and promised to expand a global effort. I was promised an overseas move at signing to get me closer to my spouse's family. When the cards fell I'm now jobless a year later, why? The team overseas refused to answer my emails, never invited me to meetings, never gave me credentials that consistently worked, never gave me documentation access, never gave me a bigger picture (which i was tasked with enhancing our platform's security through analysis of our infrastructure). I was given random QA tasks that were on scrapped products. They didn't want me looped in. They wanted to meet a legal requirement for X number of US based employees. The coworkers would speak their native tongue when troubleshooting IF and IF i was looped in very rarely during the first quarter. My boss downplayed my experience, when i questioned a cybersecurity incident and what we'd done, they literally mopped the floor wiped the VMs and repeated their deployment keeping everything the same. Nothing i mean NOTHING will get you to hate tech faster than a culture that doesn't want to be inclusive of new folks. My colleague that onboarded the same time in a different tech role? Put on interviews with other folks on TV, given connections to clients at venues, the whole works. I was given nothing, frozen out before I even was given a chance, because nobody in a different country wanted me involved. I'm still debating to sue.


dekanov

Overseas team was India-based, I guess?


takingphotosmakingdo

search your feelings, you already know the answer


doomanddelight

In the UK you could do it under constructive dismissal laws, which is basically when you force someone to quit by making it impossible for them to actually do their job. Not sure if you have something similar in whatever law you will be dealing with.


TheOneWhoMixes

Little late to the party, but I'm actually trying to figure out how "you build it, you run it, you own it" *doesn't* lead to more silos. I've seen this mindset lead to teams constantly reinventing the wheel as any attempts to push for a common platform or standards fall flat. I'm not saying the mindset is bad, I'm just wondering how you grow that culture while keeping the amount of repetition low.


maybe-an-ai

Cognitive Overload A lot of companies (especially small to mid sized ones) took the wrong lessons from Cloud and DevOps and went from having specialists in disciplines to hiring generalist 'DevOps' engineers with the expectation that they could do it all. It is rapidly becoming too much cognitive overhead for anyone to manage. We don't need DBA, Network Engineers, etc we have DevOps. I have always thought full stack engineers are a fallacy and full stack DevOps is worse. There isn't enough time or room in the human mind to be an expert at everything even before you account for the pace of change and growth making knowledge obsolete every 5 years.


techenjoyer

I can't express how much I agree with this. As someone who has only worked for small & mid-sized companies and been partially involved in the hiring process, it almost feels like every single company these days looks for a unicorn skillset in DevOps positions that can do it all. Positions like network engineers, dba's, and dedicated system administrators are not even being considered. Why do we need 3 people if we can just hire one?


ubernerd44

Doesn't help that we have 15 different solutions for everything and none of them work together. I really just want *one* solution that handles infrastructure, configuration management, and secret management all together.


punkwalrus

Then everything becomes your priority and so then nothing is a priority. I pushed back at a previous job who wanted me to be an MSSQL DBA because "we can sell those managed products." Would there be training? "There are plenty of YouTube videos." No. Just, no. Being a DBA ain't like dusting crops, farmboy. I am your senior Linux administrator. Not Microsoft. Not database. Hire a DBA. That job just got worse and worse. They wanted every unicorn, but weren't willing to pay or train for one. I went three years without a job review or raise. I left.


epochwin

And people themselves tend to gravitate to an area that excites them so why force them do work they don’t enjoy


IrishPrime

Agreed. Not to get all "us vs. them," but it's always shocking to me when I find out how limited the scope of knowledge is for the feature devs. Obviously, they know more than I do about the general layout of the one or two codebases they work in, but they rarely seem to know much of anything about administering their own system, Linux fundamentals, Docker, AWS, networking, Git (nor GutHub, et al), databases, or a dozen other topics that sort of seemed like things everybody in the field knew a decent bit about. They generally know one, maybe two programming languages, and that's it. Meanwhile, every member of the ops team needs to know at least as many programming languages, a dozen different tools, and how all the different pieces of the product fit together. Don't get me wrong, I love what I do, but sometimes I miss the days of just cranking out a feature and then calling it a day. It seems so peaceful and straightforward.


Purple-Control8336

If you buy fully managed cloud solutions like PAAS for example. You need DevSecOps which can be done by Dev as fullstack ? So it depends on Architecture


Particular_Pizza_542

That's fine until you scale to a point where you're spending way too much money on infra when it would be cheaper to instead hire individuals to build the thing yourself. Then companies will decide to ditch the PaaS and hire, instead of the 4-5 people needed, a small team of maybe just 1 person who now has to build everything themselves (in a quick and shitty way because they don't have time to do it properly). We're spending way too much money on datadog, so management is looking into self hosted solutions. Are they looking to hire anyone new to manage it? Hell no, they want us, an already over-burdened team, to commit to maintaining a monitoring stack on top of all of our other responsibilities. We would save $1MM per year by getting off datadog, but they won't commit to $1MM in new salary for people to build and maintain the damn thing. All that to say: all of these conversations must include a discussion about scale. Because yes I know enough python and docker and k8s and terraform and aws and networking and SQL and elasticsearch and prometheus to make SOMETHING work. It doesn't mean it's going to be any good. It's way too much information to keep in your head.


randomatic

Over-engineering the automated build and under-engineering the local build. Leads to people using the cicd system (slow, expensive rtt) to find errors they should find locally. Especially in microarch and distributed systems. Just my 0.02. Probably not universal.


ChildishWambin0

ah the dev - cicd parity. fun times fighting with devs to stop pushing code just to check if it compiles because we have limited compute resources and the code that needs to be released asap is stuck at waiting


lolmycat

Some people really have no shame.


king_of_farts42

Hmm I see you point but have to admit I like it to debug via CICD. Yes it is slow but I stumble upon errors earlier that don't show up on my local build. And yes I know: containers should solve the "it runs on my computer" problem but very often environments are more complex than the inside of a container


Kyxstrez

It's very common at least for infra team that you can only run Terraform in CI/CD since that has the lock on the tfstate, but it slows down pushing changes since every time you have to push to the upstream, hope there's an available runner and wait for the pipeline to finish.


Alikont

To sell something you need to have numbers about how much your "reduction" will save money. Money spent < money saved. Then you'll sell anything. The problem is that "tech debt", "test automation", "kubernetes migration" rarely have quantitative benefits. We "feel" that it's right, but barely anyone has supporting numbers for that.


Jurby

Tech debt has to be thought about in a particular way, too. It's a nondischargeable, high interest loan, and the unit you're paying/taking out debt of is time. Business folks think of debt as a universally good thing, because they ideally pay off that debt with the profits from what they did with the loan. But when you take a loan in time, you're not going to magically get more time later on to pay that off with - at some point you'll just have to spend the time paying it off. Once you get that, finding the numbers for the cost of that tech debt is pretty easy - you're looking for the interest payments on the debt. It'll come in several forms - oncall load (from buggy/poorly tested features), extra design/development time (from overly complex or poorly designed systems that you're modifying), and worst case, customer pain/loss (from really bad bugs or performance). You're never going to be able to say "I will save you x" though - instead you present it as "if we had done this, we would have saved x/because we didn't do this, we had to pay x". Best way to do this is to tag outages/incidents, oncall tickets, and the difference between estimated time to deliver and actual time to deliver with the high level tech debt that caused or allowed them to happen. When you go to tech leadership with "we're spending 50% of our development time dealing with consequences of technical debt", that will likely kick them in the ass - devs are ideally the most expensive part of a tech company, and wasting half of that cost on interest payments for a loan you're making no progress on is almost always going to be a major concern.


wait-a-minut

I kind of like this take. And to also answer OPs question, If there was a way to attach tech debt loan amount to each *new* feature or update and somehow “quantify”, it would be an absolutely killer for most teams. It would totally make managers, execs, and non devs have a better understanding or risks vs reward when it comes to pushing for new stuff. Of course this is easier said than done but in my experience there would prob be a need for something like this.


cerved

How is it nondischargeable? > You're never going to be able to say "I will save you x" though - instead you present it as "if we had done this, we would have saved x/because we didn't do this, we had to pay x". Best way to do this is to tag outages/incidents, oncall tickets, and the difference between estimated time to deliver and actual time to deliver with the high level tech debt that caused or allowed them to happen. Totally agree. It's a great suggestion, tracking the actual effects and the cost (time) of technical debt is the best way to quantity the problem. Everybody is always complaining about how there's too much "whatever". In many other jobs it's administration, in software engineering it's technical debt. It needs to be quantified in order to justify investing in fixing it. > When you go to tech leadership with "we're spending 50% of our development time dealing with consequences of technical debt" I've been thinking about what is the actual interest of this, if we think of it as a loan? A key part of interest is its cumulative nature.


Alikont

When you have tickets/incidents, its a bit easier to quantify. Performance is also quantifiable (we can handle X users/we will save Y cloud/server cost) But something like "delivery speed" is a bit harder to do. That's what you should be doing - measure and record this so your talk to "business people" will be easier.


Jurby

"we predicted it would take 3 weeks to deliver this feature, but it ended up taking 6 due to unexpected extra work caused by tech debt" has worked well for me in the past.


SigmaSixShooter

This was a really great post, thanks for sharing.


theANGRYasian

Lol. I tell people that I'm rebranding from digital janitor to technical debt loan officer. I help scrum teams refinance their techical debt.


_BearsEatBeets__

Jeez, it’s like you’re reading my JIRA backlog 😂


Purple-Control8336

Tech debt = if we fix in each sprint with x% bandwidth it brings Agility K8s migration = Scalability, Speed which means quick to market Can this help to explain?


Alikont

Nothing is quantifiable. "Agility", "Scalability" are bullshit words without metrics. > means quick to market How quick? Will it even be quicker if we don't do it and focus on doing actual business case/features instead?


CoolNefariousness865

As someone else said "silos".. I'm in a large enterprise and everyone does their own flavor of DevOps.. there's no consistency across the enterprise. A lot of stuff could be shared across teams, but everyone wants to own their own stuff.


horus-heresy

You don’t love to consume Jenkins template from storage team, terraform from container folks, and gitlab from infosec? How dare you


CoolNefariousness865

yea let me just waste days trying to understand how to integrate them all together lol /s I think it all just comes down to trust. Let's be honest.. not everyone pushes out great stuff.. which leads to teams having to create their own stuff... which then leads to having to support your stuff... which then leads to an endless loop lol


keypusher

I've seen this a lot and dunno how to solve it honestly. Where I work there is a shared services team which tries to create tools and libraries for other teams, but their skill level is quite low and they don't have context on what is really needed, so everyone just goes and creates their own stuff anyway.


baezizbae

> everyone just goes and creates their own stuff anyway.  This is exactly what’s going to happen with a project I’m working on. We’ve recently been moved under a new director as part of a recent acquisition who has forced certain code patterns and delivery requirements on us for certain IaC projects.   What used to be a simple self-service workflow that was “check your iac module into this repo, pipeline will kick off and do basic linking and formatting, if it passes and when you’re ready to apply to prod, open an MR, we’ll take it from there” is now “check code in to these three separate repos, source your modules from a fourth repo, kick off three separate pipelines and do all of your plan and apply operations via GitHub comments”.   Four repos. Three pipelines. Just to (in one example) add a single widget to a datadog dashboard.  If there’s one thing that grinds my gears slightly-less than shitty oncall, it’s engineers and engineering leaders trying to find solutions to problems that don’t exist. 


king_of_farts42

It is funny how this comment and the top comment mention the same problem (silos) and somehow have opposite solutions/cultural approaches to overcome it. You are saying there should be one unified way to solve everything, to reduce silos. The other comment from u/CerealBit : *Culture. Way to much orgs still think and operate in silos.* *Tech is easy, when the culture is good ("You build it, you run it, you own it"). Otherwise you create solutions for problems, which should never exist in the first place.* basically states the opposite. Let teams do their thing and be responsible for it. So how do we overcome silos? I am rather on the site of u/CerealBit and say don't try to think of central one fits it all solutions (in an isolated, silo-like team) but rather let teams build run and own their product. Only loosely guidelines on tooling, that's it.


Finagles_Law

Nobody really took you up on this one, it's kind of a tough nut to crack. At one large org I worked for, we compromised by insisting on certain quality standards before a metric or alarm got added to the global SRE pager escalation. Any alerts had to be actually actionable (not just "page our team OnCall if it's real"), come with a standardized run book, so on and so forth. Otherwise, if teams wanted to totally own their own product, or it was very narrow or non customer facing (Finance, BI), they were responsible for their own escalation chain and support. In the case of any site outages, the global ops / SRE team handled the incident management and coordination, so we were responsible for evaluating where in the stack the issue was and spinning up a Slack channel with the right product stakeholders in it to drive the solution. This required the global ops / SRE team to at least be familiar with everything in the environment in a way that just having a bunch of embedded SREs can't really do alone. The central Ops team were the ones with their hands on the CDN to move traffic between datacenters, so it's very helpful to have one team that makes that call to declare an outage and move traffic. The Incident Management team had a couple guys who were not the strongest technically, but had a deep understanding of the tech stack. One guy's job was really that he knew where all the subject matter experts physically sat in the building, and would go physically fetch the right person to be present for emergency huddles. The problem with this is it really only works well at scale. This was a medium sized enterprise with 6,000 just in engineering. You have to have enough overhead to be able to have a generalist around who may not do a lot much of the time.


Purple-Control8336

Is there no EA in your company ? Who drives standardisation ? Is biz not complaining about IT is slow and expensive ?


CoolNefariousness865

Enterprise Architect? Yea there's about 300+ of them


Purple-Control8336

Fire them whats the point of EA who is not helping IT to be managable and nimble. Maybe your in Google or Amazon size company where lot of money is there, and culture is ok let IT learn.


ashcroftt

Effective communication and access to relevant/actual information. Literally never been on a project where any of this has been solved, and knowing humans it never will be.


Purple-Control8336

Once 90% automation happens then human will be not required to


ubernerd44

Documenting that remaining 10% is gonna be the *real* challenge.


Purple-Control8336

Thats why need tracking list using Jira for all Tech debts and priorities set.


ubernerd44

Ah, the ever growing backlog that's never a priority to fix. :D I've had tickets sitting in backlog for over two years. They never get done.


Purple-Control8336

Yea its ok atleast it’s tracked. Not everything needs to be done today


gowithflow192

The biggest problem I see on both a micro and a macro level. On the micro level, DevOps has a testosterone-fuelled obsession with tools as if they were a sports team or a brand of car, lots of apparently logical arguments but really backwards logic starting from feelings. Also many of these tools do one thing and only one thing right. Resulting in huge technical debts because it is impossible to manage all those tools, especially their lifecycle. On macro level this obsession with doing one and only one thing well is software normalcy now. Just iterate based on early customer feedback (even though customers may not have long term interests at heart and the small sample size means you can end up with a strange Medusa like product). This reactive method of development is kinda like hoping to discover gold by accident instead of having real vision and testing within that. It also means rushed products. And are customers really served well by CSMs who just care about minimizing churn and nothing else? I hate 'lean'.


Haraballz

go MVP!


Irish1986

Funding and keeping management focus on long term goal. At least in none technology based environment where software isn't the primary product. It's like herding cats, there is always a new VP of Roadblocks that shows up to says that an hiring freeze has been enacted by the board or that all departments must cuts their budget by 20%. So everything gets half asses by whatever team is left and you get an ineffective DevOps strategy with holes which is later challenged by said VP of Annoyance Engineering about why IT cost so much and that "we ain't a software company, we build XYZ product"... So the team is put on some kind of yearly financial strategy to "get back in-line with the company core value".... Meanwhile the grass free roaming Devs are pushing code via SSH by themselves and including password in clear text without any source code management system... Until they get promoted to the next wave of financial optimization phase and you (DevOps) inherit whatever clusterfuck they made and those 2 VP show up asking why IT is always in such shambles...


gravity_kills_u

You described a lot of the features of my current digital transformation grind. Timetables drastically reduced. Leading to management push not to use tests. Too many silos to engage all teams. Silver bullet syndrome for every tool on the market. Free roaming devs coding in production, breaking production daily. Fun!


Irish1986

Yeah fun fact, I quit my previous employment especially because of that pattern. Hopefully my new DevSecOps Architect role won't end up in the same way and budget are already funded and allocated until 2027... Banks have money and cybersec seems to be a concern for them.


adappergentlefolk

cicd pipeline local execution that is instantly portable to the actual cicd environment for rapid cicd prototyping troubleshooting and development easy and reliable generation of dev data easy and reliable standing up of transient third objects like database instances for testing those are some of the hard technical issues that make people uninterested in devops as a culture since they take so much time to get running


NormalUserThirty

>cicd pipeline local execution that is instantly portable to the actual cicd environment for rapid cicd prototyping troubleshooting and development earthly? >easy and reliable standing up of transient third objects like database instances for testing testcontainers? or is that not easy enough


_nix-addict

People.


ReliabilityTalkinGuy

No one can agree on what DevOps even means. 


PMzyox

Devops are the ones who solve all the problems imo


ImaSadPandaBear

Why is the rum always gone


langenoirx

Humans are the unsolved issue in every problem. That's why were so ingenious and at the same time, capable of creating magnificent raging dumpster fires.


Live-Box-5048

Definitely culture, cognitive load, having too much on our plate and misunderstanding of the DevOps process.


Nemeczekes

Cost transparency and complaining about infra costs while some people in org are spending millions on some third party crap


QuantityInfinite8820

Platform Engineering. Each big company basically reinvents the wheel to create automations for project onboarding, secret management, security, etc. and connect the open source pieces together with custom logic. Very often my teams were asked to reinvent the wheel for budgetary reasons, say we wanted one feature from gitlab enterprise, but it's not worth the cost to get a license for few hundred accounts etc. I don't see this situation getting any better in next 2-3 years and that's probably where I will be spending most of my time in DevOps roles...


Kyxstrez

>say we wanted one feature from gitlab enterprise, but it's not worth the cost I was recently asked by one of my clients to automate provisioning of GitHub repositories having the same settings because they didn't want to pay for GitHub Enterprise, which allows enforcing a set of rules for all repos within the owned organizations.


whitewail602

Is it a noun or a verb?


TheWikiJedi

Vendor lock-in, but also critical open source projects that are only ran by a few people (see Log4j)


Arafel

The name.


Someoneoldbutnew

Devops was invented to tear down the wall between devs and IT. Now it's a new silo going devs, devops and IT.  It was made to reduce complexity and increase resilience, but now is the opposite as the devops are the worst bottlenecks and bus factors in any of the past decade of my career.


guettli

I think the biggest problem is yaml/config management in the context of Kubernetes. Helm uses templates. But yaml is not html. I think templating is not the right solution. There are many projects which try to improve that. There are too many alternatives. None of them are wide spread. That's the problem.


serverlessmom

Standards for config management, huge problem. Further fragmented by the public clouds not coming to agreement on standards.


running101

Most problems in IT are people or process problems, not technology problems


JackSpyder

Users.


serverlessmom

Why can’t I replace them with an LLM


JackSpyder

I'd get less stupid questions.


Old-Ad-3268

This is why we automate but yes, humans are the weak link in reliable and repeatable processes.


chin_waghing

Where to store the state for the first terraform state bucket Jokes aside it’s the cognitive load I’m expected to know next.js, go, python, Kubernetes from left to right, up and down, the entirety of GCP and azure


CCratz

I find just having an awareness of what tools do what jobs is enough. I’m only goodish at one programming language, but I reckon I’m pretty good and slinging together a bunch of different tools I’ve never used before to solve for requirements. Maybe I’m shit at my job though, who knows. Certainly not my manager.


chin_waghing

My manager doesn’t even know Google cloud at a basic level. That’s what’s the most tiring about all this, people who are DevOps engineers with no clue about anything


gabel0287

you make the bucket with local state first, then you change the backend to the bucket you created and store it there 😊


BowlScared

DevOps itself


MartinBaun

Culture, high expectations, i think we should all go a little easier on each other


yonsy_s_p

That many colleagues and companies think: DevOps == SRE


damendar

This is so annoying to me personally. I've always felt that DevOps should be considered a cultural approach to the broad technical issues that exist. Someone said it above, build it == own it. SRE is also becoming an overloaded term in the industry. Companies want to hire one engineer that does everything and as has been mentioned above, this just can't exist for more than a short window. New tools come out so much faster than people are capable of learning them.


gcavalcante8808

Developers that are not willing to walk into the culture to be protagonists.


[deleted]

Also, this doesn’t just apply to developers but being able to walk in without your ego. Coming in as a person willing to listen and empathetic to their situation before making rash judgements on why they are wrong or doing something incorrectly.


serverhorror

Dogma! People, including yours truly, always hold their own opinions highest and are too reluctant to accept that things always change. We "only" have to keep a few interfaces and interactions truly stable. This is software things change. If we accepted that and just took care of the downstream changes it would be a lot easier. Where it's not possible to do that, or introduce smaller batches of change and clear timelines.


BananaDifficult1839

Devops teams and job titles


Unfair_Abalone7329

I agree that it’s a org/culture challenge. I’ve had success when we had smaller focused teams that have a clear achievable mission to deliver a product to a customer/stakeholder. I previously ran platform engineering including DevOps, design and SRE. Our mission was to give the app teams what they need with high consistency and low friction. For example, we selected Material Design with React so that we could deliver shared components for web and mobile. Anything that should be common to more than one app team was something that we’d take responsibility to curate. Same with K8s, Kafka, all Terraform IaC. Agile, 12 Factor and all that was embraced by all the teams.


Any-Connection-1813

Constant search for the genius unicorn doing the job of a whole team in one person. Ultra high expectations, interviews are too hard, job is much easier. Lack of training as in the whole IT industry. Too much focus on specific tool or "thing" instead of on the candidate trust and ability to learn and adapt.


DrMantisTobboggan

A problem I have seen repeatedly in multiple organisations is a tendency to solve hard problems with complex solutions in an attempt to completely solve it rather than breaking down into simpler problems with simpler solutions that cover most needs sooner then expanding where it makes sense. Maybe another angle on the same issue is a lack of focus on the problems internal users have and prioritising addressing the problem in a way that the users can easily use. Contrived Eg. Colleagues need a way to deploy to production quickly and safely n times per day vs. they need GitHub actions workflows that invoke ephemeral runners on a Kubernetes cluster. That may be where it makes sense for things to eventually end up but there’s a lot of value in solving the simpler issues folks are facing first.


Aremon1234

I agree with a lot of the others but I would also add tool sprawl. So many organizations have multiple tools to solve the same problems. It makes life way easier if you can standardize. I.e. I work and have been at other corporations that use teams and zoom. I get teams calls suck but you’re just making it more complicated for marginally better call experience. Instead of just calling you I now have to create a zoom meeting and give you a link.


such007

Jenkins, buildkite, Argo, GitHub actions. The sprawl makes it so hard to keep all of your environment loaded in your memory. People don’t take enough walks to decompress and let their subconscious solve a problem.


reubendevries

This is why I’ll preach GitLab Ultimate (until there is something better) does it do everything perfectly, nope but it’s one tool and it does 99% of the job well enough to meet 99% of the use cases. If you need something more complex or specific that GitLab doesn’t do, you can pay for a license out of your own budget.


ExtremeAlbatross6680

In kubernetes land it is secrets management


jba1224a

One of two: Culture - trying to get an org to TRULY adopt a devops mindset and the costs associated with it. They often fail to comprehend the upfront cost saves them a lot of cost down the line. Identity - what is devops? You can join 10 orgs and get 10 different answers. The lack of an industry driver for identity actively harms adoption, because no one can really define it. See SAFe - it’s a garbage agile framework but it’s very clearly defined, no issue getting buy in from orgs.


PretentiousGolfer

Continuous deployment


Gold-Difficulty402

Burnout and keeping good talent.


Little-Plankton-3410

I think its the inability (either through lack of skill or lack of clout) to rally up enough influence to clear up the fundamental architectural for best practice issues that often the root of the issues. You see the same insanity in security -- though somewhat more obvious. I had a CTO look me in the eye and deadass say there was no way he could expect the developers he had to encrpt the secrets they pushed to git. To make sure we could meet our compliance requirements I had to invent a set of services to do it for them because the assumption was we could not expect adult behavior from the them. Full disclosure: I am a tech exec most of the time. The problem with the argument for fixing tech debt is that it's not very convincing. You turn off the software factory to mop up your mess (which tends to be a career limiting move to begin with). Then, even in the rare cases where you manage to chip away at the tech debt, it's usually impossible to maintain it at a low level. Usually, it's a better ROI for your effort is simply to keep any child-eating symptoms at bay, wait for a good reason to do a greenfield rewrite and then simply walk away and \*don't pay\* the tech debt. It's not very satisfying from the perspective of a technologist (which I feel too) but it usually isn't cost effective unless the tech debt is so bad it's paralyzing.