Turning Failures into Success: How Overcoming Challenges Fuels Software Engineer Growth

A few weeks ago, I delivered one extremely challenging project, and now I am using the energy to write this post that I have been postponing forever. If you want to grow as an engineer, you must take on challenging and most likely to fail projects.

Why

It may be counterintuitive because you need to harvest success in your career to grow as an engineer. But success on trivial projects won’t teach you anything. When projects are on the verge of failing, there you can find learning, and if you can make it a success, I guarantee the personal growth will be astonishing.

When I joined this project back in September, the team was demotivated. The Dev and the Quality team were having an enormous debate because nothing worked as expected. The Dev team was giving impossible ETAs because they were under pressure to deliver quickly. I saw little chance of success. So I was excited to join the opportunity and help pushing it to the finish line.

In this project, we needed to migrate +700TB of files and +500GB of mongo data from Azure to GCP with no downtime and eventually flip the DNS from Azure to GCP. It was the biggest migration I have ever made. The bigger challenge was to keep the two systems sync and make the migration with many database model differences but keep them in sync for a while until the DNS flip time happens.

How does one parachute into a falling project with an established team and still succeed

It is easy to join successful projects and teams. You just pick a task and go do it. However, joining teams that are on the edge of failing is not that easy. A failing project team probably won’t be so open because they are being pressured, and some may think of your arrival as an attempt to take their places. At times like those, conflicts are almost inevitable. But if you want to have a big impact, you must learn how to deal with them.

I never seek conflict, but I don’t avoid them. In fact, the best solutions often come up from polite conflicts between two smart people. In a situation where the team is defensive, you may need to “invade” the team by choosing a problem to fix and digging it until the end. And on-demand bringing people from the project to help understand the problem correctly and showing them what you are up to.

Document the process: “The only thing worse than failure is passing by accident.”

My first step in any project is to collect information. For that, I start by creating a note as a task to “how-to-PROJECT-PROBLEM-SOLUTION.md” and start collecting and storing information.

Then I ask around, “Can you explain to me how this works?”, I take notes on every important piece of information, and I draw my own understanding of the problem. As people tend to agree more than disagree, I dig the code to validate my learning from here. For a project going on for a year, this phase should not take more than a couple of days.

Decide where to start from: Avoid burnout and ensure no regressions starting from the tests

It is time to decide what task to do first, and how to shape it for delivery. In this case, I suggest starting with a test. “Are you fucking crazy? Join a delayed project and start by implementing a test instead of fixing a known problem??” Yes!

The migration project, for example, was delayed, and there was lots of pressure to deliver. Yet, I decided my first step was to write a new test, my test would read from the file’s source and conciliate with the destination of every file to ensure they were migrated correctly. This is not easily doable in reality since there would be billions of files and records to be moved, but it is a great test for a subset of the data. The test did not take more than a day to build since it was simple and pointed out that ~35% of the files were left behind in some of step of the pipelines. Unbelievable, but with such a complex pipeline with little coverage, a few failures in each of the nine steps resulted in that insane percentage in the end.

When you find problems like that, it is an opportunity to build trust with the QA(test) team. At that point, I usually call them privately and show them the issue, explaining it and providing context. The biggest point of tension between Dev and QA is not acknowledging problems, when you bring the quality team and show them your failures, they will trust you because they know you won’t try to hide problems and you rather appreciate their point of view about them.

Same with management, communicating directly and clearly will pump them the correct level of confidence about the project. In that situation, I immediately communicated about the 35% missing files and that I did not know where they failed yet, but my next step would be to investigate it.

Using this test as a check for my work, I could dig down each component of the stack with much more confidence, and with each failure, I found, I reported, documented, and took notes. The problems may be one-off or systematic. In my particular case, we had a critical problem of missing monitoring and observability of all processes. No one can manage a complex process with nine steps processing billions of records without visibility. But not only that, most of the steps also needed re-tries, proper error handling, and alerting in case of failures.

Delivering rather than debating

Once you have a good test backing you up, you may find that the people on the team will want to team up to solve the problems found. Once you have a test that fails and shows the problem, there won’t be much debate about the problem’s existence. The core point here is to pay attention not to spend all time on a tree and forget the forest. Make sure to work on the problems with the biggest impact and document the smaller issues for later.

Build trust with the project’s team

As you dig into the problems, bring the old code owners to help, and make it clear to the other team members that you are working together, they will trust you because you will make it clear to others that they are also part of the solution.

When joining failing projects, it is easy to point blame, be the one to point solutions and share the glory of success. A blameless approach will make people trust you more quicker. And trust is the most valuable resource of teams.

Communicate uncertainty

When working on projects that you are not familiar with, you will face a lot of uncertainty; the core tip here is to share the level of uncertainty with management and also with peers. If something is a source of uncertainty to you and some of your colleagues are not worried about it, take hold of their point of view to enrich your knowledge about it. They or you may be looking at the problem from the wrong perspective. It is important to reduce uncertainty to as minimal as possible as soon as possible.

Conclusion

Success on trivial projects will teach you too little. Join challenging, failing projects for real growth. Document the process, start with tests, communicate clearly, and deliver results, and your growth will be faster than you think.

Written on April 9, 2023