Mentoring Software Engineers

By: Matt Brown

Nothing has advanced my skill and career more than being around programmers who are much better than I am at programming. There's something awe-inspiring when you first meet people who are better at coding than you. They are like wizards, commanding all of nature to bend to their will. In an hour, they sling out code it would take you weeks to get finished. Maybe you've gotten a chance to work with these masters of computation and hoped that some of their skills would rub off on you.

The first time I experienced this was when I started my new job at a game company. You hear stories about the game industry: it's hard work, lots of hours, competition can be tough, etc. All of that can be true, but one thing that caught me by surprise is how incredibly smart every single person who works in games is. And this was especially true for the lead engineer, Ryan.

When I met him, he was like a programming god. He knew everything about all of the application’s code: server, client, infrastructure code, you name it. He would write in 3 or 4 languages in the course of the day without breaking a sweat.

I wanted these powers for myself, so Ryan mentored me in what has most likely been the most productive 2 years of my life.

Since I started my career as a software engineer, I've intentionally sought out participation in both sides of the mentoring process. I've watched my father (who's also a software engineer) transform complete newbies into effective and highly respected engineers. From him and others I've learned that good software mentors stress writing more code, encourage finding your own answers, provide challenges, and most importantly are respectful of those less skilled than them.

Write code

I read this all the time (and I'm sure you do too), but it bears repeating: the best way to get better at programming is to write code. If someone you're mentoring is complaining about not getting better, ask her how much new code she’s written recently. There are plenty of resources available for developers to start writing new code regardless of their level of programming skill.

  • Code Jams: Code jams are a great way to learn about your coding ability. Usually happening over a weekend with some specific coding goal in mind, they are a whole development process crammed into a couple of days. The (Global Game Jam)[] provides information on where you can meet up with other developers to participate.
  • Working on an open source project: There are countless open source projects that need bug fixes and features developed. Look through github with your mentee to find a project that he could support.
  • Emulating a simple project: Rewriting a simple application, such as a game, can provide a good focus for your mentee to practice coding as he won't’ have to worry about requirements.

Get them to RTFM

When I was a kid, I would often go up to my parents and ask them how to spell some word or what a word means, and they would almost always tell me the same thing: look it up. It's not that they didn't know the answer (although I'm sure they didn't always know the answer despite both of them having English degrees), they were trying to teach me something.

Trivial knowledge, such as API calls or language syntax, is the sort of thing that is very quick and easy to look up. Encourage the person you're mentoring to Google for answers to simple questions; for example, if she ask you what the syntax for scp is, tell her to check the man page.

Not only will the act of looking things up and reading documentation make your mentee a much better developer, it will seriously cut down on the number of trivial questions you get.

Provide challenges

As a mentor, one of your jobs should be to provide challenges to your mentee. In a mentoring relationship where the mentor is much more experienced than the mentee there's going to be an implicit challenge for the newbie to try to become better than the expert. Offering explicit challenges to your mentee will help push him beyond his notions of what they are able to accomplish and get them closer to becoming a master.

However, you can’t just give a challenge to someone and then ninja vanish. Make sure that the challenges you give are obtainable and be ready to assist. Try to have a good idea of how long a particular task will take and check in with your mentee if the task is taking longer than expected.

Ryan was particularly good at challenging people. Often he would give me tasks, and I’d have to google half of the words he used to even know what he was asking for. But knowing he completely believed in my ability to accomplish any challenge he gave me helped make it possible to complete the challenge.

And the sense of pride I felt on finishing that task made me hungry for more.

Be respectful

Arguably the most important skill you can learn as a mentor is how to show respect to your mentee. You may have more experience, but it does not make you smarter or more important than the person you are mentoring. Here are some very simple ways to show respect:

  • Put your screen away: The text/code/uncrushed candy is not more important than the conversation you are currently having. You may think you're good at multitasking, but if you're looking at a screen while talking to someone you are essentially communicating, "I don't think you're more important than whatever is going on here."
  • Keep your time commitments: If you make a meeting commitment with your mentee, do not cancel it (unless you have some legitimate emergency). When you cancel a meeting with someone, you are communicating, "you are less important than the meeting I am replacing our time for."
  • Be patient with questions: You are going to get a lot of questions, and sometimes the level or frequency of these questions can be frustrating. Remember that your mentee is asking you a lot of questions because she respects your skill and ability. Getting visibly annoyed or frustrated can make your mentee nervous about seeking your advice in the future.

Why mentoring

Mentoring someone does not just make that person a better engineer, it makes you better too.

By teaching someone else about your craft, you help to further reinforce your understanding. When you challenge someone to move beyond their skill, you think about how you can challenge yourself. Learning to be more respectful towards those with less skill or experience than you just makes you a better person.

Mentoring improves the skills of everyone involved.

Robust and Resilient, Do They Lead to Antifragile Software?

By: Chris King


In the field of software engineering, we as developers are always striving for improvements in our process of building software. We seek to improve the total throughput of our systems, their scalability to performance demands, and the clarity of the codebase. During this search, countless methodologies have been created (Test Driven Development, Agile, Waterfall, Domain Driven Development, Behavior Driven Development, etc), all of which are constantly evolving and changing due to research in our own field, and in others such as behavioral economics, complexity theory, and systems theory.

The key to understanding and learning from any discipline is clear communication of the ideas and concepts that are being discussed. This article aims to illustrate ways in which a few developers have misinterpreted the research on antifragility, and then to properly tie what our discipline is doing to achieve robust and resilient systems, and how antifragility does not make sense as a goal.

The concepts of fragility, robustness, resilience, and antifragility are useful models to classify the strengths and weaknesses of a system. By understanding these concepts we can hope to identify how much we should rely on our applications and what we can do to improve our applications.

What is Antifragile?

In Nassim Nicholas Taleb's "Antifragile" he defines antifragile as: "beyond resilience or robustness. The resilient resists shocks and stays the same; the antifragile gets better." excerpt He continues, helping us identify objects as antifragile or not:

"It is far easier to figure out if something is fragile than to predict the occurrence of an event that may harm it. Fragility can be measured; risk is not measurable (outside of casinos or the minds of people who call themselves “risk experts”). This provides a solution to what I’ve called the Black Swan problem—the impossibility of calculating the risks of consequential rare events and predicting their occurrence. Sensitivity to harm from volatility is tractable, more so than forecasting the event that would cause the harm. So we propose to stand our current approaches to prediction, prognostication, and risk management on their heads. In every domain or area of application, we propose rules for moving from the fragile toward the antifragile, through reduction of fragility or harnessing antifragility. And we can almost always detect antifragility (and fragility) using a simple test of asymmetry: anything that has more upside than downside from random events (or certain shocks) is anti- fragile; the reverse is fragile."

In the above snippet Taleb references his earlier work on Black Swans and proposes that the goal of antifragile entities is to withstand Black Swan events when they arise and to be made better by them. The criteria as defined by Taleb for a Black Swan:

  1. The event is a surprise (to the observer).
  2. The event has a major effect.
  3. After the first recorded instance of the event, it is rationalized by hindsight, as if it could have been expected; that is, the relevant data were available but unaccounted for in risk mitigation programs. The same is true for the personal perception by individuals

Before continuing it would be useful to define all of the core terms and how they will be used through this article:

  • Fragile - (of an object) easily broken or damaged Merriam-Webster
  • Resilient - The capacity of a system, enterprise, or a person to maintain its core purpose and integrity in the face of dramatically changed circumstances. Resilience p7
  • Robust - A system or entity that has been hardened so that it is not easily broken, while lacking the recovery abilities of a resilient system. Resilience p13
  • Antifragile - "Beyond resilience or robustness. The resilient resists shocks and stays the same; the antifragile gets better." Antifragile

Given how many people have struggled with fragile software, antifragile software seems like an amazing goal to have, but what would that actually look like? Can code actually become stronger when chaos and failures are hurled at it?

Antifragility and Software

With antifragility established as a goal to strive for, many developers who read Taleb's book started to float around the idea of antifragile software, or software that becomes more powerful when a Black Swan arises.

In Vikas Singh's blog post Anti-fragile Software he begins with the definitions for fragility, robustness, and antifragility, then proceeds to discuss how software has been stuck in the mindset of robustness being the end goal.

Singh's definitions as mentioned above( these may be confusing ):

  • Fragile : This is something that doesn’t like volatility. An example will be a package of wine glasses you’re sending to a friend.
  • Robust : This is the normal condition of most of the products we expect to work. It will include the wine glasses you’re sending to the friend, our bodies, computer systems.
  • Antifragile: These gain from volatility. It’s performance thrives when confronted with volatility.

Singh later writes of the challenges with building more fault tolerant systems:

"Traditionally we have been designing software systems trying to make them robust and we expect them to work under all conditions.This is becoming more challenging as software is becoming much more complex and the number of components is increasing."

Within a few paragraphs he then makes the following statement:

"We spend a great deal of effort in making a system robust but [not] much in making it antifragile.The rough equivalent of antifragile is resilience in common language - it is an attribute of a system that enables it to deal with failure in a way that doesn’t cause the entire system to fail."

Referring to the earlier definition from Taleb we see that antifragility MUST be beyond resilience, and as Singh's work continues, the only solutions brought forward simply increase resilience or robustness, not antifragility at all.

Singh's solutions that will be discussed later:

  • Create fault tolerant applications
  • Regularly induce failures to reduce uncertainty

So what would antifragile software actually look like? Ideally it would be a system that meets all the criteria for a robust and resilient system yet it should also be able to dynamically change itself and all of its metrics autonomously. These types of systems have been created but are incredibly complex and generally fall out of the purview of most applications, such as spell checking algorithms and random ranking algorithms. These specialized algorithms do not encompass the mission of the software that most developers spend their time writing, this article will focus on the majority of software being created, not the outliers.

Finding Fragility

The journey towards a robust and resilient application starts with understanding how software can be fragile, and how to buffer it from fragile to robust. Below is a simple Python example of a function to illustrate fragility.

def add_numbers(numbers):
    takes in a simple array of numbers. It will then iterate over each number,
    while doing so it adds the numbers together, finally it returns the result.
    sum = 0
    for number in numbers:
        sum += number
    return sum

If the array of integers [1,2,3] is passed, this function will output 6, as expected. Surprisingly Python even handles [1.99,2.5,3.0] correctly yielding 7.49. However if ["one", "two", "three"] is supplied an exception is thrown. Specifically "TypeError: unsupported operand type(s) for +=: 'int' and 'str'"

Providing unit tests would help prevent a developer from checking an error like this into production as the test would fail. That solution is not antifragility; it is plain robustness. If code using this function was checked into the codebase with incorrect inputs that caused a failure, the code would not become stronger during the failure, it would simply fail. That is not antifragility at all, that is robust yet fragile.

Robust Yet Fragile (RYF)

Robust Yet Fragile is a term used to describe systems that are resilient in the face of anticipated dangers, but highly susceptible to unanticipated threats. This term was coined by California Institute of Technology research scientist John Doyle.Resilience p27

Our world is filled with examples of RYF systems. Those who deal daily with anticipated issues like the electrical grid can handle many isolated failures, but if the failures start to cluster, this will yield a widespread power outage for many. A software example of RYF could be a web application that is served by 2 servers in a simple failover configuration. If both nodes experience a failure, the service will be knocked offline.

RYF is easy to achieve with software, and is certainly a great starting point for a service, but what can we do to enable an application to actually be resilient?

Resilient Software

The path to producing resilient software starts with identifying how one can leverage the tactics of "Sense, Scale, and Swarm."Resilience p61 Each of these tactics open up feedback loops, or channels for communication so one's application and those who support it can know more about what is really going on. Once the feedback loops have been established, it should be easy to identify metrics which indicate how the application is performing.

For this example we will use a web application and illustrate how leveraging technology from Amazon Web Services, New Relic, or Open Stack can help facilitate the feedback loops of sense, scale, and swarm.

Just as one relies on one's senses before performing many actions, the other steps of scaling and swarming cannot be performed intelligently without sensing exactly how things are changing in ones application. Using monitoring services like New Relic, application developers can sense when performance impacting conditions are occurring. Such conditions could be a simple traffic spike, memory leaks, infrastructure failure, or a simple bad new deployment. With real time performance metrics constantly available, it becomes easy to quickly identify the cause of new problems and to determine how best to react to them.

Scaling can often resolve many sudden traffic spikes or resource issues, and with AWS or Open Stack is easy and in many cases can be fully automated based on information from the sensing infrastructure. In AWS this is called Auto Scaling and more can be learned from their official documentation.

Focusing on the important metrics via sensing and increasing the supply of computing power or talented developers in scaling, the only remaining thing is intelligent usage of that information. The intelligent usage of these pieces of information and the combination of resources towards tackling a problem is swarming. Netflix is a well-known example of using continual swarming behavior to ensure their service operates as intended.

A few years ago Netflix started a program called Chaos Monkey, this tool lets Netflix test its ability to sense, scale, and swarm automatically by destroying virtual machines running core bits of Netflix's infrastructure. With random small pieces of their production infrastructure being constantly removed they could quickly see which pieces of their infrastructure could heal itself automatically and which pieces needed more work. The program was incredibly successful in helping Netflix reduce downtime and they have extended it since then to many more specialized chaos inducing applications. An even more destructive entity has also gained a bit of popularity called Chaos Gorilla. Like its monkey counterpart it deletes components of the Netflix infrastructure but instead of just virtual machines it simulates the outage of entire availability zones. A full list of their simian agents can be found here. When the infrastructure of an application can take the removal of an entire geographic region of servers and continue to operate as intended, it is safe to say it is truly a resilient application.

As aforementioned, Singh said the two components of an antifragile software entity are fault tolerance and regular induction of failures. Here we see that Netflix has accomplished this and is still vulnerable to a Black Swan event, so while resilient, it is not antifragile yet.


Antifragile pieces of software that are self modifying may become more commonplace in the future as artificial intelligence methods improve but we seem to be far from those methods now. Until then, resilience provides the best means of ensuring the continued success of our products. Even if a Black Swan should arise and destroy the industry the application exists for, it does not necessarily matter. The skills to understand problems are in the heads of developers, and the lessons learned building a resilient platform can be utilized to build whatever software this new world demands. In that way we as developers are certainly antifragile ourselves, even if our software is not.

Further Reading:

  1. Antifragile: Things That Gain from Disorder - Nassim Nicholas Taleb
  1. Resilience: Why Things Bounce Back - Andrew Zolli & Ann Marie Healy
Advocates for Antifragile Software
  1. Anti-fragile Software - Vikas Singh
  2. Agile is Antifragile - Stuart Wray
  3. Antifragile Systems: Designing for Agility vs. Stability - Mike Kavis
Criticisms of Antifragile (the book)
  1. Fragile Reasoning in Nassim Taleb’s Antifragile: An Enlightenment Transhumanist Critique - Gennady Stolyarov II
  2. Nassim Taleb Is Annoying, but Antifragile Is Still Worth Reading - John Horgan
  3. Anti-Fragile Book: Why We Should Eat Like Cavemen, Embrace Religion, and Hate Bankers - Marni Chan

Not Invented Here

By: Matt Brown

The past few weeks I've been thinking a lot about the "Not Invented Here" bias where a person subconsciously avoid innovations not created by them or their organization. The reasons stated for the bias can vary from cost motivations to fear of something you didn't write yourself. Recently I discovered myself employing this bias and learned to overcome it.

At my previous employer, we couldn't use anything that cost money. All of our monitoring, databases, and infrastructure were homespun (through a variety of open source and self-managed servers). If a hosted solution were brought to my manager, he would sneer and say something like, "Why would you pay someone else to do that." We had this mentality of penny pinching drilled into us so much that, without realizing it, I developed a bias against any piece of ops infrastructure that is not self-hosted. I consider this a subset of “Not Invented Here” called "Not Hosted Here" bias.

As part of the interview for my current job, I promised to add infrastructure and application performance monitoring to the platform to prepare for their impending launch to production. Once I joined the team, I set my estimate for two weeks to instrument the code, spin up some monitoring servers, automate the configuration of these servers, and do any other configuration required from the hodgepodge of open source tools I had planned to use for the task (shout out to graphite).

The CTO told me "Yeah, that's cool, but before you start implementing all that, I want you to try out a hosted solution."

Internally I balked. "But, those cost money! They are going to have data about our app!"

For the next couple of days I tried various hosted solutions for monitoring while fighting with the part of me trained to sneer at paying other people to do this.

I was not prepared for how EASY it all was.

Not only was it easy and relatively cheap (cheap enough that I would pay the cost out of pocket just to not have to manage the tools myself), I could instantly see the time I would save both in configuration and upkeep.

Our CTO wasn't blinded by "Not Invented Here". He knew that it'd be a much better use of time to work on actual issues with the app than to build a castle of monitoring solutions. Suddenly free from all of that work, we now have several extra weeks to run load tests and make improvements we otherwise wouldn't have had time to make before launch. By properly taking into account the bias of "Not Invented Here", we are able to focus on the code that is unique and important to our business and not get distracted by problems other people have solved.

I don't want to give the impression that I think all monitoring information should be outsourced. "Not Invented Here" was a bias that was subconciously affecting the way that I was making important infrastructure decisions. Now that I am aware of my decision making preference towards avoiding other people's software, I can make smarter and more informed decisions.

Welcome to KilnCode!

By: Chris King

The focus of is to provide a portal for the writings of Chris King and Matt Brown