Monday, April 6, 2020

Code is Engineering; Types are Science

Programming is a diverse activity that requires reasoning in many different ways. Sometimes one has to think like an engineer—find the best solution for a problem under multiple constraints. Other times, one has to think like a scientist—observe the data points you have in order to establish general rules that help you attain your goals. These patterns of thinking have very different natures.

In this blog post, I will explain these patterns through the theory of reasoning of Charles Sanders Peirce. Peirce divides reasoning into three complementary processes[1]: deduction, abduction and induction. In the following sections we will go through these logical processes and see how they relate to software.

But before starting, I want to make explicit the underlying subject of this post: plausible reasoning. Plausible reasoning does not imply certainty or truth—it better reflects the concept of an educated guess. At minimum, these guesses must not be contradicted by the information we have at hand. At best, with more information, we can choose the guess with the most chance of being true. With that out of the way, let's dive into our definitions.

Deduction

Deduction is the process of reasoning that we acknowledge the most. It is what we learn in mathematics, and also what we think about when talking about logic. The basic schema is the following: if we know a fact A, and we know that A implies B, then we know the fact B (modus ponens). We will represent this process with the following diagram:

In philosophy books, it is very common to identify the sides of this triangle with the following phrases:

  • Socrates is human (left).
  • Every human is mortal (right).
  • Therefore Socrates is mortal (bottom).

which can be useful when trying to identify the reasoning patterns we will see next.

The practical utility of deduction is that it is able to reduce the scope of a problem: if we want to achieve B, and we know that A implies B, we can change our goal to A if we think that is useful.

One particularity of deduction is that the result of a deduction is the certainty of a fact: if we achieve A, we are certain to achieve B. As we will see, this is not the case for the other reasoning processes.

In Haskell programming, for example, deduction is omnipresent. Every time you apply a function of type a -> b to a value of type a in order to produce a value of type b, you apply deductive reasoning.

Abduction

A client sends you a bug report describing some anomalous behavior in the software you work on. You run it locally, and observe the same behavior. After some changes in the code you do not observe the offending behavior anymore, and declare the bug as solved.

The first thing to notice is that the reasoning above is not logically correct. Indeed, we can simplify it as follows:

  • I don't see the behavior anymore.
  • If the bug is solved, I should not see the behavior.
  • Therefore, I solved the bug.

It is possible that the anomalous behavior was the result of two interacting parts, and your manipulation of one of them made the underlying cause of the bug, present in the other component, not visible anymore. However this is plausible reasoning—given the constraints you have (the observation or not of bugs locally), you managed to propose a solution that is coherent, non-contradicting, that has a chance of being correct. And as developers, we know that very often, this works.

The mechanism of abduction is the following: if we know a fact B, and we know that A implies B, then we abduce A as a plausible thing. This can be represented in the following diagram:

Abduction is the process that we identify with writing code and engineering, since it is about solving a problem under constraints. This is not the only component of engineering, but it is one of its defining aspects. In the case of the diagram above, for example, the constraint is knowing that A implies B.

We can now connect abduction with writing code in a typed language. What happens when we try to make an action plan based on abductive reasoning? If we take A to correspond to "code works" and B to correspond to "code typechecks", we can act using abduction as follows:

  • I want to write working code.
  • My only information is that working code must typecheck.
  • Therefore, I try to write code that typechecks so that it can work.

This reasoning can be successful or unsuccessful depending on the context. It has almost no chance of working in the C language, but has a very good chance of working if you doing something simple in Idris. It can also describe the experience of a beginner in a Haskell codebase, and the expression "just follow the types". This will not necessarily lead to working code, but will allow the user to run the code and obtain more information about it.

In summary, abduction corresponds to using all the information one has in order to obtain a reasonable, or if the context permits, the best solution to a problem. This is typically the job of the engineer, and something that any programmer can relate to: do the best you can under the constraints you have.

Induction

Your company's server crashed at a given time last night. You read the log files and see that there is an "out of memory" error timestamped to the moment of the server crash. Therefore, you diagnose this error as the culprit for the server crash.

The first thing to notice is that the reasoning above is not logically correct. Indeed, we can simplify it as follows:

  • The out of memory error happened at time T.
  • The server crashed at time T.
  • Therefore, the out of memory error caused the server to crash.

Indeed, it can be that case that this error was safely caught and the server crashed because of an electricity problem. However this is plausible reasoning — given two facts that have the same origin (the same moment in time), you managed to establish a causality that is coherent, non-contradicting, that has a chance of being correct. And as developers, we know that very often, this works.

The mechanism of induction [2] is the following: if we know a fact B and we know a fact A, then we induce A implies B as a plausible thing. This can be represented in the following diagram:

Induction is the process that we identify with writing types and science, especially with natural sciences, because it is about establishing constraints and general rules that limit possible behaviors. This is not the only component of science, but it is one of its defining aspects. In the case of the diagram above, for example, the constraint is the established constraint is that A implies B.

We can now connect induction with writing types for a library. What happens when we try to make an action plan based on inductive reasoning? If we take A to correspond to "code is wrong" and B to correspond to "code does not typecheck", we can act using induction as follows:

  • I know some code is wrong, and I want to avoid it in production.
  • Code that doesn't typecheck doesn't run.
  • Therefore, I write types so that this code does not typecheck.

This corresponds to the post-hoc type modelling of an existing codebase, that tries to preserve important invariants and prevent future errors. The establishment of a type system is akin to the creation of a general rule that allows some desirable code examples to be valid, and undesirable ones to not exist.

In summary, induction corresponds to establishing an environment of constraints and rules so that others, while working under these conditions, produce the outcomes that are expected. This is typically the job of managers, and present in many other activities, like teaching and parenting. This is also one of the roles of scientific theories, to constrain the space of possibilities when developing an engineering project.

Conclusion

Here's the takeaway:

  • Even the most well-informed decisions we take are not logical conclusions of facts - they are most often just plausible theories. At minimum, they must not be false, but if possible, we must use the information we have at hand to make the best decision.

  • Abduction and induction are simple mechanisms yet omnipresent. Being able to identify whether we are using one of them makes it easier to know how to find the weak points and to better justify our arguments. Abductions and inductions always require arguments.

  • Being able to identify where plausible reasoning was used in some decision process can show where possible errors can happen, or where new data can show problems that were not foreseen before.

In another note, what about writing tests? This is also an inductive activity like writing types, they only differ in breadth and power—while types can be imposed not only to developers, but also to users of a library, tests can impose constraints that are unavailable to a type system.

Globally, the message is that programming involves activities of different natures. Sometimes you are just plumbing functions together and you have to think like a mathematician. Sometimes you are writing functions and have to work under constraints and think like an engineer. And sometimes you have to establish frameworks and give appropriate constraints to yourself and your coworkers, work with the evidence you have and think like a scientist. And no part in this game is more or less important than the others. They are different, yet complementary.


  1. Benjamin S. Peirce, Philosophical writings of Peirce, 1955 ↩︎

  2. Mathematical induction, the technique for proving theorems about integer numbers, for example, is not an example of induction as written here. They share the same name, but they are not related. ↩︎



from Hacker News https://ift.tt/3aXKfTL

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.