Literate Programming is a software development paradigm that structures code and its development process like a conventional literary book. What you read is what you get. This new paradigm was introduced by Donald E. Knuth in the 1980s while he was working on TeX. He considered that, of all his work on typography, the idea of literate programming had the greatest impact on him. This is a strong and profound statement that seems to be underestimated by history.
In a literate programming framework, the developer writes plain text files (think of Markdown with code boxes). They write about the problems they are trying to solve, explain why they are approaching their problems in a particular way, and then write the computer code to implement a solution to the problem.
Once the “literate files” are finalized, they are tangled
to organize all the code into a packaged and executable form. They are also weaved
to generate documentation for the software application (the book).
Why?
I can guarantee that this software development paradigm looks silly to almost everyone who is introduced to it. Many times, developers have told me that they were wasting their time, their productivity was much slower than before, and that it was needlessly complex with no real benefits.
In a way, they were correct, but they were stuck in a local development optimum: they wanted to write code quickly to see themselves as productive developers. However, my goal was for them to write higher quality code and more thoughtful architecture to ultimately increase the project’s future velocity and overall reliability.
Literate programming allowed me to introduce additional friction by forcing developers to slow down their coding pace and spend more time thinking about the actual problem. Simultaneously, it helped improve the quality of code reviews, enhance the quality of tests, increase the quality and coverage of monitoring, and raise the overall quality (and freshness) of documentation, among other benefits.
Code is the How: Literate is the Why
Code shows me how certain things have been implemented by one or multiple developers. If I’m lucky, I’ll have access to well-documented change history for a piece of code, in the form of a history of git commits.
However, I’m almost certain that none of this will tell me anything about the why things were implemented in that particular way. To me, this is the most crucial part of the development process. Why would I invest time in developing anything if I can’t articulate and communicate why I need to develop it in a specific manner?
As a code reviewer, I want to understand why something has been implemented in a certain way (the how). If the code and its logic differ from the why description, then it raises questions during code review that should lead to improvements in the code, correction of logic errors, enhancement of documentation, etc. The discrepancy between the two aspects needs to be resolved one way or another.
Next
I’ve been fortunate enough to professionally use the literate programming paradigm since 2016 to develop multiple knowledge graph and machine learning projects, both independently and with skilled engineering teams.
In the coming months, I will take the time to write a series of blog posts discussing different aspects of this programming paradigm and sharing what I’ve learned from it: important insights, unexpected benefits, its tooling, and more.