From now on, I make the commitment that everything new I will produce is literate programming code.
The Open Source Revolution
For about a decade, we are experiencing a kind of Open Source revolution with the rise of Git (and all its free online hosting services such as GitHub, BitBucket and GitLab). At the same time the tech Juggernaut like Google, Microsoft, Facebook, Baidu, Twitter and probably all others are increasing their commitment to release several of their internal projects as open source software. There is also a myriad of young and vibrant communities that get created around new programming languages such as Clojure, Scala, R and many others. More and more code is available to people to look at and for developers to use.
My company Structured Dynamics and I are participants of that movement for more than fifteen years, producing open source software, ontologies, datasets and participating into other open source projects by fixing bugs and adding functionalities.
Human Experience of Coding
However, in a [Brave] New World driven by technologies and social networks that encourage its users to write quickly, succinctly and often in a hurry where the latest is more valued than the best, it is normal to find code that reflect that reality.
I am often confronted to the human experience of digging into a new computer project created by others or even an older project of my own. The readability of the code is poor, poorly commented or completely uncommented, the project itself is undocumented and dubious naming of variables, functions, classes, source files, packages, etc. are used. Debugging, improving and upgrading such project is nearly impossible without dedicating a substantial amount of time just trying to understand what the code is supposed to be doing. Everything is created for machines, not humans. The problem is that it is still humans that have to create and maintain these things, so we are better making sure that humans can read a understand such a computer project.
In a World where Open Source computer projects are becoming the norm, we are better making sure that we create projects that can be as easy as possible to be maintained and extended by more than a few people (note that the same applies to proprietary projects).
I am not a particularly good writer. The last thing I wanted to do at school is certainly writing. The only writing experience I have is writing this blog for the last 12 years in English, which is not my native language. However as a developer I always thought that it was important to have clean and well-documented code. It was important since I wanted to be able to re-read that code a few months after I wrote it and still know what it was doing, but it was also important since I wanted to make sure that the code I was publishing could be as easy as possible to be read and reused by other developers.
It is after starting to work with my partner Mike, who is a scientist and a writer but not a software developer, 8 years ago that I took a particular attention to the code I was writing, to properly comment it and to document the whole development process. I had to make sure that a non-coder could review the work I was doing from a higher level, to understand the workflows and the processing. I tried my best over the year to commit to that.
I was happy with this commitment since now. I am now starting to feel that it was an undercommitment, that I could do much better.
Donald Knuth wrote his Literate Programming paper in 1984, 3 years after I was born, but he started to work on the idea as early as 1978 and first released WEB in 1981. I read about what was literate programming about a decade ago. It had an immediate impact on how I was thinking of my code, how I was commenting and documenting it. However, I never really wrote literate code.
I commented my code, I wrote external documentation on Wikis and other mediums, I wrote API documentation with Doxygen and tried to generate some documentation pages using some of its features, but the process was always siloed.
I started to question my early commitment when I started to write all our applications in Clojure. Clojure led me to read much code in Clojure, Java and Scala. I had to re-use often ill documented and commented code. I had to find and fix bugs, and I had to spend too much time in a debugger for my taste. The problem I had is that I couldn’t easily read the code and I had nothing written to help me out to understand the general data structures, workflows and processing of the applications.
At the same time I have Mike that always look at my code to try to understand the data processing workflows that I write in Clojure to process the data the way he wants. He often tells me “you really write quite beautiful code”, but I am not convinced that he is right. Maybe I write “beautiful code”, but I am not sure that I write that “readable” code, or that I always write readable, well commented and documented code.
This is why I am restarting to commit a substantial amount of time into exploring the process of writing literate code.
Do Everything at the Same Time
The problem of writing readable code which is well commented, well documented and well tested is that ideally we would have to focus on all these aspects at the same time, but given the development environments used by most people, it is not possible. You will plan an aspect of your program and write the code. Then if you are really lucky and you will find (or take) the time to write some documentation and create some unit tests. The problem is that each of these tasks are siloed: they are performed in isolation with 4 different states of minds, at 4 different times and hopefully within 4 weeks. The worse happens when you start fixing bugs or improving the code: comments, documentation and unit tests will often remain unchanged.
This is what Literate Programming is for me: a way to perform all these tasks at once, with the same state of mind, at the same time. This is a process to put in place, a new way to work. The problem is to put in place a process, a way to work, that enables you do to all this at once, at the same time.
In the past I could never commit to Literate Programming for that reason: I couldn’t find a way to put in place such a process, to put in place a new way to work. However this recently changed. About a year and a half ago I started to work with Emacs for programming in Clojure. And then I got introduced by Org-mode a few months ago. Since then, I started to create a new development process that would enable me to finally write my software in a literate way.
The learning curve is steep, the time to invest in important, but the reward is big and satisfying. There is nothing free in this World even if many try to convince you otherwise. This is why I still marvel at coding, because there is always a way to learn new things and to improve the quality of your work. I don’t think the process and experience is any different than what experience professional writers.
I have the feeling that it will become more and more important to write readable code. Much of the code we are writing in these days is code that manipulates and transform data, code that implement [machine learning] workflows and such. The kind of code that would benefit to be readable by many people other than the ones that write the code.
This is why I am now making the commitment to develop all my software as literate code. I yet have to find my style, things will evolve over the next few months and years, but this is the commitment I am making to make me a better developer, a better writer, a better communicator and a better contributor [of open source softwares]. I can’t force people to do what I think is best, but I can force myself in hope to influence others to do it.
In the coming weeks and months I will write a series of blog post about literate programming and more particularly my process of doing so. I will write about the development environment I am using, the way I am using it and how I customized it to work the way I need.