Experience Report: Doing a PhD in Theoretical Computer Science

7th Mar 2021
51 min read
Tags:
personal,
essay,
academia

Abstract

In this essay I will describe how life was like for me in the heights of the ivory tower, how I got there, how it is like to do theory, share some insights I gained and explain why I personally decided to leave academia (or at least the researcher role) behind, even though I initially thought that I would stay. To clear up any confusion - science is not perfect and has its share of problems, but this is not another story uncovering its warts and flaws. On the contrary, this is a friendly “it’s is not you, it’s me” break-up letter from my three year long relationship with academic research.

Everyone has their own unique personal story and sometimes you want to hear subjective impressions and experiences, but I do not see too many written down. I would have been interested in reading such reports from people with similar background before I started, so I provide one myself now and hope this helps someone to decide whether he or she wants to embark on this trip. This is a story of finding your way through the possibilities before you, expectations clashing with reality, and honest self-reflection.

I won’t talk about the actual topics I was working on, but rather focus on my personal path, feelings and insights about the whole process, how it all evolved and what I learned. This is a rather long essay, but I think it is appropriate as a summary of a three-year long journey.

The road from ignorance into the ivory tower

From hobby coder to theoretician

I knew that I was going to be a “software developer” for a long time, as I picked up programming as a hobby early in my childhood. So it was clear to me that I am going to study computer science after graduating from high school. I was always quite good in math, but I would not call myself “brilliant” in math and initially it has never been my main interest. Originally, I went to university for the “deeper” computer science topics, motivated to learn the theory behind compiler construction, machine learning and other quite applied topics that I could relate to in some way.

Like everyone else I’ve had my problems in the first math courses when the homework was to prove some very simple properties. How detailed a proof is supposed to be? How does one even go about it? But after getting the hang of it, it was even kind of fun. The main insight at that point was that a proof has atomic structures that build up to more complex ones, just like a program is build up from the atomic control flow supported by the language. But instead of if-then-else, for- and while-loops you have general proof techniques like induction, proof by contradiction, case splitting, and some more topic-specific mechanisms and patterns that you can learn and use (it was still before I learned about the Curry-Howard-Isomorphism and proof assistants, which use this similarity in a quite literal way).

Then came the courses in theoretical computer science. This is the moment where I thought, “Yes, now that’s the stuff I came here for!”. At this time I noticed that I quite enjoy learning about theory, especially concerning logic and discrete maths, and started falling in love with it. Still, I felt more like a slightly more illuminated programmer than anything else. I was often the guy explaining complicated topics to others, so I often was told ridiculous things like “you would be a great professor”, which was giving me the first thoughts about staying in university for more than just the master’s degree.

Quite by chance it came to be that I wrote my bachelor’s thesis about a purely theoretical topic. This is where I learned the first time how this “proof” business really works. It clicked for me that in the larger picture, results you use which others proved is like using a library. Lemmas are like helper functions for easier reading and modularity, etc., but unlike programming languages, mathematical notation has no limits, you define your own syntax, hopefully with a good sense for conventions. It was also my first exposure to scientific papers.

During my master’s I seriously began to ponder about doing a PhD in some interesting field close to or in theory. Even though I had good grades, I had doubts: “Am I really smart enough for this?” I always regarded myself as a quite good programmer who is able to understand and apply even complicated theory that I learn, but I think no one gives out PhDs just for that. Will I be able to come up with enough smart ideas myself?

Reading a lot on the internet about burnt-out doctoral students did not really help, but quickly I understood that the conditions of PhD students are wildly different and reports of students in e.g. the US really say nothing about the situation in Germany. Also, I was encouraged by the advisor of my bachelor’s and master’s thesis (an almost finished PhD student himself) to go for it, as he judged me as more than capable enough for this. After hearing about his experiences and getting some valuable advice I decided that I have to at least try. Because in a few years I would rather tell the story that I tried and failed rather than that I probably could, but didn’t try. And after leaving academia, most people never come back.

Additionally, in my idealistic view, doing research and trying to understand things not out of necessity, but rather out of curiosity and just for its own sake, is at the same time one of the most absurd and most human activities one can partake in. I just could not live with the thought of missing out on something like that. So in my final master’s semester I started to look for positions and send out applications.

PhD funding in every flavour

For doing a PhD in Germany you have multiple options. I will shortly sketch the main possibilities, as I understand them. In principle, you can write a dissertation while working at a company, but this seems to be rather rare, at least in computer science, and I can not say anything about this.

For the possibilities where you are actually staying in university, multiple sources of funding exist. The worst option is being a doctoral stipendiate. As I understand, you will work as much as all the others do, but get a laughable stipend (compared to what you usually earn with a master’s in CS) and no social security benefits etc., because you still count as a student and not as employee of the university. This was an option I did not even really consider, following one advice that I got - know what you are worth.

In the remaining options you are usually employed by the university for three years, with different possible sources of funding (after three years, you will have to get some other funding). Either your position is state-funded, funded from a research grant, sponsored by some industry partner or you get a position in either a graduate school or research training group (RTG), where the latter two differ in that a graduate school is a permanent or long-running institution, whereas an RTG is a one-time funding for a number of positions. In many scientific disciplines, you usually will only get a 1/2 or 2/3 position if you are lucky, while in computer science full positions are the default, as there are several other attractive employment opportunities for computer scientists outside of academia.

The contract of a state-funded position assigns a certain percentage of your time officially for organizing and assisting the teaching done at that chair, but it is also the most unrestricted option concerning the topics you can investigate. On the other end of the spectrum, industry-funded positions are often bound to some narrow specific area of interest which the company hopes to eventually turn into a product (and might offer you a job afterwards, if your are successful), but I think this option does not really exist in purely theoretical disciplines.

Funding from a research grant is also given expecting results in some specific topic it was granted for, but this is possible for any kind of discipline. An RTG or graduate school is also focused on some (rather broad) field, but additionally is a structured program, which means that you are expected to attend some courses and to finish your dissertation ideally within the three years of funding, but you are (officially) freed from teaching obligations. Details may vary depending on your field and institution. In Germany, research grants, RTGs and graduate schools are usually funded by the German Research Society (DFG), which is effectively also funded by the state, but in a more indirect way, because the DFG, which is essentially the self-organization of the research in Germany, decides about funding autonomously.

In practice, the differences in funding probably will not influence your day-to-day experience very much and each chair has people with various funding sources. Usually, tasks related to teaching are split up so that every PhD student will participate in some way. Also, for almost every funding possibility, the project outline is formulated sufficiently broad to keep enough freedom of choice for you to find a good dissertation topic.

Finding a good position - a constraint satisfaction problem

It should probably be said that my path to a PhD position was rather exotic (so I was told), as the usual path goes like this – you write your master’s thesis, at some point you are either asked by or you ask one of your professors whether you would like to stay for a PhD, then the professor finds a way (one of those sketched above) to fund you and then you often will just continue with your topic, branching out in all possible directions, until you have produced enough novel research for a dissertation.

While that might be the usual modus operandi, I am not sure that it always leads to the best results. I don’t like surprises, I prefer to choose consciously what exactly I am getting into and choose the most appropriate option for me. Also, I was not fully invested into some narrow and specific topic, hence I was flexible in this regard. So I spent quite some time understanding the advantages and disadvantages of the myriad of possibilities and then started looking, trying to optimize for multiple variables.

My main constraints were: staying in northern Germany, if possible without needing to move, of course a full position, in a good research group, with a topic I am or can become genuinely interested in and a benevolent advisor I can talk to regularily if I want or need to. You often can not control or even know all of those variables, because of course there is luck involved in whether there are appropriate positions available. In the end I dropped the no-relocation constraint and extended my acceptable radius, as there were not that many interesting positions that checked enough of the boxes for me. I applied for one state-funded position and two RTGs. I got a rejection for the state-funded position, as my master’s was not finished soon enough for them, but I got accepted by both RTGs, from which I chose the one that looked more attractive to me.

The applications you send out usually describe your academic path, interests and your motivation to join the specific group. This often must be supported by a letter of recommendation by one or two professors and, of course, by good grades. In the case of RTGs, if you pass the first round, you get invited to an interview. Before the actual interview you probably will have a 20 minute talk about your previous research, e.g. what you did in your master’s thesis. Some days after this you get notified whether you are accepted or not.

I have often read the advice that the main aspect in the decision of where to apply is that your future advisor should be both an experienced, active researcher and also actually a good advisor. That if the advisor is not doing his job well, you could be in for a bad time, as this person is the biggest single factor determining your success, which I believe is true. But this information is hard to come by and you either obtain it directly in private by talking to people at your university (if you want to stay there), or indirectly by checking the publications of the advisors and how many people actually left that chair and advisor with a completed PhD. I had the privilege of getting some useful direct hints and I am glad that everything turned out really well for me.

Inside the holy halls – the ascension

How did I get here?!

So there I was, a fresh member of the RTG, in a nice two-person office with a gigantic whiteboard, two screens and a couch, friendly coworkers and an endless supply of coffee, and was essentially paid for reading a lot and then hopefully coming up with some good ideas of my own.

My first thought was: This is the best job ever, I am living the dream! My second thought was: It is surreal—now I am getting paid for finding solutions to problems and producing ideas concerning concepts that literally 99,999% of people cannot even not care about, as they don’t even know about them. Even more – I almost felt guilty for having a bigger salary than all those people working in underpaid, but genuinely important jobs. And for what? For solving riddles that I find more or less interesting (just to be clear – not that I would pay scientists less, but people in underpaid but socially valuable jobs more, but wage economics is not the topic here).

The combination of those two thoughts was leading to the following resolution dictated by my guilt of privilege – I better be good at the riddle-solving. On one hand, the organizers of the RTG expect this from me. If I fail, I risk the funding for the next RTG (as certain promises were made to the funding agency). On the other hand, I owe it to all the people who are not as lucky as I am. So I decided—what I lack in smarts I am going to compensate with sheer motivation, work ethics and preseverance, and in the end I am going to rock my dissertation! Sometimes being able to pressure yourself into action can be quite useful.

Learning to swim and finding a rhythm

The first two weeks, besides trying to find my way through the building, learning the names of coworkers and finding out how things work, I was mainly trying to digest about 2-3 sets of lecture notes about topics that I needed to (or should) know working there, but had just a rudimentary background in. It felt like a learning marathon before an exam, with the difference that I was actually seeing the content for the first time, but trying to learn it as if I already have seen it before.

As this obviously cannot work, I eventually decided to give up on rigor and go for a good-enough approach – just getting a rough picture of all important results and relationships and not wasting time understanding all proofs in detail. The other option would be to spend a few months going though all of this rigorously, but with the pressure I created for myself this did not feel acceptable. Of course, this phase would not have been such a jump into cold water if I would have stayed at my original university like a sane person (which I could, but it was an inferior option), and therefore would already have the perfect background. But this little start disadvantage was no big deal anyway, as learning new things all the time is part of the job.

I never had real problems with procrastination or organizing myself and in the first year I was one of the first people at the office in the morning and one of the last ones leaving in the afternoon. Anything less, and I would feel that I am lazy and do not work enough. As time went on and motivation waned a bit, I got more and more sloppy with this, because 9-to-5 is against my biology. But at the beginning, my willpower was strong, I wanted to get results as quickly as possible and the only thing you can do is never stop looking. That’s why it’s called research, obviously. In the best case, your advisor sends you into a good direction by throwing you a bone in form of a rough idea. In the worst case, you are on your own. In any case there is much work to do to sculpt that starter bone into something worth writing down and publishing, so this is swimming – alone, in deep, open waters.

After the first serious talk to my advisor, I tried to parse the papers he handed me, each corresponding to one of about three rough directions I could go in my research. I should make up my mind about what I’d rather do. After reading one paper and a bit of thinking, I decided against one of the topics, where (a recurring motif!) I was not sure what I would be getting into, and I am not that brave to jump right into unfamiliar territory. The other two topics were not mutually exclusive and more close to topics that I had background in, and I chose the more familiar one. I am glad that I had these possibilities laid out before me, because I’ve heard that many PhD students spend the first year just finding out what they are going to do as their main project (luckily for them, they are usually not expected to graduate in three years, but any funding will end eventually).

I think that joining an RTG, where your research already has some rough outlined direction, was the perfect fit for me. I feel that I would not have been able to deal with the uncertainty of complete freedom. Learning to swim is hard enough, it’s nice to have a beacon.

Back to primary school – reading and writing

Part of learning to swim and not drown in academia is to learn to judge what is worth your time. There will be piles of papers, old and new, that are in some way related or close to your field of research. Which one should you read in detail? Which ones should just be skimmed for the main results? Do you need to understand this section? When half your day you are reading papers (like you probably will, in the beginning), you better do it efficiently.

When learning as a student, I usually did not skip anything. I only felt prepared for an exam if I understood and was ready for every topic that was in the course. At least, as ready as I could be. Well, in the “real world”, there are (obviously) no packed-up, prechewed bites of information that you can try to digest fully. There is this vast sea of knowledge that you have to learn to navigate through. And you have to decide how long to stay at which place and keep the destination in mind.

The knowledge in papers is usually densely compressed. There are papers where you stare at one page for hours and just don’t get what the author is doing and why (and if you are unlucky, there are mistakes in the paper slowing you down and it is badly written). It takes time to figure out how much hard staring at the same paragraph or sentence is appropriate before giving up and just pushing on, in the hope that you can understand enough without fully understanding a definition or result (sometimes you can, but often not), or whether it is better to return to the paper later, after reading up on other things it is building on, or at least with a refreshed mind. So sometimes, it’s like working your way through a brick wall, with bare hands.

Generally, I tend to skim text a lot. I think that I am good at getting a intuitive, vague idea of almost anything quickly and, usually, skimming a text is sufficient for this. The downside is, that for me sometimes it is hard to slow down for extended periods of time to read attentively word by word. Which you must do, if you are trying to understand a math-heavy paper deeply. Any sentence you skip might contain a definition or equation that is crucial a few lines (in the best case) or many pages later. So if I catch myself at skimming a section that I was actually trying to understand, it just means that I’m exhausted and need a break and/or a coffee. I guess, all these woes is what helps building up the mythical skill called “mathematical maturity”, which is allegedly required to understand some topics.

When reading through the zoo of symbols, it is often helpful to create concrete and visual examples for yourself wherever they are missing in the paper. If the paper has some nice examples already included, consider yourself lucky. In my opinion, examples and visual representation are crucial for understanding. You can read as many definitions as you want, but it is hard to understand the full consequences and getting an intuition without seeing some instances and non-instances. The reading definitely will become easier over time, after being exposed to different notations and knowing the “writing tropes” of your field. Eventually you will just read the beginning, the end, the main results, and then only switch to attentive linear reading when there is a need for it. Reading papers on unfamiliar topics will probably always remain more difficult, though.

After much reading, in the end you also have to do some writing. Which consists of well-formulated theorems with sufficiently (but again, not over-the-top) rigorous proofs. And when you have enough of those, they must be crammed into some paper, at some point. Usually students have gained experience from writing their two theses, but writing papers is a different story. Eventually, you have to master the quite specific skill of writing papers with a limit of 12-15 pages from a draft of 30+ pages without losing too much substance. To publish the results from my master’s thesis at a conference, I had compressed my master’s thesis down from over 70 to about 25 pages, which my thesis advisor then miraculously shrinked down further to 15, including references. As I was just starting out at that time, this seemed like an impressive achievement, but later I learned how to do this myself.

If coming up with good ideas is the art of science, then reading, writing and presenting are the craft of science and hopefully your advisor helps you to become a good craftsman. Well, coming up with good ideas that are worth looking into is also a craft, but this craft is only practiced by the experienced researchers and is much more difficult to master.

Getting used to being on the other side

At first, it was a weird feeling to sit with a coffee in my hand in a seminar talk that is held by a student. As a student, it always felt as if those PhD guys already know everything I am going to present better than I do. I quickly learned that this impression of mine was quite wrong. Suddenly, I understood that the people sitting there are just normal people too. After a particularily bad talk, I asked my coworker ironically, whether she understood as much of the proofs as I did. She just said: “I think that almost nobody understood any of it, it was horrible”. It might be mundane, but at the same time this reply was enlightening and liberating.

I am not a self-confident person and tend to always feel as if the people I work with are much better than me. In a healthy dose, this is a good thing and motivates me to work harder. A scientist needs some humility and there is a saying that goes like – if you are the smartest person in a room, you are in the wrong room. In an unhealthy dose, this just ignites impostor syndrome – feeling that I was just lucky and actually didn’t belong there. So sometimes, it is nice be reminded that the others are and probably feel not that different.

The first year - new experiences and first victories

The first half of my three years doing my PhD was exciting and feeling like being on a winning streak. From the beginning I had some ideas what to work on from my advisor and had my first not too hard but still neat results right after 2-3 months, which was enough material for a first paper. While it seemed that this would be slightly off-topic from what my main focus was supposed to be, it turned out to be important later on.

In the beginning everything was fresh and new, I was attending two summer schools (one in Warsaw and one in Paris) on topics close to my research area, a conference (in Tokyo!) to present the results of my first paper and one workshop in Berlin. Just starting out and full of motivation, almost every topic is new and interesting, as there is very much going on in research that is not taught in a regular university course, which can lag a few years or decades behind the current frontier of knowledge, depending on the topic and whether the dust has already settled. Travelling internationally to learn new things, present your research and meet other PhD students and also some big names of your field is without doubt an exciting and fun experience.

But the day-to-day “routine” consists of reading a lot, thinking more, and scribbling much nonsense, hopefully with something useful in between, on the whiteboard. A few times a week there were some compulsory and some optional lectures and talks. I held a few presentations of my own and I took part in a reading group about various topics (because we could not decide on one field we’re all interested in equally, we decided to dip our toes in many things, one after the other). But other than that, the day consisted of (often fruitless) thinking, reading, writing and coding. Even though I still often felt like the dumbest guy on our floor, I just tried not to think too much about it. I told myself that I was academically brought up at at different university and probably know other things than my colleagues, most of whom spent all their academic life there.

I did appreciate my privilege of not having to do almost any teaching assistance, because as I have learned, not everybody from our RTG was really freed from these tasks (even if we all were, according to the contract). I do enjoy teaching and had many student jobs doing it, but basically organizing the whole lecture, coming up with exercises and exams and handling all the bureaucracy connected with it is a whole different story. I can’t really imagine completing my thesis in 3-3.5 years if I would have had to invest a third or half of my time in such activities. Luckily for me, I was mainly just advising half a dozen students who were writing their bachelor’s or master’s theses on topics somewhat related to my research area. Over a span of three years this is really not that much of a burden, especially when having the pleasure of dealing with somewhat capable students.

About a year in, I was still very motivated, but maybe a little bit less “hyped” – I didn’t start working before 9am as I did for a while before, and I had a lot less sleepless nights, as I somehow finally managed to learn to take breaks from thinking about work (another important skill to learn). This was much easier to do when doing implementation work and wearing my “software developer hat”, as writing code is not the kind of thing I can’t stop thinking about and keeping me awake like some unsolved theoretical riddle.

I liked the fact that I had both theoretical and also implementation work to do, because, ocassionally, I hated one or the other, depending on where I was currently stuck. Some days I thought that I’d like to go full theoretician, live in the perfect platonic world of mathematical abstraction and never write code again, as code is always messy, unforgiving and full of bugs. On other days I thought that I am not made of the material and lack the talent to do serious theory and just should accept that eventually I will have to join the ranks of software developers (which ultimately is what happened).

The paper-writing business - writing for human beings

I learned much in the process of writing my first paper. My original approach was completely backwards – I always had the 12 page restriction of the conference in the back of my mind. So I tried to compress as much information as possible into those 12 pages. My thinking was more ore less: “Papers are dense and hard to read, this is normal. So I just shrink all the technical details and fit them into the required format. If understand this stuff, then those much smarter guys will understand it too”. But I underestimated how those smarter guys are not working on the stuff I do and hence are not so intimately familiar with it, because they did not lose nights of sleep over it. Fortunately, I had a patient advisor who pointed out all the problems with my first draft. I quickly learned that many papers are only hard to read because they are written badly.

The main problem was that the storytelling in my draft was really lacking and I had the wrong idea of what belongs into the 12 pages of a conference paper. My first draft was too technical and more or less unreadable, the final version has a much longer introduction and it was only as formal as necessary to precisely state the results and briefly sketch the proofs. Previously I didn’t know that my proofs do not necessarily need to go into the proceedings (the book collecting the papers presented at a conference) and may be provided in the appendix only for the reviewers to judge the correctness of the results. I mentioned how proofs are similar to programs in the beginning, and my mistake was taking this metaphor too literally and forgetting that proofs in papers are read by human beings. Writing for humans means that sometimes you can leave out tedious but easy details, but you have to explain the difficult part especially well.

It turns out that you can’t have interesting content, be understandable and also provide full proofs on 12 pages and the good style is to prefer “nice to read” over “complete”, because the purpose of a conference paper is to communicate your results and main ideas on how to get them as clearly as possible. Of course, the complete version with all the details is to be provided in an extended journal version (which is rarely done) or by some other means.

Thinking about it now, I would say that the secret to good writing is emphatic writing. You need to write for someone like you, but before you learned what you know now. You should anticipate the questions the reader can have and preemptively discuss and answer them. You must illustrate complex ideas and constructions with enough clarifying examples, remind the reader of earlier sections that could be already forgotten and guide him through your document like a good teacher is teaching a student.

So the best approach to writing a readable paper is writing as if it was a thesis – elaborating all proofs in detail, writing a lot of prose, giving enough intuitive explanations, adding many examples and only then shrinking it down by removing as many details as necessary to fit into the constraints set up by the conference. The two weeks I was working on my first paper were quite stressful and I was really happy that the resulting version did convince the reviewers.

Hubris and pipe dreams - learning rigorous thinking

At some point in my first year, my advisor gave me an interesting connection to look into which later became the main topic of my dissertation – two superficially quite different solutions to the same problem that can suddenly look quite similar, when you look hard enough and in a certain way. It was a connection that no one spotted yet and that only someone deeply experienced in the field (like my advisor) would notice. But when you did see it, it was screaming at you. So he said that it would be nice if I would formalize and flesh out the connection, and of course even better if I could find a more general formulation where the two known approaches to the problem come out as special cases, giving me some ideas about what might work. Such a tasty treat! Instantly I was hooked, and quickly started running with full force from one wall into the other.

Of course, I thought, just fleshing out the connection is boring, I ambitiously hoped to find a way to carve out the grand unifying idea behind the two approaches, in its most general and purest form. The one ultimate construction for this problem that all other approaches can be reduced to, and no less! So right after superficially understanding both constructions and just enough of the proofs, I jumped into my usual mode of thought – question everything.

Maybe the previous researchers had a solution that is too complicated, maybe I can see it more clearly than they did, if I think long enough? At first I thought, it would be nice to develop this most general solution to the problem cleanly from scratch, equipped with the knowledge about the two approaches. I started with my understanding of what needs to be taken care of and tried to formulate it with as few assumptions and mechanisms as possible – all just to arrive right back at an even more contrieved description of one of the approaches, with other nomenclature for various elements and without any idea how to prove my description correct. After reinventing a weird version of the existing approaches myself, I was stuck and had to go back to the start.

Ok then, if I can not work from zero to the awsome most general version, then I will just work my way back. So I tried to deconstruct every aspect of the constructions that looked incidental and not essential to me. One day, I thought that a complicated aspect can be more or less left out and the process still works. Excited, I talked to my advisor, only for him to quickly point out the case where my simplified idea did not work. If I looked more thoroughly, I should have noticed it myself.

There was a second similar moment where I thought that I found a simpler solution. I did not want to annoy my advisor with obviously wrong ideas again, I thought. So I tried to prove my idea first. I was very close to succeeding, or so I believed. But one, just one case was missing in the proof. It was a crucial case. While trying to argue that my approach works I was forced to think it through so thoroughly that I noticed where it goes wrong. My feelings about it were a mix of disappointment about yet another failure and pride that this time I noticed it myself and did not embaress myself in front of my advisor.

So a few weeks of thinking about this problem were like a carousel of thinking that I am close to or even have discovered some beautiful, simple truth underneath it all, and then the obligatory painful fall back to earth. It’s all too easy to give in to wild speculations and quite hard to critically assess your ideas, especially when the problems usually hide in aspects you are not aware of.

My greatest mistake at this point was that I started to over-abstract the problem and define stuff without having a proper understanding of what is going on, without being sure that I really see the whole picture. I wanted too much, too quickly. Building the right intuitions is crucial. Each time I thought that I really got it, a quick conversation with my advisor opened up other flaws in my thinking. Each time I thought that I gained some deep insight, it was just what my advisor meant all along and I just finally reached him there, already standing and waiting for me on the top of what looks to me like a mountain, but really is just a small hill on the way. With my over-ambitious attitude, it was in hindsight inevitable that in the beginning I was just reinventing the wheel, but while it was not yet leading to a result, at least this way I eventually soaked up all nuances of the problem.

Using math to understand abstract concepts

Of course it was a naive way of approaching things. Mathematically, it was not clear what the “essence” I was looking for even was and I wasted some time chasing mirages. On this path there only lies madness and pseudoscience. On one hand, you need to shape your mathematical intuition to “feel” the things you are working with as a theoretician. On the other hand, it is a big mistake to mix up the ineffable intuition in your head with actual math and I was trying to hunt for something ignoring the most powerful tool, which is language, or in this case, mathematical formalism. In the end, we can only truly understand and explain that which we can write down and by constraining our thoughts it also gives them shape and concreteness. I just should have stayed closer to and not disregarded decades of previous work.

My advisor of course approached the problem from a more pragmatic and effective point of view, like “What properties do we need so that we can complete a slightly more general variant of the proof?” whereas I was asking “What is the essence of this thing? If I think that I have it, I will try to prove it using this insight.” And probably this is where I had it backward. I underestimated the importance of proofs in the understanding of a problem. Some concepts are so utterly abstract that you maybe can not understand them in any other way than by the things you were able to prove about them. And even if you manage to dream up something that is right, it does not matter unless you can prove it, so intuitive explorations should not be done for excessive amounts of time without trying to crystallize it down into a theorem to prove or disprove it. Many ideas can be quickly thrown into the garbage by some simple counterexample.

I waited for an inner mental image to eventually emerge such that you can point at it and say “this is that thing”, but probably the human mind is not capable to provide this “visuality” and “intuitiveness” for more complicated matters, at least not mine. You have proofs as little windows of clarity in the vast complexity of some problem or field and the kind of intuition that one gains with experience is not (always) visual and clear, but rather ineffable and wobbly. You can’t just think about your problem intuitively, waiting for clarity to appear out of thin air, no, you have to try to prove or disprove things, and by doing this you gain some more understanding. Maybe this is obvious to everyone working in theory, but for me this was a new and important insight.

So essentially, doing theory is balancing playful and intuitive thinking that you need to get a creative spark, and the disciplined and attentive work of sculpting and materializing this spark into a watertight proof. As every mind works differently, probably there is no one-size-fits-all way of finding a good (that is, productive) process, so I guess this is one of the many things that can only be learned by doing. Hitting walls many, many times until eventually breaking through is not avoidable, so accepting failure is also part of the process.

Each failed attempt and wrong idea improves understanding and is a piece of the puzzle. This whole phase of gaining intuition for a problem feels like spiralling around some wobbly idea for some time, but never quite landing on it, and trying to grab something slippery, until in the end you hopefully do. There is no shame in starting off existing proofs, because these are the few things that you can really hold on to. Not every result has to be groundbreakingly innovative and probably most research is an incremental improvement obtained by modification or combination of existing ideas, not unlike evolution works.

Concerning the problem I was working on—in the end I did succeed, but the result was more conservative than what I originally aspired to and in fact quite close to what my advisor had in mind. So another obvious lesson I had to learn the hard way would be – if you do happen to have a good advisor, better trust his approach, because he will know from decades of experience what might work. Even though it is worthwhile and fun to wander around in the wild mindspace sometimes, it is really easy to get lost in the forest of ideas when you run off the path shown by your guide.

Part-time developer - doing it wrong and then a bit better

As far as I know, I was one of the most “practical” guys in our CS theory department, by working on a complex piece of C++ software as a part of my PhD project. I silently listened to that ocassional “Everyone assumes I am good at programming just because I studied CS, but programming sucks, right guys?!” theoretician snark and thought to myself that it’s not that bad. Of course I don’t mean boring enterprise software that just glues stuff together. But I do think that programming is quite a lot of fun, when you solve interesting problems, and programming in academia usually involves just the right kind of problems.

It is known that most software produced in academia are “eternal prototypes”, as in the end you are more or less paid to write papers and not for producing highly reliable software for production. The tool I was working on was in my opinion quite a complex beast with many moving parts (various optimizations to be freely combined with each other) and in parallel I was still trying to find the right subset of C++ to use for my personal productivity sweet-spot. This is why after a few months I decided to do a half-rewrite of what I’ve done. As a Haskeller at heart, I was working at a too high abstraction level than is usable by inexperienced C++ programmers, as I am, in a productive way. I basically tried to write Haskell in C++, using every feature of modern C++ that helped me doing it. I tried to write code as general as possible. In Haskell this is easy, where you eat polymorphic functions for breakfast. But in C++ you quickly get entangled in the darkest of template sorcery.

Yes, you can easily write simple functional code in C++ for some years now. But I’ve experienced myself that the things that Haskell gives you are so much more than that and even if technically possible, I am not crazy enough to try replicating and using these things in C++. While error messages by GHC (the Haskell compiler) are verbose, they are often very useful and pinpoint the problem, often even hinting at a solution. When you mess up at more involved template magic, you get 10 pages of irrelevant error messages that sometimes provide no digestable information at all. Never mind the fact that template code that is not instantiated is not even checked. Debugging templates is basically a nightmare. So my new approach became – as general as necessary, instead of as general as possible. And I learned the valuable lesson that one should not write language X like language Y, in the end there is more than just syntax and if you do not follow the spirit and idioms of the language, you just make your work harder.

I would say that my first implementation attempt was still not completely worthless—it was necessary to understand where I need some flexibility and where I do not and how the pieces should fit together. In the end though, I reimplemented my algorithms a last time in modern Java, using a nice framework where they could be neatly integrated to, because I hoped that the framework will live longer than my C++ prototype that already was starting to decay (the first Github issue was opened after a year by someone complaining about compilation problems that appeared because a library I was using changed its API). This final reimplementation was a much nicer experience than fighting with C++, in fact so nice that I have not much to say about it. Maybe it was the combination of a saner language with actually knowing what I am doing when implementing it the third time. At least now I know that for many reasons I would very much prefer a Java job over a C++ job (if it has to be one of those two), so this helped guiding my job search preferences later on.

Slowly getting rid of the training wheels

I have written two papers with my advisor about the correspondence that I mentioned and some extensions and improvements that I implemented in my prototype tool which is based our neat theoretical result. Then, after having learned all those valuable lessons about “the process” of doing theory, in my second year I was working mostly alone and in a more open-ended direction and I was starting to see my advisor much less often.

This time, my advisor did not give me any concrete ideas to work out and just told me to look myself into a topic and find something interesting to do. Apparently I have successfully graduated academic primary school, so I was let loose a bit more. I actually actively avoided involving my advisor in my new endeavour until almost the very end, because I wanted to prove to myself that I can find a worthwhile idea to work on and complete it without much help.

Again I was lucky, because I quickly found an interesting thing to investigate due to the first paper we have written, which initially looked just neat, but not that exciting. My research agenda in the new topic I was diving into was: taking some ideas and a certain classification that we presented in the first paper, and to see whether using these ideas has any effect in the different setting I was now working in. It turned out that it did, and in my opinion, the results from our first paper become much more interesting in this new context, because they have further consequences that they do not have in the original context. I worked it all out mostly on my own and proudly could show it to my advisor with almost the whole puzzle already completed. With some help, I was able finish the few proofs I was stuck on, so that this work of mine became the central part of our fourth paper, the draft of which I completed by late summer, while the final version was eventually accepted at a conference in winter by the end of my second year.

Falling back to earth

Hitting more walls

Unfortunately, this turned out to be my last academic success. After uncovering and working out the connections that we published in the fourth paper, my advisor suggested some more things I could look into in the same area. I worked on it for almost half a year, twisting ideas around and thinking in spirals and circles, but the solution, a working proof for statements both my advisor and I conjecture to be true, did not come any closer.

After having such a successful streak of four papers, my own expectations and my imagination of the expectations of others put me under a nagging of pressure, as I felt that I should have something new to deliver soon. When you are working in theory, the papers you publish is the only tangible proof of your work and when you have no new results, then from the outside it is hard even to see whether you are working at all (except for a growing pile of useless notes with approaches and ideas that do not work).

After having a boost in self-esteem due to the successes I had so far, my doubts were coming back as time went on. I had no new publishable results and the writing of the dissertation and planning of my future after it came onto the horizon. My frustration was growing and each week it was becoming harder to stay focused and try to find still unexplored approaches to solve my problems. The four papers were enough for writing my dissertation, but I still had time and of course I really wanted to keep up the pace, producing more results and publications. I was still not sure whether to stay in science or leave for industry, but having some colleagues producing twice as much output, I thought that if I cannot keep up with producing papers on regular basis, I am wasting my and everybody’s time and should clear the area for those obviously better people.

To PostDoc or not to PostDoc

By that time, my advisor had asked me whether I would like to stay for a year longer after completing my PhD as a PostDoc, as he was writing a new proposal for another PhD student, which could be combined with a year of more funding for me. Usually, you work on PostDoc positions for 3-6 more years until you hopefully get something permanent. Unfortunately, there are not many permanent positions and essentially this path leads either to eventually becoming a professor, or leaving for industry after wasting some more years of your life in stressful PostDoc conditions. Unlike PhD positions, where you are not yet working in a niche of a niche of science, as a PostDoc you already have “your thing”, are firmly rooted in a certain subdiscipline, and as I understand it, when you are offered a position, you cannot afford to decline if you want to stay in business, so relocating to work at whatever institution that wants to fund you is the norm.

Having seen and met people at various stages of their research career and the banners of “excellence” decorating the competitive air inside the ivory tower, I have come to the conclusion that to succeed on this path you must:

be ridiculously good at research so that your bright future is out of question, or
be a reasonably good researcher and also a good networker, building a publication record and job security net out of possible people you can work with, or
have a high flexibility concerning your working conditions and high tolerance for uncertainty concerning your future and life planning.

The perspective of a nomadic and extremely competitive PostDoc life, combined with my more recent inability to produce new results and the understanding that my advisor will not always be around to help me out and spoon-feed me good ideas, all of this was scaring me. Besides of ever elongating your list of publications, you will also become responsible for writing applications to get funding. You have to come up with larger-scale projects and show results to the funding agency. And if you eventually are able to get a professorship, you essentially become a scientific manager and have to do various kinds of bureaucratic chores. Is this the life I would really want? Am I willing to take the risk of failure on this path many years later?

After weeks and months of introspection I came to the conclusion - No. I love the idea of being a scientist, like many people love the idea of having a dog. When you think about having a dog, you often overlook all the inconvenience and work that having an animal causes. Or stated differently, if you feel that those inconveniences are really a problem for you, then you probably just should not have a dog, if you are a responsible person. I think it is even more so with the inconveniences of working in science.

I have seen people who made it and people trying to make it, and I just cannot imagine walking in their shoes. I do not fit any of the three profiles that I outlined above. Yes, you have as much intellectual freedom as is possible to have, and like an artist creates art you can follow your research interests (influenced by research trends and funding opportunities, but nothing is perfect). But also like an artist, you live from gig to gig, from funding to funding during a significant part of your life. And there is more to the job than research and teaching, many things to do that you are not getting paid for, like writing reviews, organizing conferences, doing administrative work at your institution, and the list goes on.

To be a good artist you need to have real passion for your work, you need to love your work more than yourself, and it is the same in science. I lack that passion and love. I enjoyed working on my topic, but I am not dreaming about doing similar things for the rest of my life. I am interested in many different things, but in no thing so deeply that I would agree to suffer for it. If I could have a permanent position being a PhD student forever, I would probably take it, but because this phase must come to an end and I am not ready for the next step, it means that I have to go.

Concerning the PostDoc that my advisor wanted to offer me—I agreed, thinking that I can use this additional year as a kind of extension of my otherwise fast-paced PhD that was shaped by rules and structure of the RTG. I thought that with one more year I could find out another thing or two and maybe write a last paper, as after all, I do enjoy research. Leaving a year later would not change much in the grand scheme of things and this extension would demand no additional effort or risk-taking from me. But life decided that I won’t get this gift. Even though the proposal of my advisor got positive reviews, apparently the requested funding was cropped for other reasons and my one year PostDoc was not granted. So after defending my dissertation my academic journey will finally come to an end.

Life in academia - a summary

Even though it might not read that way, because I focused much on the struggle I had and mistakes I made, I want to point out that all in all I had a really great time during my PhD. I was very lucky to have this journey in a supportive and friendly environment surrounded by amazing people and have really nothing to complain about. In the end, this was an important chapter of my life that taught me many lessons and enabled much personal growth. All the time I was the only one standing in my way. I just eventually realized that the romantic idea of being a professional researcher is not what it is like in practice and that I cannot imagine continuing on this path due to lack of strength, confidence and willingness to suffer for an unknown amount of time for an unsure reward (a permanent researcher position) for which I am not even sure anymore whether I really would want it.

I will really miss:

The artist-like freedom and creativity. No one is pressuring you to deliver results next week and you can think about a problem for months, until you either solve it or pick a different target to attack. No one cares how you spend your time as long as your responsibilities are being done.
The high that you feel after finally finding a solution to a problem, after obsessing over it for a long time. These moments are rare and at the same time what keeps you going.
Being around awesome and very smart people. Talking about nerdy math things as casual small talk during lunch breaks. The kind of people you are surrounded with was for me one of the best parts of the whole experience.
International travel. While this is a somewhat controversial topic from an ecological point of view, the whole travelling in connection with scientific exchange (which I would say is necessary, because virtual meetings have a different feel to them and lack spontaneaus interactions) allowed me to visit very interesting places.

I will definitely not miss:

Fighting alone with my problems. I had more fun working on the topics where my advisor was more involved in and we talked more regularily, than when working on the topics that I researched mostly all on my own (even though the accomplishment of solving something without much help of the advisor feels great). For me, no collaborations except for my advisor emerged, so it was mostly lonely work.
Sitting in boring and/or horrible talks. While I was somewhat enjoying listening to other talks in the beginning, this changed over time. At conferences, there are often sessions on topics I really don’t care about and talks that are just really bad. Both my general interest in arbitrary technical topics and my tolerance for bad presentations decreased a lot by the end.
Reviewing papers and advising students, when the writing is rather bad. It is a quite exhausting and somewhat daunting task to read someones work about a sometimes quite unfamiliar topic and being required to assess its correctness. Especially, if you always doubt your own abilities.
Impostor syndrome, feeling like a failure, and sleepless nights. Everyone has their own weaknesses of character, I am tired of fighting with mine. I think that this kind of job would not be healthy for me in the long term. The gut feeling of dread I felt when thinking about the challenges of pursuing a scientific career was the tipping point for deciding that this kind of life is simply not for me, because I need more security than this career path can offer.

If you do a PhD in a theoretical field, you (hopefully) will:

understand how the scientific community is organized and how doing research works,
understand what it takes to be a successful professional researcher,
learn to think like a theoretician of your discipline,
learn to read advanced technical writing effectively,
learn to write down and present your insights and results understandably,
learn to organize and motivate yourself in the long-term and grow as a person, and
hopefully have some fun doing research and being a little cog in this glorious machinery producing knowledge out of coffee (or your different favorite beverage).

I enjoyed my time as a theoretical computer scientist in training and think that I learned many skills that will also be helpful outside of active research. So if you happen to have a good opportunity (according to the criteria I explained in the beginning) and are looking for a challenging, but also uniquely rewarding life experience, just give it a try and you will probably not regret it.