Sander van der Burg's blog: Software engineering fractions, collaborations and "The System"

As I have explained many times on this blog, software engineering is complicated. It happens quite often, that in order to solve problems or to gain knowledge, people collaborate with each other by various means. Nowadays, we have a wide range of means to share information and to collaborate. A few examples are:

Academic research publication libraries: ACM Digital Library, IEEE eXplore, SpringerLink
Conferences:
- Academic: ICSE, ASE, ISSRE, SPLASH
- Industrial:
- Free and Open-Source software related: FOSDEM, EclipseCon, LinuxCon
Internet services:
- Question and answers websites: Stack overflow
- Technology related websites: Slashdot, Phoronix, Ars Technica
- Social news websites: Reddit
- Source code sharing and collaboration: Sourceforge, Github
- (Micro)blogs: Twitter, Blogger
- Messaging: IRC, Mailing Lists

As you may notice, some of these means are quite common in certain fractions of the software engineering community and very uncommon in others. I can roughly divide the software engineering community into three fractions, having a number of distinct characteristics, interests and peculiarities (beware that I'm using stereotypes here and these fractions are not necessarily mutually exclusive):

Academic fraction

This is the group where I currently belong to. Academics are people working for a university and their main goal is doing scientific research. As I have explained earlier, scientific research within the software engineering domain is a bit strange, because there is no clear consensus what this actually means. Earlier, I have dived into literature and I have found a definition, which I will rephrase here once more:

Research in engineering is directed towards the efficient accomplishment of specific tasks and towards the development of tools that will enable classes of tasks to be accomplished more efficiently.

The "deliverables" that academic people produce are, in principle, papers published in peer-reviewed academic conference proceedings and scientific journals. Paper submissions are typically in competition with each other and only papers that are good enough are eligible for publication. The software engineering domain is a bit exceptional compared to many other scientific disciplines, because conference papers are more popular than journal papers.

Academic conferences are primarily about presenting research papers. Most of the attendees of these conferences are other academic people. According to some sources, there used to be a high participation degree of industry people in the past, but this is no longer true, unfortunately.

Industry fraction

The primary goal of industry is (not surprisingly) to develop and sell software (as a product or as a service) or to provide IT services and make profit, which is (preferably) as high as possible.

In order to achieve that goal, industry typically want to be as cost efficient as possible. Therefore, they don't want to invest too much time and effort in secondary goals, such as developing software tools, as these tools cost money and do not immediately give them any profits. Rather, they want to focus themselves as much on their primary goals as possible.

The conferences that industry people attend, are often related to the technology they are using. For example, companies using Java typically attend JavaOne, or companies using Microsoft technology may attend Microsoft TechEd. Companies may also participate in "trade show" conferences, such as CeBIT, to advertise their products and to attract potential new employees.

Community project fraction

Another fraction is what I call the "community project" fraction. Typically, a lot of people refer to this fraction as "Open-Source projects", but I don't want to refer to them like that. While most community projects are distributing their software under free and open-source licenses, there are also commercial parties doing this, without outside involvement. I have written an earlier blog post about Free and Open Source software explaining what this is all about.

Community projects are usually formed by various individuals having various affiliations, share a common interest and work together on common goals. There are many prominent examples of community projects around, such as The Linux Kernel and KDE.

Another notable trait of community projects is that nearly all contributors are also users of the software. Many community projects have an informal organization structure and copyright is owned by each individual contributor. Some community projects are also governed by a legal entity owning the entire codebase, such as the Free Software Foundation, Apache Software Foundation and the Eclipse Foundation.

There are also a number of free and open source conferences around, such as FOSDEM and LinuxCon. Most attendees of these conferences are either users or contributors to community projects and very much interested in new capabilities of software.

"The System"

The industrial and academic fractions have "targets" which they have to reach within a certain period of time. Usually this period is short term. These fractions also want to grow, improve and perform better than the competition.

People in both fractions are periodically assessed by some kind of measurement standard, indicating whether the results (according to this standard) are satisfiable and have increased enough. As a consequence, people in these fractions have a tendency to do as much as possible to improve these numbers according to this standard, rather than doing what have to be done. I call this phenomenon "The System".

Implications of "The System" on the academic fraction

In the academic world, publication records are used as the main productivity / efficiency measurement unit. This often means that the more papers you produce the better you are as a researcher. Furthermore, various other publication quality attributes are typically taken into account, such as the ranking of the conference or journal and the amount of citations that you have. Some metrics that measure a researcher's productivity are the G-index and the H-index.

Because publications are the primary (or sometimes the only) measurement unit of research, many researchers primarily work to improve these numbers. In my opinion, this is a bad thing.

As a consequence, many researchers spent most of their time aiming at a collection of academic conferences and journals. Each conference and journal have particular requirements, boundaries, traits and peculiarities, such as the allowed subjects, page limits, evaluation methods, stuff of which you know that is going to be well-received by the Program Committee members and stuff which don't. Some of the "tactics" researchers use to get their paper accepted is 'identify the champion', which means that you have a look at the list of Program Committee members and write your paper to please at least one of them, so that he will probably vote in favor of your acceptance of your paper.

Sometimes, I get the impression that doing research this way, looks like a darts game, in which you keep aiming at a fixed number of sections, and keep modifying your arrows until you hit the right score.

If you look at the definition of 'research in engineering' I have given earlier, writing publications and improving publication records is not the only thing that needs to be done. For example, sometimes also "uncommon" aspects have to be investigated, which do not necessarily produce great results, but are nonetheless worth knowing.

Furthermore, new software engineering concepts typically result in tools which have to be developed. Eventually, ideas developed in the academic world have to reach a broader audience and I think tools are the primary way to achieve that goal.

A lot of researchers refrain from doing these steps, because "The System" enforces them to do so. I have heard some people saying: "You shouldn't spent so much time on development. Just develop a prototype and then move on to your next goal!". As a consequence, lots of papers tend to become "forgotten knowledge" and the rest of the software engineering community doesn't care and perhaps reinvent the wheel some day, but implement it in a much crappier way.

Implications of "The System" on the industry fraction

In industry, quite often developers are seen as "code production units" and they have to be used as efficiently as possible. In order to be as efficient as possible, it is desired to reduce as many costs as possible and employees should focus themselves on the primary goals of the company as much as possible. One of the 'solutions' companies implement is to outsource labour to countries in which salaries are lower, to delegate certain tasks to other companies, or completely relying on an external vendor to provide a solution.

In my opinion this is a bad thing for the following reasons (I have taken these arguments from my colleague Rini van Solingen's video log: vlog episode 2 (in Dutch) ):

Developing software is not necessarily a process that merely costs some amount of money. Software development is an investment and also gives benefits. The benefits are often overlooked by a lot of managers. Reducing costs, e.g. by hiring cheaper employees with less knowledge, may typically result in fewer benefits.
In order to reach your primary goals, you also have to reach secondary goals. For example, banks are not really software companies, but they have to become software companies because their organization depend completely on software.

I think the same is true for software companies doing software engineering. People developing financial software, may not want to think about build management complexity issues, but they have to, because otherwise it is impossible to properly engineer systems for end-users.

Many companies have the tendency to delegate secondary problems as much as possible, with the assumption that others can do it better, cheaper and more efficiently. In my opinion this is not always necessarily true. Sometimes secondary problems are so specific to a particular industry that there is no general solution provided by an external vendor or by somebody else willing to provide a solution for this.

In such cases, you have to solve these secondary yourself, but many organizations refrain from doing so. They decide to keep living with the burden and prefer to be inefficient. I have worked for several companies (which I'm not going to mention here) and I can speak from experience. I have had several unconventional ideas in mind back then, and the only thing I was doing was fighting resistance, while I could have already solved many secondary problems already.

Another trait I have frequently encountered is that some companies are afraid to participate with other communities, because of the potential advantages the competitors could get. I think for most secondary goals this is not really an issue.

Implications of "The System" on community projects

As far as I can tell, there is no "System" for community projects, as these projects are typically not bound to deadlines or formal assessment procedures. Community projects basically have to keep themselves and the community as a whole happy. Furthermore, these projects are not composed by members of a single organization, but from various individuals all over the world. However, community projects also have a few peculiarities that I'd like to mention.

Quite often, because the developers are also the users of a particular application, it is difficult for non-technical users to get involved and have their problems solved. Sometimes applications delivered by community projects are seen as very unfriendly by non-techinal users.

Also, there are "social-issues" in community projects, such as developers who receive criticism (either about them in personally or about the project in general) immediately feel themselves offended and developers pissing of non-technical users claiming that they are stupid and they don't need a particular feature. The Linux Haters Blog often elaborates on this issue. Another famous phenomenon is "bikeshedding", in which big discussions are held over relatively minor problems, while important bigger problems are overlooked.

Discussion

"The System" of each of these fractions have arised with the intention of improving themselves, but personally I think that these "Systems" actually conflict with other and make things worse, not better.

In principle, the academic fraction (which investigates software engineering techniques) solve secondary problems for the industry fraction. But in order to completely "solve" a secondary problem, usable tools have to be developed, which academic people refrain to do so because it is a waste of time, according to their "System".

Second, industry has to focus themselves on their primary goal (because their "System" requires that) and they don't want to spent too much effort in secondary goals, such as working together with academics to successfully apply research or get involved with community projects sharing knowledge.

It almost looks a bit like a prisoners dilemma to me. Collaboration between these fractions obviously requires several small sacrifices, but it also offers all parties benefits. I also see community projects as a good means to collaborate between academia and industry (and anyone else who is interested). Although this is obvious, all fractions stick to their "System" and, as a consequence, they diverge from each other and don't benefit from each other at all.

I have a few concrete examples of this:

The Dutch government as well as the industry in the Netherlands, spent a relatively small amount of money in research, while they want our country to be a 'knowledge economy'. It's actually the least of all countries in the European Union. The government is planning to reduce these investments even more. They expect that companies invest more in research, but in my experience, apart from a few notable exceptions, most of companies are reluctant or have no clue what is going on in the research world. We have very good researchers in the Netherlands, but their work isn't applied that well in industry. In my opinion, that's a shame and a waste. (I have used this blog post as a reference (In Dutch) )
As mentioned in an earlier blog post, industry participation at academic conferences is low. Industry people often have different interests than academic people. Nowadays, they rather attend technology related conferences. For industry people, application and benefits of tools and technology is important. More important than a mathematical proof or evaluation showing numbers, which they don't understand.

I have encountered several boring presentations at academic conferences, showing lots of "greek symbols" and all kinds of complicated things I didn't understand. I'm pretty sure that if the presenter would have attendees from industry, they have no clue what they are talking about and they quickly lose interest.
Sometimes people reinvent the wheel, but in a crappier way, which they have to maintain themselves. In my research, for example, I have seen many custom build systems that have a significant maintenance burden. People usually stick to these suboptimal solutions for a long time, while there are many solutions available that are more convenient, more powerful and easier to maintain.

Recommendations

In order to improve the struggle of diverging fractions, I think all fractions have to cooperate better with each other (which is obvious of course) and let themselves go of their "System" in some degree (or better: make sure that the "System" changes). I have a few recommendations:

I think "The System" of academic researchers shouldn't be merely about publications. Furthermore, they shouldn't be merely about these fixed collection of conferences/journals each having their own "borders". Publishing stuff is not a bad thing in my opinion, as concepts must be properly explained, evaluated and peer-reviewed. But concepts are useless without any deliverables that can be applied.

"Playing darts" by only aiming at "sections on a dart board" is bad for research. For example, Edsger Dijkstra, one of the most famous computer scientists published a lot, but most of his publications were his "EWDs"; manuscripts, which he wrote about any subject he want, whenever he wanted, without aiming at anything or keeping some kind of "System" satisfied.
As an academic software engineering researcher, you are typically investigating stuff for some kind of "audience" (e.g. developers or testers) and it may possibly be related to certain kind of technology (e.g. Java, Eclipse, Linux etc.). Therefore, I think it's also important to directly work with people from these fractions, address them regularly (e.g. visit them and participate in their technology-related conferences) and see whether you can make your work interesting for them.
It's also a good thing to make your tools available by some means. Perhaps joining a community project or start your own community project can be good thing.
Companies must know that developing software is an investment and that secondary goals have to be reached in order to achieve primary goals. Some secondary goals cannot be solved efficiently by third parties, as it's too specific for the domain of the company.
Companies should not be afraid in participating in other fractions' community means. In fact, they should be more eager in finding out what other fractions have to offer them.

These recommendations may look challenging, but I think the barrier to build better "bridges" between software engineering fractions isn't that high. I have outlined a list various means in the beginning of this blog post, that may help you and I think they are relatively easy to apply without any additional costs.

For example, besides publishing academic papers, I also have this blog and I use Twitter to regularly report about results, findings and other stuff. Furthermore, the tools we are developing are made available to everyone through a community project, called the Nix project.

Apart from academic conferences, I have also presented at an industrial conference as well as FOSDEM, the biggest Free and Open-Source conference in Europe. Actually, our work has been very well received there. Far better than any academic conference I have attended so far.

Why am I writing this?

With the work that I'm doing as a PhD student, I'm trying to address the complete software engineering community, not just a small subset. It's also a shame if 4 years of hard work becomes forgotten knowledge that nobody cares about.

Furthermore, I think that for many reasons, building "bridges" between these software engineering fractions is essential and gives all fractions benefits. But the "Systems" of all these fractions (which are basically there to improve them) drive these fractions apart.

Besides publishing papers, I have spent I considerable amount of time in development of tools, case studies and examples. For example, according to the COCOMO estimation method of ohloh.net, I have spent 2 man-years of effort in Disnix (and I'm the sole author). Furthermore, I have also developed several extensions to Disnix and I also did many contributions to other Nix projects. Apart from development, I'm also maintaining this blog in which, apart from my research, I report about several other technical issues and fun projects.

While all this work is very much appreciated by the people I work with and talk with, it's actually is a waste of time according to "The System" of my fraction. If I would have sticked to "The System", then this blog would never exist. Moreover, I would have never produced the following blog posts, because I don't know how to submit them to any academic conference or journal:

These blog posts are very useful and appreciated. Perhaps not for academic people, but certainly for the other fractions! Finally, these blog posts have attracted many more readers than all my academic papers combined.

P.S. If anyone knows how to "sell" this stuff "scientifically" and knows to make a "dart" out of this which I can throw in a good "section", please let me know! I'd like to integrate this stuff in my PhD thesis, which is primarily about research! ;)

Conclusion

In this blog post, I have identified three fractions within the software engineering community. Each fraction have their distinct characteristics. The academic and industrial fractions have a "System", which have arised to improve the individual fractions, but drive the fractions apart from each other. I have proposed a few recommendations to "bridge" these fractions, but in order to achieve that goal, they have to let themselves go of their "System", which is not easy.

As a final point, I'd like to point out that I have used stereotypes to describe these fractions. These descriptions do not always accurately reflect what happens the real world. In practice, not every academic researcher is completely focused on writing papers. I know many researchers besides me, who write tools and make them publicly available. Actually, my supervisors encourage me to work on tooling and they appreciate the work I'm doing. Furthermore, many of my colleagues also have very good collaborations with other fractions, and often present at non-academic events, which I'm very happy about.

I also said that academic presentations are boring. While I have attended quite a number of boring presentations, I have also seen many good ones, which I liked very much. It's a shame that the rest of the software community does not know about these.

Furthermore, companies aren't necessarily completely driven by making profits. They also care about their customers in some degree and thinking about improvements in technology. And yes: there are collaborations between these fractions which sometimes produce good results.

But nonetheless, although the real word is a bit better than the "stereotype" world I have described, I still see a lot of room for improvement in "bridging" these fractions.

Sander van der Burg's blog

Tuesday, April 17, 2012

Software engineering fractions, collaborations and "The System"