There’s been something that’s been bothering me for quite some time, so I think I’ll climb up on my soapbox for a bit here.
The modern scientific and mathematical community relies heavily on mathematical software for research and computations of pretty much everything. Unfortunately, from what I’ve seen, most of these tools seem to be closed source. MathWork’s MATLAB, MapleSoft’s Maple, and Wolfram’s Mathematica are all giant , widely used programs used extensively to conduct research. Don’t get me wrong here, I don’t intend this as an all out blast on these companies, I happen to think these tools do their job pretty well. However, these are all closed source, and I find this to be unacceptable. Why? The reasons are simple, and boil down to basic scientific tenants, like transparency in published work and the philanthropic ideal to spread enlightenment to whoever is willing to learn.
Transparency in every aspect of your research is paramount in publishing research and conducting research. If I wrote some 400 page research article that concluded that universe was fundamentally made up of Wendy’s Baconator sandwiches, but on page 232, I made some outrageous statement and backed it up with, “trust me on this one, I’m right”, the whole paper would be moot. I did not provide the necessary transparency for others to critique and inspect my conclusion. The closed source nature of mathematical programs allows for an unfortunate opaque step in the scientific process that is nothing more than my omission on page 232. It is saying, “Just trust me, MathProgram2008 is correct”. Sure, the big names in mathematical computation all have very well written and extensively tested programs. They aren’t perfect though, no software is, and the necessary peer criticism and scrutiny stops as soon as data enters MathProgram2008.
Imagine this scenario. Scientist Bob devises some giant mathematical simulation that requires enormous computation. It could be computing the effect of the magnetosphere on global weather, or could be simulating a fission chain reaction, the exact nature of the computation doesn’t matter. Scientist Bob, having been taught at his alma mater to use a closed source tool when doing extensive calculations, codes up the simulation and runs it. Lets say that unbeknownst to Bob, his closed source tool has a bug that manifests itself in the quadrillionth iteration of the simulation. Bob’s code is flawless, but the closed source program has this funky, obscure bug that fundamentally alters the nature of conclusion of the experiment. He has some major revelation based on false information and publishes his work. This wrong revelation has impact on public policy, or changes the business plan of a company before a smarter scientist is able to find out that Bob was wrong. Would using an open source program have averted the false conclusion? Maybe not, but at least the peers reviewing Bob’s work could have been able to say, “This conclusion is wrong because of an off-by-one error that occurs in line 343 of the program’s cosine.c on the quadrillionth iteration.” Furthermore, some of the work might be salvageable because the reviewing peer is able to find the exact logic flaw that caused the false conclusion, and make appropriate corrections to the work. In a closed source math tool world, the reviewing peer can only say “You’re wrong, Bob, and we don’t know exactly why.” Open source code in computational tools clearly aids the scientific process better than closed source, even though 100% correctness is not possible in neither the closed source nor the open source model.
There’s also another huge thorn in my side with these tools. They cost an arm and a leg to buy. Personally, I want every human to have the ability to advance scientific knowledge. Is there a minimum economic and educational bar that you have to meet in order to be able to make meaningful scientific strides? Sure there is, but we should aim to set this bar as low as possible as so to include a greater subset of humanity in the scientific process. The more minds that consider a problem, the better the chance that the problem will be solved. If CompanyX charges $400 USD for what is quickly becoming the de facto computational tool of the scientific process, poor college students (like me) as well as budding Somalian intellectuals with their $100 OLPC’s are economically denied a tool that they need to do their potentially valuable work.
The most common retort to this argument is always “These tools are complicated and tough to make, and need a company behind them to support! We need to charge $400 to be able to provide this service!”. The software community has proven time and time again that a community approach to complexity is indeed a feasible and stable approach to handling intricate problems. The community approach would be supported by the brightest minds the world over. Scientists from giant corporations, government labs, the halls of universities, and the basement of the next Einstein would all be working collaboratively to advance and solidify the computational tools. What single company provide that depth and breadth of experience?
Mathematical software, closed source or open source, is not perfect. Its also unfeasible to say that we should do all calculations by hand for a thesis to be valid. Should we, as scientists, not be allowed the primitive, source-level transparency in our fundamental scientific tools? I think not. Call for transparency when a research article publishes the result of extensive data mining. Reject closed source data processing tools on the grounds that they deny access to the less affluent and that they create an opaque black box in any scientific research that uses closed source tools. Look to open source mathematical programs like scilab, gnuplot, octave, and the like. Appeal to your colleagues to do the same. Science and math is fundamentally an “open source” endeavour, its tools need to be as well.