Reject Closed Source Mathematical and Scientific Programs

There’s been something that’s been bothering me for quite some time, so I think I’ll climb up on my soapbox for a bit here.

The modern scientific and mathematical community relies heavily on mathematical software for research and computations of pretty much everything. Unfortunately, from what I’ve seen, most of these tools seem to be closed source. MathWork’s MATLAB, MapleSoft’s Maple, and Wolfram’s Mathematica are all giant , widely used programs used extensively to conduct research. Don’t get me wrong here, I don’t intend this as an all out blast on these companies, I happen to think these tools do their job pretty well. However, these are all closed source, and I find this to be unacceptable. Why? The reasons are simple, and boil down to basic scientific tenants, like transparency in published work and the philanthropic ideal to spread enlightenment to whoever is willing to learn.

Transparency in every aspect of your research is paramount in publishing research and conducting research. If I wrote some 400 page research article that concluded that universe was fundamentally made up of Wendy’s Baconator sandwiches, but on page 232, I made some outrageous statement and backed it up with, “trust me on this one, I’m right”, the whole paper would be moot. I did not provide the necessary transparency for others to critique and inspect my conclusion. The closed source nature of mathematical programs allows for an unfortunate opaque step in the scientific process that is nothing more than my omission on page 232. It is saying, “Just trust me, MathProgram2008 is correct”. Sure, the big names in mathematical computation all have very well written and extensively tested programs. They aren’t perfect though, no software is, and the necessary  peer criticism and scrutiny stops as soon as data enters MathProgram2008.

Imagine this scenario. Scientist Bob devises some giant mathematical simulation that requires enormous computation. It could be computing the effect of the magnetosphere on global weather, or could be simulating a fission chain reaction, the exact nature of the computation doesn’t matter. Scientist Bob, having been taught at his alma mater to use a closed source tool when doing extensive calculations, codes up the simulation and runs it. Lets say that unbeknownst to Bob, his closed source tool has a bug that manifests itself in the quadrillionth iteration of the simulation. Bob’s code is flawless, but the closed source program has this funky, obscure bug that fundamentally alters the nature of conclusion of the experiment. He has some major revelation based on false information and publishes his work. This wrong revelation has impact on public policy, or changes the business plan of a company before a smarter scientist is able to find out that Bob was wrong. Would using an open source program have averted the false conclusion? Maybe not, but at least the peers reviewing Bob’s work could have been able to say, “This conclusion is wrong because of an off-by-one error that occurs in line 343 of the program’s cosine.c on the quadrillionth iteration.” Furthermore, some of the work might be salvageable because the reviewing peer is able to find the exact logic flaw that caused the false conclusion, and make appropriate corrections to the work. In a closed source math tool world, the reviewing peer can only say “You’re wrong, Bob, and we don’t know exactly why.” Open source code in computational tools clearly aids the scientific process better than closed source, even though 100% correctness is not possible in neither the closed source nor the open source model.

There’s also another huge thorn in my side with these tools. They cost an arm and a leg to buy. Personally, I want every human to have the ability to advance scientific knowledge. Is there a minimum economic and educational bar that you have to meet in order to be able to make meaningful scientific strides? Sure there is, but we should aim to set this bar as low as possible as so to include a greater subset of humanity in the scientific process. The more minds that consider a problem, the better the chance that the problem will be solved. If CompanyX charges $400 USD for what is quickly becoming the de facto computational tool of the scientific process, poor college students (like me) as well as budding Somalian intellectuals with their $100 OLPC’s are economically denied a tool that they need to do their potentially valuable work.

The most common retort to this argument is always “These tools are complicated and tough to make, and need a company behind them to support! We need to charge $400 to be able to provide this service!”.  The software community has proven time and time again that a community approach to complexity is indeed a feasible and stable approach to handling intricate problems. The community approach would be supported by the brightest minds the world over. Scientists from giant corporations, government labs, the halls of universities, and the basement of the next Einstein would all be working collaboratively to advance and solidify the computational tools. What single company provide that depth and breadth of experience?

Mathematical software, closed source or open source, is not perfect. Its also unfeasible to say that we should do all calculations by hand for a thesis to be valid. Should we, as scientists, not be allowed the primitive, source-level transparency in our fundamental scientific tools? I think not. Call for transparency when a research article publishes the result of extensive data mining. Reject closed source data processing tools on the grounds that they deny access to the less affluent and that they create an opaque black box in any scientific research that uses closed source tools. Look to open source mathematical programs like scilab, gnuplot, octave, and the like. Appeal to your colleagues to do the same. Science and math is fundamentally an “open source” endeavour, its tools need to be as well.

This entry was posted in Open Source, Ubuntu. Bookmark the permalink.

11 Responses to Reject Closed Source Mathematical and Scientific Programs

  1. Christopher says:

    Thank you for the post! I might also add (as I think that you mentioned before) that code itself is a form of knowledge and scientific research. How are all those basement supergeniuses going to create improved fission reaction calculators if they can’t see the code to the work already done, and if they aren’t legally allowed to make improvements anyway? (Aside from all of them applying for a job at that company…)

  2. Fred says:

    You could add SAS to that list. I try to use R in all the research I can because SAS is closed source.

    I strongly agree with your article, though. It is unacceptable to require peer review on one hand and use non-peer-reviewable methods on the other.

  3. kostasan says:

    Great post! Knowledge and science should be free for all humanity. Imagine ancient Greeks refusing to reveal (or patenting, to speak in modern terms!) mathematics, geometry, physics or philosophy!! We would be in the dark ages for millenniums to come.
    And yes, i am Greek, as you have figured out already:)

  4. Niels Egberts says:

    As a student I’m using Maple to help me with my differential-equations. I don’t know an open source project which can replace Maple, do you have any?

  5. Stefan says:

    Hi,
    thanks for that post. In the last years I have every now and then thought the same. Now since this month I have started to actively try to teach my fellow researchers about the possibility to use python scipy/numpy/matplotlib instead of IDL and matlab (IDL is heavily used in astrophysics and is closed source).
    I would love to see more scientists switch to scipy because it is a really good (http://www.scipy.org/) and open.

  6. JsonRenlan says:

    I’m sorry but I disagree. Yes, it would be nice to have access to the source code for these exceptional programs, but that is out of the question. After all you can understand why, many are very complex and represent serious investments for the companies that develop them and simply offering everything to the general public is just not feasible. In this situation, since there are no open-source alternatives … well not any good ones anyway, you have to use the tools that you have at your disposal.

  7. lorrain says:

    You can use SAGE. Read all the posts of ‘waf’ in

    http://worldforum.pardus-linux.nl/

    or

    http://worldforum.pardus-linux.nl/index.php?board=12.20;sort=starter;desc

    and you’ll find a lot of math free software. I use Pardus:

    http://www.pardus.org.tr/eng/

  8. Fred says:

    @Niels Egberts:
    Maxima (it has a gui frontend called wxmaxima) is one I used in my multivariate calculus class.

    @JsonRenlan:
    Most medicines (drugs) are very serious investments by drug companies. Yet the law only allows them to keep the patent to the drug for a few years. The reason why is because the company gets the time to recoup costs and maybe make some profit, and then the drug is available to generic drug makers so that all may benefit, researchers and patients alike. It makes no sense to put software research on a pedestal. If I said something like “And then we did something (I won’t reveal what since this method is a serious investment for my sponsors, so I can’t just give it away to the general public!) and then cold fusion happened!” (excusing the fact that my writing in that sentence is not the style of research reports), nobody would accept that. They would expect the methods to be published in full transparency. I don’t trust research that isn’t peer-reviewed. It can easily be full of snake oil because there’s no way to prove it right or wrong. We have *every* right to expect that from software, because it is a *part of the research method*.

  9. mWo says:

    It’s true. Whenever you want to do some research you have mostly no choice but to use commercial software. And it does not matter what kind of calculations you do. I often read some medical papers, and in majority SAS or SPSS are used for the simplest things such as calculating means, standard deviations or performing t-test or some correlation. I don’t remember any paper that is publish in top journal in which R, scipy, octave or open office would be used just to calculate these simple parameters.

    My point is that it seems, that using commercial software, seems to be more preferable. Why? I don’t know why.

  10. Math Help says:

    Hello fellow blogger! I’m new to blogs but I just wanted to say that I like your blog here on math help differential equations. It kept me reading all the way to the end… And then I went and searched for some more posts after that. 🙂 Keep up the good work, I’m always looking to learn more about Math Help, in particular.

Leave a Reply

Your email address will not be published. Required fields are marked *