Lorena A. Barba group

A short lecture on Open Licensing

Warning: Undefined variable $post in /home/lj3i62g6dk4q/public_html/wp-content/themes/spruce-theme/api/figshare.php on line 57

Warning: Attempt to read property "ID" on null in /home/lj3i62g6dk4q/public_html/wp-content/themes/spruce-theme/api/figshare.php on line 57


Also on SpeakerDeck, for nicer viewing.

A lecture as part of the workshop "Essential skills for reproducible research computing," at Universidad Técnica Federico Santa María (January 2017).

[slide 3]
Syllabus of the workshop.

[slide 5]
In an October-2015 interview by mathematics professor and blogger Robert Talbert, I answer this question … “Why do you advocate so strongly for open-source technology in research and education?” … I start by first clarifying WHAT WE MEAN by “open.” 

[slide 6]
"Free and open-source software (FOSS) is a human invention of tremendous impact. It poses an alternative to intellectual-property instruments that are limiting and want to control how a creative work is used. 
Open-source licenses allow people to coordinate their work freely, within the confines of copyright law, while making access and wide distribution a priority. 
I’ve always thought that this is fundamentally aligned with the method of science, where we value academic freedom and wide dissemination of scientific findings. […] 

[slide 7]
Open licensing gives us freedom—contrary to intellectual-property instruments that want to control how creative works are used.
Freedom is power. In this case, the power of open-source software comes not just from being able to read the source code, but from being able to contribute to and build from it. 
For this power to be realized, it's not sufficient to make the source public to read. We must attach a license that allows others to modify and distribute the code.

[slide 9]
So let’s be clear about this: 
“Open data and content can be freely used, modified, and shared by anyone for any purpose.” (The Open Definition)

[slide 12]
Stanford professor David Donoho and co-workers appear to be the first ones to publicly state that reproducibility depends on open code and data.
They define reproducible computational research as that "in which all details of computations—code and data—are made conveniently available to others." 
They took inspiration from geophysics professor Jon Claerbout, who said that in computational science "the actual scholarship is the complete software development environment and the complete set of instructions which generated the figures."

Donoho, D.L. ; Maleki, A. ; Rahman, I.U. ; Shahram, M. ; Stodden, V. 
Volume 11 Issue 1 (2009):8–18 

[slide 13]
Everyone developing software in an academic setting should have working knowledge of software licenses. 

[slide 14]
Screenshot of
Morin, A., J. Urban and P. Sliz (2012) A Quick Guide to Software Licensing for the Scientist-Programmer, PLoS Comput. Biol. 8(7): e1002598 

[slide 15]
The first thing to understand is that simply making the source code public does not make your project open source. 
Software is a creative work, and copyright is automatically attached to it. 
Without a license, your software is in a legal limbo where readers don't know how they can use it, if at all: a well-informed reader will opt for not using your software, as the only safe behavior. 
The license is a contract between the authors of software and the users. It gives software authors the power to share with users, and to collaborate with other developers.

[slide 16]
Always add a license to software you plan to make public.

[slide 17]
Free and open-source software is under a license that grants the users freedom: to access, use or modify the software for their purposes. 
The most important distinction between the various FOSS licenses is whether they are permissive versus copyleft. These terms are often confused.

[slide 18]
A permissive license gives more freedoms: the only restriction of use could be that the original authors receive credit in any distribution of the software or any derivative works. Even commercial uses, or incorporating the software into other proprietary (closed) works, is allowed. Academic software benefit most from permissive licensing. In fact, permissive licenses originated at accademic institutions, including the Berkeley Software Distribution or BSD License, the MIT License and the Apache License.

[slide 19]

[slide 20]
A copyleft license restricts the use of the software by requiring that any derivative works be also under the license of the original. Another word for this model is "share-alike." Many developers want to ensure open access to their work and all derivatives for all posterity. This may be considered virtuous in some circles, but we should recognize that it is achieved by placing restrictions on the use of software. The typical copyleft license is GPL.

[slide 21]

[slide 22]
License compatibility
Compatible licenses allow source code from different works to be combined to make new software. Not all licenses are compatible!
Because incompatible terms can arise from subtle wording, the Open Source Initiative (OSI) strongly recommends using an existing OSI-approved license, instead of attempting to craft a custom license. 

[slide 23]
Note that compatibility is directional: it behaves differently whether a piece of code is built into or from another. (See the illustration below from Morin et al.) 
Staff at university or laboratory technology offices may be ignorant of these issues, making it even more important for researchers to be well informed.
Figure 2 of Morin et al. (2012). Illustrates compatibility of licenses: permissive licenses (BSD, MIT) are forward-compatible with any other license, whereas copyleft licenses are only forward-compatible with themselves.
Directional compatibility of licenses is the reason why you should always be aware of the licensing terms of any code that you are reading and hoping to build upon.

[slide 24]
For academic and research software, you are likely to want a simple license that is most permissive. For example, you can use the MIT License. Most of the research code in our group has been released under MIT.

[slide 25]
Illustrating how subtle language variations matter, we recently decided to change our pick to the BSD 3-clause License. It is very similar to the MIT License, but there is a small difference in the wording about attribution …

This language implies that a user could copy some of an MIT-licensed code (as long as the portion is deemed "not substantial") without attribution. In academia, we always prefer full attribution of any portions of copied works, and BSD 3-clause is more precise in this.

[slide 26]
Bonus advice:
Write into your grant proposals that your research software will be released under an OSI-approved license. If you're lucky to have your grant funded, the condition of open-source code release becomes part of the contract with the university. 

This can save you some grief when trying to explain to staff at the tech office why your software needs to be open source. They often just don't understand the field, and want to default to a proprietary license, imagining some commercial value may exist in your research code.