If you license it, it’ll be harder to steal it. Why we should license our work
Posted by Robert Haase, on 6 May 2023
TL;DR: When we publish work such as manuscripts, code and data, the platforms we publish on typically ask to select license. If we find a nice figure on the internet, are we allowed to reuse it? If there is no copyright statement, am I stealing? If code is available open-source, can we incorporate it into our code, which might become a commercial product at some point? I’m writing this blog post to give an overview about different licenses and concepts, and I will explain why we should choose the license wisely.
What is “Copyright”?
When we talk about copyright we commonly refer to giving others the right to copy our work under given conditions. These conditions are often refered to as license. If a published work has no license and no copyright statement, legally nobody is allowed to copy, reuse and republish the work. The most simple form of copyright statements are “(C) Robert Haase” or “(C) Journal of XYZ”. These statements are basically made to tell the reader: If you want to reuse this work, you need to contact Robert or the Journal of XYZ who have the exclusive copyright. Otherwise, you are not allowed to make a copy. If a license and/or copyright statement is put on published work, people can reuse and publish derivatives under the conditions mentioned in the copyright statement – without need to ask the original authors. That’s why we add copyright statements to our work. We publicly say: Please reuse my work (!) under these conditions…
It is good practice to check for the copyright of materials before copying them. If you copy a figure from the internet into your electronic lab notebook and years later you want to use it in a talk, you do not want to figure out if you had the right to copy years ago. I recommend only copying materials where the copyright is clear. Also keep a link to the original work so that you can later cite the source properly. I also regularly open github-issues asking people to put licenses on their code. I hope nobody feels offended by this. It’s just me who would like to reuse cool stuff under clear conditions.
Restrictive versus permissive licensing and implications for exploitation
Licenses and copyright statements can be permissive, when they allow you to do things with the licensed work, and restrictive, when they limit what you can do with the licensed work. Commonly, employers and funding agencies have more or less explicit rules in place telling their employees and grantees how they should license their work so that exploitation of the work can be ensured the right way. The two extremes are characterized like this: If you have restrictive license, for example the GNU General Public License (GPL) which is often applied to open source software, you are limited in this sense: If you use GPL-licensed open-source code in your work, you must publish your code under a GPL licenses as well. You are not allowed to make a commercial product out of it and without publishing the code. This is why companies typically avoid working with GPL-licensed code. Also some universities discourage their scientists from using GPL licenses because if the authors of the code later decided to found a company based on successful scientific open source software, they cannot exploit the software due to the limitations introduced by the GPL license. Other academic institutions encourage their employees to use the GPL-license because it ensures that publicly funded open source code development becomes available open access to the public community as enforced by the GPL license.
Other open source code licenses are more permissive, such as BSD, MIT and Apache licenses. They vary in details but basically all say, if you want to reuse this work, go ahead. If you redistribute it, mention the original authors and the original license agreement. These licenses enable software developers, also in a commercial setting, to produce software based on open source software, that is not open source and can be exploited commercially.
Platforms such as github.com, f1000 and Zenodo show the license of an open-source code repository typically on the right. Again, if no license is specified, you are technically not allowed to make copies and redistribute them.
Creative Common Licenses
Some copyright statements link to a license, such as “(C) Robert Haase, this work is shared under a CC-BY 4.0 license.” The Creative Common (CC) licenses typically come with a link to the website where the license is explained in detail in simple language. CC licenses are recommended for materials such as manuscripts, data, figures and presentations. It is not recommended to apply them to source code as the other licensed mentioned above are state-of-the-art.
There are different levels of CC licenses, from very open and permissive to restrictive licenses.
- Public domain (CC0): If something is published in the public domain, everyone can do anything with it. Public domain licenses cannot be revoked. You cannot share something under a public domain license today, others reuse it tomorrow and next week you change the license and sue them.
- Attribution (CC-BY): CC-BY is the basis for a couple of licenses. All CC-BY-based licenses have in common that if you want to copy or redistribute materials, you must mention the authors, indicate if you made changes to the reused materials and provide a link to the license. Do I really have to do all these three? Yes, that’s what the license agreement explicitly says. Example:
- Share-Alike (CC-BY-SA): The Share-Alike license limits redistribution of work enforcing that derivatives are also available under the same license. It is a form of restrictive licensing similar to GPL.
- Non-Commercial (CC-BY-NC): The name says it: If you want to sell a book or prepare materials for paid courses, you are not allowed to reuse materials that come with a non-commercial license.
- No Derivatives (CC-BY-ND): That’s a tricky one: You are not allowed to derive any other work from materials shared under a no-derivatives license. That means you can download and redistribute a PDF that’s shared under CC-BY-ND, e.g. to students in a course. But you are not allowed to take a figure out of that PDF and use it in your slides. Also this is a form of restrictive licensing.
CC licenses can also be combined, e.g. to CC-BY-NC-ND meaning that neither commercial usage nor derivatives are allowed.
Source: https://creativecommons.org/licenses/by/4.0/
The Unlicense
On the one end of the scale permissive-versus-restrictive licensing sit licenses that force people to make things openly available under specific licenses. On the other end there sits The Unlicense, a statement that makes your work available public domain and also explains why: Because nobody, in particular no lawyers, should ever earn money with arguing about reusing and/or licensing of open-source software.
Disclaimers
Another form of statement that is commonly added to open source software are disclaimers. Disclaimers ensure that authors are not responsible for harm potentially caused by their work. This is the disclaimer that commonly comes with BSD licenses:
THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> AS IS AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL <COPYRIGHT HOLDER> BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
In my opinion, we should always add such a statement to all of our code because we are unable to predict what other might do with our software. And we should certainly not be responsible for what others do with our code.
Choosing the right license
Most importantly, you should make the choice about what license to use as early as possible. If you decide it on your own at the beginning of your project, only people will join the project who agree on the license. If 20 people are involved in your project and you have to make the decision together, things might become complicated.
When choosing the license for your work, think about these aspects:
- Do you want to make your work reusable by as many people as possible? Use permissive licenses such as BSD, MIT, Apache or CC-BY.
- Do you want to enable others to tweet about your work and show your figures in their slides? Do not use CC-BY-ND.
- If you write plugins for a software platform where one specific license is very common, consider using the same license. If different plugins are license-compatible, it makes reusing work simple. If multiple different licenses contradict each other in how stuff can be reused, it might not be reused.
- Check what rules your employer has in place. If you work at an academic institution, ask for their Open Science Policy.
- Does your work contain other work that it licensed GPL or CC-BY-SA? Then you are forced to publish under the same license.
There are also online platforms such as choosealicense.com giving guidance for which license to use in different scenarios.
Read more
- Licensing a repository, Github docs
- Sharing research data with Zenodo
- choosealicense.com
- Slide deck about Licensing and Sharing, licensed CC-B
Reusing this material
This blog post is open-access. Figures and text can be rused under the terms of the CC BY 4.0 license unless mentioned otherwise.