Skip to content
This repository has been archived by the owner on Aug 11, 2022. It is now read-only.

SPDX and Non SPDX License Support #10479

Closed
scriptjs opened this issue Nov 19, 2015 · 22 comments
Closed

SPDX and Non SPDX License Support #10479

scriptjs opened this issue Nov 19, 2015 · 22 comments

Comments

@scriptjs
Copy link

I am opening this issue that was closed while discussion was ongoing for an appropriate solution to #8918. Discussion has been ongoing for months over this in #8918 as well as #8291, #8557, #8773, #8795 so it has touched a nerve. I urge NPM to listen and collaborate for an appropriately considered solution that will work for everyone.

The latest solution recommended is as follows that has the following benefits:

  • backwards compatible
  • may be validated with SPDX
  • is open and inclusive
  • may be validated against other possible license databases/registries in future ie. XYZ('Apple Software License')
  • may use "OR" on non SPDX licenses if "license" property cannot be a list
  • may emit a warning if it detects SPDX licenses that ought to be enclosed in SPDX()

Valid SPDX licenses

"license": "SPDX(MIT)"
"license":  "SPDX(ISC OR GPL-3.0)"

Non SPDX licenses

"license": "Oculus VR Inc. Software Development Kit License"
"license": "Artistic 2.0 OR StrongLoop Subscription Agreement"
"license": "WTFPL"

May Emit Warning
Backwards compatible but a SPDX License.

"license": "MIT"

My recommendation is to inform NPM users of the change of the license property and give module developers some time before driving everyone crazy with SPDX warnings as has been done when you imposed it. Perhaps blog about the change first to allow voluntary revisions until a certain date where warnings could be emitted. One way or the other, I urge you to engage users before disturbing software and build systems with noise.

I have not heard anyone come out against SPDX, only the way you have chosen to implement it that is not backwards compatible to about 5 years of data, excludes non SPDX licenses from package metadata, and creates a non standard SPDX description of "SEE LICENSE IN" that makes the language of the metadata awkward. ie.

"license": "SEE LICENSE IN LICENSE"

Metadata is a source of truth and these type of phrases are meaningless and only require more investigation into a repo or package.

@othiym23
Copy link
Contributor

Thanks for taking the time to put this together! The team and community have spent a significant amount of time considering the various trade-offs involved in how the license field is validated and checked, and we've already made one pretty disruptive migration to the current format.

I believe the current format is the best, or least worst, tradeoff between encouraging people to use a standard license declaration while allowing them to do something more complex or customized when necessary. As such, I'm not going to reopen this discussion, and with the exception of landing changes that make it easier to combine SPDX license identifiers with SEE LICENSE IN <file> or bug fixes, will not be landing changes to how the license field is validated.

@GerHobbelt
Copy link

we've already made one pretty disruptive migration to the current format.

Pity that one such move is stopping you from another - which can be nicely sold as "an improvement on the first, moving forward using new insights obtained through additional user feedback" as that first change is still very recent and both address the same topic.

Anyhow, this is your call, your decision, not ours to make, and you have made up your mind clearly.
I respect that.

Thank you for all the hard work, both in the past and in the future, on npm, while I agree to disagree on this particular issue. ;-)
Also thanks for your efforts in the exchanges regarding this issue.

Now, life goes on.

Salute,
Ger Hobbelt

@scriptjs
Copy link
Author

@othiym23 @GerHobbelt Honestly the level of collaboration with the community on this issue was not great. Anyone reviewing the dialog will see that NPM's response to the issue (speaking of the people from NPM that participated) was greeted with "what is wrong with what we have". This flavour of dialog is not collaboration, particularly when the conversation is shut down. It is not responsible.

Atm you have a captive audience and seem to be emboldened by this fact which is frustrating. Perhaps changes in the manifest should have been first vetted by the Technical Steering Committee before forcing such changes upon our software that diminish the utility of the metadata without adequately considering consequences or alternative. I will file this as an issue for the committee.

Evolving to something that ensures that license metadata is available to developers and search tools from the manifest is in everyone's interest (regardless of the license applied to the software).

@othiym23
Copy link
Contributor

Perhaps changes in the manifest should have been first vetted by the Technical Steering Committee before forcing such changes upon our software that diminish the utility of the metadata without adequately considering consequences or alternative. I will file this as an issue for the committee.

npm doesn't have a Technical Steering Committee, and isn't a part of the Node.js project (aside from being distributed in its installers), nor do we presently participate in Node's TSC. When it comes to npm CLI product decisions, the buck does stop with me, for now, but the team tends to make product decisions by consensus, and we both value and accommodate community input wherever possible.

In this case, in fact, the to move to SPDX license IDs was suggested (and the code written) by a member of the community, and both SEE LICENSE IN and UNLICENSED were added in response to user feedback. It was a pretty major bit of cheese-moving for some of our users (in particular, I greatly appreciate the feedback and forbearance of @sam-github at StrongLoop, even though I know he shares some of your concerns about where things ended up), and a big part of my reluctance to make further changes here is because the amount of churn in that piece of npm has already been disruptive.

Evolving to something that ensures that license metadata is available to developers and search tools from the manifest is in everyone's interest (regardless of the license applied to the software).

We're in complete agreement here. The CLI has gotten a bit ahead of the web site here (the web team has been busy with other things), but one of these days this patch will land, and there have already been discussions about how to surface the license text on the web site as well.

@yetzt
Copy link

yetzt commented Nov 21, 2015

i'm really sad about this breaking change that disables my choice, breaks my code from several years, discriminates agains licenses not favoured by spdx and is not open for alternative license repositories. please give me at least the option to disable spdx-checking in .npmrc, so i will not be nagged while i'm on the search of an open and inclusive package manager.

@othiym23
Copy link
Contributor

@yetzt

i'm really sad about this breaking change that disables my choice, breaks my code from several years,

How does it break your code?

discriminates agains licenses not favoured by spdx and is not open for alternative license repositories.

Why isn't it acceptable to have SEE LICENSE IN <file> and include a copy of the license in your packages? It can lead to some redundancy across packages, but many distributors (like Fedora) already require a license file before packaging software for redistribution.

I can't speak entirely to the motivations of @kemitchell (hi!) in choosing SPDX over other license vocabularies, but it is an enterprise, industry standard with a decent amount of support behind it. I'm not bound to it specifically, nor do I think npm is. It might be possible to create a superset of acceptable licenses for use as IDs, as long as they can be mapped to URLs that link to some kind of normative resource for the license.

What I can say is that I'm unwilling to inflict more work on the package maintainers who've already moved to the new syntax, and feel pretty strongly that any solution that results in breaking changes for packages that are currently passing validation is an unacceptable level of churn, and that there isn't enough broken in the current validation to warrant the scope of changes proposed both on this issue and in #8918.

I do believe that the current solution is a significant improvement over the freeform situation that was in place before; I've had to deal with the technical side of license conformance and validation before, and want to ensure that the time spent by software engineers writing license checkers and validators is as small as possible. The current behavior strikes a balance between encouraging good behavior and preventing people from getting stuff done – it has pretty good coverage of licenses (based on a standard), and at worst it prints a warning, which can be noisy, but isn't an error.

@scriptjs
Copy link
Author

I have posted the issue for the Node Technical Committee to examine. The way this has been handled feels unilateral. Truly it should not be this way. This affects every node developer since we have no choice but to prepare manifests and use NPM that is bundled with node. The current solution degrades search and discovery and adds more work for someone to investigate a non-SPDX license.

I see that StongLoop's solution was not to include the second license (their proprietary license) in their dual license packages – to exclude it from their manifests. As you can see, even from this limited example, this is not working. It is not helpful to defend this when it is possible to move to a better approach. Alternatives are available to ensure the inclusion of license types in an agnostic way while still respecting your desire to validate against SPDX.

We are both well intentioned with respect to the issue but I feel strongly that this has not been adequately resolved. I have referenced this for the Node Technical Committee here:

nodejs/node#3949

@scriptjs
Copy link
Author

@othiym23 The solution above impacts those that have had to move to SEE LICENSE IN to revert back to including the license of choice. Everything else is backwards compatible. I am certain there will be more satisfaction with a balanced solution. You could best determine whether or when you wish to enforce SPDX() for validation with a warning. I am recommending you make developers aware over some time before introducing noise with validation warnings.

I will add that changes to NPM's website will not satisfy as a solution. The package.json goes beyond NPM and includes discovery in other tooling including private package repositories, search tools and sites. The package manifest is much more that it was five years ago and used for much more than package fetching itself.

@kemitchell
Copy link
Contributor

Hi @othiym23. Special greetings also to @scriptjs, @GerHobbelt, and @yetzt. Nice to meet more folks who care about this!

There's been a lot written here, which has been the trend for issues on this topic, since the very beginning. I will also follow nodejs/node#3949.

As far as I know, I was the first to PR validation of SPDX license IDs, in #8179 and related PRs linked there. I've since become affiliated with npm, but wasn't at the time. What I do with and for CLI remains wholly on my time and dime. I suppose I have a slight insider advantage, in that I can corner @othiym23 when I visit npm's offices, and he's too polite to kick me out or flee.

In return, he @-mentions me on PRs he know I can't leave alone 😉. Indeed, I'm happy to answer any questions about my motivations or thought process behind adding validation, or to point out past conversations where it's come up. Long story short, it was about making license audit easier for my clients, who are both companies and freelancers. The last straw was an audit of existing license values with dat-npm, which revealed just how unusable free-form license data on the registry was.

I can't take credit for spec'ing SPDX in metadata, though. Nor can anyone at npm. The SPDX language in the package.json docs came from similar guidelines for RubyGems. RG recently accepted a PR of mine to add validation, too. Other language repo docs mention it; I'm not sure whether they validate.

As for the PR itself:

A lot of thought here. Much respect! It's not an easy problem. But before diving into all the careful details, I'd step back, because I think the details obscure the fundamental question being asked: Should npm default to expecting a machine-readable license value or free-form text?

Before, license was free-form. Currently, npm expects to be machine-readable, and warns when it's not. This PR would make license free-form by default, with edge cases for opting into machine-readable validation. Implementing any approach that supports validation means implementation warts. We've got some now.

I very strongly support license value validation by default, because I think the value of giving the majority who want to use a common standard license the power to do so unambiguously, with typo checks, lifts every boat in the community. I very strongly oppose causing any more disturbance than a warning when a license value fails, as is the case with name or version. I also strongly support putting any necessary free-form text on license terms in a file, rather than packing into package.json. This is because, as a lawyer, a full document's worth of license text is what I need to help my clients make decisions. That can come from a standard repository of license files---SPDX has one, mirrored on GitHub---or from the owner.

@scriptjs
Copy link
Author

@kemitchell Hello. I agree with validation by default but this does not need to be a situation where we cannot have validation via SPDX to the exclusion of a license type that is not in SPDX.

In fact you can have both and backwards compatibility with what has been proposed. Further, other sources of license validation may also be included if desired.

The objective of license validation is a good one but it should not be done to the exclusion or preference for any type of license. This is metadata.

That said, this is an important enough issue since it impacts much more than npm and the way we all describe this in our manifests and for an increasing amount of tooling for discovery and seach.

@kemitchell
Copy link
Contributor

First and foremost: @scriptjs, I hope you know, more than anything, that I respect your thought on this, and am just really excited to run into someone else who cares about licensing!

I worry about opening up to a variety of validation schemes, because from an audit point of view, a diversity of ways to express the same standard license makes using the metadata without human intervention impractical. This is especially true since MIT, the BSD licenses (often ambiguously specified), ISC (thanks, npm init), and Apache 2 are by far the runaway leaders in the public registry. The ideal way to avoid that trap is a standard, any standard, for mapping strings to specific license text.

As string-to-license maps go, SPDX is by far the most inclusive. It includes many near-variants and even vanity licenses like JSON. I think it's vitally important that it has WTFPL and Unlicense, too. As of the most recent release, with "license expressions", it's also the only standard I know of with dual-license support. It's the only game in town that actually refers to itself as a "standard". The backing of the Linux Foundation carries weight.

The current npm approach is:

  1. SPDX expressions, but none with LicenseRef, which is part of the broader SPDX spec for RDF-based metadata files.
  2. Magic values that point to a file distributed with the the package.
  3. Magic values (like UNLICENSED) that don't have any licensing meaning, and just hush the warning.

Going back to my generalization of the problem, this combination as a few advantages:

  1. Validation by default.
  2. Simple as possible for making the predominant open-source license terms choices unambiguously, including multi-licensing on standard forms.
  3. Free-form text is still possible (in a referenced LICENSE file).
  4. Authors can opt out of validation with UNLICENSED.
  5. Automated tools can review packages and determine that each falls into one of the following categories:
    1. Doesn't validate. Beware.
    2. Validates, but doesn't provide any license terms (UNLICENSED). Run for the hills.
    3. Has unambiguous license text you can send to your lawyer or review yourself, whether from the SPDX library of standard texts or a file included in the package, and confirmation that those terms are in fact a well known standard form, if that's the case.

There are certain edge cases that no automated system is going to avoid completely, like packages that include conflicting license terms or metadata. Detecting and handling those in software is an npm-scale project unto itself.

@scriptjs
Copy link
Author

@kemitchell I hope you realize I already agree with more than 90% of what you have written. The notion of validation against other schemes is only a selling point of the suggested scheme since it is open to this possibility in the future should something else come up.

I realize SPDX is a decent set of open licenses. That said, it is a subset. Being a subset of open licences excludes every proprietary license and open license that has not been submitted to the license review committee of SPDX.

This set me off at first because warnings in NPM on build tools and fetching must be handled. No one wants this occurring in their code. I appreciate the effort you have made in a SPDX validator but this does not negate the fact that any other form of licensing metadata is now excluded from the license property. They are now exchanged for statements that do not make the metadata as useful. It would have been helpful to have consulted at this stage but I feel this was pushed out without considering the impact fully.

This is occurring at a time where the package.json is only growing in importance for other tools for configuration and private package management. Thus the impact is more far reaching than NPM.

Anyway I have been in general disagreement with an approach that cannot include license types for all software. Metadata is for discovery by humans and machines. This comes unglued with references and not data. There will be nothing to handle the references to infer any license type when it is not a SPDX licence and that is a sad day for anyone outside the Linux Foundation.

@kemitchell
Copy link
Contributor

@scriptjs, yes, I think we understand each other. So glad to know that.

A few last thoughts:

If a competitor standard for standard open-source license metadata cropped up---I'd say it's unlikely, but possible---I might support switching to it, but I'd never supporting running two "standards" for exactly the same semantics in parallel. One or the other should win out, to make it easy for programs consuming the JSON.

SPDX covers the vast majority of open-source licenses folks want to use, based on current behavior in the public registry. The number of packages that don't use license terms with assigned SPDX identifiers is very small. Even just "MIT", the BSDs, Apache, and ISC, all on OSI's much more limited list, cover a huge percentage.

SPDX as a whole does not exclude any licenses. It doesn't assign identifiers for every license, proprietary or arguably open-source. (I churn out several new custom software licenses per month.) But arbitrary license text is supported. This is so because SPDX is in fact a broader standard for freestanding XML metadata files, distinct from package manager metadata, that may reference non-standard, arbitrary license term content with a special identifiers (LicenseRefs) in potentially complex license expressions.

npm uses only the subset of SPDX license expressions without LicenseRef. In lieu of LicenseRef, which doesn't make sense in package.json, npm substitutes the SEE LICENSE IN ... magic values for the same purpose. The only difference in expressiveness is that SPDX license expressions can do things like (MIT OR LicenseRef-SomeCustomLicense) with a definition of LicenseRef-SomeCustomLicense somewhere else, while npm asks for SEE LICENSE IN LICENSE.md with both MIT and the custom license terms in LICENSE.md. (This, sadly, is what bit StrongLoop.)

When I'm reviewing license terms, even npm's approach is too much for my taste. If a package isn't clearly marked with a single, OSI-approved, SPDX-identified form license, I want to inspect the whole package.

This really goes to my measure of usefulness: JSON is for programs, the rest is for people. Whatever software can't do reliably on the basis of structured data, people (ehem, lawyers) will do. The metadata and even the LICENSE file, if there is one, become just two of a hundred factors indicating how risky it is to use the code.

I'm sorry to hear the error messages may have caused you pain. I'd be lying if I let on I didn't see some pain like that coming. In my defense, LTS establishes for the first time a real, accountable right to expect stability from Node and the npm that comes with it.

More importantly, though, we're about to hit a quarter-million packages in the npm public registry. That registry serves a package manager whose clear priorities are making it fast, space-efficient, and easy to construct massively nested deps trees. In my mind, it wasn't just "Could npm be the first open-source community to make clear licensing the norm?". It was also "If the npm community doesn't start taking machine-readable licensing seriously now, will using npm in a natural way ever be safe when licensing is a concern?".

@scriptjs
Copy link
Author

@kemitchell, @othiym23 I wish JSON were for machines but sadly most programmers do their share of reading and writing it manually, particularly in initializing app and module development. Again, I am sold on all the reasons for license validation and SPDX as an appropriate choice for validation. You also understand where I am coming that ensuring that non SPDX license types should somehow be included in the license property of the manifest, otherwise we all loose human and machine discovery. References alone obfuscate the issue if the license is not a SPDX license. We don't need to degrade the capability or quality of the manifest to achieve our goals.

It is rather sad that an open manifest standard has never come about. You can imagine we would not be debating anything if this were so. As a programmer, the number of manifests and pieces of configuration can make one sick these days and more noise only makes things worse.

I seem to have determined a solution that will work in the interim. It will validate without throwing warnings and allows a non-SPDX within the metadata without conflicting with the SPDX validation scheme. If we can agree on this solution and commit to a decent discussion about how this might evolve for future, I would be satisfied.

The solution is as follows and currently requires no changes to NPM or to the validation scheme.

"license": "SEE LICENSE IN LICENSE.md (Your non-SPDX license here)"

Examples

"license": "SEE LICENSE IN LICENSE.md (Apple Software License)"
"license": "SEE LICENSE IN LICENSE.txt (WTFPL)"
"license": "SEE LICENSE IN LICENSE (Oculus VR Inc. Software Development Kit License)"

This alleviates the primary concern that non SPDX licenses cannot be included in the package.json metadata for developers and machines to discover. It will still validate free form and what is within brackets could be harvested by tools. Can we agree on this as a recommended solution. ie. You may optionally include the non-SPDX license(s) following the reference.

@yetzt
Copy link

yetzt commented Nov 21, 2015

@othiym23 i'm putting most of my code in the public domain. therefore Public Domain is my default in my .npmrc. but Public Domain is invalid. this is (especially in my jurisdiction of Germany) not the same as Unlicensed (because everything unlicensed is copyrighted by default and therefore no permission whatsoever is granted). there is no Public Domain in SPDX, PDDL is for open data (which would make no sense at all for software packages, however not all open definition approved licenses are included in SPDX), and SAX-PD is not Public Domain and valid for SAX only (quote "SAX is free!"). since the public domain is not a license, SEE LICENSE IN is not appropriate. i just want everyone to be able to use my code in every way they want. this policy impedes my choice to do so.

i can appreciate the need for machine-readable metadata, working mostly on open data projects myself. but making changes that make the existing metadata of five years of published code invalid and stipulating one closed and restricted repository of licenses (SPDX) as the only, once and forever choice for licensing your code (with some awkward escape hatch as an afterthought) contradicts and defies all the openness, egality, inclusiveness and beauty of node and npm i once fell in love with.

i suggested to create an (npm-style) open repository of all licenses with no discrimination (which could possibly include a graph of license compatibility for more automation, yay!).
i suggested to not make SPDX the only thing ever in place with SPDX(spdx-expression), so someone could create SPDZ, write their own SPDZ(spdz-expression) validatior, make a pull request and go on a hugging spreee.
and since all those clever ideas were dismissed, i just begged for the ability to easily turn all this off in my own environment. (for now i just use npm init -y and edit all package files manually while feeling sad).

please make openness and inclusion a prioroty instead of an afterthought.

@kemitchell
Copy link
Contributor

We've gone back-and-forth pretty quick so far, but I want to take some time to think through your proposal.

A couple things I have in mind:

  1. With my coder of audit tools hat on: "SEE LICENSE IN LICENSE.txt (WTFPL)" is semantically equivalent to "WTFPL", but with added risk that LICENSE.txt doesn't contain standard WTFPL. "SEE LICENSE IN LICENSE.txt" where LICENSE.txt is exactly WTFPL has the same effect as "WTFPL", too, but can't be misleading, as can "SEE LICENSE IN LICENSE.txt (WTFPL)" where LICENSE.txt is actually something else. If there's one problem I'm hoping SPDX will help me avoid, it's having to inspect LICENSE file content and try to model whether it's a match for a standard form.
  2. With my lawyer hat on: I will ignore the label Apple Software License in "SEE LICENSE IN LICENSE.md (Apple Software License)", and any other such labels, entirely. Even if I recognize the label used as a form I'm familiar with, and even if it states a revision of that form I know, I still have to read any included terms. At best, having reviewed the terms, I can look back and see the label in package.json wasn't leading me astray. I'll never rely on it.

I'll let it roll around at least a few days before weighing in again.

@yetzt
Copy link

yetzt commented Nov 21, 2015

@kemitchell with my "i'm trying to break everything" hat on, i will write a valid SEE LICENSE IN package.json and send any lawyer into an infinite loop. ;)

@kemitchell
Copy link
Contributor

@yetzt: ROFL. Lawyers are inherently lazy. Like programmers 😉

[Edit: Miss the functional programming pun? 😄]

@kemitchell
Copy link
Contributor

@yetzt: Much love to German hackers! Really sad I could not go to CCC this summer.

I can't give legal advice on licensing and how the public domain works over the Internet---especially under German law---but I assure you there are many great options for making your work as open as possible with npm right now. License metadata validation is in fact all about making sure folks can be sure they have the rights to use your work without thinking about it, using software. If you want to give others all the rights in the world to use your stuff, the idea is that npm should help you make that as clear as possible.

If you're interested in the public domain, please do take time to read up on how works actually get into the public domain. I strongly recommend you start at Wikipedia. Many great hackers misunderstand the public domain!

I'd also strongly recommend you skim some "licenses":

  1. CC0 (SDPX: CC0-1.0) is very long, but CC also provides explanation. I use it all the time for legal work, software, and artwork. All the modern Creative Commons licenses are designed for worldwide use.
  2. You may also like The Unlicense (Unlicense, not UNLICENSED). At least read the first sentence!

The SPDX legal working group is a little old-school, but definitely not closed. See the new SPDX license request process, inclusion principles, and wiki tracker. They track copies of all the licenses they accept in a Git repository, too. Members of the SPDX tech group recently worked together with me to make structured data for the license IDs available as JSON on their website, which is how both npm and RubyGems now find out about them. There are plans to make license texts available in a structured way, too.

Really: Who wants to tend a list of form contracts? The SPDX people have done it for years with little recognition.

I'd also like to point out, AFAIK, the only practical effect a bad license string is supposed to have on any npm package is a warning to stderr. npmjs.com is not showing some licenses correctly at the moment, but the PR to fix that is on its way.

@scriptjs
Copy link
Author

@kemitchell There will always be a potential mismatch between metadata and actual files as long as there are human beings regardless of SPDX. The reality is mostly unlikely and if discovered can be corrected by module authors. Obviously there is an expectation the license identified will be included in the package when it is identified in the package metadata.

What I prefer to see in place is the proposal that started this thread that would leave the license metadata fully in tact for developers, leaving things backwards compatible rather than undermining this resource that we use in our work. Developers are the main users of NPM and I sincerely hope you appreciate this fully.

I have suggested the second proposal in an attempt to mitigate the impact of eliminating license types from the manifest completely (that is my primary concern). The problem you are suggesting to avoid is the problem you create for every developer that would have to inspect a file that is not a SPDX license. I note that the second proposal does adequately address the issue that @yetzt has raised, but does work with the current long term release of node without modifications.

Please, I don't want to get into the semantics of interpreting licenses. For practical purposes, packages generally contain one or more licenses. It is up to an author to identify the license and include it if it has not identified by reference. Let's not go there. This issue that we must resolve is one of metadata and discovery by humans and machines that is in jeopardy as the result of the changes.

Lastly, I am confused somewhat by your role in this. I am looking at this as a developer. As one of thousands of developers that access and discover modules and data on multiple systems every day. I tend to work more with a private package repository than NPM most days. The package.json is the source of truth about the package. I tend to write as many or more packages than I consume from NPM. Metadata we have been able to assess quickly is about to become increasingly ambiguous for good across our ecosystem. This is something I feel everyone needs to be concerned about. Even worse is being painted into a corner and forced to write awkward, unhelpful text as metadata that assists no one just for the sake ceasing warnings while degrading the information I am passing to users of our software.

@GerHobbelt
Copy link

For the record, my current position/hoped-for compromise:

  • using SPDX as a single source (no alternatives) is fine. That's our machine database.
  • machine-readable first, human-readable second
  • exceptions handled by (...) extension of the SPDX query expression as suggested by @scriptjs above:

The solution is as follows and currently requires no changes to NPM or to the validation scheme.

"license": "SEE LICENSE IN LICENSE.md (Your non-SPDX license here)"

This I estimate is the least disruptive & minimal change to what currently is. (Without having looked at the npm code itself; estimated by looking at the proposed change itself.)


P.S.:

A decent alternative solution to the same problem as suggested in #8918 (comment) is also fine but may be more work? (while I'ld prefer that one from an unambiguous information dissemination perspective)

Or should we kick this up to SPDX themselves to get something like that included in their specification so as to cover commercial and other 'unsupported / one-off' licenses in their spec and thus line it up for inclusion in npm (and others)?

Looking at http://spdx.org/sites/spdx/files/SPDX-2.0.pdf section: Appendix IV: SPDX License Expressions (pages 81-88) however indicates that this might be counter to their process as they already have another way to potentially 'solve' this issue at least from their own perspective: 'DocumentRef-'.

@kemitchell
Copy link
Contributor

@othiym23, this has brought one interesting use case to mind. I can summarize, then I'm afraid I've given all I can here. The upshot for me is that while I am not in favor of any changes to how npm handles license metadata at this time, there was good input here.

The use case:

  • DevCo licenses the package node-cashcow from LibCo on proprietary terms.
  • LibCo delivers node-cashcow as a tarball.
  • node-cashcow's package.json license prop is "SEE LICENSE IN README.md (LibCo Production Use License)".
  • DevCo agrees to pay LibCo $500 per month during which node-cashcow is in active production use.
  • DevCo puts its copy of node-cashcow in its private npm registry.
  • DevCo counsel reviews and announces to the dev team that "LibCo Production Use License" means five Benjamins off your budget per month.

At this point, "LibCo Production Use License" has all the meaning it needs at DevCo. It's been pseudo-standardized, in that it's now a string identifier with an unambiguous connection to a particular legal outcome.

DevCo programmers cruising an index of DevCo's private registry, which looks at npm-standard package.json props, know the consequences of using packages with that license value---perhaps other libs from LibCo, too---and they know they can trust the metadata alone. DevOps can roll a shell script that checks for "SEE LICENSE IN README.md (LibCo Production Use License)" in build artifacts and cats a line to deployment logs for record keeping.

Current "SEE LICENSE IN ..." forces these users to dive into the LICENSE files. For proprietary projects built entirely of proprietary components, every single license prop might be "SEE LICENSE IN LICENSE.txt". It's plausible, if minor, pain, but it scales with dep tree leaf count, which I understand you encourage. I am not sure this is the pain @scriptjs has in mind.

The prospect of this pain doesn't change my view, however, for three reasons:

  1. The most important use case for all npm metadata is exchange of software from unfamiliar third parties, probably on common permissive license terms, that has not already been vetted by the receiving developer's organization. If you want to license under WTFPL, there should be exactly one "correct" value for your license prop, to make that unambiguously clear.
  2. In a controlled environment, there are ample alternatives. You can bolt on a new package.json property for license values coded in house, like legaltag, and even translate valid npm-standard license values into whatever other data format you use internally. You can script analysis of LICENSE files, since you're sure they will be predictable. You can even use SPDX, the whole of it, as SPDX intended.
  3. package.json files travel. The "shorthand" name for a familiar proprietary license in one organization may mean something very different in another, where a copy of the package ends up. Nobody wants to officiate parceling out a new namespace for arbitrary custom and proprietary license terms. The solution to that problem is content addressing---shameless plug---and house hash-to-nickname mappings. It's on my bucket list.

To say it another way: In my mind, npm package metadata for license terms should be a lingua franca among all developers with npm installed. That lingua franca should be optimized for packages that do the most traveling, meaning open-source packages. It should be as simple as possible to parse, interpret, and translate. It should have one escape hatch for "the license vocabulary all npm users share can't express what's going on here".

Organizations will always develop their own "dialects" of license-talk, with their own lists of non-standard license nicknames that local speakers understand. This should be encouraged; it has to be accepted. But nobody else should have to know any of these dialects to understand content in valid npm-standard license properties. That's exactly what SPDX(), SPDZ(), and SEE LICENSE IN ... () would require, especially if they say SPDX(BSD-3-Clause), SPDZ(New BSD License), and SEE LICENSE IN BSD.txt (Modified BSD License), respectively.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants