Eclipse IP Policy: Reviewing Third Party Content

I’ll start this discussion with some background…

Under the original Eclipse Foundation Intellectual Property (IP) Policy, every bit of third party content needed to be thoroughly reviewed before it could be used by an Eclipse project. And the reviews were thorough: license scan, provenance check, scan for anomalies, … Reviews of third party content literally took days, weeks, and months. All of that review needed to be complete before the project team could commit any code that made any reference to that third party content.

As you might imagine, the time required to engage in all of that analysis was somewhat inconvenient for project teams, so we introduced the notion of Parallel IP. The idea behind the introduction of Parallel IP was that the IP Team could perform a cursory check of the content and grant checkin, thereby authorizing the project team to commit code into their repository that leverages the content, while the IP Team engaged in their more thorough review in parallel. The project team needed to wait until that thorough review was complete before engaging in a release. Initially, Parallel IP was available only to Eclipse projects in the Incubation Phase; it was later extended to Eclipse projects in the Mature Phase, but only for new versions of third party content that had already been reviewed and approved.

Parallel IP made the process better. Committers only had to wait for a day or two before they could leverage new third party content. It worked well for those Eclipse projects that engaged in annual releases, but as more and more Eclipse projects increased their release frequency, the time required to complete reviews of third party content (and the lead time required to engage with the IP due diligence process ahead of a release) became a problem (it’s likely more accurate to say that the existing problem became more acute).

To accommodate those projects that needed to move quickly, we introduced the notion of license check only IP due diligence and gave Eclipse project teams the ability decide what sort of IP due diligence they’d like to engage. We helpfully labeled the new type of IP due diligence as Type A (license check only) and the classic thorough IP due diligence as Type B (license check, provenance check, and anomolies scan). We introduced some automation that leveraged open source tools to scan and automatically approve third party content submitted for Type A review. Based on the rate of adoption, Type A was very successful.

It’s worth pointing out that our Type A, even though it is less thorough than our Type B IP due diligence, it is still far more than any other open source organization does. In fact, our Type A provides a far more thorough review of third party content than most organizations engage in.

While the introduction of Type A made the IP due diligence process flow faster for Eclipse project teams, it didn’t address the underlying problem: that the process required that every single bit of third party content must be reviewed before it can be used in any capacity.

The October 2019 updates that we introduced to the Eclipse Foundation’s IP Policy changes an important definition that lets us turn the process around. The definition of the term Distributed Content had previously forced us to implement a process that required the review of all third party content before any use or reference to it could be committed to an Eclipse project’s source code repository. By narrowing the definition of Distributed Content to refer only to content that is included in a release, Eclipse project teams may now push commits that reference third party content without first checking with the IP Team during a development cycle. It’s only when it comes time to release that we need to certify that the third party content included in and referenced by the release is license compatible.

This change shifts some of the onus onto the project team. Before pushing a commit that leverages third party content, a committer will need to (at least informally) check to see if the license on that content is compatible with the project license. We’re not expecting that project committers do any sort of deep analysis, only that they review the licensing terms on the content (if the third party content’s license in on the Eclipse Foundation’s Approved Licenses list, then you’re probably okay). If there’s any question, or a committer feels that shenanigans might be afoot, they can engage with the Eclipse Foundation’s IP Team for help. Primarily, committers need to provide some scrutiny to the license of the content that they adopt to avoid surprises when it comes time to certify compliance ahead of a release.

With no requirement to review content in advance, there is no requirement to engage with the Eclipse IP Team via contribution questionnaires (CQs) for every single piece of third party content. Instead, we can leverage the vast database of intellectual property metadata that we’ve assembled over the years to validate an entire dependency list as a unit (I posted about this a couple of weeks ago). For those readers who have been part of our community for a while, this means that there is no longer any requirement to create piggyback CQs. Further, we’re leveraging other sources of intellectual property metadata (e.g., ClearlyDefined), which means that there is generally no longer any requirement to create CQs of any kind. In practice, we will continue to use CQs to engage the Eclipse IP Team to research and vet content for which information is not already available, or to investigate content when we detect shenanigans (we will also continue to use CQs to track project code and third party content that includes cryptography).

Our intent is to make as many of the changes work as possible using the process and infrastructure that we currently have in place. In 2020, we will start researching and evaluating new tools; our hope is that we will be able to implement our updated process using existing open source tools. In the meantime, we have some tools that we’ve been using internally to validate license compliance of third party content and are working on making these tools available in a form they can be leveraged by committers (we’ve just started capturing requirements on, and will track progress using Bug 553016).

We’ve only just gotten approval for the policy changes and have only just started implementing process changes. We appreciate your patience and we work to make all of this happen. We’re tracking our progress on Bug 552967.

Posted in EDP, Intellectual Property | Leave a comment

Update to the Eclipse IP Policy

Sharon Corbett drafted this message that the IP Team has been posting on some of our CQs. I’d like to share it more broadly.

The Eclipse Board of Directors approved changes to the Eclipse Intellectual Property Policy on October 21, 2019.  The most significant change relates to how we will perform due diligence of leveraged Third Party Content (Section IV B).

Motivation and Background:

The Eclipse IP Policy and Procedures date back to 2004.  While we have made significant changes over time, we cannot support agile development nor continuous delivery.  Further, it’s impossible to scale to modern day technology such as Node.JS, Electron, NPM, etc.  Additionally, our process for third party content is burdensome and lacks automation.

License Compliance of Third Party Content:

The change removes the requirement to perform deep copyright, provenance and scanning of anomalies for Third Party Content unless it is being modified and/or if there are “special” considerations regarding the package. Instead, the focus for this content will be on license compliance only, which many of you know today as Type A. Reminder, please note that this will not impact how we handle project code and/or modified third party code.  Full due diligence review will remain in place for this content.  This change applies to THIRD PARTY Content only.

Futures:

We are currently working on tooling to provide to Eclipse projects so that in the near future, Eclipse projects will perform these license compliance checks allowing projects to only check in with the IP Team to file a CQ when/if the project encounters IP violations or restrictions with respect to its Third Party Content.  We estimate this tooling will be available by the end of the year. 

We ask for your patience while we use existing infrastructure to roll out these changes with further automation being planned.  We will need to change documentation, front end changes, etc.  More information will be forthcoming on this topic.

These efforts are in support of our community and we will work with you as we continue to move forward with this new modernization effort.

Posted in EDP, Intellectual Property | Leave a comment

Eclipse Project Licenses

While it’s true that most Eclipse projects use the Eclipse Public License, many Eclipse open source projects use alternative licenses either alone or in combination.

The chart below shows the relative use of various license schemes by Eclipse open source projects:

Note that we use SPDX expression. In SPDX, license combinations are expressed from the consumer’s point of view, so dual licensing is expressed using disjunctive “OR”. For example, “EPL-2.0 or Apache-2.0” expresses dual licensing of content under the Eclipse Public License 2.0 or Apache License 2.0. Further, in SPDX, secondary licenses–as they are supported by the Eclipse Public License 2.0–are expressed as dual licensing; from the consumer’s point of view, our form of secondary licensing is equivalent to dual licensing. So, “EPL-2.0 or GPL-2.0 WITH Classpath-exception-2.0” indicates that the source code may be distributed under the Eclipse Public License 2.0, or the secondary license’s terms (there’s more information about secondary licensing in the FAQ).

When we filter out all but the Eclipse Public License, the graph simplifies to this:

Many of our older projects still use the Eclipse Public License 1.0. Project have been updating to version 2.0 over time. For those projects that haven’t updated yet: I’m coming for you.

Posted in Eclipse 101, EDP, EDP, Intellectual Property | Tagged , | Leave a comment

Revising the Eclipse IP Policy: Third Party Content

The Eclipse Foundation is in the process of making a major update to our Intellectual Property Policy. A big part of this update is a change in the way that we will manage third party content. 

In the context of the Eclipse IP Policy, “third party content” is content that is leveraged by the Eclipse open source project, but not otherwise produced or managed by an Eclipse open source project. A library produced by, say, an Apache open source project, is considered to be third party content. Today, the IP Policy requires that all third party content must be vetted by the Eclipse IP Team before it can be used by an Eclipse Project. Pending approval from the Eclipse Board of Directors, we’re planning to turn this around.

Upon approval of these updates, project teams will be able to introduce new third party content during a development cycle without first checking with the IP Team. That is, a project team may commit build scripts, code references, etc. to third party content to their source code repository without first creating a contribution questionnaire (CQ) to request IP Team review and approval of the third party content. At least during the development period between releases, the onus is on the project team to–with reasonable confidence–ensure any third party content that they introduce is license compatible with the project’s license. Before any content may be included in any formal release the project team must validate that the third party content licenses are compatible with the project license.

Traditionally, releases are preceded by a release review and, as part of that release review, the Eclipse IP Team engages in a review of the project’s record of intellectual property contributions and third party content use (the IP Log). It is during that IP Log review that the IP Team will validate the state of license compatibility. I say “traditionally”, because we changed tradition, or more specifically, we changed the Eclipse Development Process in late 2018, to make it so that a project team may engage in any number of major and minor releases for an entire year following a successful release review. In the case where a release does not require a review (and so there is no trigger to engage in an IP Log review), the onus is on the project team to ensure the license compatibility of all referenced third party content. But we’re not leaving project teams high-and-dry: the IP Team can still help with the validation, even when a formal review is not required.

We’ve been experimenting with processes and tools to help us with license compatibility validation. These tools will be used by the IP Team during their evaluation, and will be made available to project teams as well. Ideally, project teams will integrate the license compatibility validation tool into their builds so that the tool may identify content that requires further scrutiny, and present it to the IP Team to resolve early in their development cycle so that–by the time we run the tool to validate the content at the end of the release cycle–all identified content will already have been resolved and the IP Team can just “rubber stamp” it. 

This should provide significant flexibility for project teams to experiment with different libraries and versions, while also making the IP due diligence process more streamlined and predictable.

An important part of making this work, is the leveraging of existing databases of information. Over the years, we’ve accumulated a significant amount of knowledge about a great many libraries, but others have also done a great deal of work. The new process will leverage other trusted sources of information (more on this in a future post). We’re going to get out of the business of scanning through every single bit of source code ourselves, and instead trust our own database and other sources of information (and contribute to these other sources of information). 

Our prototype tool focuses on a bill of materials. Each entry in the bill of materials identifies a particular third party library. To identify a particular third party library, we’ve decided to adopt the ClearlyDefined project’s five part identifiers which includes, the type of content, its software repository source, its namespace and name, and version. ClearlyDefined coordinates are roughly analogous to Maven coordinates which unambiguously identify a particular piece of software by groupid, artifactid, and version (e.g., org.junit.jupiter:junit-jupiter:5.5.2 unambiguously identifies content that is known to be under the EPL-2.0). The ClearlyDefined coordinate system adds the type of the content as “maven” and its source as “mavencentral”, so org.junit.jupiter:junit-jupiter:5.5.2 becomes maven/mavencentral/org.junit.jupiter/junit-jupiter/5.5.2 (note that, at least theoretically, the source could be a different Maven repository). We selected ClearlyDefined coordinates at least in part because we have projects that use languages that are not Java and software repositories that are not Maven Central; using these coordinates, we can also identify, for example, NPM content (e.g., npm/npmjs/@babel/generator/7.6.2). 

A bill of materials, then, is a list of ClearlyDefined coordinates (the prototype tool automatically translates a Maven dependency list or a node package-lock file into this coordinate system).

The Maven Dependency plugin can be used to generate a list of dependencies (as Maven coordinates):

$> mvn dependency:list -DoutputFile=project.deps -DappendOutput=true

The output from the Maven Dependency plugin takes the following form:

The following files have been resolved:
   org.apache.commons:commons-lang3:jar:3.4:compile
   org.slf4j:slf4j-api:jar:1.7.21:compile
   org.slf4j:slf4j-log4j12:jar:1.7.21:compile
   log4j:log4j:jar:1.2.17:compile
   commons-logging:commons-logging:jar:1.1.1:compile
   ...
The following files have been resolved:
  org.apache.commons:commons-lang3:jar:3.4:compile
   com.clearspring.analytics:stream:jar:2.9.6:compile
   ...

The prototype tool generates a bill of materials that looks something like the following:

maven/mavencentral/org.apache.commons/commons-lang3/3.4, Apache-2.0, approved
maven/mavencentral/org.slf4j/slf4j-api/1.7.21, MIT, approved
maven/mavencentral/org.slf4j/slf4j-log4j12/1.7.21, MIT, approved
maven/mavencentral/log4j/log4j/1.2.17, Apache-2.0, approved
maven/mavencentral/commons-logging/commons-logging/1.1.1, Apache-2.0, approved
maven/mavencentral/com.clearspring.analytics/stream/2.9.6, Apache-2.0, approved
...

The output actually also includes a handful of URLs with the output (e.g., pointers to the source code when they’re known), but I’ve removed them to focus on the important bits. You’ll also notice that content that is repeated in the dependency list only appears once in the output.

For each library, the tool determines the license and whether or not it is approved for use (I’ll discuss how it works in a future post).  Any content that is listed as restricted instead of approved must be reviewed and resolved by the IP Team before it can be included in any official project release (I’ll discuss this in a future post). 

At present, the prototype tool only identifies licenses and assesses compatibility to determine whether or not the content is approved. The output is easily parsed to identify problematic content, but our intent is to make it more immediately helpful without requiring advanced bash-fu skills. There’s plenty of opportunity for further automation, including rolling the prototype into a Maven plugin to incorporate directly into a build and support for other build systems.

This bill of materials becomes the third-party content part of the project’s IP Log. This has the added benefit of being 100% accurate without the need for things like piggyback contribution questionnaires (CQs). At least in the short term, the IPzilla system and CQs will remain the main means by which project committers interact with the IP Team, but only for content that requires investigation.

Posted in EDP, Intellectual Property | 2 Comments

Eclipse Contributor Agreement 3.0

The Eclipse Contributor Agreement (ECA) is an agreement made by contributors certifying the work they are contributing was authored by them and/or they have the legal authority to contribute as open source under the terms of the project license.

The Eclipse Foundation’s IP Team has been working hard to get the various agreements that we maintain between the Eclipse Foundation and community updated. Our first milestone targeted the ECA, and we’re happy to report that a very significant number of our community members have successfully updated theirs. Today, we retired all of the rest of them. Specifically, we’ve revoked all ECAs that predate the ECA version 3.0.

We’re confident that we’ve managed to connect and update the ECA for everybody who still wants to be a contributor, so there should be no interruption for anybody who is actively contributing. If we missed you, you’ll be asked to sign the new ECA the next time you try to contribute. Or you can just re-sign it now.

We’ve made some changes with the new agreements that make contributing easier, (but explaining harder). Committers who have signed the Individual Committer Agreement (ICA) version 4.0 or work for a company that has signed the Member Committer and Contributor Agreement do not require an ECA.

Contact emo_records@eclipse.org if you’re having trouble with an agreement.

Posted in Uncategorized | Leave a comment

Specification Scope in Jakarta EE

With the Eclipse Foundation Specification Process (EFSP) a single open source specification project has a dedicated project team of committers to create and maintain one or more specifications. The cycle of creation and maintenance extends across multiple versions of the specification, and so while individual members may come and go, the team remains and it is that team that is responsible for the every version of that specification that is created.

The first step in managing how intellectual property rights flow through a specification is to define the range of the work encompassed by the specification. Per the Eclipse Intellectual Property Policy, this range of work (referred to as the scope) needs to be well-defined and captured. Once defined, the scope is effectively locked down (changes to the scope are possible but rare, and must be carefully managed; the scope of a specification can be tweaked and changed, but doing so requires approval from the Jakarta EE Working Group’s Specification Committee).

Regarding scope, the EFSP states:

Among other things, the Scope of a Specification Project is intended to inform companies and individuals so they can determine whether or not to contribute to the Specification. Since a change in Scope may change the nature of the contribution to the project, a change to a Specification Project’s Scope must be approved by a Super-majority of the Specification Committee.

As a general rule, a scope statement should not be too precise. Rather, it should describe the intention of the specification in broad terms. Think of the scope statement as an executive summary or “elevator pitch”.

Elevator pitch: You have fifteen seconds before the elevator doors open on your floor; tell me about the problem your specification addresses.

The scope statement must answer the question: what does an implementation of this specification do? The scope statement must be aspirational rather than attempt to capture any particular state at any particular point-in-time. A scope statement must not focus on the work planned for any particular version of the specification, but rather, define the problem space that the specification is intended to address.

For example:

Jakarta Batch provides describes a means for executing and managing batch processes in Jakarta EE applications.

and:

Jakarta Message Service describes a means for Jakarta EE applications to create, send, and receive messages via loosely coupled, reliable asynchronous communication services.

For the scope statement, you can assume that the reader has a rudimentary understanding of the field. It’s reasonable, for example, to expect the reader to understand what “batch processing” means.

I should note that the two examples presented above are just examples of form. I’m pretty sure that they make sense, but defer to the project teams to work with their communities to sort out the final form.

The scope is “sticky” for the entire lifetime of the specification: it spans versions. The plan for any particular development cycle must describe work that is in scope; and at the checkpoint (progress and release) reviews, the project team must be prepared to demonstrate that the behavior described by the specifications (and tested by the corresponding TCK) cleanly falls within the scope (note that the development life cycle of specification project is described in Eclipse Foundation Specification Process Step-by-Step).

In addition the specification scope which is required by the Eclipse Intellectual Property Policy and EFSP, the specification project that owns and maintains the specification needs a project scope. The project scope is, I think, pretty straightforward: a particular specification project defines and maintains a specification.

For example:

The Jakarta Batch project defines and maintains the Jakarta Batch specification and related artifacts.

Like the specification scope, the project scope should be aspirational. In this regard, the specification project is responsible for the particular specification in perpetuity. Further the related artifacts, like APIs and TCKs can be in scope without actually being managed by the project right now.

Today, for example, most of the TCKs for the Jakarta EE specifications are rolled into the Jakarta EE TCK project. But, over time, this single monster TCK may be broken up and individual TCKs moved to corresponding specification projects. Or not. The point is that regardless of where the technical artifacts are currently maintained, they may one day be part of the specification project, so they are in scope.

I should back up a bit and say that our intention right now is to turn the “Eclipse Project for …” projects that we have managing artifacts related to various specifications into actual specification projects. As part of this effort, we’ll add Git repositories to these projects to provide a home for the specification documents (more on this later). A handful of these proto-specification projects currently include artifacts related to multiple specifications, so we’ll have to sort out what we’re going to do about those project scope statements.

We might consider, for example, changing the project scope of the Jakarta EE Stable APIs (note that I’m guessing a future new project name) to something simple like:

Jakarta EE Stable APIs provides a home for stable (legacy) Jakarta EE specifications and related artifacts which are no longer actively developed.

But, all that talk about specification projects aside, our initial focus needs to be on describing the scope of the specifications themselves. With that in mind, the EE4J PMC has created a project board with issues to track this work and we’re going to ask the project teams to start working with their communities to put these scope statements together. If you have thoughts regarding the scope statements for a particular specification, please weigh in.

Note that we’re in a bit of a weird state right now. As we engage in a parallel effort to rename the specifications (and corresponding specification projects), it’s not entirely clear what we should call things. You’ll notice that the issues that have been created all use the names that we guess we’re going to end up using (there’s more more information about that in Renaming Java EE Specifications for Jakarta EE).

Posted in EDP, EFSP, Jakarta EE, Java, Uncategorized | Leave a comment

Renaming Java EE Specifications for Jakarta EE

It’s time to change the specification names…

When we first moved the APIs and TCKs for the Java EE specifications over to the Eclipse Foundation under the Jakarta EE banner, we kept the existing names for the specifications in place, and adopted placeholder names for the open source projects that hold their artifacts. As we prepare to engage in actual specification work (involving an actual specification document), it’s time to start thinking about changing the names of the specifications and the projects that contain their artifacts.

Why change? For starters, it’s just good form to leverage the Jakarta brand. But, more critically, many of the existing specification names use trademarked terms that make it either very challenging or impossible to use those names without violating trademark rules. Motivation for changing the names of the existing open source projects that we’ll turn into specification projects is, I think, a little easier: “Eclipse Project for …” is a terrible name. So, while the current names for our proto-specification projects have served us well to-date, it’s time to change them. To keep things simple, we recommend that we just use the name of the specification as the project name. 

With this in mind, we’ve come up with a naming pattern that we believe can serve as a good starting point for discussion. To start with, in order to keep things as simple as possible, we’ll have the project use the same name as the specification (unless there is a compelling reason to do otherwise).

The naming rules are relatively simple:

  • Replace “Java” with “Jakarta” (e.g. “Java Message Service” becomes “Jakarta Message Service”);
  • Add a space in cases where names are mashed together (e.g. “JavaMail” becomes “Jakarta Mail”);
  • Add “Jakarta” when it is missing (e.g. “Expression Language” becomes “Jakarta Expression Language”); and
  • Rework names to consistently start with “Jakarta” (“Enterprise JavaBeans” becomes “Jakarta Enterprise Beans”).

This presents us with an opportunity to add even more consistency to the various specification names. Some, for example, are more wordy or descriptive than others; some include the term “API” in the name, and others don’t; etc.

We’ll have to sort out what we’re going to do with the Eclipse Project for Stable Jakarta EE Specifications, which provides a home for a small handful of specifications which are not expected to change. I’ll personally be happy if we can at least drop the “Eclipse Project for” from the name (“Jakarta EE Stable”?). We’ll also have to sort out what we’re going to do about the Eclipse Mojarra and Eclipse Metro projects which hold the APIs for some specifications; we may end up having to create new specification projects as homes for development of the corresponding specification documents (regardless of how this ends up manifesting as a specification project, we’re still going to need specification names).

Based on all of the above, here is my suggested starting point for specification (and most project) names (I’ve applied the rules described above; and have suggested tweaks for consistency by strike out):

  • Jakarta APIs for XML Messaging
  • Jakarta Architecture for XML Binding
  • Jakarta API for XML-based Web Services
  • Jakarta Common Annotations
  • Jakarta Enterprise Beans
  • Jakarta Persistence API
  • Jakarta Contexts and Dependency Injection
  • Jakarta EE Platform
  • Jakarta API for JSON Binding
  • Jakarta Servlet
  • Jakarta API for RESTful Web Services
  • Jakarta Server Faces
  • Jakarta API for JSON Processing
  • Jakarta EE Security API
  • Jakarta Bean Validation
  • Jakarta Mail
  • Jakarta Beans Activation Framework
  • Jakarta Debugging Support for Other Languages
  • Jakarta Server Pages Standard Tag Library
  • Jakarta EE Platform Management
  • Jakarta EE Platform Application Deployment
  • Jakarta API for XML Registries
  • Jakarta API for XML-based RPC
  • Jakarta Enterprise Web Services
  • Jakarta Authorization Contract for Containers
  • Jakarta Web Services Metadata
  • Jakarta Authentication Service Provider Interface for Containers
  • Jakarta Concurrency Utlities
  • Jakarta Server Pages
  • Jakarta Connector Architecture
  • Jakarta Dependency Injection
  • Jakarta Expression Language
  • Jakarta Message Service
  • Jakarta Batch
  • Jakarta API for WebSocket
  • Jakarta Transaction API

We’re going to couple renaming with an effort to capture proper scope statements (I’ll cover this in my next post). The Eclipse EE4J PMC Lead, Ivar Grimstad, has blogged about this recently and has created a project board to track the specification and project renaming activity (as of this writing, it has only just been started, so watch that space). We’ll start reaching out to the “Eclipse Project for …”  teams shortly to start engaging this process. When we’ve collected all of the information (names and scopes), we’ll engage in a restructuring review per the Eclipse Development Process (EDP) and make it all happen (more on this later).

Your input is requested. I’ll monitor comments on this post, but it would be better to collect your thoughts in the issues listed on the project board (after we’ve taken the step to create them, of course), on the related issue, or on the EE4J PMC’s mailing list.

 

Posted in EDP, EFSP, Jakarta EE, Java, Uncategorized | Tagged , , | 4 Comments