Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Governance document is out of line with Mission Statement #23

Open
mcgibbon opened this issue Oct 18, 2018 · 4 comments
Open

Governance document is out of line with Mission Statement #23

mcgibbon opened this issue Oct 18, 2018 · 4 comments

Comments

@mcgibbon
Copy link
Contributor

mcgibbon commented Oct 18, 2018

The Governance document begins with:

The Pangeo Project (The Project) is an open source software project. The goal of The Project is to develop open source software and related technology for the analysis of large scientific datasets. The Project endeavors to extend the broader scientific software ecosystem.

The Mission statement is:

Our mission is to cultivate an ecosystem in which the next generation of open-source analysis tools for ocean, atmosphere and climate science can be developed, distributed, and sustained. These tools must be scalable in order to meet the current and future challenges of big data, and these solutions should leverage the existing expertise outside of the geoscience community.

This is further specified with three goals:

  1. Foster collaboration around the open source scientific python ecosystem for ocean / atmosphere / land / climate science.
  2. Support the development with domain-specific geoscience packages.
  3. Improve scalability of these tools to handle petabyte-scale datasets on HPC and cloud platforms.

The Governance document description of The Project does not include what the first goal states and the mission statement alludes to - fostering collaboration among scientists around the software ecosystem. This includes distributing and spreading the word about tools, getting users of tools to give feedback to developers, and networking scientists who can collaborate on tools.

The Governance document description of The Project also adds in the specification that the project is solely "for the analysis of large scientific datasets".

For example, holding a conference to network scientists working on various open-source scientific projects for collaboration and brainstorming of new projects would clearly fall under the Mission Statement, but not necessarily under the Governance document description. Under the Governance document description, you'd instead expect a conference of scientists specifically working on Pangeo projects to meet to work on those Pangeo projects (which are for analyzing large scientific datasets).

Here I have to take a detour to explain why these differences matter to me.

The initial meeting of The Project networked myself with @JoyMonteiro, and we spent much of that meeting and the following months developing Sympl and CliMT, ostensibly as Pangeo-affiliated projects. In the two years that followed, my impression was that the group as an online entity had become defunct (there was no memo that everything was moved to Github). As a result, Sympl and CliMT have grown apart from Pangeo.

Those projects were made with Pangeo in mind, in the following ways:

  • They leverage xarray for internal model state storage.
  • They give potential opportunities to take advantage of dask and tensorflow within Earth system models
  • They encourage model code to be easy to understand and manipulate, fostering reproducibility and collaboration between scientists.
  • They make model code interoperable between models, again fostering collaboration between model developers.
  • They allow for easy online analysis of model data, and for that online analysis code to be shared with other models.
  • They make model development more accessible for new and inexperienced programmers.
  • Ideally, Pangeo would provide an ecosystem that networks scientific software developers who would be interested in using the aforementioned features and contributing to Sympl and CliMT, both by developing software and by commenting on or disagreeing with the software design choices.

Notice that the above goals of our projects have nothing to do with analyzing large datasets.

Coming back from that detour.

Ideally now that I know Pangeo is still here, I'd like to bring Sympl (and with @JoyMonteiro's blessing, CliMT) back into the Pangeo fold. However, that brings us back to the conflict between the Mission Statement (which reflects the original intention of The Project), and the Governance document (which, recently drafted, reflects at least someone's current understanding of what The Project is supposed to be).

Should the Governance document be revised to reflect the original intention of The Project, or should the Mission Statement be updated to reflect a newer state of The Project? This is related to the question, do Sympl and CliMT have a place in Pangeo?

@rabernat
Copy link
Member

do Sympl and CliMT have a place in Pangeo?

Absolutely, yes! We should broaden the statement in the governance document to encompass these efforts. I think "interoperability" trumps any other specific focus.

The NSF Earthcube award obviously steered things in a certain direction. Having specific deliverables for which we are accountable to NSF to produce has made us very focused. Now is a great time to zoom out and look at the broader landscape.

In the two years that followed, my impression was that the group as an online entity had become defunct (there was no memo that everything was moved to Github)

Jeremy, I don't feel that this is fair. You can find the memo right here:
https://groups.google.com/forum/#!topic/pangeo/pFKILby3cuI
In this email, I said:

We plan to conduct all of our work via the pangeo-data GitHub organization. In particular, we have a new “pangeo-discussion” repo we are using just as an issue tracker and wiki:
https://github.com/pangeo-data/pangeo-discussion
Please join in these discussions freely!

@mcgibbon
Copy link
Contributor Author

mcgibbon commented Oct 18, 2018

Good to hear! With your blessing I'll work on a PR to broaden the opening statement in the coming week, unless someone else wants to take that responsibility.

I'll also think about how to approach re-integrating Sympl and CliMT into The Project.

You're right @rabernat, I was somewhat misinformed. I missed that memo. In my defense, it was really easy to miss and to misunderstand! That e-mail (subject "Announcing the Pangeo NSF Earthcube Award!") was clearly about something entirely different than deprecating the mailing list. This was only mentioned in the second-to-last paragraph of a reasonably long announcement, and it's not clear that "we have a new discussion repo" means "important announcements will no longer be posted to the mailing list". That very discussion repo says:

For now, community discussion is happening on the GitHub issues page or on the pangeo google group.

(emphasis my own)

We're missing the point though - there is nobody to blame for the mailing list becoming defunct, and I am not laying any blame (I'm certainly not blaming you!). I am simply saying that this is why I became out of touch with what was going on in the project.

@JoyMonteiro
Copy link

Yes, it would definitely be great to include climt along with the other tools that use xarray. I was unsure whether and how sympl/climt would fit into the pangeo ecosystem, but thinking about it in terms of interoperability as @rabernat mentioned makes a lot of sense!

@mcgibbon
Copy link
Contributor Author

While doing this modification, I'm also noticing that the explanation of Contributors in the governance document is a little programming-focused. I'm going to include some edits in the PR to bring it more in line with the idea that community building, education, discussion, outreach, etc. are important ways to be a Contributor (pending your comments on those edits, of course).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants