Topic 6:
XPublish Enhancements
This topic will focus on improving several aspects of the XPublish project. Pull requests will be submitted to the XPublish project and/or one of its relevant plugin repos.
Initially we will discuss the long-term vision for the project to help determine the milestones and architecture. From there we can split into subgroups to tackle specific problems. The group is not necessarily limited to addressing these goals, and other XPublish-related work is welcome.
Goal 1: Platform Standardization Improvements
Create a standardized method for deploying and hosting XPublish, including documentation.
- Produce a basic Dockerfile for running XPublish
- Write supporting documentation for running on different platforms
- Investigate and address additional considerations for running XPublish (logging, scaling, loading data, etc.)
- Improve configuration management
Goal 2: Benchmark and Enhance XPublish Performance
Address underlying XPublish architecture for large datasets, ensuring that data delivery can scale and maintain high performance.
- Create and document methods for measuring XPublish performance
- Identify bottlenecks and prototype potential solutions
Expected Outcomes:
The goal of this working group is to introduce new users to the XPublish ecosystem, improve the baseline deployment and documentation, and support new innovations to help with additional features and workloads.
Skills required:
- Python
- Basic understanding of Pangeo stack (xarray, dask, zarr)
- Familiarity with web service API concepts
Difficulty:
All experience levels will be able to contribute. We will have novice and advanced topics to work on.
Relevant links:
- XPublish GitHub: https://github.com/xpublish-community/xpublish
- XPublish Experiments: https://github.com/xpublish-experiments/
- Existing XREDS Xpublish host: https://github.com/asascience-open/xreds
- Live XPublish Prototype: https://nextgen-dev.ioos.us/xreds/
Focus areas
During the code sprint, the group primarily worked on these repositories:
- Serving Intake catalogs from Xpublish: https://github.com/xpublish-community/xpublish-intake
- Loading Intake catalogs to Xpublish: https://github.com/xpublish-community/xpublish-intake-provider
- Redis caching for fsspec: https://github.com/mpiannucci/redis-fsspec-cache
- Pixi (Python package manager): https://pixi.sh/latest/
- Unstructured grid subsetting: https://github.com/asascience-open/xarray-subset-grid