Make ACCESS-NRI Intake Catalog more portable to other computing environments (sort-of)#371
Make ACCESS-NRI Intake Catalog more portable to other computing environments (sort-of)#371marc-white wants to merge 13 commits intomainfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #371 +/- ##
=======================================
Coverage 99.03% 99.04%
=======================================
Files 15 15
Lines 1353 1361 +8
=======================================
+ Hits 1340 1348 +8
Misses 13 13 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
charles-turner-1
left a comment
There was a problem hiding this comment.
Sorry, somehow never got round to reviewing this.
I think we can use environment variables to make quite a bit more straightforward to port. Otherwise looks good!
| @@ -8,4 +8,16 @@ | |||
| __version__ = _version.get_versions()["version"] | |||
|
|
|||
| CATALOG_LOCATION = "/g/data/xp65/public/apps/access-nri-intake-catalog/catalog.yaml" | |||
There was a problem hiding this comment.
There's a trick we use for managing config files which is common in Django/Web Development which might be relevant/useful here:
import os
CATALOG_LOCATION = os.environ.get("CATALOG_LOCATION", "/g/data/xp65/public/apps/access-nri-intake-catalog/catalog.yaml")Basically, it lets us override the default with an environment variable, if found, falling back to the hardcoded default. It strikes me that it might be useful here, although we'd probably want to prefix all the environment variable names as a namespacing strategy, eg.:
CATALOG_LOCATION = os.environ.get(
"ACCESS_NRI_INTAKE_CATALOG_LOCATION", "/g/data/xp65/public/apps/access-nri-intake-catalog/catalog.yaml"
)| CATALOG_LOCATION = "/g/data/xp65/public/apps/access-nri-intake-catalog/catalog.yaml" | ||
| """Location for 'live' master catalog YAML.""" | ||
|
|
||
| USER_CATALOG_LOCATION = str(Path.home() / ".access_nri_intake_catalog/catalog.yaml") |
There was a problem hiding this comment.
| USER_CATALOG_LOCATION = str(Path.home() / ".access_nri_intake_catalog/catalog.yaml") | |
| USER_CATALOG_LOCATION = str(Path.home() / ".access_nri_intake_catalog/catalog.yaml") | |
| USER_CATALOG_LOCATION = os.environ.get( | |
| "ACCESS_NRI_INTAKE_USER_CAT_LOCATION", str(Path.home() / ".access_nri_intake_catalog/catalog.yaml") | |
| ) |
| USER_CATALOG_LOCATION = str(Path.home() / ".access_nri_intake_catalog/catalog.yaml") | ||
| """Location where user can place a master catalog YAML to override standard 'live' version.""" | ||
|
|
||
| STORAGE_ROOT = "/g/data" |
There was a problem hiding this comment.
| STORAGE_ROOT = "/g/data" | |
| STORAGE_ROOT = os.environ.get( | |
| "ACCESS_NRI_INTAKE_STORAGE_ROOT", "/g/data" | |
| ) |
| STORAGE_ROOT = "/g/data" | ||
| """Root storage location for catalog experiments""" | ||
|
|
||
| STORAGE_FLAG_PATTERN = r"gdata/[a-z]{1,2}[0-9]{1,2}" |
There was a problem hiding this comment.
| STORAGE_FLAG_PATTERN = r"gdata/[a-z]{1,2}[0-9]{1,2}" | |
| STORAGE_FLAG_PATTERN = = os.environ.get( | |
| "ACCESS_NRI_INTAKE_STORAGE_FLAG_PATTERN", r"gdata/[a-z]{1,2}[0-9]{1,2}" | |
| ) |
| STORAGE_FLAG_PATTERN = r"gdata/[a-z]{1,2}[0-9]{1,2}" | ||
| """Pattern for matching 'storage flags' - related to Gadi file access system""" | ||
|
|
||
| STORAGE_LOCATION_REGEX = r"^/g/data/(?P<proj>[a-z]{1,2}[0-9]{1,2})/.*?$" |
There was a problem hiding this comment.
| STORAGE_LOCATION_REGEX = r"^/g/data/(?P<proj>[a-z]{1,2}[0-9]{1,2})/.*?$" | |
| STORAGE_LOCATION_REGEX = os.environ.get( | |
| "ACCESS_NRI_INTAKE_STORAGE_LOCATION_REGEX", r"^/g/data/(?P<proj>[a-z]{1,2}[0-9]{1,2})/.*?$" |
| contents of :code:`config/metadata-sources` are archival copies of live experiment :ref:`metadata`; | ||
| you will not need to replace these on your system.) | ||
|
|
||
| 3. The command-line scripts in :code:`bin/` contain PBS commands and file paths specific to Gadi. You will need |
There was a problem hiding this comment.
We should then be able to just update all this to reflect that the defaults can be overridden with environment variables.
|
As noted by @marc-white in #437, this PR will also require decoupling the hard requirement on the telemetry package. |
Closes #363 .
This PR is my first pass at trying to extract out as much of the Gadi dependency as I can from the main source code. I've also included a quick write-up of what I think would be required to get the code working on a different HPC system.
Doubtless I've missed something, so I'm marking this as a draft for others to look over.