https://github.com/doi-usgs/lake-temperature-out

outputs and summaries from lake modeling pipelines

https://github.com/doi-usgs/lake-temperature-out

Science Score: 18.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    11 of 13 committers (84.6%) from academic institutions
  • Institutional organization owner
    Organization doi-usgs has institutional domain (www.usgs.gov)
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.1%) to scientific vocabulary

Keywords from Contributors

usgs
Last synced: 6 months ago · JSON representation

Repository

outputs and summaries from lake modeling pipelines

Basic Info
  • Host: GitHub
  • Owner: DOI-USGS
  • License: gpl-3.0
  • Language: R
  • Default Branch: main
  • Size: 337 KB
Statistics
  • Stars: 1
  • Watchers: 4
  • Forks: 6
  • Open Issues: 2
  • Releases: 0
Created about 6 years ago · Last pushed almost 3 years ago
Metadata Files
Readme Contributing License Code of conduct

README.md

lake-temperature-out

outputs and summaries from lake modeling pipelines

Running models on USGS clusters

Yeti quickstart

Once everything is set up (see below), log into Yeti and get started like this: sh ssh yeti.cr.usgs.gov cd /cxfs/projects/usgs/water/iidd/data-sci/lake-temp/lake-temperature-out

Configure Yeti session to not change file permissions

One limitation we have discovered with default configurations in Yeti is that any time someone modifies a file, the file permissions change so that others are locked out. This made it really difficult to work collaboratively in one directory on Yeti, especially with git repos because whoever last pulled from upstream became the "owner" of any files modified from that pull. To get around this, you can set a configuration in your Yeti session so that any changes you make to a file will not modify the file's permissions. You will need to run the command below in every Yeti session that you start up. Note that this is not something you want to do if you are working on files that only you should be able to change, e.g. your id_rsa keys. sh umask 002

If you forget to do this in a session and end up modifying a file and change the permissions, you can restore permissions to others by running the code below to give read/write access to users in your group. sh chmod g+w <filename or directory>

Using Interactive:

make sure you belong to the watertemp group -or- use iidd (or cida) in place of it below:

sh salloc -A iidd -n 4 -p normal -t 7:00:00 this ☝️ gives you 4 cores on normal for 7 hours. You probably want way more than 4, but this is a start. Then ssh into the node you are given, and from there, go to the working directory sh ssh n3-98 cd /cxfs/projects/usgs/water/iidd/data-sci/lake-temp/lake-temperature-out Load modules as before: sh module load legacy R/3.6.3 tools/nco-4.7.8-gnu tools/netcdf-c-4.3.2-intel then start R sh R Now you are in R but on a big cluster, so the number of cores you have available is much greater than on your own machine (unless you only asked for 4 cores...) sh library(scipiper) sbtools::authenticate_sb('cidamanager') scmake('3_summarize/out/annual_metrics_pgdl.csv')

Note that 3_summarize.yml will need to be modified to take advantage of loop tasks by specifying the number of cores to use. If the job fails or you are kicked off Yeti, no worries, as remake/scipiper will pick back up where you left off in the task table. 🎉

Lastly, the glm2 pb0 runs have tons of lakes and were successfully run by NOT using an interactive R session. So, instead of starting R as it says in the steps above, add an R script called run.R that contains that following script and kick off at the command line by running nohup Rscript run.R &. The difference is that you can't use this method to log in to SB, so make sure 1_fetch steps are complete first.

```r

To kick this off, run the following in the command line instead of R

nohup Rscript run.R &

message("Build with no login to SB") library(scipiper)

message(sprintf("starting 3summarize for glm2pb0 at %s", Sys.time())) scmake("3summarize/out/annualmetricsglm2pb0.csv")

```

Editing files on the cluster - Launching Jupyter Lab

You can use vim to edit files locally.

You can also use the Jupyter interface to edit files via a browser-based IDE. See https://hpcportal.cr.usgs.gov/hpc-user-docs/Yeti/GuidesandTutorials/how-to/LaunchJupyterNotebook.html for more.

Once you have set up a script to launch Jupyter Lab for the project and created the jlab environment for the user (see instructions below), follow these steps:

  1. In a new terminal window (call this one Terminal #2, assuming you'll keep one open for terminal access to Yeti): sh ssh yeti.cr.usgs.gov cd /cxfs/projects/usgs/water/iidd/data-sci/lake-temp/lake-temperature-out module load legacy module load python/anaconda3 salloc -J jlab -t 2:00:00 -p normal -A iidd -n 1 -c 1 sh launch-jlab.sh and copy the first line printed out by that script (begins with ssh). Note that this terminal is now tied up.

  2. In another new terminal window (call this one Terminal #3), paste the ssh command, which will look something like this: sh ssh -N -L 8599:igskahcmgslih03.cr.usgs.gov:8599 hcorson-dosch@yeti.cr.usgs.gov Enter the command. Note that this terminal is now tied up.

  3. Copy one of the final URLs printed out by the launch-jlab.sh script in Terminal #2, and paste it into a local browser window. Will look like this: http://igskmncmpshtl01:8528/?token=962bc58cf87016fa35075ecd64cec5597e805bd1ecbce0ca Be patient as the interface loads. Once you're in, you can edit files, create notebooks, etc. with the Jupyter Lab IDE.

Creating a conda Jupyter Lab environment (once per user)

sh module load legacy module load python/anaconda3 conda create -n jlab jupyterlab -c conda-forge

In order to add an R kernel to the Jupyter Lab IDE (so that we can build and run R notebooks in addition to Python notebooks), we need to run the following series of commands: sh module load legacy module load python/anaconda3 conda activate jlab conda install -c r r-irkernel zeromq If you have already set up Jupyter Lab for the project (see below) and launched Jupyter Lab, you will have to re-launch Jupyter Lab (see above) to see the R kernel.

Creating a script to launch Jupyter Lab (once per project)

Save the following script to launch-jlab.sh.

```sh

!/bin/bash

JPORT=shuf -i 8400-9400 -n 1

source activate jlab

echo "ssh -N -L $JPORT:hostname:$JPORT $USER@yeti.cr.usgs.gov"

jupyter lab --ip '*' --no-browser --port $JPORT --notebook-dir=. &

wait ```

Next we need to add the base R library from the Yeti R 3.6.3 module to our .Renviron file, so that it can be accessed by the Rkernel in Jupyter Lab.

In the console, within the project directory, type vim .Renviron. Enter 'i' to enter the insert mode, and paste in the following line:

sh R_LIBS_USER="/cxfs/projects/usgs/water/iidd/data-sci/lake-temp/lake-temperature-out/Rlib_3_6":"/opt/ohpc/pub/usgs/libs/gnu8/R/3.6.3/lib64/R/library" Press 'Esc', then type ':wq' to save and close the file.

Launch Jupyter Lab (see above) and open a new Jupyter Notebook with the R kernel. Run the command .libPaths(). You should see these 3 paths listed in this order: sh '/cxfs/projects/usgs/water/iidd/data-sci/lake-temp/lake-temperature-out/Rlib_3_6' '/opt/ohpc/pub/usgs/libs/gnu8/R/3.6.3/lib64/R/library' '/home/{username}/.conda/envs/jlab/lib/R/library' Now we should be able to load any libraries from our project library folder while in Jupyter Lab, and any necessary dependencies that are not in our project library folder will be loaded from the Yeti R 3.6.3 module library.

Owner

  • Name: U.S. Geological Survey
  • Login: DOI-USGS
  • Kind: organization
  • Email: gs_help_git@usgs.gov
  • Location: United States of America

By integrating our diverse scientific expertise, we understand complex natural science phenomena and provide scientific products that lead to solutions.

GitHub Events

Total
Last Year

Committers

Last synced: about 2 years ago

All Time
  • Total Commits: 338
  • Total Committers: 13
  • Avg Commits per committer: 26.0
  • Development Distribution Score (DDS): 0.249
Past Year
  • Commits: 5
  • Committers: 3
  • Avg Commits per committer: 1.667
  • Development Distribution Score (DDS): 0.4
Top Committers
Name Email Commits
Platt l****t@u****v 254
Jordan S Read j****d@u****v 18
Lindsay Platt l****t@i****v 15
Lindsay Platt l****t@i****v 12
Lindsay Platt l****t@i****v 10
Hayley Corson-Dosch h****h@i****v 7
Jesse Ross j****s@u****v 7
Lindsay Platt l****t@i****v 5
Hayley Corson-Dosch h****h@i****v 3
Lindsay Platt l****t@d****v 3
Lindsay Platt l****t@i****v 2
hcorson-dosch 5****h 1
Julie Padilla j****s@g****m 1

Issues and Pull Requests

Last synced: about 1 year ago

All Time
  • Total issues: 19
  • Total pull requests: 45
  • Average time to close issues: 4 months
  • Average time to close pull requests: about 1 month
  • Total issue authors: 3
  • Total pull request authors: 4
  • Average comments per issue: 6.53
  • Average comments per pull request: 3.18
  • Merged pull requests: 43
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • jordansread (9)
  • lindsayplatt (9)
  • jesse-ross (1)
Pull Request Authors
  • lindsayplatt (38)
  • jordansread (4)
  • jesse-ross (2)
  • aappling-usgs (1)
Top Labels
Issue Labels
enhancement (2)
Pull Request Labels

Dependencies

docker/Dockerfile docker
  • rocker/geospatial 4.1.2 build
docker/docker-compose.yml docker
  • jrossusgs/lake-temperature-out v0.2.2