Getting Started with GCP
Creating an account
To gain access to the Google Cloud cluster at the Boulder GPU Hackathon, it is most secure and simple to create a Google Cloud account on https://cloud.google.com/freetrial and fill out the registration form. When you create an account, you’ll receive $300 in credit to play around with creating virtual machines. Although GCP will ask for a credit card, you will not be charged for your usage of GCP at the Boulder Hackathon; this is used for verification purposes only.
If you have any questions or concerns, feel free to reach out to Joe at email@example.com
Once you’ve signed up for GCP and registered, your username will be added to the boulder-gpu-hackathon project and you will be able to ssh in to the Cloud Cluster’s login node.
From your web browser ( Cloud Shell, Recommended )
The Cloud Shell is a unix terminal-like environment embedded in your web browser that is equipped with GCloud command line tools. A good place to get started familiarizing yourself with the cloud console is through Google’s online documentation.
To log in to the cluster, go to https://cloud.google.com and log in using your GCP credentials. In the top panel of the GCP Console, switch projects to “Boulder GPU Hackathon” (upper left portion of the GCP Console)
Once you’ve switched to the Boulder GPU Hackathon Project, open the Cloud Shell, which can be found in the upper-right portion of the GCP Console.
This will open a terminal window (cloud shell) in your web browser. To login to the Cloud Cluster, use the following command :
|gcloud compute –project=boulder-gpu-hackathon ssh –zone=us-central1-f login1|
Once logged in, to the login node, you can install your software in your home directory and submit jobs to the compute nodes in the cluster.
From your own terminal
To access the Cloud Cluster from your own terminal, you need to install the gcloud sdk command line tools and complete the initial authentication between your GCP account and your system. Complete documentation on installation and setup can be found in GCP’s documentation on Installing Google Cloud SDK.
If you are on a Debian based linux system ( e.g. Ubuntu, Mint )
sudo apt-get install gcloud
On Redhat and CentOS systems
sudo yum install gcloud
Once gcloud has been initialized, and your system has been authenticated, you can now ssh into the login node using
|gcloud compute –project=boulder-gpu-hackathon ssh –zone=us-central1-f <user-name>@login1|
Your user-name is Gsuite/Gmail/GCP username you submitted in the Boulder GPU Hackathon registration. For example, my GCP account is firstname.lastname@example.org, I would use
|gcloud compute –project=boulder-gpu-hackathon ssh –zone=us-central1-f joe@login1|
Moving files onto the Cloud
This article from Google Cloud Platform provides good documentation on moving files from your workstation onto the cloud. To move files onto the cloud via scp, you will need to install Google Cloud SDK on your workstation.
To push a file up to your home directory on the login node
|gcloud compute –project=boulder-gpu-hackathon scp <file-name> <user-name>@login1:|
To pull a file from the cloud to your system
|gcloud compute –project=boulder-gpu-hackathon scp <user-name>@login1:path/to/file path/to/destination/|
Setting up your environment
The table below lists the base installation packages available on the GCP Cluster in addition to the commands needed to bring these packages into your environment.
|Compiler / MPI||OpenMPI 2.1.2||OpenMPI 3.0.1||MPICH 3.2.1|
spack load email@example.com
|N/A||spack load firstname.lastname@example.org %gcc7.3.0||spack load email@example.com %firstname.lastname@example.org|
module load pgi/17.10
|module load openmpi/2.1.2/pgi/17.10||N/A||spack load email@example.com %firstname.lastname@example.org|
module load pgi/18.4
|module load openmpi/2.1.2/pgi/18.4||N/A||spack load email@example.com %firstname.lastname@example.org|
Environment Modules and Spack allow you to load and unload libraries and binaries to your search path. This is useful given that we have many users all with different compiler and library needs.
To see a list of available packages on our system, use
To load packages, use
|spack load <package-name>|
This command loads the appropriate paths to your PATH and LD_LIBRARY_PATH
To clear your PATH and LD_LIBRARY_PATH
When using the PGI compilers, CUDA toolkits installed with PGI are brought into your environment. Doing an ` echo $CUDAPATH ` after loading either of the PGI compilers will reveal where the CUDA toolkit can be found.
Installs of cuda-toolkit versions 8.0 and 9.0 are provided for using the CUDA toolkit without PGI compilers.
To load version 8.0
|module load cuda/8.0|
To load version 9.0
|module load cuda/9.0|
Submitting Jobs (Slurm)
The GCP cluster we’ve set up uses the Slurm job scheduler to manage workload requests from all of our attendees. To run applications on this system, you must write a job submission script and use the sbatch job scheduling command.
Sample Job Submission Script
# Gain exclusive access to a whole node
# How many nodes are needed
# How many MPI tasks per node
# How many physical CPU’s per task
# How long the job is anticipated to run
# The name of the job
pwd; hostname; date
# Load PGI compilers with MPI
module load pgi/18.4 openmpi/2.1.2/pgi/18.4
mpirun -np 2 ./my_exe
Additional useful slurm submission scripts can be found at the University of Florida Research Computing’s website.
To submit a job with a slurm submission script
Use squeue to check the status of jobs running on the cluster.
The login nodes, controller nodes, and compute nodes use the n1-standard machine type on GCP.
For the n1 series of machine types, a virtual CPU is implemented as a single hardware hyper-thread on a 2.6 GHz Intel Xeon E5 (Sandy Bridge), 2.5 GHz Intel Xeon E5 v2 (Ivy Bridge), 2.3 GHz Intel Xeon E5 v3 (Haswell), 2.2 GHz Intel Xeon E5 v4 (Broadwell), or 2.0 GHz Intel Skylake (Skylake). For more information, on the n1-standard machines, read more about GCP Machine Types
The login node uses the n1-standard-64 setup, giving 64 cores and 240 GB RAM. There are no GPU’s attached to the login node. The login node is meant for users to edit source code, compile executables, and submit jobs to run on the compute nodes.
Users should not run intense compute jobs on the login nodes
The compute nodes use the n1-highmem-8 setup, giving 8 cores and 52 GB RAM with the addition of one V100 per node. At this time, there are limitations on the number of virtual CPUs allowable for use with each V100; this article provides more details on the allowable GPU-CPU configurations.
The login node uses the n1-standard-64 setup, giving 64 cores and 240 GB RAM. There are no GPU’s attached to the controller node. The controller node hosts the /apps/ and /home/ directory space on the compute cluster; these paths are mounted on the login node and all of the compute nodes
The login, controller, and compute nodes are networked together using Google’s Virtual Private Cloud (VPC) infrastructure. The physical hardware underlying the network system consists of proprietary high-speed “ethernet-like” hardware achieving a max bandwidth of 16 Gb/s.