Skip to main content

Globus

Globus can be used to transfer data at high speed in and out of MASSIVE. The Globus software is widely used to transfer data around the world between different endpoints. Endpoints are setup at a wide range of institutions and can also be created on your personal computer.

note

Update: Globus v5 has been installed on the MASSIVE data transfer nodes. The login process is the same, except the endpoint names have changed. The previous endpoints (massive#massive, massive#mern) will be decommissioned in the future. The main endpoint is now known as 'MASSIVE M3'. For researchers connected to the Monash eResearch Network, massive#mern will be redeployed as 'MASSIVE on MeRN'

Getting Started

  1. Click on "Log In" to Globus: https://globus.org

alt text

  1. Authenticate to Globus. There are a few options available.

    • if your organisation is in the list, select it and click continue (this works for Monash and others.)
    • use your Google account
    • use your ORCiD iD
    • if you have a Globus ID, click on 'Globus ID to sign in'
    • use your Google account (this works for Monash)
    • use your ORCiD iD

Choose the most appropriate option and follow the displayed pages.

alt text

  1. After authenticating to Globus the File Manager will be presented.
  • Click on the icon as shown to ensure two panels are displayed.

alt text

  1. The next step is to connect to the MASSIVE endpoint.
  • Click in the Collection text box. It does not matter which one, left or right.
  • Enter "MASSIVE M3" to search for the MASSIVE endpoint.
  • Select "MASSIVE M3" from the list.

alt text

  1. Click 'Continue'

alt text

  1. You will be redirected to a page running at https://globusdev.massive.cloud.monash.edu/. This machine is running inside MASSIVE.
  • Use your HPC ID username and password to 'Sign In'.

alt text

  1. After authenticating successfully, your MASSIVE home folder will be displayed.

alt text

  1. Now you need to add another endpoint to transfer data in and out of MASSIVE.
  • Using the other 'Collection' field, enter the name of the required endpoint. In the example shown below, 'AARNet-Public-Test-Share' has been used.

alt text

Personal Globus Endpoint

An Endpoint can be created on your personal computer. This allows you to transfer between your PC and any other endpoint, including MASSIVE.

To install, please choose the correct system and follow the Globus instructions:

Remembering the endpoint name setup during installation, search for your Endpoint in the Globus File Manager. The process is the same as steps 3 and 4 above.

For the Linux version, you may find the browsable directories are restricted for your Personal Globus Endpoint.

The file ~/.globusonline/lta/config-paths can be altered to fix this.

For full details please read this link: https://docs.globus.org/globus-connect-personal/install/linux/#config-paths

Transferring Data

warning

If your datasets contains symbolic links (or symlinks) please be aware of how Globus behaves with regards to these. Read this carefully: https://docs.globus.org/faq/transfer-sharing/#how_does_globus_handle_symlinks

This section assumes two endpoints have been connected to File Manager. If not, please refer to Getting Started above.

  • In the below image, the right hand side is highlighted (dark blue menu), this means any actions will apply to files from the endpoint "AARNet-Public-Test-Share".
  • possible file actions are highlighted in the middle of the screenshot.

alt text

  • to apply file actions to the endpoint MASSIVE, click on the left.

alt text

Transferring files from AARNet-Public-Test-Share to MASSIVE

  1. Ensure the destination path is correct. e.g. ~/ab12_scratch
  2. Click on the right hand side and select some files.

alt text

  1. Click on Start.

alt text

  1. To view the transfer progress, click on 'View Details'.

alt text

To monitor all transfers click on 'Activity' on the left hand side.

Sharing Data with your Collaborators

note

By default, sharing via Globus is not enabled. To apply for access to share via Globus, log in to the HPC ID system and navigate to the Software section of your profile. Click on Add Software, locate GlobusAllow and apply for access. Once approved you will be able to create a Globus collection following the below link. You can also contact our help desk.

Please refer to the Globus documentation. Sharing using Globus on MASSIVE is enabled only for read-only access.

Additional help

This is a brief introduction on how to use Globus to transfer data. For additional information, please refer to https://docs.globus.org/how-to/.

Globus - Command Line Interface

The Globus CLI software has been installed on MASSIVE.

[lexg@m3-login3 ~]$ module load globus-cli/2.0.0 
Loading globus-cli/2.0.0
Loading requirement: singularity/3.7.1
[lexg@m3-login3 ~]$ globus
Usage: globus [OPTIONS] COMMAND [ARGS]...

Interact with Globus from the command line

All `globus` subcommands support `--help` documentation.

Use `globus login` to get started!

The documentation is also online at https://docs.globus.org/cli/

Options:
-v, --verbose Control level of output
-h, --help Show this message and exit.
-F, --format [unix|json|text] Output format for stdout. Defaults to text
--jmespath, --jq TEXT A JMESPath expression to apply to json
output. Takes precedence over any specified '
--format' and forces the format to be json
processed by this expression

--map-http-status TEXT Map HTTP statuses to any of these exit codes:
0,1,50-99. e.g. "404=50,403=51"


Commands:
bookmark Manage endpoint bookmarks
config Manage your Globus config file. (Advanced Users)
delete Submit a delete task (asynchronous)
endpoint Manage Globus endpoint definitions
get-identities Lookup Globus Auth Identities
list-commands List all CLI Commands
login Log into Globus to get credentials for the Globus CLI
logout Logout of the Globus CLI
ls List endpoint directory contents
mkdir Create a directory on an endpoint
rename Rename a file or directory on an endpoint
rm Delete a single path; wait for it to complete
session Manage your CLI auth session
task Manage asynchronous tasks
transfer Submit a transfer task (asynchronous)
update Update the Globus CLI to its latest version
version Show the version and exit
whoami Show the currently logged-in identity

The CLI allows users to access Globus data movement functionality via the command line. For more details please refer to: https://docs.globus.org/cli/

For Globus CLI examples: https://docs.globus.org/cli/examples/. This link contains a good list of examples. e.g. find an endpoint, file manipulation, single item transfers, batch transfers and deletions, share endpoints, safe resubmissions.

Globus - Platform as a Service

See https://www.globus.org/platform.

The Globus Platform enables developers to utilise Globus functionality within their own research data applications and services.

Globus Juptyer Notebooks

See https://github.com/globus/globus-jupyter-notebooks.

The notebook Platform_Introduction_Native_App_Auth.ipynb contains examples of using the functionality available.

Transfer_API_Exercises.ipynb has examples on transferring files, checking for errors, cancelling transfers under certain conditions, monitoring transfers.

Data Portals


Sample research data portals can be viewed here: https://docs.globus.org/modern-research-data-portal/.

These illustrate what can be achieved by using the Globus API.

For further information on the functionality available through the Globus API please refer to: https://docs.globus.org/api/

Command Line Interface - Linux and OS X Users

rsync

Use rsync to synchronise file systems and to transfer large amounts of files, with the ability to stop and restart the file transfers. rsync will replicate all files in a folder from one spot to another. It first analyses both file systems to find the difference and then transfers only the changes.

A typical command to synchronise files from a local folder to M3 is:

rsync -auv -e ssh adirectory username@m3-dtn.massive.org.au:~/destinationdirectory/

rsync is very powerful and has many options to help transfer data. For example it can delete unwanted files (--delete), compress data before transfer (-z) or can you let you see what command options might do without actually executing them (--dry-run). For more info on rsync try man rsync.