Description of image

How to Upload and Download Datasets and Files from Gradient Notebooks

Notebooks are a web-based Jupyter IDE with shared persistent storage for long-term development and inter-notebook collaboration, backed by accelerated compute.


Uploading Large Files to the File Manager

To upload a large number of files, you should use command-line libraries such as curl, Wget, or gdown.

For example, you could use wget to download the Stanford Dogs dataset to your notebook:

Use the wget command

This command downloads the dataset to your current folder:

!wget http://vision.stanford.edu/aditya86/ImageNetDogs/images.tar

Transferring Files From Google Drive

You can bring your files and folders in your Google Drive into your notebook using gdown.

Through the notebook or terminal, execute the following command to install gdown: pip install gdown.

Then, run the following command in the same terminal to upgrade: pip install --upgrade gdown. Before each command in the notebook, use a !.

In the permissions settings of the files and folders you want to upload, set the permissions to “Anyone with the Link.”

Then, obtain the file id by copying and extracting it from the file share link, and use the following commands based on your needs.

Obtain the file id in the Google Drive share link

If your files are bigger than 500 Mb, use: gdown "<file_ID>&confirm=t". For smaller files, use: gdown <file_ID>. For folders, use: gdown https://drive.google.com/drive/folders/<file_ID> -O /tmp/folder --folder.

Download Files From the File Manager

To download large files or folders from the notebook, zip or tar the files first which is down from the notebook or terminal. If the files are in shared storage or a dataset, files are downloadable by moving them into the file manager and executing the following instructions.

  1. Compress the files and folders using the following command in a notebook code cell or the terminal. If you use the notebook, you should add a ! before each command.

    1. tar

      cd /notebooks
      tar -cf [filename].tar [file1] [file2]...
      
    2. zip

      cd /notebooks
      zip -r [filename].zip [file1] [file2]...
      
  2. Refresh the file manager.

  3. Right click on the compressed file created.

  4. Select the Download option.

Shared Storage

You can share data between users on a team and between notebooks that belong to users on a team.

You can access shared persistent storage through code, either via the notebook terminal or via a code cell within a notebook. There is currently no way to access shared persistent storage from the GUI.

Note
Shared storage cannot be accessed across clusters. As a result, data stored in /storage on the Paperspace cluster is not accessible on the Graphcore cluster.

Access Shared Storage

You can access shared persistent storage from a code cell within a notebook using the ! operator and issuing bash commands on a single line connected with the && operator.

For example, you can create a new directory within your persistent /storage directory with the following command:

!cd /storage && mkdir data && cd data

To access persistent storage in a Paperspace Notebooks terminal, use the cd command to change into the persistent directory /storage.

For example, you can create a new persistent directory called “data”:

cd /storage
mkdir data
cd data