# Use the ML-in-a-Box Template For Machine Learning Applications Machines are Linux and Windows virtual machines with persistent storage, GPU options, and free unlimited bandwidth. They’re designed for high-performance computing (HPC) workloads. ML-in-a-Box is a Linux-based machine template with a pre-installed data science stack, including popular tools like PyTorch, TensorFlow, Hugging Face Transformers, Deepspeed, JupyterLab, NumPy, Pandas, XGBoost, and Scikit-learn. It comes with full machine learning support (CUDA, cuDNN, NVIDIA Docker). You can use this template to run machine learning software, develop and train models, or install additional tools as needed. ML-in-a-Box includes several directory-specific commands located in `/usr/bin`, `/usr/local/bin`, `/home/paperspace/.local/bin`, and `/usr/local/cuda/bin`, allowing you to access and run machine learning software or install additional tools. These directories are already included in your system’s `PATH`, so no manual configuration is needed. ML-in-a-Box also has Python libraries commonly used in data science, machine learning, and deep learning projects pre-installed. For more information on ML-in-a-Box, refer to the [ML-in-a-Box GitHub repository](https://github.com/Paperspace/ml-in-a-box). ## Set Up ML-in-a-Box To use the [ML-in-a-Box](https://github.com/Paperspace/ml-in-a-box) template, you need to choose this template when first [creating your machine](https://docs.digitalocean.com/products/paperspace/machines/how-to/create/index.html.md#template). In the **Create a new machine** page, under the **Machine** section, under the **OS Template** sub-section, click the drop-down menu, in the search bar, type “ML-in-a-Box”, then select it. **Note**: For machines created after 17 January 2024, the ML-in-a-Box template automatically includes a [NCCL](https://docs.nvidia.com/deeplearning/nccl/install-guide/index.html) configuration file to improve performance on hardware like NVIDIA H100s. On older machines, you can manually add this configuration by creating a `/etc/nccl.conf` file with the following contents: ```text NCCL_TOPO_FILE=/etc/nccl/topo.xml NCCL_IB_DISABLE=0 NCCL_IB_CUDA_SUPPORT=1 NCCL_IB_HCA=mlx5 NCCL_CROSS_NIC=0 NCCL_SOCKET_IFNAME=eth0 NCCL_IB_GID_INDEX=1 ``` Afterwards, continue configuring your machine as needed, then click **CREATE MACHINE**. ## Connect to Your Machine Machines created with ML-in-a-Box only have terminal access, so you need to [connect to your machine](https://docs.digitalocean.com/products/paperspace/machines/how-to/connect/index.html.md) using SSH. After connecting to your machine, your home directory is set to `/home/paperspace` and the shell is set to `/bin/bash`. To report any issues with the software, provide feedback, or issue requests, see the [Paperspace Community](https://community.paperspace.com/) or contact [Paperspace support](https://docs.digitalocean.com/products/paperspace/machines/support/index.html.md). ## Verifying Your GPUs To verify your GPUs on your machine, run the NVIDIA System Management Interface to display information about the device: ```bash nvidia-smi ``` This outputs a list of the NVIDIA GPUs on your machine: ```text +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 | |---+---+---+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA H100 80GB HBM3 On | 00000000:00:05.0 Off | 0 | | N/A 25C P0 74W / 700W | 155MiB / 81559MiB | 0% Default | | | | Disabled | +-----------------------------------------+----------------------+----------------------+ ... ``` ### Verifying PyTorch If you need to verify your PyTorch environment on your machine, start by displaying your machine’s GPU specifications by running `python` on your terminal. Your PyTorch environment is set up properly if `torch.cuda.is_available()` returns `True`. ```sh Python 3.11.7 (main, Dec 8 2023, 18:56:58) [GCC 11.4.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import torch >>> torch.cuda.is_available() True >>> torch.cuda.get_device_name(0) 'NVIDIA H100 80GB HBM3' ``` If needed, run `python -m torch.utils.collect_env` for more information on your machine’s environment. If the PyTorch environment on your machine is not set up properly, you may receive an error indicating that `torch` is not found: ```text Traceback (most recent call last): File "", line 1, in ModuleNotFoundError: No module named 'torch' ``` If you receive this error, install [PyTorch](https://pytorch.org/). ### Verifying TensorFlow If you need to verify your TensorFlow environment on your machine, start by displaying your machine’s GPU specifications by running `python` on your terminal. Your TensorFlow environment is set up properly if `tf.config.list_physical_devices('GPU')` isn’t an empty list and `tf.test.is_built_with_cuda()` returns `True`. ```sh python >>> import tensorflow as tf >>> x = tf.config.list_physical_devices('GPU') >>> for i in range(len(x)): print(x[i]) PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU') PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU') ... >>> tf.test.is_built_with_cuda() True ```