Description of image

Deployment Spec Reference

Paperspace Deployments are containers-as-a-service that allow you to run container images and serve machine learning models using a high-performance, low-latency service with a RESTful API.


A deployment configuration or spec represents the desired state of your deployment. The deployment spec can be viewed/updated through the web console or Gradient CLI.

  apiVersion: v1 # required, defaults to v1
  name: my cool deployment # the name of your deployment. This must be unique within your project.
  image: paperspace/app-fixture:0.1.3
  enabled: true # Toggle to enable or disable the app
  containerRegistry: my-registry # (optional) name of the container registry to use for the app.

  command: # command to run on startup
    - /bin/sh
    - '-c'
    - |
      while true
      do
        sleep .01
      done      

  resources:
    ports:
      - 8000
    replicas: 2 # amount of static replicas for your app. We recommend 2 to ensure high availability.
    machineType: A100-80G
    autoscaling:
      enabled: true # toggle for enabling/disabling autoscaling
      maxReplicas: 5 # max replicas for autoscaling
      metrics:
        - metric: cpu
          summary: average
          value: 50 # 50% cpu utilization across all replicas
        - metric: memory
          summary: average
          value: 22 # 22% memory utilization across all replicas
        - metric: requestDuration
          summary: average
          value: 2 # 2 second request duration for the endpoint

  integrations: # List of integrations. Max is 5.
    - type: git-lfs # git-lfs integration type
      name: falcon # unique name of the integration
      path: /models/ # the a unique path on the filesystem to mount the integration. In this spec, model files will be located at /models/falcon
      url: https://huggingface.co/tiiuae/falcon-7b # hugging face model url for cloning

    - type: s3 # s3 integration type
      name: my-s3-integration
      path: /some/s3/mount/path
      url: s3://my-integration-bucket/
      region: us-east-1
      accessKeyId: AKIAVWO7J5OJSCWRJ3HJ
      secretAccessKey: secret:secretAccessKey # stored as a project or team secret. Naming is arbitrary.
      

  healthChecks: # health checks allow you to define a set of probes to check the health of your app
    readiness:
      path: /
      port: 8000 # healthcheck port.
      initialDelaySeconds: 5
      periodSeconds: 5
      timeoutSeconds: 5
      failureThreshold: 5
      headers: # (optional) list of headers to pass to the readiness probe
        - name: Authorization
          value: some-token
    liveness:
      ...
    startup:
      ...
  
  basicAuthKey: secret:my_paperspace_secret # setting up a protected endpoint to restrict access for unauthorized users

  env: # container environment variables
    - name: some-env
      value: some-value
    - name: secret-env
      value: secret:mySecretEnv # stored as a project or team secret. Naming is arbitrary.

Common Configuration Actions

These are actions you can take by updating the deployment spec either through the console or the CLI/SDK.

  1. Start and stop the Deployment: Set enabled to true (on) or false (off) and resubmit the spec
  2. Update the number of replicas: Change replicas to the desired number of replicas and resubmit the spec
  3. Change the machine type or image: Update the machineType and/or image respectively and resubmit the spec