Best Practices for Performance with DigitalOcean Spaces

Spaces Object Storage is an S3-compatible object storage service that lets you store and serve large amounts of data. Each Space is a bucket for you to store and serve files. The built-in Spaces CDN minimizes page load times and improves performance.


Here are some recommendations on how to get the best performance from Spaces based on your use case and application architecture.

Use a Content Delivery Network (CDN)

When Should I Do This?

A CDN, or content delivery network, caches your assets in geographically distributed locations to make downloads and page loads faster for your users.

You should use a CDN in front of Spaces if:

  • Your use case is mostly serving GET requests from the Internet, especially for frequent requests to small files.

For example, web-facing applications and media servers are especially likely to see significant performance improvements with the addition of a CDN.

How Does This Improve Performance?

By integrating a CDN with Spaces, you can distribute content to your users with low latency and a higher data transfer rate than you could get by serving your content directly from Spaces.

A CDN will fetch requested files from Spaces and cache them closer to your end users. By serving future requests for the same files from the CDN’s cache, you reduce the number of GET requests sent to Spaces and therefore reduce the user’s overall latency when interacting with your application.

How Do I Implement This?

You can use the free Spaces CDN, which is available as part of your Spaces subscription at no additional cost.

Several third party CDNs have documentation specifically for Spaces, like Fastly and KeyCDN, and most other CDNs will work with Spaces with minimal configuration. However, to use Cloudflare as a CDN, you need to either use a Cloudflare worker (which is a paid service) or use a tier that supports host header rewrites (which the free tier currently does not).

Warning
We do not recommend or support using multiple CDNs from separate vendors with your Spaces buckets (such as the Spaces built-in CDN and another vendor’s CDN), as it can cause performance issues and be complex to manage.

Name Objects Optimally

When Should I Do This?

We recommend the following object naming convention if you expect to use the ListObject API call and store a substantial amount of objects in your Spaces bucket. This threshold might be anywhere between 10 thousand and 1 million objects, depending on your specific workload and level of activity.

How Does This Improve Performance?

The ListObject API call, with its delimiter argument, operates considerably faster when the start of each object key name in your bucket is different. This is useful when your Space has a substantial amount of objects, which naturally slows the ListObject call.

How Do I Implement This?

Prefix each object key name in your bucket with 6-8 characters and pass the delimiter argument when calling ListObject. For example, you can prefix each name with random characters, such as abcdef-filename, or, if you don’t upload too many files per day, with the current date, such as 2022-05-25-filename.

Warning
Do not use personally identifiable or other private information in any in any of bucket names, bucket metadata (BucketPolicy, BucketLifecycle), object names, object metadata (tagging, x-amz-meta-, etc.). This data is not encrypted using server-side encryption and is not typically encrypted in client-side encryption designs.

Avoid Small Files and Use Multi-Part Uploads for Large Files

When Should I Do This?

You should consider the size of your files and the way you upload them to Spaces if you are:

  • Handling files smaller than 1 MB.
  • Uploading files larger than 500 MB.

How Does This Improve Performance?

Spaces is designed for storing and serving moderate to large files. Files 20 MB to 200 MB in size will give the best performance for both writes and reads. Additionally, combining small files into one larger file will greatly reduce the overall number of requests to your Space compared to handling many small files individually.

How Do I Implement This?

When uploading files larger than 500 MB, you should use multi-part uploads.

We recommend combining files less than 1 MB together into a single, larger file. How you do this will be specific to your particular files and use case, but one example is concatenating daily log files into a monthly file.

Choose the Right Datacenter for Your Resources

When Should I Do This?

Choosing the right datacenter location for your buckets depends on where the connections to your buckets come from.

If the connections to your buckets are from Droplets, you’ll see the best performance when you:

  • Ideally, put your Droplets and buckets in the same datacenter.
  • Alternatively, put your Droplets and buckets in datacenters connected by DigitalOcean’s global backbone.

If the connections to your buckets are from end users on the Internet, you’ll see the best performance when you use a CDN, regardless of which region your buckets are in.

How Does This Improve Performance?

Droplets and Spaces buckets in the same datacenter will have the least amount of latency, but if your application requires connectivity across multiple regions, choose datacenter locations connected by DigitalOcean global backbone.

Traffic between Spaces in NYC3, AMS3, SFO2 and Droplets in NYC1, NYC2, NYC3, AMS3, LON1, FRA1, TOR1, SFO1, AMS2, will all go over the global backbone. This will provide predictable and stable latency with no packet loss because these connections use our dedicated links instead of the public Internet.

How Do I Implement This?

You can choose the regions for your resources at creation time. For existing infrastructure, you can migrate your Droplets to a new region.

Traffic between buckets in NYC3 and Droplets in NYC1, NYC2, and NYC3 go over our Northeast regional backbone; traffic between buckets in AMS3 and Droplets in AMS2, LON1, and FRA1 go over our European regional backbone.

Handle 50x Errors Properly

When Should I Do This?

This recommendation applies any time you upload files to Spaces. Your upload application or library should correctly handle 50x errors.

How Does This Improve Performance?

With correct error handling and retry logic, your dataset will upload faster and require less human intervention. Additionally, without this functionality, any 50x errors during uploads will create gaps in your data.

How Do I Implement This?

Spaces has a very high degree of compatibility with S3, so one option is to use a well-established S3-compatible client or library for your uploads, like s3cmd.

If you’re writing your own code, implement retry logic with exponential backoff to handle 503 Slow Down responses.

Optimize Your Request Rate

When Should I Do This?

You should consider ways to optimize the number of requests you send to Spaces if:

  • You send more than 150 requests per second.
  • Your requests are being rate limited.
  • You frequently encounter 503 Slow Down responses.

How Does This Improve Performance?

To ensure all customers receive a fair share of Spaces’ available throughput in any region, we apply rate limiting at the Space level. Making sure your uploads stay within the threshold will prevent unexpected throttling. As we continue to improve Spaces, we will also re-evaluate this threshold.

How Do I Implement This?

There are several ways to begin optimizing the number of requests you send depending on your use case.

  • If you’re uploading many small files at a high rate, consider combining small files into larger files.
  • If you’re running into throttling issues, make sure your upload application or library is handling 50x errors correctly.
  • If you plan to push more than 150 requests/second to Spaces (regularly or as part of a one-time upload), open a support ticket so we can help you prepare for the workload and avoid any temporary limits on your request rate.

Use Local or Block Storage Instead

When Should I Do This?

Not all use cases are appropriate for Spaces. Like all object stores, Spaces are best used for static, unstructured data.

You should use local or block storage if:

  • You’re storing dynamic or structured data, like low-latency key/value stores and other databases.
  • You need traditional filesystem access or POSIX-like semantics.

We don’t recommend Spaces for use with filesystem-on-S3 services, like S3FS or S3QL.

How Does This Improve Performance?

Using local storage or block storage will give you better performance in certain cases because the underlying hardware devices provide low-latency I/O. You can visit Object Storage vs. Block Storage Services to learn more.

How Do I Implement This?

You can get started with Volumes, or learn more about Linux storage concepts and terminology first.

If you need a database solution, you can look into a Redis or Cassandra cluster.