SD Card

RPi NAS: Part 2 Boot Medium

10 January 2024

The latest Raspberry Pi’s can boot from the built-in SD card slot but also from a USB device connected to one of the USB ports. Which one should we use for our NAS? And why? Let’s find out.

This post is part of a series about building a Network-Attached Storage (NAS) with redundancy using a Raspberry Pi (RPi). See here for a list of all posts in this series.


So we want to select a boot medium for our NAS. There are a few options: SD cards and USB devices like USB sticks, HDDs and SSDs1. To be able to decide which one to use we need to know what differentiates them.

As a heads up, this is one of the few posts that contain quite a bit of information that’s not strictly necessary for building this NAS. However, I think what we cover here is both interesting and good to know if you’re into computers.

Device Characteristics

I’m focusing on drive speed and longevity and ignoring factors like (physical) size and noise because they seem less important for this NAS2.

Speed

From a practical point of view the major characteristics we care about are

  • access time and
  • throughput.

Access time tells us how long it takes from when the OS tells the device to read/write data to when the data is actually transferred. For example, to read data from an idle HDD the disk has to spin up, the physical head that reads (and writes) data has to move to the appropriate track and the disk has to rotate until the head is over the location we want to read3. Access times on HDDs thus depend on whether the disk is spinning and the locations of the head and of the disk. On SD cards, USB sticks and SSDs there are no moving parts so access time will be a lot shorter and more or less constant.

Throughput tells us how quickly data can be read from or written to the device, once the data transfer has started.

How much speed we get from a device depends on the type of data access. Let’s consider two scenarios:

  1. Reading one large file
  2. Reading many small files from random locations

In the first case file access only happens once so access time is less important than in the second case. For example, let’s assume you want to copy a 20GB file and your drive can write at 160MB/s. The copy process will take about 125 seconds (=20*1000/160) plus access time4. In that case an access time of 3-6 seconds doesn’t seem like a big deal. By comparison, if we want to copy 20,000 files of 1MB each, then access time will have a lot more impact5.

Another factor to consider is caching on the drive. Small devices like SD cards and USB sticks don’t normally have space for caches and for the controller required to manage them. Also, they can’t easily dissipate the heat generated during caching.

Caches, however, are very useful because they are designed to be very fast. So if you write to a drive with a cache the data will first be saved in the cache (which is a very fast operation). Little by little the data is transferred from the cache to the main (but slower) parts of the drive.

For example, if you copy a large file onto an SSD you may notice how the transfer speed drops off once a certain amount of data has been copied. That amount is a rough indication of the size of the SSD’s cache6.

In addition to the characteristics of the drive we also need to consider how it is connected to our computer (the RPi in this case). For example, the RPi 3b only has USB 2 ports which support a maximum theoretical speed of 480Mb/s (= 60MB/s). Moreover their connection is shared with the Ethernet port so data sent over Ethernet will reduce the bandwidth available for USB transfers7. On the RPi 4b USB and Ethernet don’t share a connection but all four USB ports share a single 4Gb/s (= 500MB/s) PCIe lane. So even though a RPi 4b has two USB 3 ports, which are theoretically able to transfer 5Gb/s each, we will only be able to achieve at most 4Gb/s in total. I.e. if multiple drives are attached the bandwidth is shared among all of them8.

Given these hardware restrictions, a drive with faster access time will still be faster but once the maximum speed of the connection is reached additional throughput won’t speed up transfers.

Longevity

First of all, all storage media we consider here will fail at some point9. In fact that’s the main reason for creating a redundant-data NAS.

Which factors affect life expectancy?

A major reason for why a device may fail is due to manufacturing defects. This can affect any kind of storage medium.

Another major reason is wear-out. HDDs have moveable parts. Flash-based media like SD cards and SSDs don’t have moveable parts but they are rated for a limited number of write-cycles10. So all of them experience some form of wear-out. It may be worth highlighting that flash-based media are not worn-out much by reading from them, only by writing to them. HDDs, on the other hand, are affected by both.

Other factors will be the conditions under which you store the device. High heat, extreme cold, mechanical stress, dirt, etc. can all affect the expected lifetime.

So for how long can we expect our devices to last?

The best source I’ve found was an article about Backblaze’s statistics, covering the HDDs and SSDs used as boot drives in their data center (no SD cards and USB sticks I’m afraid :). Apparently in Backblaze’s datacenter SSDs outperformed HDDs by quite a bit. However, it’s likely that the SSDs haven’t reached their rated number of write-cycles yet so their failure rates may increase (substantially?) once that happens. Check out the article for more information11.

I didn’t find any good information on SD cards and USB sticks. Lot’s of websites state the expected number of write-cycles given by manufacturers, e.g. 10k to 100k cycles, but I haven’t found any empirical data backing up those claims.

Estimating Spin-up Time

Before we move on let’s do quick and rough estimate of the spin-up time for my 3.5″ HDD and my 2.5″ HDDs. Ideally I’d like to estimate the access time but it’s difficult to measure its components individually. The spin-up time is probably the easiest to isolate (but only roughly). Specifically, for each drive I did the following:

  1. First I tested the time it takes to copy 4GiB of data from the RPi to the disk when the disk is already spinning (so no spin-up time is included in this measurement). I repeated that measurement five times and computed the average copy time.
  2. Second I tested the time it takes to copy 4GiB of data from the RPi to the disk when the disk is in standby (so spin-up time is included in this measurement). Again, this measurement was repeated five times and I then computed the average copy time. FYI, you can manually put a HDD into standby with command hdparm -y /dev/sdaX.

The difference between those two measurements is a rough estimate of the spin-up time12. In the case of my WD HDDs both the 3.5″ and the 2.5″ HDDs took approximately 3.5 seconds to spin up.

Phases of Operation

We know that the type of access matters for the speed we can get out of our storage devices. Which kinds of access will typically be used on our boot medium? There are two major stages of the operation of the RPi (of most computers, really),

  • the boot process and
  • ongoing operations.

Let’s look at each of them separately.

The Boot Process

Disk access is typically the slowest of all hardware connections (e.g. it’s much slower than using RAM or processor caches). However, the data required to start our system is stored on a disk. Therefore, the boot process has been optimized to account for this bottleneck. This involves:

  1. Copying a compressed kernel image from the drive into memory (i.e. RAM) and decompressing it.
  2. Copying a compressed initramfs image from the drive into memory and decompressing it.

Loading the files as images has the advantage that file access only has to happen once (which saves a lot of access time). The images are compressed so the amount of data that has to be transferred is as small as possible. Decompressing data in RAM is comparatively fast.

After the initial (standardized) part of the boot process we enter user space. From now on we can’t use pre-made images because every system will be different13. The services started now (systemd, system daemons, window managers, …) will be loaded from wherever they are located on the disk. So this part is characterized by loading many small files from random locations on the drive.

So the boot process involves both of the scenarios we considered in the previous section. Is one more important than the other? The easiest approach to find out is probably by just testing different boot media. The table below shows average boot times for a freshly installed version of Raspberry Pi OS 64 bit (release date 2023-10-10).

DeviceConnectionKernel Time
(~sequential)
Userspace Time
(~random)
Total TimePing Time
SanDisk Extreme ProSD4.2s13.4s17.5s27.7s
Samsung Evo PlusSD4.1s13.7s17.8s27.8s
SanDisk UltraSD4.0s14.0s18.0s28.1s
Samsung Portable SSD T7USB 35.3s13.6s18.9s29.2s
SanDisk 128GB
BP2309001191Z
USB 34.7s14.3s19.9s29.3
SanDisk 128GB
BP200557928Z
USB 34.7s17.5s22.2s29.6
2.5″ WD Elements PortableUSB 311.8s80.5s92.3s103.7s
Kernel time, userspace time and total time are the outputs of the command ‘systemd-analyze’. Ping time is the time it takes from plugging the RPi into the PSU to when it’s connected to the network and answers pings. The roughly 10s gap between total time and ping time should mostly be due to the firmware and bootloader, which are not reported by ‘systemd-analyze’ on the RPi. All numbers are the average of five boot processes. The standard deviation of all of them is very small, except for the HDD whose userspace times vary widely. This may not be too surprising since access times of HDDs are not constant.

It’s probably of no surprise that the HDD is by far the slowest boot device. It has to spin up before any data can be copied and random access is slow. There is no significant speed difference between the flash-based media.

Ongoing Operations

How are files accessed when the system is up and running? There are two major components:

  • The operating system and
  • applications.

Let’s first look at the operating system. As we mentioned earlier, during the boot process most of the OS gets loaded into RAM. During regular operations very little file access is required. The system will write log files and temporary files and may load config files but these are normally small. If information has to be loaded from disk then it will be characterized by reading from random disk locations.

If an application is started its files first have to be loaded from the disk. What happens afterwards depends on the application but it will typically be characterized by random reads and writes. For example, a web-server will get lots of queries from different users. To satisfy them, typically databases are accessed at lots of different locations.

For our NAS, once the system has been started, very little file access will be necessary on the boot medium. Data that comes in via the network will be saved directly on an external drive. Data that goes out via the network will be retrieved directly from an external drive. None of that data will be saved on the boot medium, not even temporarily.

Decision Time

Our NAS is intended to be always on so booting will only happen very infrequently and the actual operations require little disk access. So based on the previous sections it seems that for our purpose the choice of the boot medium really doesn’t matter much for the performance of our NAS (I know, such a simple conclusion after everything we’ve discussed!).

I think the major deciding factor is that on a RPi 4b the bandwidth of all USB ports together is limited to 500MB/s. It’s probably best to use all of that bandwidth for storage devices. So use I an SD card in my system.


Footnotes:

  1. I’m assuming that we’re not using the PCIe connection on the RPi 4b because that’s quite messy to set up. On the RPi 5 it would be an option but they were all sold out when I wanted to get one so I couldn’t test that. ↩︎
  2. I’m saying they are not so important because I’m assuming that the NAS will not be placed on your desk or in your bedroom but rather out of the way, somewhere with an Ethernet connection (more on that in the next post). ↩︎
  3. See https://en.wikipedia.org/wiki/Hard_disk_drive_performance_characteristics for a lot more information on how HDDs work. ↩︎
  4. This assumes that the 20GB file will be written to a contiguous part of the disk, which is not necessarily the case (e.g. because of disk fragmentation). ↩︎
  5. Although most probably not every file will require the full access time because they may be written to contiguous spaces on the disk. However, there is additional admin work required when copying a large number of files which will also slow down the process. ↩︎
  6. It’s an upper bound actually, because some of the data from the cache will have been written to the SSD’s NAND cells by then. ↩︎
  7. So on the RPi 3b all devices connected via USB and the Ethernet connection together can use 480Mb/s. This is a theoretical bandwidth because it has to cover all communication overheads that are required to set up and maintain connections. The bandwidth available for actual data transfer will be a lot lower. ↩︎
  8. By comparison, on the RPi 5 the two USB 3 ports each have their own 5Gb/s PCIe lane. ↩︎
  9. I suppose all storage media will fail eventually but some can last for a long time. For example, clay tablets from ancient Sumeria that were buried underneath the ground for a long time can still be read today. Compare those three millennia to the disappointing 5 to 15 years our current storage media will last … ↩︎
  10. Actually program-erase (P/E) cycles but we don’t really need those details here. ↩︎
  11. Note, there are also more up-to-date statistics by Backblaze, e.g. https://www.backblaze.com/blog/backblaze-drive-stats-for-q3-2023/ and https://www.backblaze.com/blog/ssd-edition-2023-mid-year-drive-stats-review/. ↩︎
  12. With only five samples of each test there’s a very good chance that the other components of access time are not averaged out, so they may still affect the results a little. I think that’s not a big problem for rough estimate of spin-up time. ↩︎
  13. Of course you can customize your system so that initramfs includes everything you need but that does require a lot more work and is probably not accessible to the average user. ↩︎

This post is tagged as:



Comments

Leave a Reply

Your email address will not be published. Required fields are marked *