< All posts | Fediverse | RSS | GitHub | Talks

Jan 27 2022

LTO Tape data storage for Linux nerds

The insides of a LTO-5 tape

Tape storage is surprisingly not dead! If you are here then you may be considering using LTO tape as part of your backup or your long term archiving strategy. I’m here to mostly talk you out of it, or at best make sure that you are aware of what you are walking into.

Is it actually cheaper to use Tape or Disk?

One of the common reasons to look towards LTO Tape is that it’s much cheaper than hard drives, where a 12TB SATA drive costs around £18.00 per TB (at the time of writing), a LTO-8 tape that has the same capacity costs around £7.40 per TB (at the time of writing). That’s a significant price difference. So, you may ask what is the catch?

The drive cost

LTO has different generations. Around every 3 years the LTO Consortium will unveil a new version. This generally comes with a capacity upgrade of around double the last generation. Tape cartridges themselves are not forward compatible, but drives are generally backwards compatible to write for one version before and read for two versions before.

This means that if you buy an LTO-6 drive, you should expect to be able to also read/write LTO-5 tapes, but only read LTO-4 ones.

If you want to buy a factory new LTO-8 drive (the newest generation that is readily available at the time of writing) then you are looking at around £3,000. You can often find drives much cheaper on eBay if you can tolerate used drives (often 1000’s cheaper). Drives do have wear and tear (we will get to that later) so they are not perfectly the same product. LTO drives, however, degrade much slower than a standard hard disk would.

To illustrate the cost per TB, here is a calculator that you can plug in your own numbers (or use mine at the time of writing) to figure out if tape is actually cheaper for you. (Assuming you are okay with giving up “instant” access to data)

LTO Data (LTO-7 Default):

Cost of LTO Tape Drive:

Cost of LTO Tape:

Uncompressed LTO Tape Capacity (TB):

HDD Data:

Size of HDD (TB):

Cost per Disk:

Capacity of HDD Chassis (Slots):

Cost of HDD Chassis:

LTO Drive downsides

The drives are noisy, meaning if you are planning to work next to your LTO drive you should be prepared for noise and vibration. Here is an example of my LTO-6 drive:

Tape cartridges physically stack reasonably well, and if you are buying enough tapes your supplier can barcode them for you to easily identify them from each other.

If you are planning to use your tapes for long term archiving tapes supposedly have a 30 year life (obviously no LTO tape has been around 30 years to verify this claim), but only if you keep them in their preferred temperature and humidity. The packaging on the tape will generally state what this sweet spot is, My LTO-6 tapes suggest 16C to 25C at 20% - 50% RH.

When buying tapes, if your generation matches with the drive (or within the backward compatibility of the drive as mentioned before) you should be fine. The tape brand and the drive brand do not have to match.

While the write speeds of tape drives are generally quite fast (100MByte/s+ for LTO 4-6, 300MBytes/s+ for LTO 7-9) drives can slow in some directions as they age (or if the tape cartridge itself is getting older) In some cases when the drive is writing slower in one direction it will record less data. Meaning the amount of data you can write per tape cartridge might be reduced by the drive’s age. This can be tested for when you have a drive, as this sort of degradation is generally detected as a drive failure.

Used tapes?

A box of used LTO5 tape

If you are looking for cheap tapes, you may be able to find older LTO (4,5,6) generation tapes from cheap IT equipment recyclers for immensely cheap. I have a 150 stack of LTO-5 tapes that I bought for less than 200 GBP. Meaning that the cost per TB was immensely low (hovering around 1.10 GBP per TB).

Not all used tapes are the same however! Depending on the competency of your recycler, the tape may be sold unusable. If the tape was magnetically wiped, the tapes will be useless no matter what. LTO tapes require special factory data recorded on them for alignment on the tape head. If the cartridge is magnetically erased, then the tape will be permanently useless. You can find out more on mass erasing LTO tapes here


I have been convinced anyway; I want to buy a LTO drive!

Ok fine. Here is what you need to know if you want to buy a drive.

For this example, we will be running on the assumption that you will be buying an LTO-5 or better drive. Since LTO-4 and lower have stranger formats and tools that I have no experience in.

First. Drive type. You can generally buy 4 types of LTO drives:

Types of LTO drive

In general, I would always recommend that you go for an External SAS drive. These have the least amount of effort to get working. They have a C13 (same as most desktops) power input, and a SAS SFF-8088 (rolls off the tongue I know!) socket on the back. This can connect to a machine with a PCI-E SAS card (generally cheap) that will trivially auto detect it.

Other options include the half height (will fit in a 5.25 inch slot normally used for an optical drive), that will have an internal SAS SFF-8482 connector on the back, It will look like it’s a SATA port, but it is not.

The final options in the diagram above are (as far as I understand) for autoloaders (better known as tape libraries). The full height units often come with SAS SFF-8482 connectors as well. While the sleeker (bottom of the image) ones most often come with Fibre Channel (FC) SFP connectors. This is because most of the tape libraries use FC as a transport between the machine putting data on the tape, and the actual physical drive itself. Fibre Channel cards go pretty cheap on the second hand market and I’ve covered some tricks with them before

You may find that the non-external types need decent cold airflow to work correctly, this will not be an issue if you are integrating a drive into a server chassis or a dedicated tape chassis but might be a problem if you are installing it into a regular ATX chassis.

Tape drive and cartridge health

Since it seems most tape drives are very similar (if not the same??) to each other. Almost all of the tools to work with them are the same across brands, There are two holy tools I use for Linux debugging of tape drives and tape cartridges. You will want to download a copy of the IBM Tape Diagnostic Tool (ITDT) and the xTalk tool.

I personally find that ITDT is great for checking if the drive is working right, and xTalk is great for dumping out information on drive stats and media health.

Here is an example of xTalk’s “Dump All Pages” output for drive health:

Log Page 14
14 - Device Statistics Log

Lifetime media loads:  1300
Lifetime cleaning operations: 41
Lifetime power on hours: 41194
Lifetime media {tape} motion hours: 9425
Lifetime meters of tape motion: 72926447
Media motion hours since last cleaning: 44
Media motion hours since second to last cleaning: 50
Media motion hours since third to last cleaning: 93
Lifetime power cycles: 51
Volume loads since last paramater reset: 4
Hard write errors: 0
Hard read errors: 0
Duty cycle sample time: 2888989
Read duty cycle: 3
Write duty cycle: 0
Activity duty cycle: 4
Volume not present duty cycle: 90
Ready duty cycle: 6
Drive manufacturer serial number: xxx  
Drive serial number: xxx  
Medium removal prevented: 0
Maximum recommended mechanism temperature exceeded: 0

When buying a used drive, the Lifetime media {tape} motion hours is a useful metric to gauge the wear on the drive head.

xTalk can also dump data out of the RFID chip inside the cartridge that contains usage data. This data has things like how many times it’s been put inside a drive, how many times the tape has passed over a drive head, and what the lifetime read/writes are on the cartridge itself. This is not too different to S.M.A.R.T data.

LTO RFID Chip

The xTalk output for a cartridge looks like this (for a tape that has mild issues):

Log Page 17
	17 - Volume Statistics Log

	Page Valid                                        : 1
	Thread Count                                      : 26
	Total data sets written                           : 1579318
	Total write retries                               : 57
	Total unrecovered write errors                    : 0
	Total suspended writes                            : 2
	Total fatal suspended writes                      : 0
	Total data sets read                              : 1060455
	Total read retries                                : 1171
	Total unrecovered read errors                     : 3
	Last mount unrecovered write errors               : 0
	Last mount unrecovered read errors                : 0
	Last mount megabytes written                      : 0
	Last mount megabytes read                         : 1676
	Lifetime megabytes written                        : 3904137
	Lifetime megabytes read                           : 2621487
	Last load write compression ratio                 : 0
	Last load read compression ratio                  : 99
	Medium mount time                                 : 0
	Medium ready time                                 : 0
	Total native capacity                             : 1520000
	Total used native capacity                        : 337522
	Volume serial number                              : MF0WU3YFJ4                      
	Tape lot identifier                               : G5AA135D
	Volume barcode                                    : A11952L5                        
	Volume manufacturer                               : FUJIFILM
	Volume license code                               : U107
	Volume personality                                : Ultrium-5
	Write Protect                                     : 0
	WORM                                              : 0
	Maximum recommended tape path temperature exceeded: 0
	BOM passes                                        : 922
	Middle of tape passes                             : 463
	First encrypted logical object identifiers
		Partition 0                                  : FFFFFFFFFFFFh
	First unencrypted logical object on the EOP side of
	the first encrypted logical object identifier
		Partition 0                                  : FFFFFFFFFFFFh
	Approximate native capacity of partition
		Partition 0                                  : 1520000
	Approximate used native capacity of partition
		Partition 0                                  : 337522
	Approximate remaining native capacity to early warning of partitions
		Partition 0                                  : 1182484

Cleaning

Drives do need cleaning from time to time, this can be as far as every 100 or so of “motion hours” or lower. Generally, I do it whenever the drive is above 50 hours of “motion hours” and the performance of the drive is questionable.

Cleaning requires a special cartridge that costs about the same as a new tape, these tapes are compatible with all drives as far as I know. They are generally “good” for about 50 cleans.

Compression and Encryption

It is wise to encrypt your data going on to tape, since tape takes a very long time to erase (since you would have to write the whole tape) disposing of a tape can be risky even if it’s slightly broken. Drives above LTO-4 have built-in hardware encryption, however I would steer away from using it and instead just encrypt data yourself (possibly with the tool I helped make called age!). Like most things, you should also consider compressing your data before encrypting and writing it to tape. LTO tape capacities are often quoted in their “compressed capacity” which is a little cheeky since it assumes basically over a 50% compression ratio, this is not at all likely to be true if you are writing video or other lossy mediums like images etc to the tape. I generally run my data through zstd to compress and then age to encrypt. Zstd and age are quite fast and I’ve not found them to impede performance noticeably.

Actually writing data to the tape

root@testtop:~# ls -alh /dev/tape/by-id/
total 0
drwxr-xr-x 2 root root 120 Dec 27 17:10 .
drwxr-xr-x 4 root root  80 Dec 27 17:10 ..
lrwxrwxrwx 1 root root   9 Dec 27 17:10 scsi-xxxxxx -> ../../st0
lrwxrwxrwx 1 root root  10 Dec 27 17:10 scsi-xxxxxx-nst -> ../../nst0

Tapes show up in Linux as two block devices, /dev/st0 and /dev/nst0 (last number depending on how many tape drives the system has detected). Unless you are writing one huge (IE: the whole tape) thing at once, you will want to use the /dev/nst0 device as it will not automatically rewind the tape when the program that is writing data releases the file descriptor.

Unlike most block devices these are devices that do not enjoy seeking of any kind. So you generally end up writing streaming file formats to tape, unsurprisingly this is exactly what the Tape ARchive (.tar) is actually for. If you are unable to use tar for whatever reason and really need something that looks like a file system, there is LTFS. I have never attempted to use LTFS myself, and would likely only really attempt it if I was running on Windows.

Need for speed

Having faster than 1GBit/s networking is useful for this. If you have the ability to have cheap 10GBit/s Ethernet (even if it’s point to point), it might be worth it.

Writing out a full tape can take quite a long time, even if you are writing at full speed. As LTO capacity has gone up, the write speed has not caught up with it. Meaning an LTO-5 tape takes around 3 hours to write, but an LTO-8 tape takes a whopping 9 hours!

Block size woes

Another issue to keep in mind with tape is keeping the drive well fed with large blocks of data. While most standard disks have block sizes of 512 bytes or 4096 bytes, tapes enjoy a much larger block size of 512KB or higher. In addition to this a drive (and the tape inside it) take more wear and tear if they are to stop and stall. So running your backups through mbuffer to buffer a section of your data into RAM before writing out to tape is a good idea to “smooth out” the time gaps (even if short) in data not being delivered to the drive. Later generation drives are able to deal with slower input rates by physically slowing down how fast they are moving the tape. However I generally try and avoid this happening and instead mbuffer a lot of data (6GB) and write out as much as it can at full speed. Here is what I use (to buffer incoming network data):

ncat -l -p 1337 |  mbuffer -P 80 -m 6G -s 524288 -o /dev/nst0

It is worth pointing out that if you are writing to tape at large block sizes you may find you are unable to read the tape device with some cryptic Cannot allocate memory error. This is because the program that is reading from the block device is not reading it with a big enough buffer. You can work around this by using dd and setting bs=512k and piping the output into the program desired.

If you are streaming more than just one thing to a tape, you will want to write a EOF to the tape. This allows you to read out a whole “file” from the block device (think dd) in one go, and then when you are done with that section, the program will exit out cleanly, and you can start the program again (assuming you are using /dev/nst0) and you will get the next section. You can do this using the mt command with mt -f /dev/nst0 weof 1.

root@testtop:~# echo "Hello world!" > /dev/nst0
root@testtop:~# mt -f /dev/nst0 weof 1
root@testtop:~# dmesg > /dev/nst0
root@testtop:~# mt -f /dev/nst0 weof 1
root@testtop:~# mt -f /dev/nst0 rewind
root@testtop:~# dd if=/dev/nst0 bs=524288 status=progress > FileA 
0+1 records in
0+1 records out
13 bytes copied, 0.0106532 s, 1.2 kB/s
root@testtop:~# dd if=/dev/nst0 bs=524288 status=progress > FileB
0+0 records in
0+0 records out
0 bytes copied, 0.00441687 s, 0.0 kB/s
root@testtop:~# dd if=/dev/nst0 bs=524288 status=progress > FileB
0+15 records in
0+15 records out
61257 bytes (61 kB, 60 KiB) copied, 0.0415833 s, 1.5 MB/s
root@testtop:~# head FileA
Hello world!
root@testtop:~# head FileB
[    0.000000] Linux version 5.7.8-benjojo (root@airmail) (gcc version 8.3.0 (Debian 8.3.0-6), GNU ld (GNU Binutils for Debian) 2.31.1) #1 SMP Wed Jul 15 20:29:31 BST 2020
...

Once you are done with a tape in the system you can request the tape drive to rewind and eject the tape by running mt -f /dev/nst0 offline


If you want to stay up to date with the blog you can use the RSS feed or you can follow me on Twitter

Until next time!