[geeks] Disks: recommendations?

Sun Nov 1 14:07:31 CST 2020

On Fri, 30 Oct 2020, Mouse wrote:

>> Unlike spinning media, though, SSDs do now have mechanical
>> degradation at rest, as there's no lubrication to dry up.
>
> s/now/not/ I assume?

Correct.

> Maybe not, but they do have stored charge to leak.  It's not
> *mechanical* degradation, but it amounts to something similar
> operationally.

It's not as if it's a grid of plate capacitors.  Anything can fail, but
there's not anywhere for the charge to leak *to* at rest.

>> SSD lifetimes are rated on total-device-writes-per-day over a given
>> number of years, assuming an end-to-end wear pattern.
>
> ...which is completely unrealistic in almost every real scenario.

Not when you consider where SSDs are placed and how they're generally
used.  In a modern copy-on-write filesystem, full-device writes are a
closer approximation than rewriting hotspots.  The same is true of
tablespace storage in a database, the fast caching layer of a hierarchical
storage system, short-lived storage for containers, etc.

>> If you totally fill a device, and then rewrite the last block
>> infinitely, you'll exhaust the spare block pool faster, but that's a
>> pathological use case.
>
> Not all that pathological; it's a lot like what will happen if it's
> used with a filesystem that doesn't do TRIM.

Which is why one not ought use such filesystems with flash memory.  Some
controllers are smart enough as to treat an all-0s block write as a DSM
command (indeed, many of the embedded controllers for CF/SD cards in
camera applications do this), but the command exists for good reasons.

>> Remember to issue discard/dsm (a.k.a. "trim") commands for the unused
>> portions of the media, and they'll last ages.
>
> This sounds like "you need SSD-aware filesystem code to get decent
> lifetime out of them".  If so, that's another reason for me to avoid
> them; FFS *long* predates TRIM.

But it's not hard to add DSM to FFS.  The FreeBSD and NetBSD folks have
already done it.  When you unmap a block, send a DSM opcode with that LBA
(and an optional run-length).  Simple!

FFS predates quite a lot of things in modern storage.  It still assumes
that sector remapping doesn't happen, that drives don't have multiple
storage-density zones, etc.

> And...is there any reasonably simple way someone like me, not in the
> storage industry in any form, can tell whether I'm looking at something
> like that or soemthing worth putting data on?

Sure.  Buy Intel, Micron, Western Digital, or Samsung.  Or look at the
datasheet for a device and see if it uses the same controller or flash as
a well-recommended device.

> I was, naovely (and apparently unrealistically), expecting that the
> firmware would be able to tell whether it's got good blocks and, when it
> runs out of good space, would know it.

You don't know what space is good until it fails to write correctly or
fails to read-back with the data you sent to it.  This is a problem with
all storage media.  Either you have to preassign a write-endurance to each
block and forbid overwriting past that count, or flag the error upon read.
Neither is a great compromise.

It's also a big part of why ZFS is such a good technology stack.  Pity
about the patent and licensing quagmire.

>> For what it's worth, I off-site with LTO, but that's only because of
>> the price of SSDs and speed doesn't matter for disaster-recovery in
>> my use case.
>
> LTO?  Isn't that a tape technology?

It is.

> I'd be tempted, but my (severely limited, of course) experience is that
> tape media is even more likely to lose bits than disk drives - and is
> substantially more expensive in dollars-per-byte as well.

I've read 9-track tapes that are older than I am, and DLTs that were
written last century, and they still pass the checksum.  People still
routinely recover machines on this list from QICs written in the 1980s.
It's all in how they're stored.  There were a few really *awful* tape
formats (4mm and 8mm) that were also unfortunately the most popular for
low-end systems, but their fragility was related to how much physical
contact the tape had during operation--not the medium itself.

LTO is expensive upfront, but if someone drops my box of LTO tapes, it's
unlikely any of them will be damaged, and there's no disk-drive equivalent
to a tape robot.  The expense keeps me a few generations behind current,
but I don't mind taking an small box of tapes to the bank instead of just
one.

By buying used, I got a tape robot and 16 tapes (enough for three rotating
set of offsites) for the cost of 2 or 3 hard drives.

>> I'm slowly migrating from spinning rust to SSDs for active data
>> because it's hard to argue with a seek time of nearly zero.
>
> Oh, I don't argue with it.  It's just that, for me, performance isn't
> important enough to override the factor-of-over-two price difference.

For data at rest, no.  For live data, the decreased power draw, greater
resistance to vibration, reduced weight, and security features are all
compelling even before we talk about eliminating seek time, and (in the
case of NVMe) ridiculously parallel storage operation.

-- 
Jonathan Patschke   |   "The more you mess with it, the more you're
Austin, TX          |    going to *have* to mess with it."
USA                 |                            --Gearhead Proverb