Read more.Users should update the firmware to prevent crash bug occurring after 32,768 hours of use.
Read more.Users should update the firmware to prevent crash bug occurring after 32,768 hours of use.
Hopefully the patch doesn't make it fail at 65536 hours of use (which would be outside warranty )
edit: Ooh, I speculated an overflow as soon as I saw the 32768 number, guess that makes me an expert!
Last edited by DanceswithUnix; 27-11-2019 at 02:24 PM.
mtyson (27-11-2019)
It isn't usually the overflow that directly kills your code though, it is usually some secondary effect like using the resulting -32768 value from the overflow to search/index into a table which doesn't have any entries suitable for negative numbers. Given that power on hours isn't usually considered that important a metric I can imagine it not being that heavily tested either.
OTOH, if it was something like using the top bit as a debug flag then someone needs to be taken out and shot
afiretruck (27-11-2019)
Ha ha, chemical sheds and the ditches!
It is very interesting that the drive is completely inoperable/irrecoverable when this value is hit which definitely follows your logic of the secondary effect, maybe the time is used as a calculation in SMART, the SMART crashes and takes the controllers with it?
Edit: to qualify my thought, the flipped bit would make a negative time so the calculations, if uncaught, will just drop out of range. Why they're counting time using a signed 16-bit integer is a little bit odd...
Last edited by Tabbykatze; 27-11-2019 at 04:31 PM.
Thinking about it, there is a good chance they aren't, and this isn't an overflow...
Imagine you store that value in a word of flash, then every hour you erase the page it is in and re-write it with the new value one higher. That's 65535 writes to a page just to store one thing, where a page has an endurance in modern flash devices of about 3000 writes. Just to count.
Now imagine you choose an 4KB page of flash, that's 32768 bits in total. On first ever power up you clear the page so all the bits are 1's. Every hour, you clear one bit. Flash is written by erasing an entire page of bytes to all 1 bits (as in each byte 0xff) and then clearing the bits you want cleared to get the value you wanted stored. So you can actually zero a bit in flash at any time without erasing it first (flash programming fun fact!), you only need to erase to flip a zero into a one. Now you get 3.7 years of counting hours before you have to erase to count the next 3.7 years, so your 3000 erase endurance gets you 11000 years of counting. Handling the 3.7 year boundary would take some careful testing though (how many cycles you had been through being stored elsewhere).
That's probably how I would do it anyway, and given storage devices use a 4K filesystem page that fits nicely.
Hmm, so now I don't think it is an overflow. Will have to hand my expert title back
someone found out, they had to do something about it... simple as that
Why don't they just build the dam higher? Or stop putting water in the drive full stop? Sounds a bit silly to me.
There are currently 1 users browsing this thread. (0 members and 1 guests)