9

I'm copying 100GB from a Windows 7 workstation to 2 external drives (if you have only one backup, then you have none). All files have a MD5 checksum. I've verified all MD5 checksums on the external disks after the copy, and they were all correct.

I have a D:\ partition that holds all files I want to backup: mp3, documents, videos, etc. I'm moving to a mac, so software preferences aren't needed. My bookmarks are at delicious.com.

My question is: is this really a safe approach to avoid incorrect copies or corrupted files on my external disks? I'm going to format the machine and give it away to my brother, so I copied all files this way.

2 Answers2

5

If the checksums match then you are near a 100% safe the files were copied correctly, provided the external drive does not fail.

Pylsa
  • 30,630
  • 16
  • 89
  • 116
KCotreau
  • 25,519
  • 5
  • 48
  • 72
  • @KCotreau: That's why I copied to TWO different external drives, and I'm going to copy the really important files to dvds – Somebody still uses you MS-DOS Jun 05 '11 at 19:34
  • 11
    Actually, if the checksums match you're 99.9999...% safe, but that's probably close enough to 100% for most purposes. The bigger problem is that you may have missed some important files entirely, or be losing some important meta-info that ties things together. There is no good way to back up a Windows box with reasonable confidence that it can be restored to working order (at least not without considerable effort). – Daniel R Hicks Jun 05 '11 at 19:39
  • @DanH: I dont need the meta info. It's just pictures (exif is internal), mp3 (id3tag, internal as well) and some documents. What I'm looking for is md5 flaws (I know md5 is already flawd for collision, but in the this situation to verify corruption I think it's still useful, since it's fast and well supported) in this scenario. – Somebody still uses you MS-DOS Jun 05 '11 at 19:53
  • 3
    You have a chance of `2^-128` that you generate the same hash for another file... :) – Tamara Wijsman Jun 05 '11 at 19:59
  • @DanH: Actually, there is a really good way to backup a Windows drive -- I use this solution all the time (it only copies used sectors by default, and takes advantage of Windows' built-in snapshot technology if it's available so you can perform the copy while using the computer): http://www.drivesnapshot.de/ – Randolf Richardson Jun 05 '11 at 20:12
  • 4
    The chance of two *valid* files having the same hash through a bad copy are near-zero, so if the file looks right, and the hashes match, that's well beyond reasonable doubt! – Phoshi Jun 05 '11 at 21:38
2

There is no 100% guarantee that the target and the source will be exactly the same. However, it the hash values match, the possibility of a corrupted file is very very low. It also depends on the length of the hash value. SHA is twice larger than MD5. So using SHA will be much more safe than MD5. I know an application called HashCopy. It supports both MD5 and SHA and automatically verify the hash values. It can be downloaded from www.jdxsoftware.org.