Checksum and Hash padding

Randall Maas 5/8/2010 6:19:47 PM

I think people tend to learn to use MD5 / SHA / CRC calculations by mimicking others. This isn't bad. But I think it explains why I see a class of mistakes become common. The first is that these algorithms work on a certain number of bits and need to be padded out to get their power, qualities and other goodness.

Have 7 bytes you want to CRC32? You need to pad it out to 8 bytes. 10,007? You need to pad it to 10,008. Just append the proper number of zeros.

Have 27 bytes you want to feed to SHA128? Pad it out to 32 bytes. 1027 bytes for SHA512? Pad it out to 1088 bytes.

The second mistake is that people get (usually) the core a CRC algorithm correct. But they forget that the initial value in the CRC must be all ones. I've heard that this has always been learned the hard way. It doesn't have to be.