What is Checksum?

There is two slightly different meanings of "checksum" term.

The first one is more general and means almost the same as "hash value", ie the value produced by any hash function or hash algorithm (for example, CRC32 One of checksum calculation algorithms family based on Cyclic Redundancy Codes. CRC32 is used in SFV checksum file format, produced by many checksum utilities. CRC32 is not secure and easily can be forged. , MD5 MD5 is secure cryptographic checksum (message digest, hash) algorithm. RFC 1321 contains MD5 algorithm description and reference implementation of MD5 too. Used by MD5SUM utility. and SHA-1 (Secure Hash Algorithm) designed by the National Security Agency (NSA) in 1993 as the algorithm of the Secure Hash Standard (SHS, FIPS 180).) from some input data.

The second currently is used to name the hash values produced by simplest form of hash function, which simply adds up basic components of original input, typically the bytes. This form is used very rarely at now, because it can't detect a number of types of errors, such as: reordering of the bytes in the input, inserting or deleting zero-valued bytes, multiple errors that cancel out each other.

In further we will use "checksum" term in first meaning, ie as synonym of "hash value" term and vice versa.

How this is works?

Simply put, a some piece of information - the JPG image, or MP3 music, or in general any file for example - is run through a hash function (through MD5 MD5 is secure cryptographic checksum (message digest, hash) algorithm. RFC 1321 contains MD5 algorithm description and reference implementation of MD5 too. Used by MD5SUM utility. for example). The result is a relatively short (128 bits or 16 bytes for MD5 MD5 is secure cryptographic checksum (message digest, hash) algorithm. RFC 1321 contains MD5 algorithm description and reference implementation of MD5 too. Used by MD5SUM utility.) string of digits, which is likely to be unique.

If you change even one byte in this file, and then run it through the function again, the result should be different. This is used as a way to verify whether a file has been altered.

Usually the hash values have fixed length (usually from 16 to 256 bits) and not depend from size of original input.

Unimportantly, what was the size of source file - 1 kilobyte or 100 megabytes, the length of the checksum always will be small. For example, 32 bits for CRC32 One of checksum calculation algorithms family based on Cyclic Redundancy Codes. CRC32 is used in SFV checksum file format, produced by many checksum utilities. CRC32 is not secure and easily can be forged.  or 128 bits for MD5 MD5 is secure cryptographic checksum (message digest, hash) algorithm. RFC 1321 contains MD5 algorithm description and reference implementation of MD5 too. Used by MD5SUM utility..

Such features make checksum very handy thing for verifying the integrity of files or other piece of information.

Cryptographic hash function and Message Digest.

The usual checksum are useful in detecting accidental modification such as corruption to stored data or errors in a communication channel. However, they provide no security against a malicious agent as their simple mathematical structure makes them trivial to circumvent. To provide this level of integrity, the use of a cryptographic hash function (MD5 MD5 is secure cryptographic checksum (message digest, hash) algorithm. RFC 1321 contains MD5 algorithm description and reference implementation of MD5 too. Used by MD5SUM utility. or SHA-1 (Secure Hash Algorithm) designed by the National Security Agency (NSA) in 1993 as the algorithm of the Secure Hash Standard (SHS, FIPS 180). for example) is necessary.

In cryptography the input data for hash function usually name as "message" and hash value for this message is named as "message digest".

The main features of cryptographic hash function may be described by two terms - "one-way" and "collusion-free".

"One-Way" means that it's impossible to find a (previously unseen) message that matches a given digest and "collusion-free" means that it's impossible to find two different messages which have the same message digest.

List of some hash functions

The usual (non-cryptographic) hash functions:

The cryptographic (secure) hash functions:

About AccuHash 2.0

AccuHash 2.0 (the Windows utility for protecting the integrity and verifying the accuracy of data files) currently support CRC32 One of checksum calculation algorithms family based on Cyclic Redundancy Codes. CRC32 is used in SFV checksum file format, produced by many checksum utilities. CRC32 is not secure and easily can be forged. , MD5 MD5 is secure cryptographic checksum (message digest, hash) algorithm. RFC 1321 contains MD5 algorithm description and reference implementation of MD5 too. Used by MD5SUM utility. and SHA-1 (Secure Hash Algorithm) designed by the National Security Agency (NSA) in 1993 as the algorithm of the Secure Hash Standard (SHS, FIPS 180).  checksums. Latest version of AccuHash is 2.0.18 - ah2setup.exe (968Kb) from November 06, 2008.