Digest (computing)
Overview
In the field of computing, a digest is a unique value (also referred to as a hash) derived from a string of text. The primary use of a digest in computing is to compare pieces of data. Instead of comparing data by reading all of it, a digest of the data is created and compared instead. This method saves time and computational resources, particularly when dealing with large volumes of data.
Digest Function
A digest function is a type of hash function that takes an input (or 'message') and returns a fixed-size string of bytes. The output is typically a 'digest' that is unique to each unique input. Changes to the input, even small ones, will ideally produce such drastic changes in output that the new digest appears uncorrelated with the old digest.
Digest functions are deterministic, meaning that the same input will always produce the same output. They are designed to be fast and to produce a digest that looks random. Digest functions are also designed so that a small change to the input (even changing just one bit) produces such a drastic change in the output that the new digest is uncorrelated with the old digest.
Uses of Digests in Computing
Digests have a variety of uses in computing, particularly in the areas of data retrieval, data comparison, and security.
Data Retrieval
In databases, digests are often used to retrieve data. Instead of searching for the data directly, the database will search for the digest instead. This is particularly useful in large databases where a direct data search would be time-consuming.
Data Comparison
Digests are also used to compare pieces of data. Instead of comparing the data directly, the digests of the data are compared. This is much faster and requires fewer computational resources, particularly when dealing with large volumes of data.
Security
In the field of cryptography, digests are used to ensure data integrity. When data is sent over a network, it can be intercepted and modified by a third party. To prevent this, the sender can create a digest of the data and send it along with the data. The receiver can then create a digest of the received data and compare it to the received digest. If the two digests match, the data was not modified during transmission.
Digest Algorithms
There are several well-known digest algorithms, including MD5, SHA-1, and SHA-2. These algorithms take an input and produce a digest, which is a string of characters that is unique to the input.
MD5
MD5 (Message Digest Algorithm 5) is a widely used cryptographic hash function that produces a 128-bit (16-byte) digest. It is commonly used to verify data integrity. However, MD5 is not collision-resistant; as such, it is not suitable for functions such as SSL certificates or encryption that require a high level of security.
SHA-1
SHA-1 (Secure Hash Algorithm 1) produces a 160-bit (20-byte) digest and was long considered to be very secure. However, researchers from Google and the CWI Institute in Amsterdam announced in February 2017 that they had successfully broken SHA-1 by creating two different files with the same digest. This demonstrated that the algorithm was no longer safe for use.
SHA-2
SHA-2 (Secure Hash Algorithm 2) is a set of cryptographic hash functions (including SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, SHA-512/256). It is currently considered to be secure and is used in several security protocols and systems, including TLS and SSL, PGP, SSH, IPsec, and Bitcoin.