This plugin for Lightroom allows you to validate images and check for file corruption or “bit rot”. It works by computing a hash for each file and then comparing it to a previously stored value to see if your file has changed unexpectedly.
Note that this plugin is under development and should be considered a work in progress. It currently requires Lightroom 4 and runs on both Mac OS X and Windows. It works with the current version of Lightroom (v5.7).
Note: Please read the section about writing XMP to Metadata before using this plugin.
How it works
For each file, Validator reads in the entire image and computes a hash value based on every bit in the image file. You can think of this as a digital fingerprint, and just as no two humans have the exact same fingerprints, no two files have the exact same hash (mathematically speaking, there is a very very small probability that two files will have the same hash but it’s close enough to zero that we can ignore it).
The way the hash algorithm works is that if any bit in the file changes, the hash value/fingerprint also changes. For example, if I take my picture of the Arc de Triomphe and compute a hash (with the MD5 algorithm) we get a value of 2642a5f5f7468c115e63f7e35c67c20a.
Now if we test this by making small changes to the file, for example taking a single pixel in the sky and modifying the value from RGB = (40,41,80) to RGB = (40,41,81), we get a totally different hash: 435070d0f98d682502ed058833780e96.
The main idea is that if your data is corrupted we can detect it by recomputing the hash and comparing it to the original value when we first brought the image into Lightroom. Detecting these changes is critical because the last thing we want is for an image file to be silently damaged and then copied into all of our backups.
Writing XMP to Metadata
Under the catalog settings in Lightroom there is an option to “Automatically write changes into XMP”. (On a Mac, choose Lightroom > Catalog Settings and click the Metadata tab. On Windows choose Edit > Catalog Setting and click the Metadata tab). What this option means is that for file types with publicly documented formats (i.e., TIFF, JPEG, PSD, and DNG), Lightroom will write metadata such as captions and keywords directly into the file. Since the actual file changes, this can cause the hash value to change as well.
I recommend leaving this option turned off. Since I often go back and alter captions or add keywords to an image, this behavior of writing to the image file is undesirable as it will cause the hash to change even though the image data is still good. There are three main ways of dealing with this issue:
- Turn off write XMP metadata into files. Lightroom will store this information in the catalog and your image files won’t change even if you change the metadata.
- Run Validator only on RAW files which Lightroom does not modify. For these files Lightroom puts the metadata into a sidecar file (with a .xmp extension).
- Run Validator on TIFF files (and other files types which Lightroom modifies) but update the hashes whenever you alter the metadata or make direct changes to the file such as retouching in Photoshop.
Workflow with Validator
The commands for Validator can be found under Library > Plug-in Extras. Here’s how I use Validator with my images:
- I run Generate Hashes on all of my RAW files and any TIFF master images (client ready images that have all necessary edits and retouching done).
- Every few months, I select all of my images (RAW + TIFF masters) and run Verify Files.
- Verify Files will identify any image file that has changed and place them into a collection (the default is validator_changed)
- I manually verify the changed files:
- If the image file is okay, I run Accept Changed Hashes to update the hash values.
- If an image has been corrupted (only happened once to me), I replace it with a working version from my image backups.
In most cases, if Verify Files turns up a change, it is usually is a TIFF image where I performed some additional editing in an external program like Photoshop. Lightroom never modifies RAW files so their hash values should always stay the same.
These commands can be found under the menu item Library > Plug-in Extras. There are five commands: Generate Hashes, Verify Hashes, Accept Hashes, Clear Hashes, and Help.
This command will generate a hash value for all selected images and store the value in the plug-in specific metadata fields Archive Hash and Archive Date. The command will bring up a dialog box where you can select the type of files for which you want to generate hashes, and whether you want to include virtual copies (the hash for a virtual copy is the same as for the original image).
Generate Hashes will run as a background task in Lightroom. On completion, you will see a dialog box with summary statistics. In addition, the log file will store a detailed record of the actions taken for each image and indicate if there were any errors (such as problems reading an image file).
For images that have an Archive Hash, this command will verify whether the file has changed by computing a new hash and comparing it to the existing value. Files which have changed will be added to the specified collection.
Once the command runs, a dialog box summarizing the results is shown. In this case we have 3 files that have changed.
Verify Files will store the new hash in the field Last Hash. It will not change the Archive Hash.
Accept Changed Hashes
If there are changed files (Last Hash does not equal Archive Hash), you should manually verify if this is because (1) you made edits to the file or its metadata or (2) the file is damaged/corrupted. In the former case, you can accept the new hash value by selecting the appropriate images and running the command Accept Changed Hashes. This will update the Archive Hash and Archive date with new values.
Note: this command ignores selected images where there is no change in hash values.
This function clears all hash values from the selected images. Normally you should not need to use this function.
Running the Help function will bring up this webpage in a browser window.
Validator creates 5 custom metadata fields to track image hashes:
- Archive Hash — the hash value of the image
- Archive Date — date the ArchiveHash was computed
- Last Hash — the most recent hash value computed by running Verify Files
- Last Date — date the Last Hash was computed
- Status — takes on one of four values: MATCH, CHANGE, NEW, N/A
Note that these fields are read-only and cannot be edited directly. You can change the fields only by executing the various Plug-in menu commands.
You can show the fields by setting the Metadata panel to display “Validator”: