When a box is finished, do the following:
checkimages
and correct any issues with images.mods -c
and correct any metadata issues.r:/credo/metadata/checkbox
and run image_view
to spot check metadata and scans. Correct any errors found in the 'staging' directory.mods -Iwbu
to add box numbers and item IDs to MODS records. Usage:
This script is used to quality control tiff images. To run the script type the following in the directory of images you would like to check:
checkimages 300 *.tif
If any images are not 300 dpi or compressed with LZW, their filenames will be printed out. This script checks the image's EXIF metadata. If for some reason the image EXIF cannot be read, the script will tell you and the image will have to be quality controlled manually.
If you want to check against a different resolution, simply change the first argument:
checkimages 600 *.tif
This script offers several other image qc features. If an image is not compressed with LZW, it will copy the image to c:/image_sandbox/compress
. This will allow you to run a Photoshop batch process to compress the files. For images that were not scanned at 24-bit color, it will copy the images to c:/image_sandbox/bitdepthissues
. Make sure you create these directories before you run the script for the first time.
Usage:
mods
provides a variety of tools to quality control MODS records. Examples of the most common uses are:
The -c option checks all the MODS records in a given directory to see if there is missing text and then prints the element and filename for the suspect record. The final argument is the starting directory. This script will recursively go through all subdirectories of the starting directory.
mods -c box001
The -Iwbu options automatically add the item ID, box numbers, and URLs to MODS records:
mods -Iwbu box001
To make a global change to the text value of an element, use the -e option. You'll have to also use the -x option to specify an XPath expression for the element you want to edit. NB, the -w option will write over the original file with the edited version. Make sure you test your script well before set it loose on an entire set of records. You will need to use NetDrive to access the server to make changes. It's also a good idea to ask Aaron Addison to run a back-up of oubliette before you make a massive change:
mods -w -e "New element text" -x "/mods:mods/mods:accessCondition[@type='Use and reproduction']" mums312
You can also do a search and replace, editing only the text values that match your search:
mods -w -r "Du Bois" -S "DuBois" -x "//mods:namePart[@type='family']" mums312
This search and replace will look for every <namePart> with Willie's family name misspelled and replace it with the properly spaced version.
To add an element use the -a command. Create the element exactly as you want it to appear in the records. You don't need to add the mods prefix to the elements, that will be done automatically. Add the -w to overwrite the existing file at the beginning of the command. If you want to put the element somewhere specific in the record, you'll have to pass the element index to -i, otherwise it will insert the element at the bottom of the record. The index count probably starts at 0 so titleInfo would be 0, name would be 1, and so on. Make sure you test the script on a small amount of records before setting it loose on a bunch.
mods -w -a "<relatedItem><identifier>mums312-s01</identifier></relatedItem>" -i 16
If you wish to login to Credo directly, there is an instance of mods installed there. All of the commands are the same but you have to add .py and you must use regular expressions so certain precautions have to happen to preserve the meaning of certain symbols like so:
mods.py -w -r "Broadsides (notices)" -S "Broadsides \(format\)" -x "//mods:genre[@authority='aat']"
Also check out http://misc.yarinareth.net/regex.html to learn more about regular expressions.
There are more options as well but they get a little tricky so stick with the ones listed unless you feel like fearlessly exploring. You can type mods -h
to get a full set of options.
Usage:
image_view
allows you to view metadata and images side-by-side in order to spot check the images and metadata and whether they correspond.
To use, copy images and metadata to r:/credo/metadata/checkbox
. Go to r:/credo/mods-search/image_view
and double-click on image_view.exe
. A black window should open up and print http://0.0.0.0:8080
.
Point your browser to http://localhost:8080.
N.B This tool needs IE's security loopholes to work. I use the IETab plugin for Firefox or Chrome. When I've loaded the page that lists the item IDs, I change to IE view.
Once you're all set, click on an ID to view the image and the corresponding metadata.