Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
hathitrust [2016/04/14 15:55]
rapeterson [Submitting OCA digitized content]
hathitrust [2017/02/01 16:19]
mbanach
Line 1: Line 1:
 ===== HathiTrust ===== ===== HathiTrust =====
 +
 +**Link to ETD Workflow Google doc:**
 +
 +https://​docs.google.com/​document/​d/​1Qs_uIWPjDfUIcQhxJiMlgvgN8XpexZn2CTs0eSGTOxo/​edit ​
  
 HathiTrust Support: [[feedback@issues.hathitrust.org]] HathiTrust Support: [[feedback@issues.hathitrust.org]]
Line 14: Line 18:
   - Enter your password and click "SIGN IN"   - Enter your password and click "SIGN IN"
   - You will get the Shibboleth login screen.   - You will get the Shibboleth login screen.
-  - Create a new folder "​[current year] holdings"​+  - Create a new folder "​[current year] holdings" ​inside the "​UMass"​ folder.
   - Upload files into the folder that you just created.   - Upload files into the folder that you just created.
  
Line 22: Line 26:
 The Libraries contributes content digitized by the Open Content Alliance (Internet Archive) to the HathiTrust. HathiTrust (HT) already works with the Internet Archive (IA), which facilitates the process. We only need to provide HT with MARC records that include content specific to IA, the IA Identifier and the [[https://​en.wikipedia.org/​wiki/​Archival_Resource_Key|ARK Identifier]],​ in the 955 tag. More information on submitting bibliographic records is available here: https://​www.hathitrust.org/​bib_specifications The Libraries contributes content digitized by the Open Content Alliance (Internet Archive) to the HathiTrust. HathiTrust (HT) already works with the Internet Archive (IA), which facilitates the process. We only need to provide HT with MARC records that include content specific to IA, the IA Identifier and the [[https://​en.wikipedia.org/​wiki/​Archival_Resource_Key|ARK Identifier]],​ in the 955 tag. More information on submitting bibliographic records is available here: https://​www.hathitrust.org/​bib_specifications
  
-On the Coral server, there is a PHP script (www/HathiTrust/​get_IA_data2.php) which takes a file of comma separated Aleph bib numbers and IA Identifiers and creates the appropriate MARCXML for submission to HathiTrust. It FTPs the file to HT and sends the email notification that is required by HT. It works fairly well, but occasionally encounters a problem that needs to be addressed. ​+On the fcweb.library.umass.edu ​server, there is a PHP script (www/html/ht/​get_IA_data2.php) which takes a file of comma separated Aleph bib numbers and IA Identifiers and creates the appropriate MARCXML for submission to HathiTrust. It FTPs the file to HT and sends the email notification that is required by HT. It works fairly well, but occasionally encounters a problem that needs to be addressed. ​
  
-The file of comma separated values can be created using the Pick List that is returned from OCA. Talk to Lisa Persons about where the latest files are, typically they are located in W:\Open Content Alliance\Pick lists\Completed picklists. The first 2 columns on the Pick List should contain the bib number and the IA identifier. If there are errors, a column will be inserted between the first and second column where the error is noted. Errors should be deleted from the Pick List and the empty column should be deleted. Delete all other columns of the Pick List and save the file as "​ialist_YYYYMMDD_[local identifying information].csv"​. Follow the below steps to complete the processing:+The file of comma separated values can be created using the Pick List that is returned from OCA. Talk to Lisa Persons about where the latest files are, typically they are located in W:\Open Content Alliance\Pick lists\Completed picklists. The first 2 columns on the Pick List should contain the bib number and the IA identifier. If there are errors, a column will be inserted between the first and second column where the error is noted. Errors should be deleted from the Pick List and the empty column should be deleted. Delete all other columns, and the header, ​of the Pick List and save the file as "​ialist_YYYYMMDD_[local identifying information].csv" ​(no spaces). Follow the below steps to complete the processing:
  
-  - Upload the file created from the Pick List to the www/HathiTrust/ directory on the Coral server [Talk to Steve Bischof if you need FTP access to the Coral Server.]+  - Upload the file created from the Pick List to the www/html/ht/ directory on the fcweb.library.umass.edu ​server [Talk to Steve Bischof if you need FTP access to the FCWeb Server.]
   - Edit the get_IA_data2.php file.   - Edit the get_IA_data2.php file.
   - Change the $fileId variable to reflect the date and local identifying information,​ e.g., "​20160408_darktruck1"​   - Change the $fileId variable to reflect the date and local identifying information,​ e.g., "​20160408_darktruck1"​
   - Update the $my_email variable as necessary. Separate multiple emails with a comma.   - Update the $my_email variable as necessary. Separate multiple emails with a comma.
   - From the command line run: ''​php get_IA_data2.php''​   - From the command line run: ''​php get_IA_data2.php''​
-  - The email address(es) entered in $my_email should receive ​and email after the file has been uploaded to HT.+  ​- because sendmail is not installed on the fcweb server, the php script will not successfully send; create manual msg to cdl-zphr-l@ucop.edu with current file name, size in bytes, and nbr records (use previously sent msg as template) 
 +  ​- The email address(es) entered in $my_email should receive ​an email after the file has been uploaded to HT.
  
 For digitized content other than digitized theses and dissertation,​ you can also create the file of comma separated values by printing the 856 from Aleph in Aleph sequential format for the records that you want to upload to HT. This will provide you with the bib number and IA identifier, but you will need to massage the data to get it into the proper format. This has been useful for Massachusetts Documents records, because many of the print records never had an OCLC number. For digitized content other than digitized theses and dissertation,​ you can also create the file of comma separated values by printing the 856 from Aleph in Aleph sequential format for the records that you want to upload to HT. This will provide you with the bib number and IA identifier, but you will need to massage the data to get it into the proper format. This has been useful for Massachusetts Documents records, because many of the print records never had an OCLC number.
  
  
hathitrust.txt · Last modified: 2019/01/07 17:20 (external edit)
[unknown link type]Back to top
www.chimeric.de Creative Commons License Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0