This is an old revision of the document!


Batch Uploading OAIs from Scholarworks into OCLC and Aleph

CHANGE TITLE? ETDs (Current) Processing ScholarWorks OAIs

NOTE: My helpful “hints” will appear in Italics.

Introduction

The Graduate School will email “Packing Lists” dated February, May and September (end of semesters) of new dissertations, theses, MFA theses and occasionally LARP theses. There may be a lag between these dates and when the ETDs are available on ScholarWorks. I try to process them after a couple of months have passed, to assure that they will be picked up in the Crosswalk harvest.

Preparation

  • Have a handy copy (either online or a printout) of the Packing List-in-process. NOTE: It's a good idea to save copies of these in appropriate folders. Example: PackingListReport_Feb2019diss.xlsx in [Drive]:\OAI\Dissertations\2019\ (i.e. 2019), or OAI\Theses, ThesesMFA or ThesesLARP.
  • Open MarcEdit. (NOTE: Make sure your MarcEdit XSLT engine is set to SAXON.NET. On MarcEdit home page, click tools(found on top), Preferences, MARCEngine, select SAXON/NET under XSLT Engine.)

Harvesting from ScholarWorks

  1. Click on Harvest OAI Records: (Found on either the MarcEdit home page or under (top) tools/OAI Harvester Tools/) Set the following:
    • Set name (for dissertations): publication:dissertations_harvesting (IMPORTANT NOTE: Because of software changes made in 2018, Erin Jerome needs to be informed before running a Crosswalk on dissertations only! Before they can be pulled, they need to be transferred from “publication:dissertations_2” to a special harvesting subset.)
    • Set name (for theses): publication:masters_theses_2
    • Set name (for MFAs): publication:englmfa_theses (NOTE: This series is only for English MFAs; MFAs for art etc. are included in masters_theses_2.)
    • Set name (for LARPs): publication:
    • Metadata type: dcq (NOTE: This is not included in the MarcEdit drop-down, but needs to be typed in. It's a “modified” version of Dublin Core.)
    • Crosswalk path: C:\Crosswalk\XML1\OAIDCtoMARCXMLmodified.xsl (NOTE: This program needs to be loaded onto your personal C: drive.)
    • Start date (for May, in this format): 2019-06-01
    • End date (for May, in this format): 2019-08-31 (NOTE: Using August avoids Sept. lists. Occasionally these dates have to be tweaked to include everything on the appropriate Packing List.)
    • Hit “OK” and let it run. A green bar will appear if it is working. (NOTE: This function is a little cranky. Recently it didn't work for me because I entered 2019-11-31 instead of 2019-11-30. Everything has to be entered precisely! If no amount of tweaking resolves the issue, contact bepress (Digital Commons), which occasionally blocks ScholarWorks harvesting for security purposes, Erin Jerome or Aaron Rubinstein.)
    • Once the harvesting is finished, a MarcEdit list will open up, containing the harvested records in raw form. I like to save this immediately into the appropriate OAI folder, as (example) umdissertations_sept.mrk
  1. Check harvested records against Grad School's packing list
    • Hint: In MarcEdit, click Edit/Find/enter =100 in “Find what” window/click Find All. This will produce a list that can be saved to the clipboard, and copied into Excel or another program. (NOTE: When working in MarcEdit, click File/Save after every change!! Do NOT Save if no changes are made.)
    • IMPORTANT NEW STEP, added 2020:Go to ScholarWorks/Dissertations and Theses and log onto “My account”, scroll down to the appropriate series (i.e., DOCTORAL DISSERTATIONS (dissertations_2)/Manage Dissertations/Batch revise Excel/Generate a spreadsheet of current data. See Changing one year campus titles to open access in ScholarWorks for instructions on generating ScholarWorks spreadsheets.
    • If extra names appear in the MarcEdit file, check the generated spreadsheet to make sure they are NOT dated in the range requested. Any harvested record NOT on the Packing LIst with a different date (Check degree_year and award_month), or which belong to a different series (such as English MFAs)can be removed from the MarcEdit file.

Once our dissertations and theses are OAI harvested from Digital Commons (BePress) via MarcEdit then run through MarcEdit's Task list before being uploaded to Connexion, the bib records are ready to be batch uploaded to OCLC then batch exported to Aleph.

To upload from Connexion to OCLC:

After importing the bib records file from MarcEdit

(For example purposes, we will use the Connexion file for February 2016 Dissertations which can be opened via CatalogingSearchLocalSaveFile → T:\\oclcapps\Connexion\Theses\2016_Feb_Dissertations.bib.db))

  • Highlight all records in the file and Validate (Edit → Validate or Shift+F5). This will generate a report of results. Note which records did not validate and make the necessary corrections. Re-validate as needed.
  • Highlight all records in the file and Update Holdings (Action → Holdings → Update Holdings or F8). OCLC record numbers will begin appearing in the file as each record is uploaded.

To export from Connexion to Aleph:

  • Go to Tools → Options and click on the Export tab. Highlight the Prompt for filename option then check off the box for Display report for immediate export results. Click on Apply then Close.
  • Open the Local Save file you want to export (2016_Feb_Dissertations - See path above)
  • Highlight records
  • Export (Action - Export or F5)

This will ask where to put the output file in your C: drive and what name to use. Make sure the filename is in all lower case - for example, feb2016diss. The file will be downloaded into your C: drive as a .dat file. (Example: C:\Crosswalk\Dissertation&Theses\Connexion_Records\feb2016diss.dat)

  • Open MARCTools in MarcEdit.
  • Input the .dat file from your C: drive (feb2016diss.dat)and name the Output file with a .mb extension (feb2016diss.mb. Execute the MarcBreaker.
  • Click on Edit Records. Use Replace to change AUMM to AUMETD.
  • Under MARCEditor –> File, click on Compile File into Marc. This will save as a .mrc (MARC) file.
  • Open Aleph, Cataloging function
  • Click on Task Manager then [F] Upload/Download files
  • Find where your saved .mrc file is on your C: drive (feb2016diss.mrc) and copy to the FCL01/Scratch file (from drop-down menu over left Remote Files column)by clicking on the left arrow button between columns
  • In the Aleph menu bar above, click on *_Services → Load Catalog Records
  • Click on Advanced Generic Vendor Records Loader (file_90)

Make sure the following rules are set:

  • Input File name (for this example, feb2016diss.mrc)
  • Default Holding - AUMETD
  • Character Conversion - OCLC_UTF_TO_UTF
  • Fix Routine - UMFIX
  • Match Routine - OCLC
  • Merge Routine - OCLC
  • Update Database - Yes
  • Produce Loading Report - Yes
  • Report file name(for this example, feb2016_report)
  • Click on the Submit button at top right

Once the exporting is done, click on Task Manager → [A] Batch Log to view the report.

  • Highlight your file (p_file_90) and click on View Printouts.
  • Under Remote Name, highlight <filename>_report.new (i.e. feb2016diss_report_new)
  • Click on Print to obtain reports. You want the loader-log-report which will show the FCL01 Bib Sys numbers for each record. Copy one and check the bib record which displays for any potential corrections needed.

To Globally Remove the 856 Field from Bib Records:

  • Click on *_Services → Catalog Maintenance Procedures → Global Changes (manage-21)

Set the rules:

  • Input file name <filename>.mrc.bib (i.e., feb2016diss.mrc.bib)
  • Output file name <filename>.mrc856 (i.e., feb2016.diss.mrc856)
  • Update Database - Yes
  • Line in Record → Tag → 856; first indicator - # second indicator - #
  • Delete field - Yes
  • Click on Submit button

The process should now be complete.

Contact persons: Kay Dion or Lucy deGozzaldi

*

batch_uploading_oais_to_oclc_and_aleph.1588102777.txt.gz · Last modified: 2020/04/28 19:39 by ldegozzaldi
www.chimeric.de Creative Commons License Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0