AddPinyin Plugin for MarcEdit

The East Asian Library and the Gest Collection

The East Asian Library and the Gest Collection

普林斯頓大學葛思德東亞圖書館 ・ プリンストン大学東アジア図書館 ・ 프린스턴 대학교 동아시아 도서관

AddPinyin Plugin for MarcEdit

Click Here to Download the Plugin Installer (EXE, packaged in a ZIP file)

  • Version 1.1.0, last updated 2016-12-20 (Click here to see version history).
  • Compatible with the Windows version of MarcEdit (version 6).
  • This plugin does not need to be installed as Administrator.  It should be installed while logged in as the user that will be using the software.

Creative Commons License
AddPinyin Plugin for MarcEdit by Princeton University Library is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License .

The Chinese dictionary data is based on 3 sources:

  • The Unihan database, copyright 1991-2016, Unicode, Inc. Last updated 2016-06-01.
  • CC-CEDICT, copyright 2016, MDBG. Last updated 2016-12-14.
  • User feedback: To suggest additional characters or phrases for the dictionary, please use the suggestion form.

The AddPinyin plugin takes a set of MARC records and converts Chinese text in selected fields to pinyin. For each converted field, the original text is moved to a corresponding 880 field. A subfield 6 with the appropriate linkage value is automatically added to both the converted field and the 880. 

The plugin's functionality complements that of the OCLC Connexion Pinyin Conversion Macro; whereas the OCLC Macro is run on individual fields within a single record, the MarcEdit Plugin is designed for batch processing.  Also, since the plugin runs within MarcEdit, it is independent of a specific catalog.

The plugin generates pinyin using ALA-LC standards. The output is similar to that produced by the OCLC Macro. Please see the documentation for this marco for specifics. It is difficult to automate romanization with 100% accuracy, so it is always beneficial to manually proofread the results when practical. However, most of the needed adjustments will have to do with spacing, capitalization, and punctuation, not the pinyin itself. Efforts have been made to keep even these minor inaccuracies to a minimum. However, if you notice any errors or would like to suggest new phrases to be included in the dictionary, please submit them using this form. (It is the same form used to suggest phrases for the OCLC Macro).

This plugin can be run on files containing both Chinese and non-Chinese records. The plugin examines the 008 field to identify Chinese records, and leaves everything else untouched. The program also will not touch records where Chinese text only appears in the 880 fields (i.e., romanization has already been added). If a record has romanization added for some but not all Chinese fields, the plugin can add romanization for the remaining unconverted fields (if desired). The main dialog of the program will tell you how many records contain Chinese text that could potentially be converted.

To run the plugin, do the following:

  1. Create a backup of the MARC record file to be converted.
  2. Convert the file to MRK format (using the MarcBreaker tool in MarcEdit) and open in MarcEditor. The file may be encoded in either UTF-8 or MARC-8. However, with MARC-8, the program will take much longer to run. It may be more efficient to use MarcBreaker to convert the file to UTF-8 before running the plugin, then convert it back to MARC-8 afterwards, if needed.
  3. Open the "Plugins" menu and select "AddPinyin". A dialog will appear warning you that the conversion cannot be undone, and that the MRK file will be automatically saved after conversion. (This is why it is important to back the file up). Click "OK".
  4. A dialog will appear asking you which fields to generate romanization for. Certain fields are selected by default. Use the arrow buttons to specify which fields you would like to convert. Then click the “Convert” button.
  5. After romanization is complete, the updated records will be displayed in the MarcEditor. Compile file back to MRC format by opening the “File” menu and selecting “Compile File into MARC”.