Photo credit: Sameer Khan/Fotobuddy Written by Brandon Johnson, Communications Strategist May 29, 2025 How can artificial intelligence improve archivists’ ability to create meaningful and searchable metadata? James Zhang ’25, a computer science graduate, explored just that in his Princeton University Library (PUL) Award-winning project for Princeton Research Day. Now in its fourth year, the Library award recognizes student work that made thoughtful or innovative use of Library resources. Zhang’s project, “Optical Character Recognition (OCR) with Large Vision-Language Models (LVLMs),” centered on library and archival efforts to digitize items in the name of accessibility. An outgrowth of his work with Wouter Haverals, Mary Naydan, and Brian Kernighan, Zhang’s project focused on improving the Princeton Prosody Archive (PPA). The Archive, a searchable database of thousands of English-language books published between 1532 and 1928, highlights developments in the study of language and poetry, tracking how these fields converge and diverge over time. “Wouter and Mary first pointed out that the existing OCR was often subpar, which limited the archive’s usability,” Zhang said. “During the spring of my junior year, we tackled the post-OCR correction problem, asking: ‘if all we had was the raw OCR text, could we reconstruct the correct transcription?’”In pursuit of making PPA even more functional, Zhang questioned whether “recognition” was really the problem that needed solving. Using LVLMs, which process both text and visual data simultaneously, Zhang’s Princeton Research Day project explored how treating documents as visual objects, rather than simply swathes of text, can scale the ability to create rich, informative metadata. “What we needed wasn’t just more accurate transcriptions: it was better representation,” Zhang said. In addition to working with the Center for Digital Humanities and the Computer Science department, Zhang turned to Princeton University Library, whose staff shaped the way he thought about metadata, digitization, and “useful” information. He added, “Sarah Reiff Conell, Bryan Winston, Dan Linke, Morgan Kirkpatrick, Andy Janco, Amy Vo, their insights grounded my technical work in the realities of library workflows and showed me how much invisible labor goes into making collections accessible.” Similarly, for his thesis, Zhang created “MetaScribe,” an open-source tool that leverages LVLMs to extract information from scanned documents and create metadata based on customizable fields. “It is designed to scaffold the metadata creation process, offering a flexible starting point that archives and libraries can adapt to their own needs,” Zhang said.“With a richer metadata backbone, we can then enable more powerful discovery and computational scholarship,” Zhang said. A video about Zhang’s project is viewable on Media Central.