TigerData Logo. Credit: Laurel Cantor, Office of Communications. May 7, 2025 Written by Adam HadhazyA powerful new data management service is about to become available to the Princeton research community. Known as TigerData in an homage to Princeton’s mascot, the service enables researchers to work with their data in better and easier ways.“We are proud to announce that TigerData is opening its doors to the entire research community of Princeton University,” says Curtis Hillegas, co-chair of the faculty-led TigerData steering committee and Senior Associate Dean for Research Computing. “We designed TigerData to empower researchers to do the things they want to do with their data, both for present projects and for future endeavors.”TigerData represents the culmination of years of planning and development by Princeton University to implement a system where users can efficiently access, organize, and describe their data within a secure ecosystem. TigerData has also been designed to ably deliver sustainable long-term data storage and re-use, whether for users looking for just a handful of gigabytes or those users whose research spans hundreds of terabytes. “TigerData welcomes all disciplines, from the sciences to the humanities,” says Wind Cowles, co-chair of the TigerData steering committee and the Associate Dean for Data, Research, and Teaching for the Library. “We look forward to the service helping Princeton’s rich range of researchers focus more on their core scholarly work and less on dealing with data management challenges.”One such researcher who is eager to start using TigerData is Jon Cohen, the Robert Bendheim and Lynn Bendheim Thoman Professor in Neuroscience and a steering committee member. His research delves into the cognitive and neurobiological mechanisms of how people control their attention, thoughts, and actions toward goals. With a new magnetoencephalography scanner in the Princeton Neuroscience Institute (PNI) soon to be generating prodigious quantities of human brain activity data from study participants, Cohen feels TigerData has come together at the right time.“Having significantly more data compute and storage capabilities right here on campus is going make it possible for us to do experiments that we couldn't have done otherwise,” says Cohen. “Princeton is absolutely in the lead amongst our peer institutions in tackling this new data frontier. My colleagues and I give tremendous credit to the university administration for their willingness to take on this bold and challenging effort.”Modernizing data management The impetus for TigerData stems from continuing increase in complexity, cost, compliance, and security for data-intensive research. As a world-leading research institution, Princeton has historically obtained additional storage for its voluminous faculty-associated data on an essentially ad hoc basis across several platforms. Yet as the stacks have grown higher and the users have multiplied, data management has emerged as a serious issue university-wide. Research Data Lifecycle Model illustrates the stages of data management. Credit: Princeton Research Data Service. “In talking to faculty about their data management needs, we’ve heard many stories about how difficult it can be to make data available when that eventual outcome had not been planned for from the beginning of the project,” says Hillegas.“Speaking as a former researcher, I know the experience of going back to a project even a year later when you’ve not been actively working on it, and saying, ‘Where are the data that I need?’” adds Cowles. The TigerData approach seeks to alleviate these problems by bringing data management and best practices together. “The essence of what we want to do with TigerData is that we want to change the culture of Princeton from being data storage-focused to being data management-focused,” says Hillegas.How TigerData worksFundamentally, TigerData is a suite of software tools and a collection of scalable, tiered data storage repositories. The tiering approach allows for access- and cost-effective storage of data over its entire life cycle, from generation through to publication, and then for archiving or deletion, as desired. TigerData storage racks housed at the High Performance Computing Research Center. Photo: Michael Monaghan, Research Computing. The impetus for TigerData stems from continuing increase in complexity, cost, compliance, and security for data-intensive research. As a world-leading research institution, Princeton has historically obtained additional storage for its voluminous faculty-associated data on an essentially ad hoc basis across several platforms. Yet as the stacks have grown higher and the users have multiplied, data management has emerged as a serious issue university-wide.Connecting these data storage architectures is a product called Mediaflux, a comprehensive data management platform developed by Arcitecta, an Australian software company. Mediaflux collects and organizes data for retrievability, making a user’s data seamlessly accessible regardless of where in the storage architecture they actually reside. When data are initially housed within TigerData, users fill in metadata fields so the system can enable researchers to track and understand what data they have. The metadata can, for instance, identify what the data are, when they were created, and who the data are sponsored by. As TigerData is further expanded, the plans are to enable customizable metadata, such as how data are funded and how long the data need to be preserved. To help in this metadata-tagging effort, the TigerData team will assist in training a data manager for each research group or department to act as a point person for robust data management. “We’ll be making a community effort to show researchers how to create extensible, flexible, customizable metadata,” says Cowles.In this way, and in reflection of the latest industry trends, TigerData is accordingly following the principles of FAIR—Findable, Accessible, Interoperable, Reusable—to ensure users get the most of the system. “We are giving users tools and teaching them practices that let them make their data FAIR so it as easy as possible to work with now and in the future,” says Cowles.The TigerData advantageIn addition to these benefits, TigerData offers key advantages over existing third-party data storage and management systems, for instance such as Dropbox or Google Drive. Through TigerData, the university will have the ability to be responsive to user needs in ways that offsite Big Tech companies simply cannot. The overall TigerData user experience will be facilitated via the direct support of technical and data management teams on campus, helping researchers implement optimal data management solutions as well as respond to evolving user requests. TigerData design which shows how it manages storage with multiple ways of access. Credit: Irene Kopaliani, Research Computing. “TigerData will continue to be tuned by the needs of the research community, and we'll change the service over time to meet those needs,” says Hillegas. “It's not an ‘if you build it, they will come’ approach. As we further deploy TigerData and people say they require capabilities that aren't there, we'll add those capabilities.”An example case pertains to high-performance computing and “hot storage,” meaning the maintaining of fast access to frequently used data. Typically, available legacy options slow down when researchers are working with notably large files or numbers of files. TigerData can more nimbly avoid such issues, given that its team can help users make sure their data is where it needs to be and when, given the computing resources that they need. Also significantly, the leadership group behind TigerData is building collaborations with other key data and computational players at the university along with academic and administrative departments to effectively implement and support the service. Relationships are in place between Princeton Research Computing, the Office of Information Technology Enterprise Infrastructure Services (OIT EIS), the Princeton Institute for Computational Science and Engineering (PICSciE), the Princeton University Library, and the Support for Computing in Academic Departments/Department Computing Support (SCAD/DCS) program.As TigerData’s rollout continues, the service will remain provisionally free to all small and medium scale users. With anticipated expansion of the service, it is possible that contributions from the largest data users may be prudent to maintain the system’s overall resiliency, in keeping with general large-scale data management practices. Looking down the road, the TigerData team anticipates evolving the service to meet the needs of a rapidly changing research landscape and enable new opportunities for leveraging the power of institutional data. “With TigerData, we’ll constantly stay attuned to longer-term shifts regarding how we work with data as researchers,” says Cowles. “We’re still at an early stage of what TigerData eventually can be. We're at the beginning of something bigger.”