VIVO at the University of Idaho

50 %
50 %
Information about VIVO at the University of Idaho

Published on February 27, 2014

Author: anniegaines



In 2012, the University of Idaho Library began implementing VIVO, an open-source Semantic Web application, both as a discovery layer for its fledgling institutional repository and as a database to describe, visualize, and report university research activity. The presenters will detail some of the challenges they encountered developing this resource, while discussing the tools and techniques they used for obtaining, editing, and uploading institutional data into the RDF-based VIVO system.


What is VIVO?  An Open-Source …   Semantic Web application …   RDF (Resource Description Framework) Triples, which are controlled subject-predicate-object expressions that produce consistent relationships and Data Harvesting procedures   Data structured so that it can be shared and reused using Linked Data practices and standards…   Freely available with a community of librarians and web developers Collecting, ingesting and publishing (public/private) data in batches to create a searchable, browseable, and reusable network of information on research and researchers.

Early History of VIVO  1997-2005: VIVO Network idea developed at Cornell for life and social sciences.  Intended to provide a view of sciences and research “across disciplinary and administrative boundaries.”  2005: Released for Life Sciences  2007: Expanded to all of Cornell University (thru Library)  2009: $12.2 million NIH grant provided to develop a national version with several other partners  2010 – Present: More and more institutions adopting and developing VIVO instances from “VIVO: Enabling National Networking of Scientists”

VIVO at the University of Idaho  Spring 2012 – Fall 2012  Approached by Idaho INBRE (a Biomedical Researcher network in Idaho) with question about possibly installing VIVO instance  Installed VIVO, began setting up and learning the system, while gathering feedback from INBRE and other stakeholders  Garnered approval from INBRE faculty to publish their information in the system  Harvested INBRE related information from public resources: PubMed and NIH and NSF grants database

VIVO at the University of Idaho  Spring 2013  Began to pursue expanded VIVO  Receive approval from institutional IT evaluation group to go forward  Re-branded instance  Presented VIVO to library faculty and administration as possible project going forward  Presented instance and proposal for new position to VP of Research

VIVO at the University of Idaho  Summer 2013  VP approved expanded use of VIVO for Research Groups on campus and funding for position  Annie Gaines begins as Scholarly Communication Librarian  Ingest, Ingest, Ingest,  Added three additional research groups, as well as the Law School, and associated faculty  Added thousands of grants, publications, and people into the system.

VIVO at the University of Idaho  Fall 2013  Presented VIVO publicly on campus for first time  VIVO goes live (accessible from off campus)  Additional organizational descriptions added (Department, College, Grant Strucutures, etc.)  Gained approval and access to use campus database system, Banner

VIVO at the University of Idaho  VIVO Today  Beginning to explore VIVO as front-end for historical documents  Adding all University Faculty  Creating applications and access points for data  Cleaning, always cleaning …  Using this presentation as a prompt for further development of application, as well as further defining:  the system’s presentation  our data’s preservation  and our mission and goals in using the system

Hosting  Provided by the Northwest Knowledge Network   NKN focuses on providing technical support to researchers  Division of UI’s Office of Research  Strong relationship with the UI Library (they are in the building)  Data is replicated to a data center at Idaho National Laboratory  Present future opportunities for integrating VIVO’s information with other research-related tools/systems

Technical Specs  Our installation   Apache Web Server  MySQL   Red Hat Linux Tomcat Current Version of VIVO  1.5.2  Probably upgrade to 1.6 in March 2014

Building VIVO – Two Approaches  Approach #1 – the high-resource approach (ideal)  Requires   Available programmers and developers   Discrete IT department Formal IT project management Advantages   Advanced customization and configuration   High-level of integration into existing systems/services Reasonably short time from inception to production Disadvantages  Red-tape  Represents a large commitment by the unit

Building VIVO – Two Approaches  Approach #2 – the low-resource approach (practical)  Requires   Experimental mindset   Minimum recommended staff identified in the VIVO implementation guide View VIVO as a series of small projects, rather than one large integration into university activities Advantages    Simple Manageable Disadvantages  Time (takes much longer)  Integration with existing services  Creation of custom data ingest tools

Implementation Goals  Start with low-hanging fruit. It is easier to collect  When considering custom tools and processes, our priorities:  1 – re-use from community or locally  2 – buy if possible  3 – build as needed  Build institutional interest in the existing data before soliciting more resources to further our development  Investigate third-party solutions (Symplectic Elements) as alternatives to custom-building internal methods of collecting data

Data Ingestion - General Typical workflow: 1. Receive data in source format 2. Convert to RDF (usually RDF/XML or Turtle) 3. Associate with VIVO ontology (as needed) 4. Reconcile against existing database 5. Load into the application 6. Re-index if needed

Data Ingestion - Sources  Public Sources    NSF, NIH, USDA Awards Pubmed Commercial Sources    Web of Science Must remove “intellectual effort” CVs, Publication Lists   Must have some means of soliciting them Local Databases (central university, research groups)  Several institutional sources  Must work through the gatekeepers of each  Need data security review to ensure that institutional concerns are met before public exposure

Data Ingestion - Tools  VIVO Harvester   Extract, Transform, and Load (ETL) tool that takes data from a source and loads it into VIVO automatically OpenRefine   Very flexible for different datatypes  Extension enables export in RDF format   Data cleaning tool Reconciliation service allows us to match and deduplicate entries before export Custom Conversion Tools (in Python)  Used for CRIS reports output, as well as other consistent, but unusual formats

Ontology Extensions  Custom University of Idaho model prefixed with “uidaho:”  Goals with our extensions   Establish the local need before creating   Re-use as much as possible Always associate classes within the VIVO hierarchy so that data is not fully reliant on uidaho for context Examples  Members of Idaho EPSCoR, Idaho INBRE, REACCH-PNA  Non-UI/Courtesy Faculty

Data Re-use - Fuseki  Apache Jena - Fuseki project   Enables external access to VIVO data  Without Fuseki, data re-use is limited to those authenticated with the system  Created examples of data re-use to assist in marketing efforts  Goal: to establish value-addness of putting data in VIVO  Example: Labs who need to report the results of their research by creating publication lists, or displaying spatial, temporal, or conceptual aspects of UI research to stakeholders or students could use this feature

Data Re-use - Fuseki Example 1: A very simple way to look at awards data. This presents the number of awards by agency. It is using a javascript library called sgvizler to turn JSON data from Fuseki into a Google Charts visualization.

Data Re-use - Fuseki Example 2: An other simple view using sg-vizler. This shows a comparison of two variables – awards and publications – for personnel in a specific research group. It would need work as a formal graph, but it points to the way that the data can be reused.

Data Re-use - Fuseki Example 3: An other simple example of data re-use using a javascript/ajax technique to display a list of journal titles and faculty within a specific research group. Links to the faculty members’ VIVO profiles are associated with their names.

VIVO as Institutional Repository

Background  When Annie was brought on for Scholarly Communications, one of her tasks was to develop an IR for the UI.  Some potential platforms to use for UI IR:  CONTENTdm – too flat  Bepress – too expensive  VIVO?

‘Institutional repositories’ “A set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members.” Clifford Lynch, ARL Bimonthly Report 226, Feb. 2003. “Digital collections that capture and preserve the intellectual output of university communities.” Ryam Crowe, Case for Institutional Repositories, SPARC, 2002

‘Institutional repositories’  Are:   Collection of scholarly work  Both cumulative and perpetual   Institutionally defined and managed Open Provide:  Long term preservation  Wide dissemination  Showcase for scholars and the institution

Challenges  Copyright issues, varying access  Buy-in from faculty, voluntary submissions  Getting people to care

VIVO as IR?  Not your typical IR interface   Interconnectedness in a large network  Includes diverse materials, not just article pre-prints  Includes citations for all works, not just the ones hosted in the IR   Dynamic browsing and searching Linked data format allows for reuse of data for a variety of purposes The following page shows a theses document in VIVO

Theory vs. Practice  Although VIVO can act as a front end, the documents must be hosted elsewhere  We deposit our docs in CONTENTdm and link to the PDF in VIVO  This makes things easier, but also more complicated  See example of the same theses document in CONTENTdm on the next page

Theory vs. Practice  We wanted to close this presentation by asking some questions to the group. If you have any advice for us on this project we would love to hear from you!  Are more access points better or more confusing?  Should we include historical documents in the VIVO IR?  Which page should be the main collection?  Should we provide links to all collections? Or link from one into the other?  What are best practices with unusually constructed Irs?

Thank you!

Add a comment

Related presentations

Related pages

VIVO @ UI | Digital Initiatives | University of Idaho

An informational resource on what VIVO is and how to get involved at the University of Idaho. Skip navigation. ... VIVO at the University of Idaho.
Read more

University of Idaho VIVO

Welcome to VIVO. VIVO is a research-focused discovery tool that enables collaboration among scientists across all disciplines at the University of Idaho.
Read more

About - University of Idaho VIVO

The current implementation of VIVO at the University of Idaho includes faculty from these research organizations and departments: EPSCoR; IBEST; INBRE
Read more

University of Idaho VIVO | NKN

About University of Idaho VIVO. The VIVO National Network enables the discovery of researchers across institutions. Participants in the network include ...
Read more

Help - Reference Services - University of Idaho Library ...

... databases, books, ebooks, digital collections, reference, geospatial ... newspapers, theses, and VIVO. ... University of Idaho Research ...
Read more

GitHub - dcnb/VIVO: vivo queries from the University of Idaho

VIVO - vivo queries from the University of Idaho ... HTTPS (recommended) Clone with Git or checkout with SVN ...
Read more

Tanya Miura - University of Idaho - Offering top-ranked ...

University of Idaho 875 Perimeter MS 3051 Moscow, ... we are combining data from in vivo and in vitro models to understand how neutrophils contribute to ...
Read more

International Jazz Collections (IJC) -- University of ...

The University of Idaho Library's International Jazz Collections consists of many different collections of digitized and archival ... VIVO; ABOUT; Departments;
Read more

Adjunct and Affiliate Faculty - University of Idaho ...

Learn about the many reasons the University of Idaho could be a ... Adjunct and Affiliate Faculty. ... In vivo studies involving the ...
Read more