Published on March 4, 2014
Utilitarian Aggregation of Open Data Srinath Srinivasa, Sweety Agrawal Chinmay Jog, Jayati Deshmukh IIIT Bangalore
OSL@IIITB ● ● Started in 2002 Current strength: 3 PhD students, 5 MS (by Research) students, 24 MTech project students ● Research Graduates: 4 PhDs, 1 MS ● Part of the Intel PlanetLab grid between 2003—2006 ● Broad research areas: data, systems and cognition ● Specific research areas over the years ● Data models for graph databases ● Distributed query processing ● Data management in ad hoc networks ● Semantics mining from text data ● Analytics of online social spaces ● Community knowledge management ● Linked Open Data ● Computational Cognition Current focus areas
OSL Releases Agama: A graph database for storing large undirected graphs for efficient traversal (not structurebased retrieval) Currently Agama powers a co occurrence graph of all noun phrases from Wikipedia articles hosted in OSL, managing 10s of millions of nodes and 100s of millions of edges
OSL Releases Topical Anchors: Given a list of noun phrases, identify a semantic topic for these terms. Powered by Wikipedia cooccurrence graph hosted by Agama Web APIs enable use of Topical Anchors in third party applications
OSL Releases Topic Expansion: Given a term, expands it into semantically relevant topical clusters with different senses. Uses co-occurrence datasets from Wikipedia 2006 or 2011. Web APIs enable use by third party applications
OSL Releases Silverfish: A social space for managing and discussing research papers Supports automatic indexing, recommendations and social networking features
Utilitarian Aggregation of Open Data
Open Data Data hosted publicly for use and re publication with a free or open license Usually comprising of structured datasets in the form of tables Major government, NGO and corporate players in the open data space
Open Data in India: A Summary [Agrawal et al. 2013]
Sandesh A “semantic data mesh” over Indian Open Data Connecting elements from different datasets under an overarching semantic structure Challenges Open data about no single topic in particular, fits into no single ontology Contextual boundaries of open data assertions unable to model using LinkedData standards The problem of “openended” data
Challenges in Open Data Aggregation Fragmentation
Challenges in Open Data Aggregation Bounded validity of utilitarian data Consider the following RDF statements: ( Einstein , HasWon , NobelPrize) (Wheat , PricePerKilo , 50) Encyclopedic knowledge Utilitarian knowledge Valid everywhere without contextual boundaries Valid only within specific contextual boundaries (market, place, time, etc.) No immediate or specific utility Has immediate and/or specific utility
Challenges in Open Data Aggregation The “divergent” nature of utilitarian aggregation The “convergent” nature of encyclopedic aggregation like Wikipedia articles
Challenges in Open Data Aggregation The “divergent” nature of utilitarian aggregation Utilitarian aggregation involves creation of several “utility worlds” each of which combine a given data with different other data sets for different utilitarian goals.
Challenges in Open Data Aggregation Open Data and Credibility Open data portals hosting utilitarian data (Ex: Data.gov.in) requires credential checks from data sources for establishing trust, which is not so critical for open data portals hosting encyclopedic data (Ex: Wikipedia).
Challenges in Open Data Aggregation The problem of “openended” data Data containing private information about entity p, but which may need to be (legitimately) disseminated and used by several entities unrelated to p Owner (p) of data may not have knowledge or control over consumers of data; but trusts the system to disseminate this data to legitimate consumers Example case studies: ● ICSE marks data ● BPL data
Many Worlds on a Frame (MWF) A trusted, distributed middleware for utilitarian aggregation and dissemination of open data Users as knowledge elements Users MWF Datasets Formal model of MWF developed independently, but representable as a superposition of two Frames in Kripke Semantics Aggregated knowledge in utilitarian “worlds”
MWF: Conceptual World Place Person Conceptual World: A semantic context to host data about something Institution Crop
City is-a is-in is-a is-in State is-a is-in MWF: Frame Place Every conceptual world has a “type” and a “location” specified by an “isa” parent and “isin” parent respectively. The data structure formed by isa and isin connections is called the Frame
MWF: World Structure and Participation Institution Person Office Location Member Components Member Member Associations Heads Office Location ReportsTo Member Place
MWF: Privileges Institution User :: Person Credentials of a Person (User) defined by the roles played by the Person in different worlds Administrator Schema Manager Data owner Casual user Public Credentials determine privilege level in a target world
MWF: Inheritances Place City is-a is-in State isa hierarchy inherits: ● World structure ● Attributes ● Participations ● Constraints isin hierarchy inherits: ● Privilege levels ● Visibility ● Construction ● Destruction
MWF: Other Features (ongoing work) Constraints Uniqueness constraints Dual Associations Bulk loading of data Cognitive gapfillers Query semantics ● Selectin Answer a query by matching query condition inside a world and its contained worlds ● Selecton Answer a query by matching query condition on a set of worlds of a given type ● Selectworld Answer a query about the participation of a given world in other worlds
MWF: Future Work Distributed MWF with proxy worlds From privileges and constraints to an integrity management subsystem
References [Agrawal et al, 2013] Sweety Agrawal, Jayati Deshmukh, Srinath Srinivasa, Chinmay Jog, Sri Sayi Bhavani Kakarla, Rahul Dhek, Sneha Deshpande, Sana Javed and Vikas Mohandoss. A Survey of Indian Open Data. Proceedings of IBM ICARE 2013. ACM Press. New Delhi, India. Oct 2013 [Srinivasa et al, 2014] Srinath Srinvasa, Sweety Agrawal, Chinmay Jog, Jayati Deshmukh. Characterizing Open Utilitarian Knowledge. Proceedings of CoDS 2014, New Delhi, India, March 2014.
Small deck used during Use Case roundtable at JiveWorld 2014. On each on the 14 ta...
This 30 minute presentation was given at the 2014 Rochester Young Professionals En...
Characterizing Utilitarian Aggregation of Open Knowledge. Full Text: PDF: Authors: ... CoDS '14 Proceedings of the 1st IKDD Conference on Data Sciences
Characterizing Utilitarian Aggregation of Open Knowledge Srinath Srinivasa ... Open data, Linked Data, Ontology, Utilitarian knowledge, Many Worlds on a Frame
Characterizing Utilitarian Aggregation of Open Knowledge ... together open data elements ... utilitarian aggregation is a ...
Characterizing Utilitarian Aggregation of Open ... of tabular data using ontologies from publicly available knowledge bases in Linked Open Data.
Data (FRED®) Find material. JEL ... View the original document on HAL open archive server: ... "Utilitarian Aggregation of Beliefs and Tastes," Journal of ...
Utilitarian aggregation of open data ... They are a place to share open data on how ... Social Media Research Report
In The Open Society ... the aggregation of utility becomes futile as both pain ... International Website for Utilitarianism and Utilitarian Scholar's ...
View Chinmay Jog’s professional ... Characterizing utilitarian aggregation of open ... Recent initiatives in "open data" have resulted in ...
An indiscriminate Pareto condition has been shown to contradict linear aggregation of beliefs and tastes. ... Data policy; Manuscript ... utilitarian ...