HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization

50 %
50 %
Information about HIEDS: A Generic and Efficient Approach to Hierarchical Dataset...

Published on July 14, 2016

Author: gcheng

Source: slideshare.net

1. HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization Gong Cheng, Cheng Jin, Yuzhong Qu National Key Laboratory for Novel Software Technology Nanjing University, China Websoft

2. Linking Open Data cloud diagram 2014, by Max Schmachtenberg, Christian Bizer, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/

3. Scenario: browsing a dataset in an open data portal https://data.europa.eu/euodp/en/data/dataset/dgt-translation-memory I need some insight into the contents, not just metadata.

4. Meeting the challenge with a dataset summary i.e., automatically generated small-sized, high-level abstraction of data, to summarize the contents of a dataset for quick inspection.

5. Expected features of a dataset summary • To provide multigranular abstraction of data to be incrementally explored • To preserve the structural nature of a dataset • To be comprehensible

6. Constitution of a dataset summary • An example A hierarchical grouping of entities Relations connecting sibling groups A property-value pair differentiates a group of entities from sibling groups.

7. Quality of a dataset summary • Coverage of data • Height of hierarchy • Cohesion within groups • Overlap between groups • Homogeneity of groups

8. Quality of a dataset summary • Coverage of data • large subgroups, frequent relations • Height of hierarchy • Cohesion within groups • Overlap between groups • Homogeneity of groups

9. Quality of a dataset summary • Coverage of data • Height of hierarchy • moderate-sized subgroups • Cohesion within groups • Overlap between groups • Homogeneity of groups

10. Quality of a dataset summary • Coverage of data • Height of hierarchy • Cohesion within groups • informative (i.e., less frequent) property-value pairs • Overlap between groups • Homogeneity of groups

11. Quality of a dataset summary • Coverage of data • Height of hierarchy • Cohesion within groups • Overlap between groups • controllable overlap • Homogeneity of groups

12. Quality of a dataset summary • Coverage of data • Height of hierarchy • Cohesion within groups • Overlap between groups • Homogeneity of groups • different values of the same property

13. Problem formulation: multidimensional knapsack problem (MKP) maximizing moderateness of each subgroup maximizing cohesion within each subgroup disallowing large overlap between subgroups selecting ≤k subgroups (optionally) disallowing different properties

14. Problem solution • A greedy strategy is used (sorting candidates by ) but its efficient implementation is non-trivial.

15. Experiments • Baseline: LODeX (ISWC’14) • flat grouping • biased towards coverage (e.g., Type:Person) • redundant information (e.g., Type:Person and Type:Chair) • Advantages of HIEDS • hierarchical grouping • trade-off between coverage and cohesion (e.g., Type:Actor) • controllable overlap

16. Details can be found in our poster!

Add a comment

Related pages

HIEDS: A Generic and Efficient Approach to Hierarchical ...

HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization Gong Cheng⇤, Cheng Jin, Yuzhong Qu National Key Laboratory for Novel ...
Read more

ws.nju.edu.cn

HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization.

Read more

机器学习日报 2016-07-09 卷积神经网络的数学推导 ;使用 ML Pipeline 构建机器学习工作流 ...

... An Information Retrieval Approach》 http://t.cn/R5drhjD ,《HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization ...
Read more

Two papers accepted by IJCAI'16 - Websoft Research Group

HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization ... Taking up the Gaokao Challenge: An Information Retrieval Approach. Web:
Read more

A Hierarchical Framework for Efficient Multilevel Visual ...

... hierarchical approaches can effectively address ... dataset [72,60,87], most approaches aim at ... (Hierarchical Exploration Tree), a generic model for ...
Read more

A Hierarchical Aggregation Framework for Efficient ...

A Hierarchical Aggregation Framework for ... hierarchical approaches can effectively ... (Hierarchical Exploration Tree), a generic model for organizing ...
Read more

Gong Cheng | LinkedIn

View Gong Cheng’s professional profile on LinkedIn. ... HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
Read more

Incremental and Effective Data Summarization for Dynamic ...

Incremental and Effective Data Summarization for ... the quality of the data summarization for hierarchical ... the dynamic changes of the dataset.
Read more

Incremental and Effective Data Summarization for Dynamic ...

Publication » Incremental and Effective Data Summarization for Dynamic Hierarchical ... efficient split and merge ... datasets as these approaches need ...
Read more