The Human Face of the Web of Data: A Cross-sectional Study of Label

Research & Innovation

Labels in the Web of Data are the key element for humans to access the data. In the following, we analyze seven diverse datasets, from the web of data, a collaborative knowledge base, open governmental and GLAM data. We gain an insight into the current state of labels and multilinguality on the web of data. We analyze the datasets based on a set of metrics including completeness, unambiguity, multilinguality, labeled object usage, and monolingual islands. Comparing a set of differently sourced datasets can help data publishers to understand what they can improve and what other ways of collecting and data can be adopted. Overall, the centrally published datasets are more comprehensive, however also in lack of data comprehensive. The community maintained dataset can show the best human accessibility, based on its constraints and its dedicated community.


Interested in this talk?

Register for SEMANTiCS conference