Data and Metadata Management with Semantic Technologies

A special issue of Information (ISSN 2078-2489). This special issue belongs to the section "Information Systems".

Deadline for manuscript submissions: closed (15 March 2022) | Viewed by 14246

Special Issue Editors


E-Mail Website1 Website2
Guest Editor
Istituto di Matematica e Tecnologie Informatiche “Enrico Magenes” (IMATI) - Consiglio Nazionale delle Ricerche (CNR), 16100 Genova, Italy
Interests: (linked) open data and quality; linked data consumption; data and metadata analysis and standardization

E-Mail Website
Guest Editor
Semantic Arts, Fort Collins, CO 80524, USA
Interests: ontologies and other semantic technologies in business and government enterprises and in national information infrastructures

Special Issue Information

Dear Colleagues,

Linked data and semantic web technologies emerging from the last 20 years have reached a good level of maturity. IETF and W3C standards at the fundamental level, such as HTTP, RDF, OWL, and SPARQL, provide a technological layer to publish, harmonize, and consume data. The underpinning linked data and semantic web principles and practices have impacted the sharing, discovery, and integration of (meta)data beyond the specific field of research, inspiring the management and consumption of information using explicit semantics even beyond the web. Novel research questions can still arise from implementation experiences and intersection with other academic disciplines.

The purpose of this Special Issue is to present emerging research issues and the latest developments in the management, harmonization, and consumption of (meta)data with semantic technologies.

Investigators in the field are invited to contribute with their original, unpublished works. Both research and review papers are welcome.

Topics of interest include but are not limited to:

  • Approaches to publish, harmonize, and consume (meta)data;
  • Management and exploitation of provenance and workflows;
  • Management of the evolution and preservation of the data; 
  • Quality assessment, documentation, and improvement;
  • Data matching, enrichment, and cleaning;
  • Ontologies, metadata vocabularies, and standardization; 
  • Semantic technologies applied to data infrastructures and FAIR data management;
  • In-use cases and lessons learned managing (meta)data in industrial and domain-specific applications (e.g., in relation to cultural heritage, e-government, education, environmental, health and medical data, among others).

Dr. Riccardo Albertoni
Dr. Peter Winstanley
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • data management
  • metadata
  • semantic technologies

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

32 pages, 975 KiB  
Article
DPCat: Specification for an Interoperable and Machine-Readable Data Processing Catalogue Based on GDPR
by Paul Ryan, Rob Brennan and Harshvardhan J. Pandit
Information 2022, 13(5), 244; https://doi.org/10.3390/info13050244 - 10 May 2022
Cited by 3 | Viewed by 4544
Abstract
The GDPR requires Data Controllers and Data Protection Officers (DPO) to maintain a Register of Processing Activities (ROPA) as part of overseeing the organisation’s compliance processes. The ROPA must include information from heterogeneous sources such as (internal) departments with varying IT systems and [...] Read more.
The GDPR requires Data Controllers and Data Protection Officers (DPO) to maintain a Register of Processing Activities (ROPA) as part of overseeing the organisation’s compliance processes. The ROPA must include information from heterogeneous sources such as (internal) departments with varying IT systems and (external) data processors. Current practices use spreadsheets or proprietary systems that lack machine-readability and interoperability, presenting barriers to automation. We propose the Data Processing Catalogue (DPCat) for the representation, collection and transfer of ROPA information, as catalogues in a machine-readable and interoperable manner. DPCat is based on the Data Catalog Vocabulary (DCAT) and its extension DCAT Application Profile for data portals in Europe (DCAT-AP), and the Data Privacy Vocabulary (DPV). It represents a comprehensive semantic model developed from GDPR’s Article and an analysis of the 17 ROPA templates from EU Data Protection Authorities (DPA). To demonstrate the practicality and feasibility of DPCat, we present the European Data Protection Supervisor’s (EDPS) ROPA documents using DPCat, verify them with SHACL to ensure the correctness of information based on legal and contextual requirements, and produce reports and ROPA documents based on DPA templates using SPARQL. DPCat supports a data governance process for data processing compliance to harmonise inputs from heterogeneous sources to produce dynamic documentation that can accommodate differences in regulatory approaches across DPAs and ease investigative burdens toward efficient enforcement. Full article
(This article belongs to the Special Issue Data and Metadata Management with Semantic Technologies)
Show Figures

Figure 1

13 pages, 621 KiB  
Article
Revolutions Take Time
by Peter Wittenburg and George Strawn
Information 2021, 12(11), 472; https://doi.org/10.3390/info12110472 - 16 Nov 2021
Cited by 2 | Viewed by 1725
Abstract
The 2018 paper titled “Common Patterns in Revolutionary Infrastructures and Data” has been cited frequently, since we compared the current discussions about research data management with the developments of large infrastructures in the past believing, similar to philosophers such as Luciano Floridi, that [...] Read more.
The 2018 paper titled “Common Patterns in Revolutionary Infrastructures and Data” has been cited frequently, since we compared the current discussions about research data management with the developments of large infrastructures in the past believing, similar to philosophers such as Luciano Floridi, that the creation of an interoperable data domain will also be a revolutionary step. We identified the FAIR principles and the FAIR Digital Objects as nuclei for achieving the necessary convergence without which such new infrastructures will not take up. In this follow-up paper, we are elaborating on some factors that indicate that it will still take much time until breakthroughs will be achieved which is mainly devoted to sociological and political reasons. Therefore, it is important to describe visions such as FDO as self-standing entities, the easy plug-in concept, and the built-in security more explicitly to give a long-range perspective and convince policymakers and decision-makers. We also looked at major funding programs which all follow different approaches and do not define a converging core yet. This can be seen as an indication that these funding programs have huge potentials and increase awareness about data management aspects, but that we are far from converging agreements which we finally will need to create a globally integrated data space in the future. Finally, we discuss the roles of some major stakeholders who are all relevant in the process of agreement finding. Most of them are bound by short-term project cycles and funding constraints, not giving them sufficient space to work on long-term convergence concepts and take risks. The great opportunity to get funds for projects improving approaches and technology with the inherent danger of promising too much and the need for continuous reporting and producing visible results after comparably short periods is like a vicious cycle without a possibility to break out. We can recall that coming to the Internet with TCP/IP as a convergence standard was dependent on years of DARPA funding. Building large revolutionary infrastructures seems to be dependent on decision-makers that dare to think strategically and test out promising concepts at a larger scale. Full article
(This article belongs to the Special Issue Data and Metadata Management with Semantic Technologies)
Show Figures

Figure 1

25 pages, 3152 KiB  
Article
RDFsim: Similarity-Based Browsing over DBpedia Using Embeddings
by Manos Chatzakis, Michalis Mountantonakis and Yannis Tzitzikas
Information 2021, 12(11), 440; https://doi.org/10.3390/info12110440 - 23 Oct 2021
Cited by 7 | Viewed by 3359
Abstract
Browsing has been the core access method for the Web from its beginning. Analogously, one good practice for publishing data on the Web is to support dereferenceable URIs, to also enable plain web browsing by users. The information about one URI is usually [...] Read more.
Browsing has been the core access method for the Web from its beginning. Analogously, one good practice for publishing data on the Web is to support dereferenceable URIs, to also enable plain web browsing by users. The information about one URI is usually presented through HTML tables (such as DBpedia and Wikidata pages) and graph representations (by using tools such as LODLive and LODMilla). In most cases, for an entity, the user gets all triples that have that entity as subject or as object. However, sometimes the number of triples is numerous. To tackle this issue, and to reveal similarity (and thus facilitate browsing), in this article we introduce an interactive similarity-based browsing system, called RDFsim, that offers “Parallel Browsing”, that is, it enables the user to see and browse not only the original data of the entity in focus, but also the K most similar entities of the focal entity. The similarity of entities is founded on knowledge graph embeddings; however, the indexes that we introduce for enabling real-time interaction do not depend on the particular method for computing similarity. We detail an implementation of the approach over specific subsets of DBpedia (movies, philosophers and others) and we showcase the benefits of the approach. Finally, we report detailed performance results and we describe several use cases of RDFsim. Full article
(This article belongs to the Special Issue Data and Metadata Management with Semantic Technologies)
Show Figures

Figure 1

Review

Jump to: Research

30 pages, 29455 KiB  
Review
Review of Tools for Semantics Extraction: Application in Tsunami Research Domain
by František Babič, Vladimír Bureš, Pavel Čech, Martina Husáková, Peter Mikulecký, Karel Mls, Tomáš Nacházel, Daniela Ponce, Kamila Štekerová, Ioanna Triantafyllou, Petr Tučník and Marek Zanker
Information 2022, 13(1), 4; https://doi.org/10.3390/info13010004 - 24 Dec 2021
Cited by 5 | Viewed by 2845
Abstract
Immense numbers of textual documents are available in a digital form. Research activities are focused on methods of how to speed up their processing to avoid information overloading or to provide formal structures for the problem solving or decision making of intelligent agents. [...] Read more.
Immense numbers of textual documents are available in a digital form. Research activities are focused on methods of how to speed up their processing to avoid information overloading or to provide formal structures for the problem solving or decision making of intelligent agents. Ontology learning is one of the directions which contributes to all of these activities. The main aim of the ontology learning is to semi-automatically, or fully automatically, extract ontologies—formal structures able to express information or knowledge. The primary motivation behind this paper is to facilitate the processing of a large collection of papers focused on disaster management, especially on tsunami research, using the ontology learning. Various tools of ontology learning are mentioned in the literature at present. The main aim of the paper is to uncover these tools, i.e., to find out which of these tools can be practically used for ontology learning in the tsunami application domain. Specific criteria are predefined for their evaluation, with respect to the “Ontology learning layer cake”, which introduces the fundamental phases of ontology learning. ScienceDirect and Web of Science scientific databases are explored, and various solutions for semantics extraction are manually “mined” from the journal articles. ProgrammableWeb site is used for exploration of the tools, frameworks, or APIs applied for the same purpose. Statistics answer the question of which tools are mostly mentioned in these journal articles and on the website. These tools are then investigated more thoroughly, and conclusions about their usage are made with respect to the tsunami domain, for which the tools are tested. Results are not satisfactory because only a limited number of tools can be practically used for ontology learning at present. Full article
(This article belongs to the Special Issue Data and Metadata Management with Semantic Technologies)
Show Figures

Figure 1

Back to TopTop