Quality standards for data and metadata addressed to data science applications
DOI:
https://doi.org/10.47909/anis.978-9916-9760-3-6.116Keywords:
Networked Databases, Metadata, Metadata Quality, Data Science, Information ScienceAbstract
The present research investigates how the models of organization and representation of information and knowledge can be applied in Data Science. We highlight and discuss how data quality standards can provide conditions for the production of curated databases for applications. As for the methodology, this is qualitative, exploratory, and descriptive research. Our theoretical discussions are supported by bibliographic research performed in the fields of data science and information science. The results contributed to understanding that the potential of data reuse for applications in Data Science depends on strategies of organization and representation of information and knowledge based on the theoretical-methodological scope of Information Science. From the metadata creation stages to the description, cataloguing, classification, and indexing processes, Information Science can make significant contributions that impact the data quality for use and reuse in diverse applications.
Downloads
References
Allemang, D., Hendler, J., & Gandon, F. (2020). Semantic Web for the Working Ontologist: Effective Modeling for Linked Data, RDFS, and OWL (3a ed.). Association for Computing Machinery.
Berners-Lee, T., Hendler, J., & Lassila, O. (2001). The semantic web. Scientific American, 284(5), 34-43.
Bizer, C., Heath, T., & Berners-Lee, T. (2011). Linked data: The story so far. In Semantic services, interoperability and web applications: emerging concepts (pp. 205-227). IGI global. https://doi.org/10.4018/978-1-60960-593-3.ch008
DCC (Digital Curation Center). (2021). Retrieved from https://www.dcc.ac.uk/
Europeana Tech. (2020). Interim Analysis of Europeana Tech AI in Relation to GLAMs Survey. Retrieved from https://pro.europeana.eu/files/Europeana_Professional/Europeana_Network/Europeana_Network_Task_Forces/Final_reports/Final_Interim_Report_AI_in_GLAMs_TF.pdf
Freire, K. M. W., Sales, L. F., & Sayão, L. F. (2020). Curadoria digital no contexto artístico e cultural: possibilidades de reuso de dados de arte. Encontros Bibli: Revista eletrônica De Biblioteconomia e Ciência da Informação, (25), 01-21. https://doi.org/10.5007/1518-2924.2020.e74280
Greenberg, J. (2017). Big metadata, smart metadata, and metadata capital: Toward greater synergy between data science and metadata. Journal of Data and Information Science, 2(3), 19–36.
Joudrey, D.N., Taylor, A.G. and Miller, D.P. (2015). Introduction to Cataloging and Classification (11th Ed). Libraries Unlimited, Santa Barbara, CA.
Lemos, D.S.L., & Souza, R. R. (2020). Knowledge Organization Systems for the Representation of Multimedia Resources on the Web: A Comparative Analysis. Knowledge Organization, 47(4), 300-319. https://doi.org/10.5771/0943-7444-2020-4-300
Mora-Cantallops, M., Sánchez-Alonso, S., & García-Barriocanal, E. (2019). A systematic literature review on Wikidata. Data Technologies and Applications, 53, 250–268.
NISO (National Information Standards Organization). (2005). Guidelines for the construction, format, and management of controlled vocabularies. NISO Press, Baltimore.
Shadbolt, N., Berners-Lee, T., & Hall, W. (2006). The semantic web revisited. IEEE Intelligent Systems, (21)3, 96–101. https://doi.org/10.1109/MIS.2006.62
Siqueira, J., Carmo, D. D., Martins, D. L., Silva Lemos, D. L. D., Medeiros, V. N., & Oliveira, L. F. R. D. (2021, March). Elements for Constructing a Data Quality Policy to Aggregate Digital Cultural Collections: Cases of the Digital Public Library of America and Europeana Foundation. In International Conference on Data and Information in Online (pp. 106-122). Springer, Cham. https://doi.org/10.1007/978-3-030-77417-2_8
Virkus, S., & Garoufallou, E. (2020). Data science and its relationship to library and information science: a content analysis. Data Technologies and Applications, 54(5), 643-663. https://doi.org/10.1108/DTA-07-2020-0167
Wang, L. (2018). Twinning data science with information science in schools of library and information science. Journal of Documentation, 74, 1243-1257. https://doi.org/10.1108/JD-02-2018-0036
Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J., Appleton, G., Axton, M., Baak, A., ... & Mons, B. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific data, 3(1), 1-9. https://doi.org/10.1038/sdata.2016.18
Published
How to Cite
Issue
Section
Copyright (c) 2022 Danielle Carmo, Daniela Lucas da Silva Lemos

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
This is an open access article distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0) which permits copying and redistributing the material in any medium or format, adapting, transforming and building upon the material as long as the license terms are followed.