Quality standards for data and metadata addressed to data science applications


  • Danielle Carmo Universidade de Brasília, Brasil
  • Daniela Lucas da Silva Lemos Universidade Federal do Espírito Santo, Brasil




Networked Databases, Metadata, Metadata Quality, Data Science, Information Science


The present research investigates how the models of organization and representation of information and knowledge can be applied in Data Science. We highlight and discuss how data quality standards can provide conditions for the production of curated databases for applications. As for the methodology, this is qualitative, exploratory, and descriptive research. Our theoretical discussions are supported by bibliographic research performed in the fields of data science and information science. The results contributed to understanding that the potential of data reuse for applications in Data Science depends on strategies of organization and representation of information and knowledge based on the theoretical-methodological scope of Information Science. From the metadata creation stages to the description, cataloguing, classification, and indexing processes, Information Science can make significant contributions that impact the data quality for use and reuse in diverse applications.


Download data is not yet available.


Allemang, D., Hendler, J., & Gandon, F. (2020). Semantic Web for the Working Ontologist: Effective Modeling for Linked Data, RDFS, and OWL (3a ed.). Association for Computing Machinery.

Berners-Lee, T., Hendler, J., & Lassila, O. (2001). The semantic web. Scientific American, 284(5), 34-43.

Bizer, C., Heath, T., & Berners-Lee, T. (2011). Linked data: The story so far. In Semantic services, interoperability and web applications: emerging concepts (pp. 205-227). IGI global. https://doi.org/10.4018/978-1-60960-593-3.ch008

DCC (Digital Curation Center). (2021). Retrieved from https://www.dcc.ac.uk/

Europeana Tech. (2020). Interim Analysis of Europeana Tech AI in Relation to GLAMs Survey. Retrieved from https://pro.europeana.eu/files/Europeana_Professional/Europeana_Network/Europeana_Network_Task_Forces/Final_reports/Final_Interim_Report_AI_in_GLAMs_TF.pdf

Freire, K. M. W., Sales, L. F., & Sayão, L. F. (2020). Curadoria digital no contexto artístico e cultural: possibilidades de reuso de dados de arte. Encontros Bibli: Revista eletrônica De Biblioteconomia e Ciência da Informação, (25), 01-21. https://doi.org/10.5007/1518-2924.2020.e74280

Greenberg, J. (2017). Big metadata, smart metadata, and metadata capital: Toward greater synergy between data science and metadata. Journal of Data and Information Science, 2(3), 19–36.

Joudrey, D.N., Taylor, A.G. and Miller, D.P. (2015). Introduction to Cataloging and Classification (11th Ed). Libraries Unlimited, Santa Barbara, CA.

Lemos, D.S.L., & Souza, R. R. (2020). Knowledge Organization Systems for the Representation of Multimedia Resources on the Web: A Comparative Analysis. Knowledge Organization, 47(4), 300-319. https://doi.org/10.5771/0943-7444-2020-4-300

Mora-Cantallops, M., Sánchez-Alonso, S., & García-Barriocanal, E. (2019). A systematic literature review on Wikidata. Data Technologies and Applications, 53, 250–268.

NISO (National Information Standards Organization). (2005). Guidelines for the construction, format, and management of controlled vocabularies. NISO Press, Baltimore.

Shadbolt, N., Berners-Lee, T., & Hall, W. (2006). The semantic web revisited. IEEE Intelligent Systems, (21)3, 96–101. https://doi.org/10.1109/MIS.2006.62

Siqueira, J., Carmo, D. D., Martins, D. L., Silva Lemos, D. L. D., Medeiros, V. N., & Oliveira, L. F. R. D. (2021, March). Elements for Constructing a Data Quality Policy to Aggregate Digital Cultural Collections: Cases of the Digital Public Library of America and Europeana Foundation. In International Conference on Data and Information in Online (pp. 106-122). Springer, Cham. https://doi.org/10.1007/978-3-030-77417-2_8

Virkus, S., & Garoufallou, E. (2020). Data science and its relationship to library and information science: a content analysis. Data Technologies and Applications, 54(5), 643-663. https://doi.org/10.1108/DTA-07-2020-0167

Wang, L. (2018). Twinning data science with information science in schools of library and information science. Journal of Documentation, 74, 1243-1257. https://doi.org/10.1108/JD-02-2018-0036

Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J., Appleton, G., Axton, M., Baak, A., ... & Mons, B. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific data, 3(1), 1-9. https://doi.org/10.1038/sdata.2016.18



How to Cite

Carmo, D., & Lucas da Silva Lemos, D. (2022). Quality standards for data and metadata addressed to data science applications. Advanced Notes in Information Science, 2, 161-170. https://doi.org/10.47909/anis.978-9916-9760-3-6.116