COVID-19: A scholarly production dataset report for research analysis

COVID-2019 has been recognized as a global threat, and several studies are being conducted in order to contribute to the fight and prevention of this pandemic. This work presents a scholarly production dataset focused on COVID-19, providing an overview of scientific research activities, making it po...

Full description

Autores:
Tipo de recurso:
Article of investigation
Fecha de publicación:
2020
Institución:
Universidad de Bogotá Jorge Tadeo Lozano
Repositorio:
Expeditio: repositorio UTadeo
Idioma:
eng
OAI Identifier:
oai:expeditiorepositorio.utadeo.edu.co:20.500.12010/14112
Acceso en línea:
https://doi.org/10.1016/j.dib.2020.106178
http://hdl.handle.net/20.500.12010/14112
Palabra clave:
COVID-19
SARS-CoV-2
Pandemic
Data Science
Bibliometrics
Scientometrics
Síndrome respiratorio agudo grave
COVID-19
SARS-CoV-2
Coronavirus
Rights
License
Abierto (Texto Completo)
Description
Summary:COVID-2019 has been recognized as a global threat, and several studies are being conducted in order to contribute to the fight and prevention of this pandemic. This work presents a scholarly production dataset focused on COVID-19, providing an overview of scientific research activities, making it possible to identify countries, scientists and research groups most active in this task force to combat the coronavirus disease. The dataset is composed of 40,212 records of articles’ metadata collected from Scopus, PubMed, arXiv and bioRxiv databases from January 2019 to July 2020. Those data were extracted by using the techniques of Python Web Scraping and preprocessed with Pandas Data Wrangling. In addition, the pipeline to preprocess and generate the dataset are versioned with the Data Version Control tool (DVC) and are thus easily reproducible and auditable.