Data science: an emerging discipline

The role of data scientist has been described as the “sexiest job of the 21st Century”. While possibly there is a degree of hype associated with such a claim, there are factors at play such as the unprecedented growth in the amount of data being generated. This paper characterises the already establ...

Full description

Autores:
Galpin, Ixent
Tipo de recurso:
Fecha de publicación:
2016
Institución:
Universidad Santo Tomás
Repositorio:
Repositorio Institucional USTA
Idioma:
OAI Identifier:
oai:repository.usta.edu.co:11634/11508
Acceso en línea:
http://hdl.handle.net/11634/11508
Palabra clave:
Data science
Data mining
Data engineering
Big Data
Rights
License
http://purl.org/coar/access_right/c_abf2
id SANTOTOMAS_8be8b49dd4a20408981c6a0fcb81748a
oai_identifier_str oai:repository.usta.edu.co:11634/11508
network_acronym_str SANTOTOMAS
network_name_str Repositorio Institucional USTA
repository_id_str
dc.title.spa.fl_str_mv Data science: an emerging discipline
title Data science: an emerging discipline
spellingShingle Data science: an emerging discipline
Data science
Data mining
Data engineering
Big Data
title_short Data science: an emerging discipline
title_full Data science: an emerging discipline
title_fullStr Data science: an emerging discipline
title_full_unstemmed Data science: an emerging discipline
title_sort Data science: an emerging discipline
dc.creator.fl_str_mv Galpin, Ixent
dc.contributor.author.spa.fl_str_mv Galpin, Ixent
dc.subject.keyword.spa.fl_str_mv Data science
Data mining
Data engineering
Big Data
topic Data science
Data mining
Data engineering
Big Data
description The role of data scientist has been described as the “sexiest job of the 21st Century”. While possibly there is a degree of hype associated with such a claim, there are factors at play such as the unprecedented growth in the amount of data being generated. This paper characterises the already established disciplines which underpin data science, viz., data engineering, statistics, and data mining. Following a characterisation of the previous fields, data science is found to be most closely related to data mining. However, in contrast to data mining, data science promises to operate over datasets that exhibit significant challenges in terms of the four Vs: Volume, Variety, Velocity and Veracity. This paper notes that the current emphasis, both in industry and academia, is on the first three Vs, which pose mainly scientific or technological challenges, rather than Veracity, which is a truly scientific (and arguably a more complex) challenge. Data Science can be seen to have a more ambitious objective than what traditionally data mining has: as a science, data science aims to lead to the creation of new theories and knowledge. This paper notes that, ironically, the veracity dimension, which is arguably the closest one relating to this objective, is being neglected. Despite the current media frenzy about data science, the paper concludes that more time is needed to see whether it will emerge as discipline in its own right.
publishDate 2016
dc.date.issued.spa.fl_str_mv 2016
dc.date.accessioned.spa.fl_str_mv 2018-04-03T15:35:30Z
dc.date.available.spa.fl_str_mv 2018-04-03T15:35:30Z
dc.type.coarversion.fl_str_mv http://purl.org/coar/version/c_970fb48d4fbd8a85
dc.type.coar.fl_str_mv http://purl.org/coar/resource_type/c_2df8fbb1
dc.type.drive.none.fl_str_mv info:eu-repo/semantics/article
dc.identifier.uri.none.fl_str_mv http://hdl.handle.net/11634/11508
url http://hdl.handle.net/11634/11508
dc.rights.coar.fl_str_mv http://purl.org/coar/access_right/c_abf2
rights_invalid_str_mv http://purl.org/coar/access_right/c_abf2
dc.format.mimetype.spa.fl_str_mv application/pdf
dc.coverage.campus.spa.fl_str_mv CRAI-USTA Duad
institution Universidad Santo Tomás
bitstream.url.fl_str_mv https://repository.usta.edu.co/bitstream/11634/11508/1/GalpinIxent2016.pdf
https://repository.usta.edu.co/bitstream/11634/11508/2/license.txt
https://repository.usta.edu.co/bitstream/11634/11508/3/GalpinIxent2016.pdf.jpg
bitstream.checksum.fl_str_mv e053ed21f6e1ac6507e756ba9b9d9434
8a4605be74aa9ea9d79846c1fba20a33
47ebe823bbb6e1f9e22231b67d3896f3
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
repository.name.fl_str_mv Repositorio Universidad Santo Tomás
repository.mail.fl_str_mv noreply@usta.edu.co
_version_ 1800800536273879040
spelling Galpin, Ixent2018-04-03T15:35:30Z2018-04-03T15:35:30Z2016http://hdl.handle.net/11634/11508The role of data scientist has been described as the “sexiest job of the 21st Century”. While possibly there is a degree of hype associated with such a claim, there are factors at play such as the unprecedented growth in the amount of data being generated. This paper characterises the already established disciplines which underpin data science, viz., data engineering, statistics, and data mining. Following a characterisation of the previous fields, data science is found to be most closely related to data mining. However, in contrast to data mining, data science promises to operate over datasets that exhibit significant challenges in terms of the four Vs: Volume, Variety, Velocity and Veracity. This paper notes that the current emphasis, both in industry and academia, is on the first three Vs, which pose mainly scientific or technological challenges, rather than Veracity, which is a truly scientific (and arguably a more complex) challenge. Data Science can be seen to have a more ambitious objective than what traditionally data mining has: as a science, data science aims to lead to the creation of new theories and knowledge. This paper notes that, ironically, the veracity dimension, which is arguably the closest one relating to this objective, is being neglected. Despite the current media frenzy about data science, the paper concludes that more time is needed to see whether it will emerge as discipline in its own right.application/pdfData science: an emerging disciplineData scienceData miningData engineeringBig Datainfo:eu-repo/semantics/articlehttp://purl.org/coar/version/c_970fb48d4fbd8a85http://purl.org/coar/resource_type/c_2df8fbb1CRAI-USTA Duadhttp://purl.org/coar/access_right/c_abf2ORIGINALGalpinIxent2016.pdfGalpinIxent2016.pdfapplication/pdf980747https://repository.usta.edu.co/bitstream/11634/11508/1/GalpinIxent2016.pdfe053ed21f6e1ac6507e756ba9b9d9434MD51open accessLICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://repository.usta.edu.co/bitstream/11634/11508/2/license.txt8a4605be74aa9ea9d79846c1fba20a33MD52open accessTHUMBNAILGalpinIxent2016.pdf.jpgGalpinIxent2016.pdf.jpgIM Thumbnailimage/jpeg6907https://repository.usta.edu.co/bitstream/11634/11508/3/GalpinIxent2016.pdf.jpg47ebe823bbb6e1f9e22231b67d3896f3MD53open access11634/11508oai:repository.usta.edu.co:11634/115082023-07-14 16:37:51.788open accessRepositorio Universidad Santo Tomásnoreply@usta.edu.coTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=