A formalization for mapping discourses from business-based technical documents into controlled language texts for requirements elicitation

Requirements elicitation for software engineering is a process for obtaining, analyzing, and specifying requirements supported by stakeholders—analysts, clients, domain experts, and final users, among others. In this process we generate either textual or graphical descriptions, reflecting the most r...

Full description

Autores:
Manrique Losada, Bell
Tipo de recurso:
Doctoral thesis
Fecha de publicación:
2014
Institución:
Universidad Nacional de Colombia
Repositorio:
Universidad Nacional de Colombia
Idioma:
spa
OAI Identifier:
oai:repositorio.unal.edu.co:unal/52152
Acceso en línea:
https://repositorio.unal.edu.co/handle/unal/52152
http://bdigital.unal.edu.co/46419/
Palabra clave:
0 Generalidades / Computer science, information and general works
62 Ingeniería y operaciones afines / Engineering
Requirements elicitation
Natural language
Controlled language
Technical document
Natural language processing
Linguistics engineering
Domain knowledge
Rights
openAccess
License
Atribución-NoComercial 4.0 Internacional
id UNACIONAL2_84f8095df204c7b23afe975c12c24257
oai_identifier_str oai:repositorio.unal.edu.co:unal/52152
network_acronym_str UNACIONAL2
network_name_str Universidad Nacional de Colombia
repository_id_str
dc.title.spa.fl_str_mv A formalization for mapping discourses from business-based technical documents into controlled language texts for requirements elicitation
title A formalization for mapping discourses from business-based technical documents into controlled language texts for requirements elicitation
spellingShingle A formalization for mapping discourses from business-based technical documents into controlled language texts for requirements elicitation
0 Generalidades / Computer science, information and general works
62 Ingeniería y operaciones afines / Engineering
Requirements elicitation
Natural language
Controlled language
Technical document
Natural language processing
Linguistics engineering
Domain knowledge
title_short A formalization for mapping discourses from business-based technical documents into controlled language texts for requirements elicitation
title_full A formalization for mapping discourses from business-based technical documents into controlled language texts for requirements elicitation
title_fullStr A formalization for mapping discourses from business-based technical documents into controlled language texts for requirements elicitation
title_full_unstemmed A formalization for mapping discourses from business-based technical documents into controlled language texts for requirements elicitation
title_sort A formalization for mapping discourses from business-based technical documents into controlled language texts for requirements elicitation
dc.creator.fl_str_mv Manrique Losada, Bell
dc.contributor.author.spa.fl_str_mv Manrique Losada, Bell
dc.contributor.spa.fl_str_mv Zapata Jaramillo, Carlos Mario
dc.subject.ddc.spa.fl_str_mv 0 Generalidades / Computer science, information and general works
62 Ingeniería y operaciones afines / Engineering
topic 0 Generalidades / Computer science, information and general works
62 Ingeniería y operaciones afines / Engineering
Requirements elicitation
Natural language
Controlled language
Technical document
Natural language processing
Linguistics engineering
Domain knowledge
dc.subject.proposal.spa.fl_str_mv Requirements elicitation
Natural language
Controlled language
Technical document
Natural language processing
Linguistics engineering
Domain knowledge
description Requirements elicitation for software engineering is a process for obtaining, analyzing, and specifying requirements supported by stakeholders—analysts, clients, domain experts, and final users, among others. In this process we generate either textual or graphical descriptions, reflecting the most relevant concepts of the stakeholder domain for developing a software product and the related domain knowledge. Based on well-known elicitation techniques, the intervention of the stakeholders is overrated, since interviews, dialogues, and inspection are the most commonly used methods on them. Such methods—subjective and directly dependent on human beings—cause loss of sequence and conciseness, and waste of elicitation time and cost. The intervention of the stakeholders in the process leads to problems related to the usage of natural language: a lot of unstructured, redundant information and overuse of synonyms and ambiguities, among others. Several approaches have been developed for reducing the language differences between stakeholders and analysts: natural language processing of requirements documents; semi-automated identification of lexical features in requirements specifications; semi-automated optimization of feature identification; controlled English for representing knowledge; identification of knowledge domain from documents in several domains. Some progress has been made in solving this problem, but either such progress is related to other phases than the requirements elicitation or it exhibits no relation to the domain knowledge. Some of such progress is focused on techniques like information retrieval methods, identification of regular expressions, and text mining. These techniques can be applied to the requirements elicitation process, regarding the technical and methodological aspects. Nowadays, well-known elicitation techniques require: i) high involvement of stakeholders and analysts; ii) information-specification mapping process; iii) requirements description previous to the design model conversion; and iv) a method for guiding the elicitation process. In addition to the aforementioned techniques, some other ones known as synthetic/analytical techniques non-dependent on human intervention are useful for considering diverse sources of domain knowledge. Several approaches for studying elicitation techniques have been identified in the state-of-the-art review, but they scarcely use analytical techniques focused on documents. In this research we analyze documents written by the stakeholders in their domains as a source for requirements elicitation. We propose a model, comprising methodical and structural components, to set up the elicitation process by applying analytical elicitation techniques based on technical documents (e.g. policies, regulations, and manuals). Consequently, as a solution to deal with language differences, in this proposal we work with controlled languages—existing for specifying requirements—and the natural language from the stakeholders domain, directly translated from technical documents. We propose a formalization of mapping which comprises a variety of more linguistically-informed methods available—based on rhetorical analysis discourse and linguistics processing—which treat each document as a potential input for the natural language processing aiming requirements elicitation. The application of the model based on the analytical technique is expected to produce the following results: understanding and structured representation of the business and organizational information; comprehension of the stakeholder domain; and subsequent application of subjective techniques for specifying requirements. This Ph.D. Thesis is concerned with natural language processing for the requirements elicitation process, looking for a model for transforming discourses on technical documents into controlled language texts. In the model we specify the transformation process based on patterns—functional, structural, and linguistic patterns—from source technical documents, for obtaining organizational domain knowledge and business information, useful for the requirements elicitation process.
publishDate 2014
dc.date.issued.spa.fl_str_mv 2014
dc.date.accessioned.spa.fl_str_mv 2019-06-29T13:39:14Z
dc.date.available.spa.fl_str_mv 2019-06-29T13:39:14Z
dc.type.spa.fl_str_mv Trabajo de grado - Doctorado
dc.type.driver.spa.fl_str_mv info:eu-repo/semantics/doctoralThesis
dc.type.version.spa.fl_str_mv info:eu-repo/semantics/acceptedVersion
dc.type.coar.spa.fl_str_mv http://purl.org/coar/resource_type/c_db06
dc.type.content.spa.fl_str_mv Text
dc.type.redcol.spa.fl_str_mv http://purl.org/redcol/resource_type/TD
format http://purl.org/coar/resource_type/c_db06
status_str acceptedVersion
dc.identifier.uri.none.fl_str_mv https://repositorio.unal.edu.co/handle/unal/52152
dc.identifier.eprints.spa.fl_str_mv http://bdigital.unal.edu.co/46419/
url https://repositorio.unal.edu.co/handle/unal/52152
http://bdigital.unal.edu.co/46419/
dc.language.iso.spa.fl_str_mv spa
language spa
dc.relation.ispartof.spa.fl_str_mv Universidad Nacional de Colombia Sede Medellín Facultad de Minas Escuela de Sistemas
Escuela de Sistemas
dc.relation.references.spa.fl_str_mv Manrique Losada, Bell (2014) A formalization for mapping discourses from business-based technical documents into controlled language texts for requirements elicitation. Doctorado thesis, Universidad Nacional de Colombia - Sede Medellín.
dc.rights.spa.fl_str_mv Derechos reservados - Universidad Nacional de Colombia
dc.rights.coar.fl_str_mv http://purl.org/coar/access_right/c_abf2
dc.rights.license.spa.fl_str_mv Atribución-NoComercial 4.0 Internacional
dc.rights.uri.spa.fl_str_mv http://creativecommons.org/licenses/by-nc/4.0/
dc.rights.accessrights.spa.fl_str_mv info:eu-repo/semantics/openAccess
rights_invalid_str_mv Atribución-NoComercial 4.0 Internacional
Derechos reservados - Universidad Nacional de Colombia
http://creativecommons.org/licenses/by-nc/4.0/
http://purl.org/coar/access_right/c_abf2
eu_rights_str_mv openAccess
dc.format.mimetype.spa.fl_str_mv application/pdf
institution Universidad Nacional de Colombia
bitstream.url.fl_str_mv https://repositorio.unal.edu.co/bitstream/unal/52152/1/201160318_2014.pdf
https://repositorio.unal.edu.co/bitstream/unal/52152/2/201160318_2014%20appendices.pdf
https://repositorio.unal.edu.co/bitstream/unal/52152/3/201160318_2014.pdf.jpg
https://repositorio.unal.edu.co/bitstream/unal/52152/4/201160318_2014%20appendices.pdf.jpg
bitstream.checksum.fl_str_mv 2f357504940d612f23db0d02e28638db
aea85a21ba07bf6df165080636a67d2a
c7210c5340c4fc0b7f03364c33bfb515
c5318385b27e7dc9bfa408ad58658993
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositorio Institucional Universidad Nacional de Colombia
repository.mail.fl_str_mv repositorio_nal@unal.edu.co
_version_ 1814089833615720448
spelling Atribución-NoComercial 4.0 InternacionalDerechos reservados - Universidad Nacional de Colombiahttp://creativecommons.org/licenses/by-nc/4.0/info:eu-repo/semantics/openAccesshttp://purl.org/coar/access_right/c_abf2Zapata Jaramillo, Carlos MarioManrique Losada, Bellccf6966f-f67b-4d97-b255-dbbdca6ba7773002019-06-29T13:39:14Z2019-06-29T13:39:14Z2014https://repositorio.unal.edu.co/handle/unal/52152http://bdigital.unal.edu.co/46419/Requirements elicitation for software engineering is a process for obtaining, analyzing, and specifying requirements supported by stakeholders—analysts, clients, domain experts, and final users, among others. In this process we generate either textual or graphical descriptions, reflecting the most relevant concepts of the stakeholder domain for developing a software product and the related domain knowledge. Based on well-known elicitation techniques, the intervention of the stakeholders is overrated, since interviews, dialogues, and inspection are the most commonly used methods on them. Such methods—subjective and directly dependent on human beings—cause loss of sequence and conciseness, and waste of elicitation time and cost. The intervention of the stakeholders in the process leads to problems related to the usage of natural language: a lot of unstructured, redundant information and overuse of synonyms and ambiguities, among others. Several approaches have been developed for reducing the language differences between stakeholders and analysts: natural language processing of requirements documents; semi-automated identification of lexical features in requirements specifications; semi-automated optimization of feature identification; controlled English for representing knowledge; identification of knowledge domain from documents in several domains. Some progress has been made in solving this problem, but either such progress is related to other phases than the requirements elicitation or it exhibits no relation to the domain knowledge. Some of such progress is focused on techniques like information retrieval methods, identification of regular expressions, and text mining. These techniques can be applied to the requirements elicitation process, regarding the technical and methodological aspects. Nowadays, well-known elicitation techniques require: i) high involvement of stakeholders and analysts; ii) information-specification mapping process; iii) requirements description previous to the design model conversion; and iv) a method for guiding the elicitation process. In addition to the aforementioned techniques, some other ones known as synthetic/analytical techniques non-dependent on human intervention are useful for considering diverse sources of domain knowledge. Several approaches for studying elicitation techniques have been identified in the state-of-the-art review, but they scarcely use analytical techniques focused on documents. In this research we analyze documents written by the stakeholders in their domains as a source for requirements elicitation. We propose a model, comprising methodical and structural components, to set up the elicitation process by applying analytical elicitation techniques based on technical documents (e.g. policies, regulations, and manuals). Consequently, as a solution to deal with language differences, in this proposal we work with controlled languages—existing for specifying requirements—and the natural language from the stakeholders domain, directly translated from technical documents. We propose a formalization of mapping which comprises a variety of more linguistically-informed methods available—based on rhetorical analysis discourse and linguistics processing—which treat each document as a potential input for the natural language processing aiming requirements elicitation. The application of the model based on the analytical technique is expected to produce the following results: understanding and structured representation of the business and organizational information; comprehension of the stakeholder domain; and subsequent application of subjective techniques for specifying requirements. This Ph.D. Thesis is concerned with natural language processing for the requirements elicitation process, looking for a model for transforming discourses on technical documents into controlled language texts. In the model we specify the transformation process based on patterns—functional, structural, and linguistic patterns—from source technical documents, for obtaining organizational domain knowledge and business information, useful for the requirements elicitation process.Doctoradoapplication/pdfspaUniversidad Nacional de Colombia Sede Medellín Facultad de Minas Escuela de SistemasEscuela de SistemasManrique Losada, Bell (2014) A formalization for mapping discourses from business-based technical documents into controlled language texts for requirements elicitation. Doctorado thesis, Universidad Nacional de Colombia - Sede Medellín.0 Generalidades / Computer science, information and general works62 Ingeniería y operaciones afines / EngineeringRequirements elicitationNatural languageControlled languageTechnical documentNatural language processingLinguistics engineeringDomain knowledgeA formalization for mapping discourses from business-based technical documents into controlled language texts for requirements elicitationTrabajo de grado - Doctoradoinfo:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/acceptedVersionhttp://purl.org/coar/resource_type/c_db06Texthttp://purl.org/redcol/resource_type/TDORIGINAL201160318_2014.pdfTesis de Doctorado en Ingeniería - Sistemasapplication/pdf4097982https://repositorio.unal.edu.co/bitstream/unal/52152/1/201160318_2014.pdf2f357504940d612f23db0d02e28638dbMD51201160318_2014 appendices.pdfTesis de Doctorado en Ingeniería - Sistemas_anexoapplication/pdf2279878https://repositorio.unal.edu.co/bitstream/unal/52152/2/201160318_2014%20appendices.pdfaea85a21ba07bf6df165080636a67d2aMD52THUMBNAIL201160318_2014.pdf.jpg201160318_2014.pdf.jpgGenerated Thumbnailimage/jpeg3868https://repositorio.unal.edu.co/bitstream/unal/52152/3/201160318_2014.pdf.jpgc7210c5340c4fc0b7f03364c33bfb515MD53201160318_2014 appendices.pdf.jpg201160318_2014 appendices.pdf.jpgGenerated Thumbnailimage/jpeg6499https://repositorio.unal.edu.co/bitstream/unal/52152/4/201160318_2014%20appendices.pdf.jpgc5318385b27e7dc9bfa408ad58658993MD54unal/52152oai:repositorio.unal.edu.co:unal/521522023-10-03 11:00:09.719Repositorio Institucional Universidad Nacional de Colombiarepositorio_nal@unal.edu.co