A formalization for mapping discourses from business-based technical documents into controlled language texts for requirements elicitation
Requirements elicitation for software engineering is a process for obtaining, analyzing, and specifying requirements supported by stakeholders—analysts, clients, domain experts, and final users, among others. In this process we generate either textual or graphical descriptions, reflecting the most r...
- Autores:
-
Manrique Losada, Bell
- Tipo de recurso:
- Doctoral thesis
- Fecha de publicación:
- 2014
- Institución:
- Universidad Nacional de Colombia
- Repositorio:
- Universidad Nacional de Colombia
- Idioma:
- spa
- OAI Identifier:
- oai:repositorio.unal.edu.co:unal/52152
- Acceso en línea:
- https://repositorio.unal.edu.co/handle/unal/52152
http://bdigital.unal.edu.co/46419/
- Palabra clave:
- 0 Generalidades / Computer science, information and general works
62 Ingeniería y operaciones afines / Engineering
Requirements elicitation
Natural language
Controlled language
Technical document
Natural language processing
Linguistics engineering
Domain knowledge
- Rights
- openAccess
- License
- Atribución-NoComercial 4.0 Internacional
Summary: | Requirements elicitation for software engineering is a process for obtaining, analyzing, and specifying requirements supported by stakeholders—analysts, clients, domain experts, and final users, among others. In this process we generate either textual or graphical descriptions, reflecting the most relevant concepts of the stakeholder domain for developing a software product and the related domain knowledge. Based on well-known elicitation techniques, the intervention of the stakeholders is overrated, since interviews, dialogues, and inspection are the most commonly used methods on them. Such methods—subjective and directly dependent on human beings—cause loss of sequence and conciseness, and waste of elicitation time and cost. The intervention of the stakeholders in the process leads to problems related to the usage of natural language: a lot of unstructured, redundant information and overuse of synonyms and ambiguities, among others. Several approaches have been developed for reducing the language differences between stakeholders and analysts: natural language processing of requirements documents; semi-automated identification of lexical features in requirements specifications; semi-automated optimization of feature identification; controlled English for representing knowledge; identification of knowledge domain from documents in several domains. Some progress has been made in solving this problem, but either such progress is related to other phases than the requirements elicitation or it exhibits no relation to the domain knowledge. Some of such progress is focused on techniques like information retrieval methods, identification of regular expressions, and text mining. These techniques can be applied to the requirements elicitation process, regarding the technical and methodological aspects. Nowadays, well-known elicitation techniques require: i) high involvement of stakeholders and analysts; ii) information-specification mapping process; iii) requirements description previous to the design model conversion; and iv) a method for guiding the elicitation process. In addition to the aforementioned techniques, some other ones known as synthetic/analytical techniques non-dependent on human intervention are useful for considering diverse sources of domain knowledge. Several approaches for studying elicitation techniques have been identified in the state-of-the-art review, but they scarcely use analytical techniques focused on documents. In this research we analyze documents written by the stakeholders in their domains as a source for requirements elicitation. We propose a model, comprising methodical and structural components, to set up the elicitation process by applying analytical elicitation techniques based on technical documents (e.g. policies, regulations, and manuals). Consequently, as a solution to deal with language differences, in this proposal we work with controlled languages—existing for specifying requirements—and the natural language from the stakeholders domain, directly translated from technical documents. We propose a formalization of mapping which comprises a variety of more linguistically-informed methods available—based on rhetorical analysis discourse and linguistics processing—which treat each document as a potential input for the natural language processing aiming requirements elicitation. The application of the model based on the analytical technique is expected to produce the following results: understanding and structured representation of the business and organizational information; comprehension of the stakeholder domain; and subsequent application of subjective techniques for specifying requirements. This Ph.D. Thesis is concerned with natural language processing for the requirements elicitation process, looking for a model for transforming discourses on technical documents into controlled language texts. In the model we specify the transformation process based on patterns—functional, structural, and linguistic patterns—from source technical documents, for obtaining organizational domain knowledge and business information, useful for the requirements elicitation process. |
---|