SALE MX - Model Extraction from Natural Language Texts
Kurzbeschreibung
SALE MX aims at the extraction of UML models from natural language (NL) text. To avoid error prone natural language processing (NLP), SALE MX starts after a (currently manual) annotation of a NL text. The annotation explicitly marks the semantics of the text, thereby documenting a common understanding of the requirements.
The basis of the entire process is SENSE, the Software Engineer's Natural Language Semantics Encoding. SENSE describes how semantics can be encoded and used to process NL texts. SALE (the SENSE Annotation Language for English) is one possible realization of the SENSE process and provides a set of thematic roles with which you can explicitly encode the semantics of texts. Even though designed for English, SALE is also usable for various languages like German, French and Hungarian.
SALE also comes with an ANTLR based compiler that transforms the annotated text into a graph representation, which can be loaded into GrGen.NET. This graph is the internal discourse model of the text and is the central artifact of our process. Graph rewriting rules are then used to evaluate the structure of the semantics. We also use graph rewriting rules to produce an internal graph representation of an UML document which can be saved to an XMI document for further processing.
Apart from the annotation process, the system works without user interaction and produces UML diagrams. This annotation process can be time consuming and is the bottleneck of our system at the moment. Therefore we aim at providing a supportive tool for annotators and try to (pre-) annotate texts automatically.
Veröffentlichungen
-
Text to software: developing tools to close the gaps in software engineering
Tichy, W. F.; Körner, S.
2010. Proceedings of the FSESDP workshop on Future of software engineering research, 2010, Santa Fe, NM, 379–384, Association for Computing Machinery (ACM). doi:10.1145/1882362.1882439 -
Modellextraktion aus natürlichen Sprachen : eine Methode zur systematischen Erstellung von Domänenmodellen. Dissertation
Gelhausen, T.
2010. KIT Scientific Publishing. doi:10.5445/KSP/1000019366
-
Automatic Checklist Generation for the Assessment of UML Models
Gelhausen, T.; Landhäußer, M.; Körner, S.
2009. Models in software engineering : Workshops and Symposia at MODELS 2008 Toulouse, France, September 28 - October 3, 2008; reports and revised selected papers. Ed.: M.R.V. Chaudron, 387–399, Springer-Verlag. doi:10.1007/978-3-642-01648-6_40
-
Customizing GrGen.NET for model transformation
Gelhausen, T.; Derre, B.; Geiß, R.
2008. Proceedings of the third international workshop on Graph and model transformations, GRaMoT’08,Leipzig, Germany, 17–24, Association for Computing Machinery (ACM). doi:10.1145/1402947.1402951 -
Applications and Rewriting of Omnigraphs – Exemplified in the Domain of MDD
Denninger, O.; Gelhausen, T.; Geiß, R.
2008. 3rd International Symposium on Applications of Graph Transformations with Industrial Relevance, AGTIVE 2007, Kassel, Germany, 168–183, Springer. doi:10.1007/978-3-540-89020-1_13 -
Improving Automatic Model Creation Using Ontologies
Körner, S. J.; Gelhausen, T.
2008. Proceedings of the Twentieth International Conference on Software Engineering and Knowledge Engineering : San Francisco, USA, July 1 - 3, 2008, SEKE, Redwood City, Calif., Knowledge Systems Institute Graduate School -
Automatic Checklist Generation for the Assessment of UML Models
Gelhausen, T.; Landhäußer, M.; Körner, S.
2008. Educators Symposium @ MODELS 2008, Toulouse, France, September 28 - October 3, 2008
-
Thematic Role based Generation of UML Models from Real World Requirements
Gelhausen, T.; Tichy, W. T.
2007. Proceeding ICSC ’07 Proceedings of the International Conference on Semantic Computing, 282–289, IEEE Computer Society