The current document (D2.1) constitutes the first version of deliverable “SmartCLIDE Innovative Approaches and Features on Services Discovery, Creation, Composition and Deployment” of the SmartCLIDE project. The aim of the deliverable is two-fold: (a) to describe novel approaches for service discovery, creation, composition, and deployment; and (b) to provide details on the adoption of existing technological approaches for service discovery, creation, composition, and deployment. The first version of the deliverable will present: (a) an approach for service discovery and classification; (b) an initial version of service registry—the querying process will be finalized in the final version of D2.1; (c) a process for service creation and service specification; (d) an initial approach for source code generation—mostly focused on design pattern selection and skeleton creation; (e) a process for service composition; (f) an initial version for AI support in service composition; (g) an initial process for testing services and workflows; (h) an initial version for the security, maintainability, and reusability assessment of services and workflows; and (i) a process for service and workflow deployment. Given the above, apart from the finalization of the aforementioned initial versions of approaches the final version of the deliverable will contribute: (a) service specification for runtime monitoring and verification; (b) source code autocomplete suggestions; (c) approaches for requirements assessment; and (d) approaches for service monitoring upon deployment. Deliverable “D3.1 Early SmartCLIDE IDE Design” is part of WP3 and is produced as the main outcome of Task 3.1 “Design, development and unit testing of the User Interface”, Task 3.2 “Design, development and unit testing of the Deep Learning Engine”, and Task 3.3 “Design, development and unit testing of the Backend Services”. The main objective of WP3 is to design, develop, and unit test the three main SmartCLIDE framework components, namely User Interface (UI), Deep Learning Engine (DLE), and the Backend Components. The purpose of this deliverable is to report the early design approach and technical progress that has been conducted until M20. The emphasis is given on the front and backend of the SmartCLIDE Integrated Development Environment (IDE). This deliverable is based on the outcome of previous WP1 deliverables, namely “D1.4 the SmartCLIDE Concept” and “D1.5 Design of SmartCLIDE Architecture”. Besides, there is a strong link with WP2, where technology providers have performed research-oriented tasks, which have been documented in “D2.1 SmartCLIDE Innovative Approaches and Features on Services Discovery, Creation, Composition, and Deployment”. This is the first version of deliverable, which is going to be updated in M30. This deliverable describes the validation procedures that will be applied for the full prototype of the SmartCLIDE solution. This deliverable provides a description of the validation environments that will be used by each of the industrial Pilot Case partners, along with the individual Test Runs that will be executed to determine the degree of achievement in fulfilling the industrial needs for a Cloud-based IDE. Also included are the procedures for Pilot Case partners to report bugs and suggested revisions to the research and development teams for inclusion in the final prototype of the solution delivered at the end of the project. A final section addresses the overall performance indicators and targets that will form the basis for the Assessment Methodology and associated Assessment Scenarios under Workpackage 5, which complement the full prototype validation testing by quantifying the business, operational and other impacts delivered by the project technologies for each of the industrial Pilot Case partners. According to the predefined rules exposed at the beginning of the SmartCLIDE project, project presentation and brochure will be issued in this section. All the images and materials created (Brochure, poster, Roll up for conferences, and project templates) could be downloaded and are, of course, open to use as Creative Commons. This report gives an overview of the SmartCLIDE project public website dissemination area (Dissemination Kit, Blog and Follow up) and internal website and collaboration support. The public site (www.smartclide.eu) is designed to present the work of the SmartCLIDE project to the general public, the scientific community, and industry. I was already presented with the SmartCLIDE logo on the deliverable D6.3 All partners are collaborating in making local and international news about the goals of the consortium, updating deliverables to the website and keeping the open for public access. Our collaboration infrastructure will be evaluated and upgraded as necessary during the lifetime of the project. All partners are encouraged and reminded regularly to provide additional suggestions and further information regarding activities related to the SmartCLIDE project, so that these can be properly captured and advertised via the project website in order to keep the website current with fresh information and material. Using the materials provided (printed and online) for their own events and for the events in which the Consortium have presence. (Updated pictures, updated reports, news about the platform, Workshops activities, Interaction with the end users… etc.) This document will have 3 releases: This deliverable describes the validation procedures that will be applied for the early prototype testing of the SmartCLIDE solution. This deliverable provides a description of the validation environments that will be used by each of the industrial Pilot Case partners, along with the individual Test Runs that will be executed to determine the degree of progress achieved in fulfilling the industrial needs for a Cloud-based IDE. Also included are the procedures for Pilot Case partners to report bugs and suggested revisions to the research and development teams for inclusion in the full prototype development tasks. A final section addresses the overall performance indicators and targets that will form the basis for the Assessment Scenarios under workpackage 5, which complement the final prototype validation testing and quantify the business, operational and other impacts delivered by the project technologies for each of the industrial Pilot Case partners. This document presents the current status on the dissemination, communication and exploitation plans for the project. It contains the list of the current actions and time scales. It details the post-project actions containing the individual partners’ intentions, at the point in time at which the deliverable is published. All the agreements with regard to the ownership of the results and IPR issues will be also included in this document. The outcomes resulting from the Standardisation, Clustering and Concertation tasks will also be reported within the future releases of this deliverable. This document will have 3 releases: According to the predefined rules exposed at the beginning of the SmartCLIDE project, project presentation and brochure will be issued in this section. All the images and materials created (Brochure, poster, Roll up for conferences, and project templates) could be downloaded and are, of course, open to use as Creative Commons. This report gives an overview of the SmartCLIDE project public website dissemination area (Dissemination Kit, Blog and Follow up) and internal website and collaboration support. The public site (www.smartclide.eu) is designed to present the work of the SmartCLIDE project to the general public, the scientific community, and industry. It was already presented with the SmartCLIDE logo on the deliverable D6.3.1. All partners are collaborating in making local and international news about the goals of the consortium, updating deliverables to the website and keeping the open for public access. Our collaboration infrastructure will be evaluated and upgraded as necessary during the lifetime of the project. All partners are encouraged and reminded regularly to provide additional suggestions and further information regarding activities related to the SmartCLIDE project, so that these can be properly captured and advertised via the project website in order to keep the website current with fresh information and material. Using the materials provided (printed and online) for their own events and for the events in which the Consortium have presence. (Updated pictures, updated reports, news about the platform, Workshops activities, Interaction with the end users… etc.) This document will have 3 releases: Deliverable “D3.1 Early SmartCLIDE IDE Design” is part of WP3 and is produced as the main outcome of Task 3.1 “Design, development and unit testing of the User Interface”, Task 3.2 “Design, development and unit testing of the Deep Learning Engine”, and Task 3.3 “Design, development and unit testing of the Backend Services”. The main objective of WP3 is to design, develop, and unit test the three main SmartCLIDE framework components, namely User Interface (UI), Deep Learning Engine (DLE), and the Backend Components. The purpose of this deliverable is to report the early design approach and technical progress that has been conducted until M20. The emphasis is given on the front and backend of the SmartCLIDE Integrated Development Environment (IDE). This deliverable is based on the outcome of previous WP1 deliverables, namely “D1.4 the SmartCLIDE Concept” and “D1.5 Design of SmartCLIDE Architecture”. Besides, there is a strong link with WP2, where technology providers have performed research-oriented tasks, which have been documented in “D2.1 SmartCLIDE Innovative Approaches and Features on Services Discovery, Creation, Composition, and Deployment”. This is the first version of deliverable, which is going to be updated in M30. This document presents the architecture of the SmartCLIDE system. It is the result of the design process, strongly dependent on and complementing the defined SmartCLIDE requirements, use cases and conceptual design of its components. Consequently, taking into account both the results of requirements, the specified set of system use cases and pilot scenarios that captured in detail how the envisioned SmartCLIDE system will offer its functionality to the users, and the envisioned technical innovations of the SmartCLIDE system as outlined in its conceptual design, this document focuses on detailing the component-based architecture, the information flows and component interactions view, as well as the deployment architecture of SmartCLIDE. The architecture description is also complemented by the delivery plan of the system – the approach to be used and a time plan for the delivery of the system with specific phases and milestones has been included in the delivery plan. All the aforementioned content has been structured following the concept and terms of the ISO/IEC/IEEE 42010:2011, “Systems and software engineering — Architecture description” standard. Although this document does not fully comply with the standard’s requirements, the use of the principles included in the standard increases the standardization of the architecture description and the readability of the document itself. It must be noted that the design process and architecture specification of the system is an ongoing process that will continue in the next phases of the project on the light of new deliverables that are due in the months to follow. Thus, even though the current document, along with D1.3 and D1.4, is a starting point for the specification of the system, modifications will apply in the course of the project in order to record the evolving requirements and the corresponding changes in the architecture specification, following an agile development approach. The current document presents the SmartCLIDE Concept. The work described in this document is part of T1.4 Design of SmartCLIDE System Concept for WP1 – Specification of SmartCLIDE concept and pilot cases. The objectives of this task can be summarised in the points below: This document summarises the concept of the envisaged SmartCLIDE solution, based on the requirements of the industrial partners, as well as the state-of-the-art, including the basic approach for each of the main SW services and components. The main result of these activities is the high-level concept for the SmartCLIDE solution, which will serve as the starting point for the detailed design documents. This report describes the SmartCLIDE project website from its conception to its first adaptation. It explains its current role in the entire dissemination and communication process. It further describes both the methods as well as the technologies used to effectively design and build the SmartCLIDE project website. We have further taken advantage of this document to present the other communication channels that we have set up since the start of the project to cover the early dissemination needs of the consortium. This document describes the initial Open Data Use Plan of the SmartCLIDE project and the initial data sets that have been identified to be utilised or to be generated by the four Use Case evaluations for industrial validation of the project technologies. This deliverable also outlines how the research data collected, or generated, will be handled during and after the SmartCLIDE project, describes which methodology for data collection and generation will be followed, and whether and how data will be shared. The data sets are described in accordance with the European Commission guidelines of the Open Research Data Pilot and include the key attributes of data type, format, metadata, use of standards and sharing modalities. The current document constitutes the deliverable D1.1 “State-of-the-Art and Market Analysis” of the SmartCLIDE project. The deliverable aims to explore the current state-of-research and -practice in the topics of interest for the project, and deliver as a main outcome the baseline requirements for the intended framework. The deliverable has been developed using a well-defined strategy, and received contribution from almost all partners of the consortium, so as to provide an as comprehensive view of the current state of the art and market analysis. The deliverable is going to be provided as input to Task 1.2 “Specification of Requirements”.
- Authors: Nikolaos Nikolaidis, Elvira-Maria Arvanitou, Christina Volioti, Theodore Maikantis, Apostolos Ampatzoglou, Daniel Feitosa, Alexander Chatzigeorgiou, Phillipe Krief
- Location: The Journal of Systems & Software
Abstract:
Service-Oriented Architectures (SOA) have become a standard for developing software applications, including but not limited to cloud-based ones and enterprise systems. When using SOA, software engineers organize the desired functionality into self-contained and independent services that are invoked through end-points (with API calls). The use of this emerging technology has changed drastically the way that software reuse is performed, in the sense that a “service” is a “code chunk” that is reusable (preferably in a black-box manner), but in many (especially “in-house”) cases, white-box reuse is also meaningful. To confront the reuse challenges opened-up by the rise of SOA, in the SmartCLIDE project1 we have developed a framework (a methodology and a platform) to aid software engineers in systematic and more efficient (in terms of time, quality, defects, and process) reuse of services, when developing SOA-based cloud applications. In this work, we:
- (a) present the SmartCLIDE methodology and the Eclipse Open SmartCLIDE platform; and
- (b) evaluate the usefulness of the framework, in terms of relevance, usability, and obtained benefits.
The results of the study have confirmed the relevance and rigor of the framework, unveiled some limitations, and pointed to interesting future work directions, but also provided some actionable implications for researchers and practitioners.
Keywords: reuse; service-based development; cloud development; platform
- Authors: Zakieh Alizadehsani, Francisco Pinto-Santos, David Alonso-Moro, David Berrocal Macías & Alfonso González-Briones
- Location: Distributed Computing and Artificial Intelligence, 19th International Conference. Cham: Springer International Publishing, 2022.
Abstract:
Distributed computing has been gaining a continually increasing interest over the past years in research and industrial communities. One of the significant objectives of distributed computing is to provide the infrastructure for performing tasks on independent systems. Utilizing this approach in software development can reduce costs. Consequently, there has been an increasing interest in distributed applications. However, distributed applications need to meet main features, such as scalability, availability, and compatibility. In this context, service-based systems provide an architecture that can support mentioned features. Nevertheless, current services use various technologies and languages, which bring complexity to development. This work aims to facilitate web service development by introducing a deep Learning-based code auto-complete model. This model is used in the toolkit called SmartCLIDE, which provides features to accelerate development using Artificial Intelligence and cloud deployment. The contribution of this work can fall into two steps: First, the top web APIs from a benchmark web service data-set has been identified. Afterward, a data optimization approach has been proposed to systematically augment and improve available web service codes. Second, the service code auto-completion model has been trained, which takes advantage of text generation trends and deep learning methods. The experimental results on web service codes demonstrate that the proposed approach outperforms another general-purpose code-completion model.
- Authors: Dimitrios Tsoukalas; Nikolaos Mittas; Alexander Chatzigeorgiou; Dionysios Kehagias; Apostolos Ampatzoglou; Theodoros Amanatidis
(Department of Applied Informatics, University of Macedonia, Thessaloniki, Greece) - Location: Published in: IEEE Transactions on Software Engineering ( Volume: 48, Issue: 12, 01 December 2022)
Abstract:
Technical Debt (TD) is a successful metaphor in conveying the consequences of software inefficiencies and their elimination to both technical and non-technical stakeholders, primarily due to its monetary nature. The identification and quantification of TD rely heavily on the use of a small handful of sophisticated tools that check for violations of certain predefined rules, usually through static analysis. Different tools result in divergent TD estimates calling into question the reliability of findings derived by a single tool. To alleviate this issue we use 18 metrics pertaining to source code, repository activity, issue tracking, refactorings, duplication and commenting rates of each class as features for statistical and Machine Learning models, so as to classify them as High-TD or not. As a benchmark we exploit 18,857 classes obtained from 25 Java projects, whose high levels of TD has been confirmed by three leading tools. The findings indicate that it is feasible to identify TD issues with sufficient accuracy and reasonable effort: a subset of superior classifiers achieved an F 2 -measure score of approximately 0.79 with an associated Module Inspection ratio of approximately 0.10. Based on the results a tool prototype for automatically assessing the TD of Java projects has been implemented.
- Authors: Ioannis Zozas; Stamatia Bibi; Apostolos Ampatzoglou
- Location: Published in IEEE Transactions on Software Engineering
Abstract:
JavaScript (JS) is one of the most popular programming languages for developing client-side applications mainly due to allowing the adoption of different programming styles, not having strict syntax rules, and supporting a plethora of frameworks. The flexibility that the language provides may accelerate the development of application, but also pose threats to the quality of the final software product, e.g., introducing Technical Debt (TD). TD reflects the additional cost of software maintenance activities to implement new features, occurring due to poorly developed solutions. Being able to forecast the levels of TD in the future can be extremely valuable in managing TD, since it can contribute to informed decision making when designating future repayments and refactoring budget among a company’s projects. Despite the popularity of JS and the undoubtful benefits of accurate TD forecasting, in the literature, there is available only a limited number of tools and methodologies that are able to: (a) forecast TD during software evolution, (b) provide a ground-truth TD quantifications to train forecasting, since TD tools that are available are based on different rulesets and none is recognized as a state-of-the-art solution, (c) take into consideration the language-specific characteristics of JS. As a main contribution for this study, we propose a methodology (along with a supporting tool) that supports the aforementioned goals based on the Backward Stepwise Regression and Auto-Regressive Integrated Moving Average (ARIMA). We evaluate the proposed approach through a case study on 19,636 releases of 105 open-source applications. The results point out that: (a) the proposed model can lead to an accurate prediction of TD, and (b) the Number of appearances of the “new” and “eval” keyword along with the number of “anonymous” and “arrow” functions are among the features of JavaScript language that are related to high levels of TD.
- Authors: Eleni Polyzoidou, Evangelia Papagiannaki, Nikolaos Nikolaidis, Apostolos Ampatzoglou, Nikolaos Mittas, Elvira Maria Arvanitou, Alexander Chatzigeorgiou, George Manolis, Evdoxia Manganopoulou
- Location: Wiley Online Library
Abstract:
Design patterns are well-known solutions to recurring design problems that are widely adopted in the software industry, either as formal means of communication or as a way to improve structural quality, enabling proper software extension. However, the adoption and correct instantiation of patterns is not a trivial task and requires substantial design experience. Some patterns are conceptually close or present similar design alternatives, leading novice developers to improper pattern selection, thereby reducing maintainability. Additionally, the mis-instantiation of a GoF (Gang-of-Four) design pattern, leads to phenomena such as pattern grime or architecture decay. To alleviate this problem, in this work we propose an approach that can help software engineers to more easily and safely select the proper design pattern, for a given design problem. The approach relies on decision trees, which are constructed using domain knowledge, while options are conveyed to software engineers through an Eclipse Theia plugin. To assess the usefulness and the perceived benefits of the approach, as well as the usability of the tool support, we have conducted an industrial validation study, using various data collection methods, such as questionnaires, focus groups, and task analysis. The results of the study suggest that the proposed approach is promising, since it increases the probability of the proper pattern being selected, and various useful future work suggestions have been obtained by the practitioners.
- Authors: Nikolaos Nikolaidis; Nikolaos Mittas; Apostolos Ampatzoglou; Elvira-Maria Arvanitou; Alexander Chatzigeorgiou
(Department of Applied Informatics, University of Macedonia, Greece) - Location: Published in: IEEE Transactions on Software Engineering
Abstract:
Quality improvement can be performed at the: (a) micro-management level: interventions applied at a fine-grained level (e.g., at a class or method level, by applying a refactoring); or (b) macro-management level: interventions applied at a large-scale (e.g., at project level, by using a new framework or imposing a quality gate). By considering that the outcome of any activity can be characterized as the product of impact and scale , in this paper we aim at exploring the impact of Technical Debt (TD) Macro-Management, whose scale is by definition larger than TD Micro-Management. By considering that TD artifacts reside at the micro-level, the problem calls for a nested model solution; i.e., modeling the structure of the problem: artifacts have some inherent characteristics (e.g., size and complexity), but obey the same project management rules (e.g., quality gates, CI/CD features, etc.). In this paper, we use the Under-Bagging based Generalized Linear Mixed Models approach, to unveil project management activities that are associated with the existence of HIGH_TD artifacts, through an empirical study on 100 open-source projects. The results of the study confirm that micro-management parameters are associated with the probability of a class to be classified as HIGH_TD, but the results can be further improved by controlling some project-level parameters. Based on the findings of our nested analysis, we can advise practitioners on macro-technical debt management approaches (such as “ control the number of commits per day ”, “ adopt quality control practices ”, and “ separate testing and development teams ”) that can significantly reduce the probability of all software artifacts to concentrate HIGH_TD. Although some of these findings are intuitive, this is the first work that delivers empirical quantitative evidence on the relation between TD values and project- or process-level metrics.
- Authors: Nikolaos Nikolaidis, Apostolos Ampatzoglou, Alexander Chatzigeorgiou, Sofia Tsekeridou & Avraam Piperidis
- Location: PROFES 2022: Product-Focused Software Process Improvement pp 265–281
Abstract:
Service-Oriented Architectures (SOA) have become a standard for developing software applications, including but not limited to cloud-based ones and enterprise systems. When using SOA, the software engineers organize the desired functionality into self-contained and independent services, that are invoked through end-points (API calls). At the maintenance phase, the tickets (bugs, functional updates, new features, etc.) usually correspond to specific services. Therefore, for maintenance-related estimates it makes sense to use as unit of analysis the service-per se, rather than the complete project (too coarse-grained analysis) or a specific class (too fine-grained analysis). Currently, some of the most emergent maintenance estimates are related to Technical Debt (TD), i.e., the additional maintenance cost incurred due to code or design inefficiencies. In the literature, there is no established way on how to quantify TD at the service level. To this end, in this paper, we present a novel methodology to measure the TD of each service considering the underlying code that supports the corresponding endpoint. The proposed methodology relies on the method call graph, initiated by the service end-point, and traverses all methods that provide the service functionality. To evaluate the usefulness of this approach, we have conducted an industrial study, validating the methodology (and the accompanying tool) with respect to usefulness, obtained benefits, and usability.
- Authors: Nikolaos Nikolaidis, Dimitrios Zisis, Apostolos Ampatzoglou, Nikolaos Mittas, and Alexander Chatzigeorgiou
- Location: 6th International Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE 2022)
- Link: will come soon
Abstract: Refactorings constitute the most direct and comprehensible ap-proach for addressing software quality issues, stemming directly from identified code smells. Nevertheless, despite their popularity in both the research and industrial communities: (a) the effect of a refactoring is not guaranteed to be successful; and (b) the plethora of available refactoring opportunities does not allow their compre-hensive application. Thus, there is a need of guidance, on when to apply a refactoring opportunity, and when the development team shall postpone it. The notion of interest, forms one of the major pil-lars of the Technical Debt metaphor expressing the additional maintenance effort that will be required because of the accumulated debt. To assess the benefits of refactorings and guide when a refac-toring should take place, we first present the results of an empirical study assessing and quantifying the impact of various refactorings on Technical Debt Interest (building a real-world training set) and use machine learning approaches for guiding the application of fu-ture refactorings. To estimate interest, we rely on the FITTED framework, which for each object-oriented class assesses its dis-tance from the best-quality peer; whereas the refactorings that are applied throughout the history of a software project are extracted with the RefactoringMiner tool. The dataset of this study involves 4,166 refactorings applied across 26,058 revisions of 10 Apache projects. The results suggest that the majority of refactorings reduce Technical Debt interest; however, considering all refactoring appli-cations, it cannot be claimed that the mean impact differs from zero, confirming the results of previous studies highlighting mixed ef-fects from the application of refactorings. To alleviate this problem, we have built an adequately accurate (~70%) model for the predic-tion of whether or not a refactoring should take place, in order to reduce Technical Debt interest.
- Authors: Elvira Maria Arvanitou, Pigi Argyriadou, Georgia Koutsou, Apostolos Ampatzoglou
- Location: 48th Euromicro Conference on Software Engineering and Advanced Applications (SEAA’ 22)
Abstract: Despite the attention that Technical Debt has attracted over the last years, the quantification of TD Interest still remains rather vague (and abstract). TD Interest quantification is hindered by various factors that introduce a lot of uncertainty, such as: identifying the parts of the system that will be maintained , quantifying the load of maintenance, as well as the size of the maintenance penalty, due to the existence of TD. In this study, we aim to shed light on the current approaches for quantifying TD Interest by exploring existing literature within the TD and Maintenance communities. To achieve this goal, we performed a systematic mapping study on Scopus and explored: (a) the existing approaches for quantifying TD Interest; (b) the existing approaches for estimating Maintenance Cost; and (c) the factors that must be taken into account for their quantification. The broad search process has returned more than 1,000 articles, out of which only 25 provide well-defined mathematical formulas / equations for the quantification of TD Interest or Maintenance Cost (only 6 of them are explicitly for TD Interest). The results suggest that despite their similarities, the quantification of TD Interest presents additional challenges compared to Maintenance Cost Estimation, constituting (at least for the time being) the accurate quantification of TD Interest an open and distant to solve research problem. Regarding the factors that need to be considered for such an endeavor, based on the literature: size, complexity, and business parameters are those that are more actively associated to TD Interest quantification.
- Authors: Zakieh Alizadehsani, Daniel Feitosa, Theodore Maikantis Apostolos Ampatzoglou
- Location: 48th Euromicro Conference on Software Engineering and Advanced Applications (SEAA’ 22)
Abstract: Developing software based on services is one of the most emerging programming paradigms in software development. Service-based software development relies on the composition of services (i.e., pieces of code already built and deployed in the cloud) through orchestrated API calls. Black-box reuse can play a prominent role when using this programming paradigm, in the sense that identifying and reusing already existing/deployed services can save substantial development effort. According to the literature, identifying reusable assets (i.e., components, classes, or services) is more successful and efficient when the discovery process is domain-specific. To facilitate domain-specific service discovery, we propose a service classification approach that can categorize services to an application domain, given only the service description. To validate the accuracy of our classification approach, we have trained a machine-learning model on thousands of open-source services and tested it on 67 services developed within two companies employing service-based software development. The study results suggest that the classification algorithm can perform adequately in a test set that does not overlap with the training set; thus, being (with some confidence) transferable to other industrial cases. Additionally, we expand the body of knowledge on software categorization by highlighting sets of domains that consist ‘grey-zones’ in service classification.
- Authors: Apostolos Ichtsis, Nikolaos Mittas, Apostolos Ampatzoglou, Alexander Chatzigeorgiou
- Location: 5th International Conference on Technical Debt (TechDEBT’ 22)
Abstract: Technical Debt estimation relies heavily on the use of static analysis tools looking for violations of pre-defined rules. Largely, Technical Debt principal is attributed to the presence of low-level code smells, unavoidably tying the effort for fixing the problems with mere coding inefficiencies. At the same time, despite their simple definition, the detection of most code smells is non-trivial and subjective, rendering the assessment of Technical Debt principal dubious. To this end, we have revisited the literature on code smell detection approaches backed by tools and developed an Eclipse plugin that incorporates six code smell detection approaches. The combined application of various smell detectors can increase the certainty of identifying actual code smells that matter to the development team. We also conduct a case study to investigate the agreement among the employed code smell detectors. To our surprise the level of agreement is quite low even for relatively simple code smells threating the validity of existing TD analysis tools and calling for increased attention to the precise specification of code and design level issues.
- Authors: Elvira Maria Arvanitou; Apostolos Ampatzoglou; Stamatia Bibi; Alexander Chatzigeorgiou; Ignatios Deligiannis;
- Location: IEEE Access (Volume 10)
Abstract: DevOps is an emerging software development methodology, that differs from more traditional approaches due to the closer involvement of the customer and the adoption of “ continuous -*” (e.g., integration, deployment, delivery, etc.) practices. The vast research on DevOps (including numerous secondary studies) published in a short timeframe, and the diversity of the authors’ research backgrounds (e.g., from a Dev or an Ops perspective), has inevitably produced a long list of investigated topics, which use inconsistent terminology. The goal of this study is to analyze literature reviews on DevOps with respect to: (a) the research topics in DevOps; (b) the terms that are mapped to each topic; and (c) the consistency of terminology. To achieve this goal, we have performed a tertiary study, i.e., a systematic mapping study that uses as primary studies “ Systematic Literature Reviews ” and “ Mapping Studies ”. For Data Extraction, Analysis, and Synthesis (DEAS) we propose a novel approach relying on thematic analysis, statistical analysis, and meta-analysis. The results unveiled 7 core topics on DevOps research, out of which DevOps features and DevOps practices are dominant ones. Additionally, as expected various terminology ambiguities have been identified, most between features as practices, as well as, between challenges faced before adopting DevOps and while applying DevOps. The main contribution of this study is the disambiguation of the mapping of terms to topics. Along this process, we highlight both inconsistencies—attempting to resolve ambiguities, as well as topics and terms with high levels of consistency; aiding researchers and practitioners.
- Authors: Panagiotis Kotsikoris, Theodore Chaikalis, Apostolos Ampatzoglou, Alexander Chatzigeorgiou
- Location: 17th International Conference on the Evaluation of Novel Approaches to Software Engineering (ENASE ‘22)
Abstract: The last decade marked undeniably the leading role of web services and the establishment of service-oriented architectures. Indeed, it is nowadays hard to find a contemporary software application that does not use at least one third-party web service. The main driver for this paradigm shift, lies in the benefits that decoupled, cloud-based services bring to software development, operation and maintenance as well as at the seamless deployment, integration and scalability features those modern public clouds provide.
- Authors: Elvira Maria Arvanitou, Nikolaos Nikolaidis, Apostolos Ampatzoglou, Alexander Chatzigeorgiou
- Location: 17th International Conference on the Evaluation of Novel Approaches to Software Engineering (ENASE ‘22)
Abstract: Scientific software development refers to a specific branch of software engineering that targets the development of scientific applications. Such applications are usually developed by non-expert software engineers (e.g., natural scientists, biologists, etc.) and pertain to special challenges. One such challenge (stemming from the lack of proper software engineering background) is the low structural quality of the end software-also known as Technical Debt-leading to long debugging and maintenance cycles. To contribute towards understanding the software engineering practices that are used in scientific software development, and investigating whether their application can lead to preventing structural quality decay (also known as Technical Debt prevention); in this study, we seek insights from professional scientific software developers, through a questionnaire based empirical setup. The results of our work suggest that several practices (e.g., Reuse and Proper Testing) can prevent the introduction of Technical Debt in software development projects. On the other hand, other practices seem as either improper for TD prevention (e.g., Parallel / Distributed Programming), whereas others as non-applicable to the branch of scientific software development (e.g., Refactorings or Use of IDEs). The results of this study prove useful for the training plan of scientists before joining development teams, as well as for senior scientists that act as project managers in such projects.
- Authors: Dimitrios Tsoukalas, Alexander Chatzigeorgiou, Apostolos Ampatzoglou, Nikolaos Mittas
- Location: 5th International Conference on Technical Debt (TechDEBT’ 22)
Abstract: To date, the identification and quantification of Technical Debt (TD) rely heavily on a few sophisticated tools that check for violations of certain predefined rules, usually through static analysis. Different tools result in divergent TD estimates calling into question the reliability of findings derived by a single tool. To alleviate this issue, we present a tool that employs machine learning on a dataset built upon the convergence of three widely-adopted TD Assessment tools to automatically assess the class-level TD for any arbitrary Java project. The proposed tool is able to classify software classes as high-TD or not, by synthesizing source code and repository activity information retrieved by employing four popular open source analyzers. The classification results are combined with proper visualization techniques, to enable the identification of classes that are more likely to be problematic. To demonstrate the proposed tool and evaluate its usefulness, a case study is conducted based on a real-world open-source software project. The proposed tool is expected to facilitate TD management activities and enable further experimentation through its use in an academic or industrial setting.
Video: https://youtu.be/umgXU8u7lIA
Running Instance: http://160.40.52.130:3000/tdclassifier
Source Code: https://gitlab.seis.iti.gr/root/td-classifier.git
- Authors: Theodore Maikantis, Theodore Chaikalis, Apostolos Ampatzoglou, Alexander Chatzigeorgiou
- Location: PCI 2021. 25th Pan-Hellenic Conference on Informatics
Abstract: Nowadays the majority of cloud applications are developed based on the Service-Oriented Architecture (SOA) paradigm. Large-scale applications are structured as a collection of well-integrated services that are deployed in public, private or hybrid cloud. Despite the inherent benefits that service-based cloud development provides, the process is far from trivial, in the sense that it requires the software engineer to be (at least) comfortable with the use of various technologies in the long cloud development toolchain: programming in various languages, testing tools, build / CI tools, repositories, deployment mechanisms, etc. In this paper, we propose an approach and corresponding toolkit (termed SmartCLIDE-as part of the results of an EU-funded research project) for facilitating SOA-based software development for the cloud, by extending a well-known cloud IDE from Eclipse. The approach aims at shortening the toolchain for cloud development, hiding the process complexity and lowering the required level of knowledge from software engineers. The approach and tool underwent an initial validation from professional cloud software developers. The results underline the potential of such an automation approach, as well as the usability of the research prototype, opening further research opportunities and providing benefits for practitioners.
- Authors: David Berrocal-Macías, Zakieh Alizadeh-Sani, Francisco Pinto-Santos, Alfonso González-Briones, Pablo Chamoso, Juan M. Corchado
- Location: PAAMS 2021 : 19th International Conference on Practical Applications of Agents and Multi-Agent Systems
Abstract:
The great development of the internet and all its associated systems has led to the growth of multiple and diverse capabilities in the field of software development.
One such development was the emergence of code repositories that allow developers to share their projects, as well as allowing other developers to contribute to the growth and improvement of those projects. However, there has been such a growth in the use of these systems that there are multiple works with very similar names and themes that it is not easy to find a repository that completely adapts to the developer’s needs quickly.
This process of searching and researching for repositories that fit the initial needs has become a complex task. Due to the complexity of this process, developers need tools that allow them to process a large amount of information that can be downloaded and analysed programmatically. This complexity can be solved by approaches such as big data and scraping.
This paper presents the design of a data ingestion system for libraries, components and repositories in a multi-agent programming environment.
- Authors: Francisco Pinto-Santos, Zakieh Alizadeh-Sani, David Alonso, Alfonso González-Briones, Pablo Chamoso, Juan M. Corchado
- Location: PAAMS 2021 : 19th International Conference on Practical Applications of Agents and Multi-Agent Systems
Abstract:
Today, the paradigm of multi-agent systems has earned a place in the field of software engineering thanks to its versatility to adapt to various domains. However, the construction of these systems is complex, which leads to additional costs in the implementation process. In recent years, however, several frameworks have emerged to simplify this task by providing functionalities that these systems need as a basis, or even tools to generate code related to this paradigm. These tools are based on a single framework, protocol and language, which sets many limits to the code generation capacity of these tools.
Therefore, this paper proposes a tool for code generation of complete multi-agent systems, focused on the elimination of the restrictions of programming language, framework, communication protocol, etc. through the use of model-driven and template-driven development.
- Authors: Zakieh Alizadeh-Sani, Pablo Plaza Martínez, Guillermo Hernández González, Alfonso González-Briones, Pablo Chamoso, Juan M. Corchado
- Location: PAAMS 2021 : 19th International Conference on Practical Applications of Agents and Multi-Agent Systems
Abstract:
Reusing software is a promising way to reduce software development costs. Nowadays, applications compose available web services to build new software products. In this context, service composition faces the challenge of proper service selection. This paper presents a model for classifying web services. The service dataset has been collected from the well-known public service registry called ProgrammableWeb.
The results were obtained by breaking service classification into a two-step process. First, Natural Language Processing(NLP) pre-processed web service data have clustered by the Agglomerative hierarchical clustering algorithm. Second, several supervised learning algorithms have been applied to determine service categories.
The findings show that the hybrid approach using the combination of hierarchical clustering and SVM provides acceptable results in comparison with other unsupervised/supervised combinations.
-
- Authors: Angeliki-Agathi Tsintzira, Elvira-Maria Arvanitou, Apostolos Ampatzoglou, Alexander Chatzigeorgiou (University of Macedonia)
- Location: Sept. 2020 – 13th International Conference on the Quality of Information and Communications Technology (QUATIC 2020).
Abstract: Technical Debt Management (TDM) is a fast-growing field that in the last years has attracted the attention of both academia and industry. TDM is a complex process, in the sense that it relies on multiple and heterogeneous data sources (e.g., source code, feature requests, bugs, developers’ activity, etc.), which cannot be straightforwardly synthesized; leading the community to use mostly qualitative empirical methods. However, empirical studies that involve expert judgment are inherently biased, compared to automated or semi-automated approaches. To overcome this limitation, the broader (not TDM) software engineering community has started to employ machine learning (ML) technologies. Our goal is to investigate the opportunity of applying ML technologies for TDM, through a Systematic Literature Review (SLR) on the application of ML to software engineering problems (since ML applications on TDM are limited).
Thus, we have performed a broader scope study, i.e., on machine learning for software engineering, and then synthesize the results so as to achieve our high-level goal (i.e., possible application of ML in TDM). Therefore, we have conducted a literature review, by browsing the research corpus published in five high-quality SE journals, with the goal of cataloging: (a) the software engineering practices in which ML is used; (b) the machine learning technologies that are used for solving them; and (c) the intersection of the two: developing a problem solution mapping. The results are useful to both academics and industry, since the former can identify possible gaps, and interesting future research directions, whereas the later can obtain benefits by adopting ML technologies.
- Authors: Elvira Maria Arvanitou; Apostolos Ampatzoglou; Stamatia Bibi; Alexander Chatzigeorgiou; Ignatios Deligiannis;
- Location: IEEE Access (Volume 10)
Abstract: DevOps is an emerging software development methodology, that differs from more traditional approaches due to the closer involvement of the customer and the adoption of “ continuous -*” (e.g., integration, deployment, delivery, etc.) practices. The vast research on DevOps (including numerous secondary studies) published in a short timeframe, and the diversity of the authors’ research backgrounds (e.g., from a Dev or an Ops perspective), has inevitably produced a long list of investigated topics, which use inconsistent terminology. The goal of this study is to analyze literature reviews on DevOps with respect to: (a) the research topics in DevOps; (b) the terms that are mapped to each topic; and (c) the consistency of terminology. To achieve this goal, we have performed a tertiary study, i.e., a systematic mapping study that uses as primary studies “ Systematic Literature Reviews ” and “ Mapping Studies ”. For Data Extraction, Analysis, and Synthesis (DEAS) we propose a novel approach relying on thematic analysis, statistical analysis, and meta-analysis. The results unveiled 7 core topics on DevOps research, out of which DevOps features and DevOps practices are dominant ones. Additionally, as expected various terminology ambiguities have been identified, most between features as practices, as well as, between challenges faced before adopting DevOps and while applying DevOps. The main contribution of this study is the disambiguation of the mapping of terms to topics. Along this process, we highlight both inconsistencies—attempting to resolve ambiguities, as well as topics and terms with high levels of consistency; aiding researchers and practitioners.
SmartCLIDE offers services to accelerate the creation and deployment of Cloud solutions by providing the ability for non-programmers to construct applications and new services using smart automation. One of the backend services provided by SmartCLIDE is runtime monitoring and verification (RMV) which in conjunction with automated testing is applied to assure the quality of the created services. In this paper we describe the objectives of RMV, and provide an overview of the approach and the benefits.
- Authors: Sebastian Scholze
- Location: Eclipse Newsletter, Sept. 2021
Abstract: The SmartCLIDE research project aims to bridge the gap between on-demand business strategies and the lack of qualified software professionals by creating a new cloud native IDE that makes it easier to develop and deploy cloud services. The project is funded by the European Union’s Horizon 2020 research and innovation program, and involves a consortium of 11 partners from Germany, Greece, Luxembourg, Portugal, Spain, and the United Kingdom.
- Authors: Dimitrios Tsoukalas, Nikolaos Mittas, Alexandros Chatzigeorgiou, Dionisis D. Kehagias, Apostolos Ampatzoglou, Theodoros Amanatidis, Lefteris Angelis
- Location: “IEEE Transactions on Software Engineering” journal
Abstract: Technical Debt (TD) is a successful metaphor in conveying the consequences of software inefficiencies and their elimination to both technical and non-technical stakeholders, primarily due to its monetary nature. The identification and quantification of TD rely heavily on the use of a small handful of sophisticated tools that check for violations of certain predefined rules, usually through static analysis. Different tools result in divergent TD estimates calling into question the reliability of findings derived by a single tool. To alleviate this issue we use 18 metrics pertaining to source code, repository activity, issue tracking, refactorings, duplication and commenting rates of each class as features for statistical and Machine Learning models, so as to classify them as High-TD or not. As a benchmark we exploit 18.857 classes obtained from 25 Java projects, whose high levels of TD has been confirmed by three leading tools. The findings indicate that it is feasible to identify TD issues with sufficient accuracy and reasonable effort: a subset of superior classifiers achieved an F<sub>2</sub>-measure score of approximately 0.79 with an associated Module Inspection ratio of approximately 0.10. Based on the results a tool prototype for automatically assessing the TD of Java projects has been implemented.
Abstract: The Eclipse OpenSmartCLIDE project aims to deliver a cloud-native IDE based on Eclipse Theia for cloud developers. The IDE integrates a comprehensive set of tools that support developers in all phases of SDLC, going from requirements and design up to testing and deployment. In a nutshell, Eclipse OpenSmartCLIDE offers automation through innovation in all aspects of SDLC for cloud development. Our presentation at TheiaCon 2022 will include a brief description of the project (design and key principles), as well as hands-on short demos. Slide deck presented by Nikolaos Nikolaidis (University of Macedonia) during the CloudDev Community Day at EclipseCon 2022 On 3 March 2022, from 14:00 – 16:00 CET, HORIZON CLOUD hosted its March Community event for the European Cloud Community. The two H2020 Cloud Research and Innovation Actions SmartCLIDE and PHYSICS used the webinar to share their project outcomes with the community. Enjoy it! The session has been recorded. Check it out: https://youtu.be/VgmiIp7bGEk On 3 March 2022, from 14:00 – 16:00 CET, HORIZON CLOUD hosted its March Community event for the European Cloud Community. The two H2020 Cloud Research and Innovation Actions SmartCLIDE and PHYSICS used the webinar to share their project outcomes with the community. After a brief presentation of the project, we can assist to nice demo on the “SmartCLIDE Service Creation and Composition” component followed by a presentation of the “SmartCLIDE Deep Learning Engine” component. Enjoy it! The session has been recorded. Check it out! Sebastian Scholze, ATB Presented at the Open Research Webinars co-organized by the Eclipse Foundation and OW2, Dec. 15, 2020 Presented at the M9 Review First public presentation on SmartCLIDE presented during EclipseCon 2020
Abstract: The Eclipse OpenSmartCLIDE project aims to deliver a cloud-native IDE based on Eclipse Theia for cloud developers. The IDE integrates a comprehensive set of tools that support developers in all phases of SDLC, going from requirements and design up to testing and deployment. In a nutshell, Eclipse OpenSmartCLIDE offers automation through innovation in all aspects of SDLC for cloud development. Our presentation at TheiaCon 2022 will include a brief description of the project (design and key principles), as well as hands-on short demos. To confront the challenges opened-up by the rise of SOA, in the OpenSmartCLIDE project we have developed a framework (a methodology and a platform) to aid software engineers in systematic and more efficient (in terms of time, quality, defects, and process) reuse of services, when developing SOA-based cloud applications. In the following video, we present a use-case scenario of our platform that uses the Service Composition, Service Discovery, Service Creation, and Deployment components. Supposing that we want to get the Bitcoin price in USD at this moment, we could create a workflow that does exactly that. We can take advantage of an existing service in order to get the Bitcoin price in Euro (Service Discovery), and create a service that changes the number of euros to USD (Service Creation). After having these two services we can create a workflow that connects them, so that we take the price in euro from the first one and we pass it to the second, to get the final value at the end (Service Composition). In the end, we can see the quality assessment of the created/used service or even of the workflow. Moreover, at the service level, we can also use some automated generated unit testing or use design patterns easier with the help of the appropriate extensions. On 3 March 2022, from 14:00 – 16:00 CET, HORIZON CLOUD hosted its March Community event for the European Cloud Community. The two H2020 Cloud Research and Innovation Actions SmartCLIDE and PHYSICS used the webinar to share their project outcomes with the community. After a brief presentation of the project, we can assist to nice demo on the “SmartCLIDE Service Creation and Composition” component followed by a presentation of the “SmartCLIDE Deep Learning Engine” component. Enjoy it! The session has been recorded. Check it out! SmartCLIDE was presented at the Open Research Webinar series, December 15, 2020. This video introduces and summarizes the article Cloud Computing in a nutshell This video introduces and summarizes the key point of the article AGILE Methodologies and DevOps A four minutes video presenting SmartCLIDE challenges, objectives, and targets
This white paper gathers the main articles published by the consortium during the three years of the project. Table of content: Enjoy the reading! 2-pages presenting the last version of SmartCLIDE project and its associated open source project, Eclipse OpenSmartCLIDE This poster was originally presented at the TRANSFIERE event. 2-pages introduction to the SmartCLIDE project
- The Horizon2020 project SmartCLIDE has officially started on 1st January 2020! (2/21/2020)
- SmartCLIDE: a new cloud-native IDE (6/5/2020)
- Machine Learning and Deep Learning: A power couple (6/8/2020)
- Cloud Computing in a nutshell (6/15/2020)
- Programming By Example (7/7/2020)
- Service Discovery in a Nutshell (7/9/2020)
- AGILE methodologies and DevOps (8/21/2020)
- Use Case: Real-Time Communication Service (11/10/2020)
- Use Case: Enhance IoT-Catalogue with an integrated Cloud IDE (12/23/2020)
- Use Case: Provide a Quick Demonstration for a Customer (1/6/2021)
- SmartCLIDE Market Requirements (Part 1) (8/11/2021)
- SmartCLIDE Market Requirements (Part 2) (8/11/2021)
- SmartCLIDE Service Creation (8/12/2021)
- SmartCLIDE Innovative Approaches (9/7/2021)
- SmartCLIDE User Interface (12/9/2021)
- SmartCLIDE Deep Learning Engine (12/16/2021)
- Backend Service: Source Code Repository (12/20/2021)
- Backend service: Service Discovery, Creation and Monitoring (12/20/2021)
- Backend service: Security (12/20/2021)
- Backend service: Intercommunication (12/20/2021)
- Backend service: User Access Management (12/20/2021)
- Backend services: Deployment and CI/CD (12/20/2021)
- Early SmartCLIDE IDE Design (12/21/2021)
- Testing Cloud-Based Applications (4/11/2022)
- SmartCLIDE DLE Component (6/24/2022)
- Vulnerability prediction based on Text Mining and BERT (7/1/2022)
- Runtime Monitoring and Verification (RMV) (9/15/2022)
- Tool Support for Architectural Pattern Selection in Cloud-centric Service-oriented IDEs (9/29/2022)
- Technical Debt in Service-Oriented Software Systems (12/4/2022)
- Use Case: LoRaWAN Platform As A Service (12/8/2022)
- Use Case: Optimizing Resources (12/19/2022)
- SmartCLIDE White-paper (5/8/2023)
GitHub repositories
All the code of the project is hosted at the Eclipse OpenSmartCLIDE project but the Runtime Monitoring and Verification subsystem component which is still hosted at the Eclipse Research Labs.