We are looking for a ML+SE Researcher for project “Fine-Grained Analysis of Software Ecosystems as Networks” (FASTEN).
Project duration: 3 years
Start date: 1 January 2019
This position offers the unique opportunity to progress your research career in a challenging industrial context, focusing on innovation and pragmatic solutions for software improvement, while still maintaining scientific standards of quality. The Software Improvement Group has research at its core, with a dedicated research team hosting several PhD and Postdoc researchers, several active collaborations with academic partners, and plenty of colleagues with PhD degrees. The FASTEN project asks for a rich skillset: software engineering expertise combined with machine learning and/or statistical techniques, scientific curiosity and proficiency, the ability to publish, and the drive to integrate research results into daily practice. In this position you will be creating the next-gen software risk analysis models and tools, which will have a lasting impact on software development practice.
A popular form of software reuse involves linking open source software (OSS) libraries hosted on centralized code repositories, such as Maven or PyPI. Developers only need to declare dependencies to external libraries, and automated tools make them available to the workspace of the project. As recent events such as the leftpad incident, which led to hundreds of thousands of websites to stop working, and the Equifax data breach, which led to a leak of hundreds of thousands of credit card numbers, have demonstrated, dependencies on networks of external libraries can introduce to projects significant operational and compliance risk as well as difficult to assess security implications. Solving these problems would boost the efficiency and production quality of software development companies by allowing them to reuse OSS code with confidence, covering a large untapped potential.
To address this situation, the FASTEN project introduces fine-grained, method-level, tracking of dependencies on top of existing dependency management networks. Specifically, the project will introduce a service that tracks dependencies at the method call-graph level and performs sophisticated analyses of i) security vulnerability propagation, ii) licensing compliance, and iii) dependency risk profiles. To facilitate adoption, FASTEN will bring those analyses to the hands of developers by integrating the analysis service to popular package managers, for the Java, C, and Python programming languages.
The project consortium consists of Delft University of Technology, Athens Univ. of Economics and Business, Università degli studi di Milano, XWiki SAS, Endocode AG, Software Improvement Group, and OW2.
As an ML+SE Researcher you will be providing SIG’s contribution to the FASTEN project. This consists of both conceptual and applied research in the intersection of the machine learning and software engineering research fields, aiming for concrete innovations within the company. You will be tasked e.g. with creating new conceptual risk models regarding quality and security based on the FASTEN dependency networks, prototyping ML and/or statistical solutions, performing validation studies, generating exploitation ideas, and helping drive integration with the company’s existing software stack.
You will be part of the Research team at SIG’s Amsterdam office, where you will participate in team activities, get coaching from senior staff, and have day-to-day interaction opportunities with the consulting and software development branches within the company.
SIG Work Package
SIG is leading the work package titled “Security, Quality and Risk”, which will focus on exploiting the FASTEN techniques for better, more accurate analysis of both the quality and the risks of software systems: a key element in translating the technical achievements of FASTEN to better insights for software development organizations. This work package will develop a method for aggregating properties of libraries and software components. This aggregation can benefit from FASTEN results in two ways: (1) it can be more accurate and insightful by using the detailed insights about the dependencies in software (2) In addition to conventional quality and risk measurements, such as maintainability metrics previously developed and used by SIG. In this work package we will also collect and aggregate the insights from change impact analysis, and compliance work packages, to come up with overall risk profiles of software systems and/or individual libraries.