GSoC 2018 - Code Quality

Project Link

Student and Mentors

Student Name -

Nimish Kapoor

Mentors -

Paul Bignier
Clément David
Dhruv Khattar

Abstract

The aim of this project is to improve the code quality of Scilab with the help of various Static Analysis Tools like Scan Coverity(Java and C/C++), FindBugs(Java), Cppcheck, Valgrind(Dynamic Analysis Tool for cross-checking Memory related fixes) and Clang-Tidy (last three for C/C++) by fixing Java, JNI and C/C++ related errors. In this project, we aim to fix various bugs, errors, defects, typical programming errors like style violations, interface misuse reported by Static analysis tools and to provide comments to avoid future conflicts (e.g use of proper API). Hence, to improve the overall quality of the Scilab code. Using multiple tools will help to double check errors and to mark false positives in the case of C/C++ related errors. The project is aimed to fix C/C++, Java, and JNI related errors.

Static Analysis Tools

Static analysis tools work by analyzing source code, bytecode (e,g, compiled Java), and binary executable code. No code is executed in static analysis, but rather the analysis is done by reasoning about the potential behavior of the code.

Dynamic versus static analysis

Dynamic testing tools all require program execution in order to generate useful results. Static analysis is relatively efficient at analyzing a codebase compared to dynamic analysis. Static analysis tools also analyze code paths that are untested by other methods and can trace execution and data paths through the code. Static analysis can be incorporated early during the development phase for analyzing existing, legacy, and third-party source and binaries before these codes are incorporated into our product. As new source is added, incremental analysis can be used in conjunction with configuration management to ensure quality and security throughout.

Advantages of Static Analysis Tools

Speed It takes time for developers to do manual code reviews. Automated tools are much faster. Static code checking addresses problems early on. And it pinpoints exactly where the error is in the code. So, we’ll be able to fix those errors faster. Plus, coding errors found earlier are less costly to fix.

Depth Testing can’t cover every possible code execution path. But a static code analyzer can. It checks the code as we work on our build. We’ll get an in-depth analysis of where there might be potential problems in our code, based on the rules we’ve applied.

Accuracy Manual code reviews are prone to human error. Automated tools are not. They scan every line of code to identify potential problems. This helps us ensure the highest-quality code is in place — before testing begins. After all, when we’re complying with a coding standard, quality is critical.

The above-mentioned bugs can possess potential risk and threats which can lead to undefined behavior, occasional crashes, and security-related vulnerabilities. And hence, cannot be ignored. We aim to fix all above-mentioned errors and other errors not mentioned here, detected by static analysis tools as part of this project.

Memory Related Errors can cause many errors like Segmentation fault. It is a specific kind of error caused by accessing memory that “does not belong to us.” It’s a helper mechanism that keeps us from corrupting the memory and introducing hard-to-debug memory bugs. Whenever we get a segfault we know we are doing something wrong with memory – accessing a variable that has already been freed, writing to a read-only portion of the memory, etc. Segmentation fault cause crashes hence needs to be fixed.

Memory safety is also a concern in software development that aims to avoid software bugs that cause security vulnerabilities dealing with random-access memory (RAM) access, such as buffer overflows and dangling pointers. Solving these bugs will enhance the security and efficiency of Scilab significantly. The consequences of each type of defect or vulnerability are dependent on the specific instance. For example, unsafe use of signed values may cause crashes, lead to unexpected behavior, or lead to an exploitable security vulnerability. Memory-related errors like not freeing of resources also cause wastage of resources and hence reduced efficiency.

Error handling issues if avoided can lead to crashes which could be avoided. Control flow errors can cause logical errors and hence are important to be fixed for the proper execution of Scilab. Proper use of API makes code easy to understand and modify for future needs.Concurrent data access violations, Incorrect expression,Integer handling issues all leads to undefined behaviour and crashes.Uninitialized variables waste computational powers.

Progress

Link to Coverity Scilab Dashboard
Link to merged commits
Link to open commits
Link to markings

In phase 1, C/C++ related errors were patched.
In phase 2, Java related errors were patched.
In phase 3, False positives and bugs caused by external code generator(like JFlex, GNU Bison) were marked, remaining errors were patched and unresolved errors were worked on.
Current Defect Density : 0.75

Email: <nmshkpr AT SPAMFREE gmail DOT com>

...

CategoryHomepage