Analysing the quality of code with NDepend
- Processes, standards and quality
Nowadays, there are a lot of aspects that we have to think about while developing real world applications. Maintainability, Understandability, Clarity, Dependency are just a few of them. We have to work hard to maintain our code and make it self-commenting, preserve it from cyclic dependencies between assemblies or simply, to provide its good quality. There are a lot of tools that may help to achieve these goals e.g. Sonar, ReSharper, JustCode or NDepend. In this article I will show how we can analyse the quality of code with the use of NDepend. As an exemplary project I have chosen Gallio, a test automation platform and MbUnit and unit testing framework. It’s an open source project that can be downloaded here.
What is NDepend?
NDepend is a static analysis tool for .NET platform projects. It supports a large number of code metrics and is able to visualize the dependencies using directed graphs and dependency matrix. NDepend can also perform code base snapshots that could be later compared. One of the key-features of NDepend is that users can write their own rules using LINQ queries (CQLinq). There are also a lot of predefined CQLinq code rules. NDepend can be easily integrated with Visual Studio. What is more, with NDepend API and Power Tools everyone can write their own static analyser or tweak existing open-sources Power Tools.
Code analysis of Gallio test automation platform
To analyse a project in NDepend we have to create a new one, and then add the solution file as well as all the assemblies that should be analysed. After completing the analysis the summary of key metrics can be found on the dashboard.
Basic code metrics on Dashboard
On Dashboard (Fig. 1) we can find summary concerning the basic metrics for the analysed project:
Lines of code
Gallio consists of 72121 Lines of Code (LOC’s) and only 4609 LOC’s are generated. In comparison, NUnit projects consist of 33460 LOC’s, so we can say on a high level of abstraction that Gallio is a bigger project.
Number of types, namespaces, methods, source files and assemblies
Gallio contains 5124 types, 89 assemblies, 409 namespaces, 22185 methods, 4261 fields and 2156 source files. A huge number of methods in a project is a little bit alarming as there are only 5124 types and we should control the right balance between them.
Third-part usage (assemblies, namespaces, types, methods and fields)
Gallio uses 47 assemblies, 135 namespace, 1118 namespaces, 3044 method and 204 fields from third-part libraries. All these statistics are a piece of information stating to what extent a project depends on external libraries.
Code coverage by tests
We haven’t received any results for Code coverage because there aren’t any tests for that project.
Code coverage by comments
Some assemblies in Gallio are written in a different language than C#. In that case, NDepend can’t measure the comment percentage.
Method complexity (IL)
The average method complexity for Gallio is 1.93 which is less than the suggested value, that is, 2. The maximal method cyclomatic complexity is 88 and such a high result shows that that method should be reviewed and refactored if possible.
Statistics on the violated code rules and number of violations found in a project
When we look at the Code Rules field we can see that there are approximately 31700 violations for 92 rules and 1207 critical violations for 13 rules. These numbers are a bit alarming and therefore we should take a closer look at those issues.
The diagrams above area graphical presentation of tall the aforementioned metrics in which we can see trends and different changes of metrics over time. That can be helpful because we will see whether the development process or refactoring actions are going into the right direction.
CQLinq queries and predefined code rules
Before we take a look at the results that we received for Gallio, I want to say one more thing regarding definition of violations in NDepend. CQLinq is a query language that allows finding any violations in code that a user can imagine. CQLinq is, as the name suggests, based on LINQ and a simply query looks as shown in Fig.2.
This query scans the whole project and lists the fields that are “potentially dead” – not used by any methods or other class. Potentially dead fields rule is labelled as a critical one, because results show unused code in a project that should be deleted or changed.
There are couple of predefined rules in NDepend and they can be found in „Queries and Rules Explorer” tab (Fig. 3). All violations are grouped and labelled in terms of severity.
From the predefined rules the most important ones can be found in the following groups:
- Code Quality
- Object Oriented Design
- Architecture and Layering
- Dead Code
- Purity – Immutability – Side Effects
- Name Conventions
Summary regarding the violations in Gallio can be found on Dashboard (Fig. 1). As I have already mentioned, approximately 31700 violations for 92 rules and 1207 critical violations for 13 rules were found in Gallio. In the first step, we should look at the critical violations.
Critical violations in Gallio
[table id=11 /]
When we look at the violations that we received for each group we can see that there are a lot of pieces of potentially dead code. There are 819 unused methods and 143 types, so we have to take a closer look at those high numbers. When we analyse other violations, the numbers in other groups aren’t as high as in Dead Code. In Architecture and Layering group we can find Avoid namespaces mutually dependent query with 41 violations and Avoid namespaces dependency cycles with 8. We should avoid a situation when two namespaces are mutually dependent or even when there are dependency cycles. According to many sources, mutually dependent namespaces lead to what is called spaghetti code and indicating the way classes have been grouped doesn’t organize them in a strict higher-to-lower fashion. Good way to analyse those issues is using a Dependency Graph.
In the Dependency Graph, we can see namespaces or assemblies represented by rectangles and connections between them. The size of the rectangles can be set according to the size of namespaces lines in a code, or assemblies, their complexity or many other metrics. The thickness of the arrows can be set regarding the number of fields, methods, types and namespaces. We can easily analyse dependencies between namespaces and assemblies as well as to find mutual dependencies and possible cycles.
In the Fig. 3 there are two pairs of namespaces from Gallio that are mutually dependent (rectangles connected by bilateral arrows). However, for some mutually dependent namespaces that were found in Gallio, the Dependency Graph is a bit illegible as in that graph. In that case it’s better to use the Dependency Matrix.
In the Dependency Graph, when we only select application assemblies, we get a short info about all calculated metrics:
- lines of code
- IL instructions
- lines of comments
- number of methods, fields, types and namespaces
- rational cohesion
- afferent coupling
- efferent coupling
When we look at the results for each metric and compare them with suggested values from documentation we can find out which assembly could be problematic and where we should look further.
In the Dependency Matrix we can observe the dependencies between namespaces or assemblies that are represented by lines and columns. Additionally, numbers in cells reflect the amount of dependencies between them with selected characteristics (types, methods, fields). Ideally, if blue cells in the matrix are under the diagonal and the green one above the diagonal. The namespaces between which a dependency cycle was found are marked with a red bordered square in the Dependency Matrix (Fig. 5). Moreover, in this case we observe that the green and blue cells are located within the borders above and below the diagonal of the matrix.
For some dependency cycles we can observe that in the Dependency Matrix all cells in red bordered square are black. That only happens when direct and indirect dependencies are between namespaces.
Besides the dependencies between namespaces, we can also observe here two popular good object code heuristics.
SRP (Single Responsibility Principle)
According to that rule, class shouldn’t have more than one reason to change. When we go to a lower level we can say whether a code element uses dozens of other elements (at the same level) as if that is the case, it has too many responsibilities. We can observe this code element in the Dependency Matrix if it contains many blue cells in a column and many green cells in a line. When we look at the results for Gallio (Fig. 7) we can’t see that situation, so we can be pretty sure that SRP in that project isn’t broken.
The second rule says that the assemblies in a project should be coherent. Component should implement a single logical function or a single logical entity and all parts should contribute to the implementation. Low cohesion means that component does a great variety of actions and isn’t focused on what it should do. In the opposite to that is a high cohesion, which means that component is strongly focused on what it should do and all classes in that component have much in common. While talking about high cohesion, it’s worth mentioning coupling. It refers to the relation between two components and their mutual dependency. Low coupling would mean that changing something in one component should not affect the other; whereas high coupling would mean that a code change is difficult because it could mean an entire system revamp. Therefore, software with good design will have high cohesion and low coupling.
The cohesion level of assemblies can be observed in the Dependency Matrix. Assemblies have high cohesion when green and blue cells are grouped in squares. When we look at the results for Gallio (Fig. 7) we can’t see any green and blue cells grouped in a square around the diagonal, which means that assemblies in Gallio have low cohesion. The architecture and components design should be reviewed because in the future it could be difficult to understand the purpose of some components and to maintain the code. To compare, when we look at the results that we received for NUnit project (Fig. 8) we can observe that the assemblies have higher level of cohesion.
Abstractness vs. Instability diagram
Each assembly has its own calculated metrics: Abstractness, Instability and Distance from main sequence. Those metrics are visualized in the “Abstractness vs. Instability” diagram. It helps to detect which assemblies are potentially painful to maintain (concrete and stable) and which ones of them are potentially useless (abstract and instable). When we look at the assemblies from Gallio (Fig. 9) we can see that the majority of them are in the “green” area, only one assembly is in the zone of uselessness and three are in the zone of pain. One thing that could be alarming is that most of the assemblies have the Instability factor (I) around 1 which means that they have high values of efferent coupling. As a result, the majority of the assemblies depends on other assemblies and in the future we may encounter problems with making changes in assemblies as one change will force other in remaining assemblies.
To sum up the results of NDepend analysis, in general there are a couple of things in code at which we should take a closer look. Dependency cycles between assemblies were found and a couple of them are mutually dependent. It means that some changes in the project architecture should be implemented to avoid those dependencies. Secondly, it looks like there is a lot of unused code in Gallio and MbUnit unit testing framework. The violations statistics shows 819 unused methods, which is a high number. In consequence, the code should be reviewed carefully and all the old and unused methods, types and fields should be deleted. It will reduce the lines of code and make it clearer. Another thing that should be taken into account is the low cohesion of assemblies in the project. It indicates that they could have more than a single logical function and it could be hard to understand the purpose of some assemblies. Components should be reviewed again and the borders for the functionalities should be set, which will help to get higher cohesion. The last thing that should be mentioned is the fact that unit tests should be created for that project as they help to find defects at early stage of the development process.