Refactoring 3000 Lines of Code

Have you come across a class file (object-oriented programming languages) which spanned across 3000 lines of code or more? Or, even, a class of 1000 lines or more. I do have seen many such code in my career and trust me, every time I have come across such instances, I have a painful time working with any changes required to be made in the code. Honestly, the changes was made without much assurance that everything would work fine with this change in now and future.

Lets  try and understand what’s the problem with long classes (1000 lines code or more)?

  1. Low Maintainability: Such classes score very low in maintainability aspect of software code quality whose key aspect is test-ability & re-usability. 
    • Difficult to change: As such classes can have inter-dependencies, such classes are very difficult to change as it requires lot of time to test. Until the change is very small, it can be difficult to test against these changes. Some of the famous code smells that could be found in such classes are long class, long method, duplication, dead code etc.
    • Difficult to test: Any change in such classes are very difficult to test. Any small change can prove very expensive as it requires testing for all known and unknown inter-dependencies.
    • Very low re-usability: A class of 1000 lines of code or more is expected to serve more than one functionality and in turn, violate “Single Responsibility Principle” of SOLID principle. Such class thus become low in cohesion. Thus, these classes can be used for multiple things. Thus, from re-usability perspective one would want to avoid using the class as the functionality could be broken due to forced change by other functionality served by the same class.
    • High cost of change: a small change may ideally require QA to perform regression test associated with the class for unrelated functionality as well. This unnecessarily increases the cost of change.
  2. Low Usability: Usability is about ease of readability and understand-ability
    • Difficult to read and understand: Such classes are very difficult to read and understand owing to interdependence that exists in the class.

With above as some of problem associated with long classes, one wants to re-factor the class to be able to manage the class well. Let’s try and understand what would be approach of re-factoring such a huge class:

  • Believe that the re-factoring would happen in phases and not in one shot owing to inter-dependencies in the code.
  • Plan to create trace-ability matrix for re-factoring exercise which would include tracing some of the following attributes:
    • Business functionality: List down all business functionality served by this particular class.
    • Dependent modules & stakeholders: List down dependent modules which are using this class to serve their business functionality. Plan to discuss and agree with the required stakeholders about your re-factoring initiative and release timelines. It is always advisable to avoid refactoring a set of code which falls in the critical path and is directly associated with upcoming release.
    • Re-factored interfaces and classes (packages): List down planned interfaces and classes (or packages) which would result from code re-factoring exercise.
    • Test cases around business functionality. This could be separate document with a link.
    • Tet cases around dependent business functionality: This would server the purpose of testing the dependent business functionality.
  • Apply extract class re-factoring strategy to first extract related method in different classes and also change the dependencies spread across different modules.
  • For individual classes, apply extract method re-factoring strategy to extract method and remove “long method” code smell.
  • Drill down further and apply re-factoring for removing other code smells such as duplication, dead code, method naming, method parameters etc.
  • It would be a great idea to support the re-factored classes with unit tests if supported by your practice.

 

Ajitesh Kumar

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking.
Posted in Code Review, Software Quality. Tagged with .