code4thought

SOFTWARE QUALITY

The future of
software engineering:
generated by AI or humans?

29/03/2023
4 MIN READ  /
Author
Yiannis Kanellopoulos
CEO and Founder | code4thought
With all the buzz around the ChatGPTs (3 and 4) and the latest announcements from GitHub Copilot X we couldn’t but wonder what will be the impact of AI-powered tools on the profession of Software Engineering which up until now seemed to be immune to existential threats.
In this post we will attempt to answer the question: “How likely is it for software engineers to lose their jobs in the future?”. Or better, “Are we close to a Doom’s day for software engineers?”. As a spoiler, I’d say that the answer is: “Not very likely, if they adapt”.
For starters let’s see what is the promise of GitHub Copilot X as the tool claims that:
  • Using the proper prompts a software engineer can ask it to generate code,
  • It suggests code improvements such as proposed bug fixes, unit-tests generation and so on.
All in all, GitHub (or Microsoft for that matter) claims that 46% of the produced code of more than 1 million developers is written by Copilot and the tool helps increase the coding capacity of developers by 55%. These numbers are impressive especially if we consider that Copilot was launched less than two years ago.
On the other hand, increasing the coding capacity of a software developer by 55% doesn’t necessarily mean that Copilot (or Copilot X) is increasing their overall productivity by that same percentage. The main reasons for that are related to the fundamentals of software engineering:
  • G.I.G.O. also known as Garbage In Garbage Out: So, if the prompt we give to Copilot is not a good one, the result is going to be similarly bad and most likely at scale.
  • 20% Capex vs 80% Opex: Typically in a software system’s life cycle 20% of its costs are related to its initial launch and 80% to its support/maintenance throughout the years. So, it may be easier to initially generate parts of a system’s source code but what about their maintenance or adaptability to users’ or stakeholders’ future needs?
  • Program Comprehension or the 90% of the 80%: Several studies indicate that when a software engineer needs to change a piece of code (either to fix a bug or add a feature), 90% of its time will be spent in understanding it. Such a task might be easier for boilerplating code (especially if you are a software engineer with some years of experience) but it may be challenging for a part of code that is mostly related to the business logic of a system or the core functionality of that. Even if a LLM model can understand it and explain it (as some of them already do), still we cannot deduct how credible or trusted the explanation will be, especially for a software developer with no prior experience in the codebase or the business domain.
  • Coding is not (just) typing code: Continuing from the previous point a major part of the work of a software engineer consists of requirements elicitation, analysis and design for a solution that satisfies those requirements and in general it relies crucially on human communication. Our experience with ChatGPT4 indicates that the task of design (even the domain) might be easy ( for an experienced software engineer), though it cannot replace human interaction for understanding the requirements (or the business problem at hand) and analyzing them in a form that the design of the solution will be more straightforward.
Based on all the above, what can we say about the pros and cons of Copilot X and their impact on the software engineering profession?
At first, we could say that tools like Copilot X will help automate those tasks that are less interesting for a developer, such as boilerplate code, templates, or building a basic structure for a web-site. Regarding the latter, let’s not forget that even before Copilot X there were several tools out there claiming that you can build your e-shop with a click. However in this case, the quality of the prompt that will be given will inevitably affect the output. Imagine for instance a Sr. Software Engineer giving a prompt for creating a REST API in python compared to a prompt given from a Jr. Software Engineer.
Then, it is crucial for software engineers to validate how trusted Copilot’s suggestions are: How was the tool trained and what was the quality of its training dataset? This is applicable also in the case that you can train Copilot with your organization’s documentation; which between us, we all know is usually outdated. Also, what if a software system has been designed in a way that is not addressing the needs of the business using it (not a rare case in software engineering)? How valid/relevant in that case is the output/suggestions of Copilot X? In that line of reasoning, if we keep in mind that the more complex a software development is the higher the possibility it will fail, how certain can we be that Copilot X is suggesting things that in essence will not mitigate that risk and instead they might be increasing it? In other words, how can we know that Copilot X has been trained in source code coming from successful software projects (or products) or how this success is relevant to the business domain we’re working in?
Finally, when it comes to maintaining/changing the existing source code, things can be challenging. For instance, if you’re a newcomer to a software development project, how easy is it for you to change that code created/generated by Copilot X? Even better, how easy is it for you to understand and change the prompt initially given to the tool in order to generate that code?
All in all, software development tools such as Copilot X which make use of the capabilities of generative AI are here to help software developers by alleviating the boring tasks (e.g. writing boilerplate code) and by giving a helping hand in order to understand code and ensure it can be properly tested. On the other hand, if not used properly, they can delay software projects or even propagate bad practices. So, as any tool, their successful adoption and operationalisation still depends on humans.