(C) Daily Kos
This story was originally published by Daily Kos and is unaltered.
. . . . . . . . . .
ChatGPT seems to understand JUnit [1]
['This Content Is Not Subject To Review Daily Kos Staff Prior To Publication.']
Date: 2025-04-23
So-called “artificial intelligence” will take software developer jobs away from humans, regardless of whether or not it can actually do the work. And from what I’ve seen, I have to say it can’t. Sure it can spit out lines and lines of source that look like they should work, and sometimes do work.
Maybe A.I. can help software developers get better at unit testing, which ought to be an essential part of any software project. I actually heard a guy at a blockchain meet-up say that A.I. could take care of writing tests, leaving human devs free to do the interesting stuff.
Which of course is to completely miss the point of testing. We should write tests because we want to be sure our programs work the way we expect them to. But if it’s A.I. writing the tests, how can you be sure passing or failing tests mean what you want them to mean? Does the A.I. have the same priorities you do, or any priorities at all?
Still, it might be helpful to use A.I. not as a wizard that does everything but as an assistant that alerts you to blind spots. Like, “I see you tested for scenario X, have you thought about similar scenario Y?”
First, though, we need to see if the particular A.I. we’ve chosen to use, like for example ChatGPT, can actually write unit tests using the particular framework we want to use.
For Java programming, unit testing is almost always done with the JUnit framework. There are other options, including a unit testing framework that I created. But for nearly universal integration with integrated development environments (IDEs, programs that bundle almost everything you need to work on a software project), you can’t beat JUnit: it’s available in all the major Java IDEs: IntelliJ IDEA, Apache NetBeans, Eclipse and BlueJ right “out of the box.”
I think Eclipse only comes with JUnit, but you can add other testing frameworks if you want. A lightweight IDE like BlueJ can probably only use JUnit, any other framework is theoretically possible but likely too difficult to be practical.
In this article, I’m going to try to explain a lot of basic concepts which I generally assume my regular readers on Medium already know and understand very well. The only thing I assume my regular Daily Kos readers know of relevance to this topic is how to use a Web browser. Though I will also gloss over some concepts that are not absolutely required for my main point here.
JUnit has become such an important part of Java development that some Java developers assume it’s an official part of the Java Development Kit. It’s actually not. Even so, it’s reasonable to expect ChatGPT to have some “awareness” of JUnit.
To test whether or not ChatGPT can write JUnit tests, we need some simple exercise that we can quickly see if ChatGPT did what we expected or not. So I thought of Array List from the java.util package, perhaps the most famous data structure in the Java Collections Framework.
Almost everyone who studies Java beyond the most basic level will eventually use Array List for one purpose or another. Presumably Array List has been thoroughly tested by the folks at Sun Microsystems and later Oracle Corporation, and its operation is well documented. So ChatGPT should be able to write tests for Array List.
And it did. Here I’m going to quote the Java source ChatGPT gave me in its entirety, but I won’t quote the whole chat, which you can read at the link if you want to. I’ll let you read the Java source in its entirety here in an unbroken span if you want, but if you prefer, feel free to scroll down to my commentary below. I have not edited this other than to reformat it to look halfway decent on the narrower screen of a mobile device. If the indentation is a little odd, that might be my fault, don’t blame that on ChatGPT.
import org.junit.jupiter.api.BeforeEach; import org.junit.jupiter.api.Test; import java.util.ArrayList; import static org.junit.jupiter.api.Assertions.*; class ArrayListTest { private ArrayList<String> list; @BeforeEach void setUp() { list = new ArrayList<>(); } @Test void testIsEmpty() { assertTrue(list.isEmpty(), "List should be empty after initialization"); } @Test void testAddElement() { list.add("Element 1"); assertFalse(list.isEmpty(), "List should not be empty after adding an element"); assertEquals(1, list.size(), "List size should be 1 after adding one element"); } @Test void testGetElement() { list.add("Element 1"); list.add("Element 2"); assertEquals("Element 1", list.get(0), "First element should be 'Element 1'"); assertEquals("Element 2", list.get(1), "Second element should be 'Element 2'"); } @Test void testRemoveElement() { list.add("Element 1"); list.add("Element 2"); list.remove("Element 1"); assertEquals(1, list.size(), "List size should be 1 after removing one element"); assertEquals("Element 2", list.get(0), "Remaining element should be 'Element 2'"); } @Test void testClear() { list.add("Element 1"); list.add("Element 2"); list.clear(); assertTrue(list.isEmpty(), "List should be empty after clear()"); assertEquals(0, list.size(), "List size should be 0 after clear()"); } @Test void testIndexOf() { list.add("Element 1"); list.add("Element 2"); assertEquals(0, list.indexOf("Element 1"), "Index of 'Element 1' should be 0"); assertEquals(1, list.indexOf("Element 2"), "Index of 'Element 2' should be 1"); assertEquals(-1, list.indexOf("Element 3"), "Index of non-existent element should be -1"); } @Test void testContains() { list.add("Element 1"); assertTrue(list.contains("Element 1"), "List should contain 'Element 1'"); assertFalse(list.contains("Element 2"), "List should not contain 'Element 2'"); } @Test void testSize() { assertEquals(0, list.size(), "Initial size should be 0"); list.add("Element 1"); assertEquals(1, list.size(), "Size should be 1 after adding one element"); list.add("Element 2"); assertEquals(2, list.size(), "Size should be 2 after adding two elements"); } @Test void testAddAll() { ArrayList newList = new ArrayList<>(); newList.add("Element 1"); newList.add("Element 2"); list.addAll(newList); assertEquals(2, list.size(), "Size should be 2 after adding all elements from newList"); assertEquals("Element 1", list.get(0), "First element should be 'Element 1'"); assertEquals("Element 2", list.get(1), "Second element should be 'Element 2'"); } @Test void testAddAtIndex() { list.add("Element 1"); list.add("Element 3"); list.add(1, "Element 2"); assertEquals("Element 1", list.get(0), "First element should be 'Element 1'"); assertEquals("Element 2", list.get(1), "Second element should be 'Element 2'"); assertEquals("Element 3", list.get(2), "Third element should be 'Element 3'"); } @Test void testSet() { list.add("Element 1"); list.set(0, "New Element"); assertEquals("New Element", list.get(0), "Element at index 0 should be updated to 'New Element'"); } @Test void testOutOfBounds() { assertThrows(IndexOutOfBoundsException.class, () -> list.get(0), "Accessing an empty list should throw IndexOutOfBoundsException"); list.add("Element 1"); assertThrows(IndexOutOfBoundsException.class, () -> list.get(1), "Accessing index out of bounds should throw IndexOutOfBoundsException"); } }
If you get these tests into an IDE with JUnit 5 on the classpath and run them, they should all pass. Which is what I expected and which is still kind of impressive. But if we start scrutinizing these tests, we should be able to think of ways a human would have written these better.
The Before Each annotation tells the JUnit test runner to execute the procedure so annotated before each test. Here, ChatGPT wrote a test set up procedure that re-initializes a private instance of Array List to a fresh new instance. This is valid, I suppose, under the principle of Don’t Repeat Yourself (DRY).
Personally, though, I’d much prefer to initialize these fresh instances of Array List as local to each test. It’s a stylistic preference which, however, points up an omission by ChatGPT: it has no creativity for the element types, which in my Medium articles I refer to as “E types.”
And worse, it’s what JetBrains IntelliJ IDEA flags as a “raw use of [a] parameterized class.” I’m pretty sure NetBeans and Eclipse also flag that by default as a warning (associated with the color yellow). Wait, I’m sorry, the E type declaration got lost in the copy and paste and I had to put it back in. Still, I stand by what I wrote about ChatGPT’s lack of creativity for the E types.
I’m going to need to back up and explain element types and parameterized classes as plainly as I can.
Lots of computer programs need to make lists. For example, a program to read movie reviews from a movie review database would probably need a class for a list of movies and a class for a list of movie reviews. A program to manage cookie recipes would probably need a class for a list of ingredients and a class for a list of recipes.
Before Java 5, we could very well have created each of those classes. But we would have found ourselves repeating a lot of functionality: add a movie to a list of movies, add a review to a list of movie reviews, add an ingredient to a list of ingredients, add a recipe to a list of recipes, remove a movie from a list of movies, remove a review from a list of reviews… you get the idea.
Another option was to write a list class for type java.lang.Object, the ultimate superclass of any Java class. But then you have the problem that the list can accept objects of any type whatsoever. For example, you could write a program that mistakenly adds a recipe to a list that is supposed to only hold ingredients.
Martin Odersky, the inventor of Scala, thought Java could do better. After a lot of work, he convinced the folks at Sun Microsystems to add “generics” to Java. Instead of something like
MovieList movieList = new MovieList(); MovieReviewList movieReviewList = new MovieReviewList(); IngredientList ingredientList = new IngredientList(); // etc.
(in which those are all classes we wrote ourselves), we can write something like
List<Movie> movieList = new ArrayList<>(); // Java 7 “diamond” syntax List<MovieReview> movieReviewList = new ArrayList<>(); List<Ingredient> ingredientList = new ArrayList<>(); // etc.
(in which we still write the Movie, Movie Review, Ingredient classes, etc., but List and Array List are already written for us). Though we have to mind some caveats regarding “type erasure” (which I won’t go into in this article), there’s no need for us to reinvent the wheel on adding an element to a list, removing an element from a list, checking whether an element is in the list, etc.
Unless of course you want to practice that. It can be a valuable exercise, one which I’ve undertaken on more than one occasion. For example, the collections.mutable package in my toy-examples repository on GitHub (src and test). The Array Backed List class is an obvious analogue of java.util’s Array List.
Right off the bat in Array Backed List Test you will notice differences from what ChatGPT wrote in Array List Test. I call on Array Backed List Test to import a few more classes besides classes from JUnit.
import java.math.BigInteger; import java.sql.CallableStatement; import java.time.LocalDateTime; import java.util.ArrayList; import java.util.HashSet; import java.util.Locale; import java.util.Random; import org.junit.jupiter.api.Test; import static org.junit.jupiter.api.Assertions.*; import randomness.ExtendedRandom;
I have a good reason to import Array List from java.util. There’s no need to import Array Backed List because it’s in the same package as Array Backed List Test.
I do have some tests in which the E type is String from the java.lang package.
@Test void testContains() { System.out.println("contains"); ArrayBackedList list = new ArrayBackedList<>(); String msg = "List should contain this message after it's added"; list.add(msg); assert list.contains(msg) : msg; } @Test void testDoesNotContain() { ArrayBackedList list = new ArrayBackedList<>(); String msg = "Since this message wasn't added, list shouldn't have it"; assert !list.contains(msg) : msg; }
Notice that the message in each of those tests for the contains( ) function is being used for two distinct purposes: as an element for the lists and as an assertion message for the tests. ChatGPT’s not clever enough for that, but I’m sure some humans are much cleverer than I am at this.
I like using other types for some of the tests. The fact that Array Backed List can be used for multiple different types is not quite something that can be tested directly, nor is it even possible to test it with every possible E type that could theoretically be used. But my having tests that use a few different E types reassures me that Array Backed List works with types other than String.
@Test void testHashCode() { System.out.println("hashCode"); HashSet hashes = new HashSet<>(); int expected = ExtendedRandom.nextInt(24); LocalDateTime dateTime; ArrayBackedList list = new ArrayBackedList<>(); for (int i = 0; i < expected; i++) { hashes.add(list.hashCode()); dateTime = LocalDateTime.now().plusSeconds(i); list.add(dateTime); } int actual = hashes.size(); assertEquals(expected, actual); }
Looking at my tests, you might feel some skepticism, which even I have felt. But I know that I have gone through the human experience of writing these tests, seeing each of them fail the first time and then modifying the class under test to pass the test.
So even if ChatGPT can write the tests and justify them, it simply can not substitute for the human experience of writing tests, seeing them fail for the right reason and then changing the class under test to pass the tests.
I am confident that Array Backed List works because I have written tests for it and taken those tests from failing to passing. And I am confident that I can make changes to Array Backed List and know that it still works simply by running Array Backed List Test, or that JUnit will alert me if my changes break something.
Of course for almost every purpose where I need an array-backed list I will use Array List from java.util. I trust that the people who wrote that, Josh Bloch and Neal Gafter (according to the OpenJDK source), and whatever uncredited humans also worked on it, did their due diligence to make sure it all works correctly. I don’t have that same confidence for anything ChatGPT writes.
“Nobody cares how it works, as long as it works,” Councilor Hamann (Anthony Zerbe) says to Neo in a scene in Matrix: Reloaded. But if we’re to avoid a future in which machines are actively and consciously trying to kill us, we had better understand how machines work today, and not trust them to program themselves.
[END]
---
[1] Url:
https://www.dailykos.com/stories/2025/4/23/2308638/-ChatGPT-seems-to-understand-JUnit?pm_campaign=front_page&pm_source=more_community&pm_medium=web
Published and (C) by Daily Kos
Content appears here under this condition or license: Site content may be used for any purpose without permission unless otherwise specified.
via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/dailykos/