[Don't] set StringBuffers to null: or how to read lower than bytecode

[Don't] set StringBuffers to null: or how to read lower than bytecode

S. Gilles

2017-01-20

A few months ago, I was working on a project written in Java.
One function I had written looked something like this:

public String getFoo() {
StringBuffer result = new StringBuffer();

for (Some thing : things) {
result.append(thing.frobinate());
}

return result.toString();
}

In the peer review, a coworker (a reasonable and thoughtful person)
made a comment (paraphrased) along the lines of:

Set result to null before returning - it helps the garbage
collector according to [Book I can't recall the title of].

My instinctive reply (paraphrased) was “If the garbage collector
can't tell that result is out of scope from the end of the function,
setting it to null won't help. The book was probably talking about long
functions during which GC passes could be expected to run at times when
the variable was still in scope, just not used.”

That argument isn't as strong as it could be, however, since it relies
on me claiming that I understand the Java Memory Model and assuming
something about the behavior of an optimizing compiler. Besides, the
advice was published in a book (albeit one focused on Java 1.4 or 1.5),
which is pretty strong. A better argument would be “I can show you
that setting the StringBuffer to null does nothing”.

If this were C, I could have turned up the compiler optimization and
compare the object files, and perhaps we would even be using a compiler
that outputs some kind of IR in a human and diff-friendly fashion.
But this isn't C, and the bytecode that javac outputs doesn't have the
optimizations I was looking for applied.

However, I was in luck, since we were using the HotSpot VM, which allows
-XX:+PrintAssembly. There are a few hoops to jump through, but once I
downloaded and built hsdis, including its special version of binutils,
I was able to run

LD_LIBRARY_PATH=/path/to/hsdis/build/Linux-amd64 \
java \
-XX:+UnlockDiagnosticVMOptions \
-XX:+PrintAssembly \
-cp /path/to/program \
MainClass 2>/dev/null

and see a wonderful spew of of movs and jnes.

Next, I wrote A.java and B.java. A.java is

public class A {
public static String whatever() {
StringBuffer sb = new StringBuffer();
for (int i = 0; i < 10; ++i) {
sb.append(String.valueOf(System.currentTimeMillis()));
}
String s = sb.toString();
// sb = null;
return s;
}

public static void main(String[] args) {
String w = whatever();
System.out.println(w.length());
}
}

and B.java is similar, except that the ‘sb = null;’ is uncommented.
The whatever() function is intended to be close to the function I wrote,
and to be moderately resistant to inlining, as I was hoping to be able
to actually isolate the function. However, that didn't happen.

The next step was to run PrintAssembly on both and diff the results.
Two problems showed up with this:

1. The memory addresses, which show up on just about every line,
were completely different for each, and
2. After s/0x[0-9a-f]+/HEX/g, the files were wildly different.
I didn't just get a few lines of difference, I got terminalfulls
of it. Filesizes differed by hundreds of lines.

It turns out that HotSpot's JIT-ing is, completely unsurprisingly,
nondeterministic, and even for a simple program like A.java that
difference ends up being rather significant. So I ran the following
script:

#!/bin/sh

best=999999
hp="${HOME}/builds/hsdis/build/Linux-amd64/"

while sleep 1
do
for x in $(seq 1 2)
do
LD_LIBRARY_PATH="${hp}" java \
-XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly \
-cp . A 2>/dev/null | perl -pi -e 's/0x[0-9a-f]+/HEX/g' \
> a.s
done
for x in $(seq 1 2)
do
LD_LIBRARY_PATH="${hp}" java \
-XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly \
-cp . B 2>/dev/null | perl -pi -e 's/0x[0-9a-f]+/HEX/g' \
> b.s
done

test=$(diff -u a.s b.s | wc -l)

if [ "${best}" -ge "${test}" ]
then
best=${test}
echo "${best}"

fi

if [ "${test}" -lt "15" ]
then
cp a.s a.$(date +%s)
cp b.s b.$(date +%s)
fi
done

There are some bits I don't quite remember, like why I used perl instead
of sed, or why 15 was chosen, or why I used sleep instead of just letting
my CPU fan exercise itself for a bit. I threw away the half of my results
(the ‘seq 1 2’ thing) because I was bumblingly checking if repeated
invocations of the same bytecode caused HotSpot to somehow stabilize at
any kind of canonical output. I don't think it helped.

The general idea is that there might be some universal average bytecode
that HotSpot will usually be close to. If two compilations of A and B are
close to each other, they are probably close to this average, so they get
saved for later. After about 400 or 500 iterations, I had a sizable pool
of a.${date} and b.${date}s. I also recall that ${best} fell to about
3 or 4 by the end: there was one small section in which HotSpot couldn't
make up its mind which register to use, and whether to use jne or je.

When I checksummed all my files, there was a collision that diff
confirmed: a.1470941948 and b.1470941881 were identical. Therefore,
up to memory addresses (which I think is actually reasonable), the JVM
is capable of running A.java and B.java in exactly the same manner,
and the ‘sb = null;’ line had no effect.

(Sadly, I had overestimated this argument, as my coworker's response
was to promise to find the book and show me the page.)