| JamaicaVM 3.2 — User Documentation: The Virtual Machine for Realtime and Embedded Systems | ||
|---|---|---|
| Prev | Chapter 8. Profiling using the Builder | Next |
When running in profiling mode, the VM collects data about performance, memory requirements etc. This information is used to create an optimized application but can also be interpreted manually to find find memory leaks or time consuming methods.
| Warning |
The format of the profile file will change in future versions of Jamaica Builder. |
Every line in the profiling output starts with a keyword followed by space separated values. The meaning of these values depends on the keyword.
For a better overview, the corresponding values in different lines are aligned as far as possible and words and signs that improve human reading are added. Here for every keyword the additional words and signs are omitted and the values are listed in the same order as they appear in the text file.[1]
unique ID
unique ID
total number of references in object attributes
total number of words in object attributes
relative number of references in object attributes
total number of currently allocated objects of this class
number of blocks needed for one object of this class
block size in bytes
number of bytes needed for all objects of this class
relative heap usage of objects of this class
total number of objects of this class organized in a tree structure
relative number of objects of this class organized in a tree structure
name of the class
total number of instantiated objects of this class
number of blocks needed for one object of this class
number of blocks needed for all objects of this class
number of bytes needed for all objects of this class
total number of objects of this class organized in a tree structure
relative number of objects of this class organized in a tree structure
used class loader
name of the class
total number of bytecodes spent in this method
relative number of bytecodes spent in this method
signature of the method
name of the class used via reflection
total number of processor cycles spent in this method (if available on the target)
signature of the method
number of calls from caller method to called method
bytecode position of the call within the method
signature of the caller method
signature of the called method
total number of calls to the native method
relative number of calls to the native method
signature of the called native method
number of calls to array creation within a method
bytecode position of the call within the method
signature of the method
current Java priority of the thread
total amount of CPU cycles in this thread
relative time in interpreted code
relative time in compiled code
relative time in JNI code
relative time in garbage collector code
required C stack size
required Java stack size
maximum number of concurrently used threads
We can sort the profiling output to find the application methods where most of the execution time is spent. Under Unix, the 25 methods which use the most execution time (in number of bytecode instructions) can be found with the following command:
> grep PROFILE: HelloWorld.prof | sort -rn -k2 | head -n25
PROFILE: 8158752 (11%) java/nio/ByteBufferImpl.put(B)Ljava/nio/ByteBu
PROFILE: 6459012 (9%) java/nio/CharBufferImpl.get()C
PROFILE: 6219075 (8%) gnu/java/nio/charset/ISO_8859_1$Encoder.encode
PROFILE: 5079390 (7%) java/nio/CharBuffer.wrap(Ljava/lang/CharSequen
PROFILE: 4559424 (6%) java/nio/Buffer.remaining()I
PROFILE: 4279396 (6%) java/nio/Buffer.hasRemaining()Z
PROFILE: 2985102 (4%) java/nio/charset/Charset.checkName(Ljava/lang/
PROFILE: 2400240 (3%) java/nio/Buffer.<init>(IIII)V
PROFILE: 2380593 (3%) java/lang/String.toLowerCase(Ljava/util/Locale
PROFILE: 2036194 (2%) java/lang/StringBuffer.append(Ljava/lang/Strin
PROFILE: 1520152 (2%) java/nio/Buffer.limit(I)Ljava/nio/Buffer;
PROFILE: 1343115 (1%) java/lang/String.toLC(CLjava/util/Locale;)C
PROFILE: 1342884 (1%) java/lang/Character.toLowerCase(C)C
PROFILE: 1200120 (1%) java/nio/Buffer.position(I)Ljava/nio/Buffer;
PROFILE: 1103555 (1%) java/lang/String.length()I
PROFILE: 980480 (1%) java/util/HashMap.get(Ljava/lang/Object;)Ljava
PROFILE: 960096 (1%) java/nio/charset/CharsetEncoder.encode(Ljava/n
PROFILE: 833414 (1%) java/lang/String.substring(II)Ljava/lang/Strin
PROFILE: 780079 (1%) java/nio/charset/CharsetEncoder.<init>(Ljava/n
PROFILE: 732913 (1%) java/lang/StringBuffer.needBuffer()[C
PROFILE: 672042 (0%) java/lang/StringBuffer.ensureCapacity2(I)[C
PROFILE: 640064 (0%) java/nio/CharBufferImpl.<init>([CIIIIIZ)V
PROFILE: 640064 (0%) java/nio/ByteBufferImpl.<init>([BIIIIIZ)V
PROFILE: 620062 (0%) java/nio/ByteBufferImpl.get([BII)Ljava/nio/Byt
PROFILE: 600060 (0%) java/nio/charset/CharsetEncoder.encode(Ljava/n
|
In this small example program, it is not surprise that nearly all execution time is spent in methods that are required for writing the output to the screen. The dominant function is java/nio/ByteBufferImpl.put, which is used while converting Java's unicode characters to the platform's ISO 8859-1 encoding. Also important is the time spent in StringBuffer.append. Calls to the StringBuffer methods have been generated automatically by the jamaicac compiler for string concatenation expressions using the '+'-operator.
On systems that support a CPU cycle counter, the profiling data also contains a cumulative count of the number of cycles spent in each method. This information is useful to obtain a more high-level view on where the runtime performance was spent.
The CPU cycle profiling information is contained in lines starting PROFILE_CYCLES:. A similar command line can be used to find the methods that cumulatively require most of the execution time:
> grep PROFILE_CYCLES: HelloWorld.prof | sort -rn -k2 | head -n25
PROFILE_CYCLES: 25264930944 HelloWorld.main([Ljava/lang/String;)V(inte
PROFILE_CYCLES: 22632966124 java/io/PrintStream.print(Ljava/lang/Strin
PROFILE_CYCLES: 22551401560 java/io/PrintStream.println(Ljava/lang/Str
PROFILE_CYCLES: 21846461476 java/lang/String.getBytes(Ljava/lang/Strin
PROFILE_CYCLES: 16380943752 java/io/PrintStream.writeChars(Ljava/lang/
PROFILE_CYCLES: 13202413840 java/nio/charset/CharsetEncoder.encode(Lja
PROFILE_CYCLES: 10219330580 java/nio/charset/CharsetEncoder.encode(Lja
PROFILE_CYCLES: 9813888540 gnu/java/nio/charset/ISO_8859_1$Encoder.en
PROFILE_CYCLES: 6461216840 gnu/java/nio/channels/FileChannelImpl.writ
PROFILE_CYCLES: 6130778804 java/io/PrintStream.writeChars([CII)V(inte
PROFILE_CYCLES: 4144212080 java/nio/charset/Charset.forName(Ljava/lan
PROFILE_CYCLES: 4085474468 java/nio/charset/Charset.charsetForName(Lj
PROFILE_CYCLES: 3387837524 java/nio/CharBuffer.wrap(Ljava/lang/CharSe
PROFILE_CYCLES: 3263648472 java/nio/CharBuffer.wrap(Ljava/lang/CharSe
PROFILE_CYCLES: 2907582920 gnu/java/nio/charset/Provider.charsetForNa
PROFILE_CYCLES: 2706031748 java/nio/Buffer.hasRemaining()Z(interprete
PROFILE_CYCLES: 1766095212 java/nio/ByteBufferImpl.put(B)Ljava/nio/By
PROFILE_CYCLES: 1556039636 java/lang/String.toLowerCase()Ljava/lang/S
PROFILE_CYCLES: 1446420368 java/nio/ByteBuffer.wrap([BII)Ljava/nio/By
PROFILE_CYCLES: 1435486124 java/lang/String.toLowerCase(Ljava/util/Lo
PROFILE_CYCLES: 1338653144 java/nio/CharBufferImpl.get()C(interpreted
PROFILE_CYCLES: 1322128948 java/nio/Buffer.<init>(IIII)V(interpreted)
PROFILE_CYCLES: 1228418980 java/util/HashMap.get(Ljava/lang/Object;)L
PROFILE_CYCLES: 1076201776 java/lang/StringBuffer.append(Ljava/lang/S
PROFILE_CYCLES: 1032032660 java/nio/Buffer.remaining()I(interpreted)
|
The cumulative cycle count shows more clearly that the main method is running most of the time during the profiling run, the next dominating methods are print and println of class java.io.PrintStream [2]. The cumulative cycle counts can now be used as a basis for a top-down optimization of the application execution time.
| [1] | The format of the profile file can change with every release. It is currently mainly used for human interpretation. |
| [2] | Since main calls only println, one would expect the cumulative time spent in println to exceed the time spent in print. While this is actually the case for the calls from main, there are however calls to print performed during startup of the virtual machine while executing the static initializer of class java.lang.System, so that the total cumulative time in print is higher than the time spent in println. |