性能调优之JMH必知必会5:JMH的Profiler
- JMH必知必会系列文章(持续更新)
- 一、前言
- 二、JMH的Profiler
- 1、添加JMH依赖包
- 2、StackProfiler
- 3、GcProfiler
- 4、ClassLoaderProfiler
- 5、CompilerProfiler
JMH必知必会系列文章(持续更新)
- 性能调优之JMH必知必会1:什么是JMH
- 性能调优之JMH必知必会2:JMH的基本用法
- 性能调优之JMH必知必会3:编写正确的微基准测试用例
- 性能调优之JMH必知必会4:JMH的高级用法
一、前言
在前面四篇文章中分别介绍了什么是JMH、JMH的基本法、编写正确的微基准测试用例和JMH的高级用法。现在来介绍JMH的Profiler。【单位换算:1秒(s)=1000000微秒(us)=1000000000纳秒(ns)
】
官方JMH源码(包含样例,在jmh-samples包里)下载地址:https://github.com/openjdk/jmh/tags。
官方JMH样例在线浏览地址:http://hg.openjdk.java.net/code-tools/jmh/file/tip/jmh-samples/src/main/java/org/openjdk/jmh/samples/。
本文内容参考书籍《Java高并发编程详解:深入理解并发核心库》,作者为 汪文君 ,读者有需要可以去购买正版书籍。
本文由 @大白有点菜 原创,请勿盗用,转载请说明出处!如果觉得文章还不错,请点点赞,加关注,谢谢!
二、JMH的Profiler
JMH提供了一些非常有用的Profiler可以帮助我们更加深入地了解基准测试,甚至还能帮助开发者分析所编写的代码。
Profiler 名称 | Profiler 描述 |
---|
CL | 分析执行 Benchmark 方法时的类加载情况 |
COMP | 通过 Standard MBean 进行 Benchmark 方法的 JIT 编译器分析 |
GC | 通过 Standard MBean 进行 Benchmark 方法的 GC 分析 |
HS_CL | HotSpotTM 类加载器通过特定于实现的 MBean 进行分析 |
HS_COMP | HotSpotTMJIT 通过特定于实现的 MBean 编译分析 |
HS_GC | HotSpotTM内存管理器(GC)通过特定于实现的 MBean 进行分析 |
HS_RT | 通过 Implementation-Specific MBean 进行 HotSpotTM 运行时分析 |
HS_THR | 通过 Implementation-Specific MBean 进行 HotSpotTM 线程分析 |
STACK | JVM线程栈信息分析 |
【先附上官方Profiler样例(JMHSample_35_Profilers) 代码,并使用谷歌和百度翻译其中的注解】
package cn.zhuangyt.javabase.jmh.jmh_sample;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.infra.Blackhole;
import org.openjdk.jmh.profile.ClassloaderProfiler;
import org.openjdk.jmh.profile.DTraceAsmProfiler;
import org.openjdk.jmh.profile.LinuxPerfProfiler;
import org.openjdk.jmh.profile.StackProfiler;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import java.net.URL;
import java.net.URLClassLoader;
import java.util.HashMap;
import java.util.Map;
import java.util.TreeMap;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicLong;
public class JMHSample_35_Profilers {
@State(Scope.Thread)
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public static class Maps {
private Map<Integer, Integer> map;
@Param({"hashmap", "treemap"})
private String type;
private int begin;
private int end;
@Setup
public void setup() {
switch (type) {
case "hashmap":
map = new HashMap<>();
break;
case "treemap":
map = new TreeMap<>();
break;
default:
throw new IllegalStateException("Unknown type: " + type);
}
begin = 1;
end = 256;
for (int i = begin; i < end; i++) {
map.put(i, i);
}
}
@Benchmark
public void test(Blackhole bh) {
for (int i = begin; i < end; i++) {
bh.consume(map.get(i));
}
}
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(JmhTestApp17_Profiler.Maps.class.getSimpleName())
.addProfiler(StackProfiler.class)
.build();
new Runner(opt).run();
}
}
@State(Scope.Thread)
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public static class Classy {
public static class XLoader extends URLClassLoader {
private static final byte[] X_BYTECODE = new byte[]{
(byte) 0xCA, (byte) 0xFE, (byte) 0xBA, (byte) 0xBE, 0x00, 0x00, 0x00, 0x34, 0x00, 0x0D, 0x0A, 0x00, 0x03, 0x00,
0x0A, 0x07, 0x00, 0x0B, 0x07, 0x00, 0x0C, 0x01, 0x00, 0x06, 0x3C, 0x69, 0x6E, 0x69, 0x74, 0x3E, 0x01, 0x00, 0x03,
0x28, 0x29, 0x56, 0x01, 0x00, 0x04, 0x43, 0x6F, 0x64, 0x65, 0x01, 0x00, 0x0F, 0x4C, 0x69, 0x6E, 0x65, 0x4E, 0x75,
0x6D, 0x62, 0x65, 0x72, 0x54, 0x61, 0x62, 0x6C, 0x65, 0x01, 0x00, 0x0A, 0x53, 0x6F, 0x75, 0x72, 0x63, 0x65, 0x46,
0x69, 0x6C, 0x65, 0x01, 0x00, 0x06, 0x58, 0x2E, 0x6A, 0x61, 0x76, 0x61, 0x0C, 0x00, 0x04, 0x00, 0x05, 0x01, 0x00,
0x01, 0x58, 0x01, 0x00, 0x10, 0x6A, 0x61, 0x76, 0x61, 0x2F, 0x6C, 0x61, 0x6E, 0x67, 0x2F, 0x4F, 0x62, 0x6A, 0x65,
0x63, 0x74, 0x00, 0x20, 0x00, 0x02, 0x00, 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x04, 0x00,
0x05, 0x00, 0x01, 0x00, 0x06, 0x00, 0x00, 0x00, 0x1D, 0x00, 0x01, 0x00, 0x01, 0x00, 0x00, 0x00, 0x05, 0x2A,
(byte) 0xB7, 0x00, 0x01, (byte) 0xB1, 0x00, 0x00, 0x00, 0x01, 0x00, 0x07, 0x00, 0x00, 0x00, 0x06, 0x00, 0x01, 0x00,
0x00, 0x00, 0x01, 0x00, 0x01, 0x00, 0x08, 0x00, 0x00, 0x00, 0x02, 0x00, 0x09,
};
public XLoader() {
super(new URL[0], ClassLoader.getSystemClassLoader());
}
@Override
protected Class<?> findClass(final String name) throws ClassNotFoundException {
return defineClass(name, X_BYTECODE, 0, X_BYTECODE.length);
}
}
@Benchmark
public Class<?> load() throws ClassNotFoundException {
return Class.forName("X", true, new JmhTestApp17_Profiler.Classy.XLoader());
}
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(JmhTestApp17_Profiler.Classy.class.getSimpleName())
.addProfiler(ClassloaderProfiler.class)
.build();
new Runner(opt).run();
}
}
@State(Scope.Benchmark)
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(1)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public static class Atomic {
private AtomicLong n;
@Setup
public void setup() {
n = new AtomicLong();
}
@Benchmark
public long test() {
return n.incrementAndGet();
}
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(JmhTestApp17_Profiler.Atomic.class.getSimpleName())
.addProfiler(LinuxPerfProfiler.class)
.build();
new Runner(opt).run();
}
}
}
1、添加JMH依赖包
在Maven仓库中搜索依赖包jmh-core
和 jmh-generator-annprocess
,版本为 1.36
。需要注释 jmh-generator-annprocess 包中的“<scope>test</scope>”,不然项目运行会报错。
<dependency>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-core</artifactId>
<version>1.36</version>
</dependency>
<dependency>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-generator-annprocess</artifactId>
<version>1.36</version>
</dependency>
2、StackProfiler
StackProfiler 不仅可以输出线程堆栈的信息,还能统计程序在执行的过程中线程状态的数据,比如 RUNNING 状态、WAIT 状态所占用的百分比等。
【StackProfiler样例 - 代码】
package cn.zhuangyt.javabase.jmh;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.profile.StackProfiler;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import org.openjdk.jmh.runner.options.TimeValue;
import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.TimeUnit;
@BenchmarkMode(Mode.AverageTime)
@Fork(1)
@Warmup(iterations = 5)
@Measurement(iterations = 5)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Group)
public class JmhTestApp17_StackProfiler {
private BlockingQueue<Integer> queue;
private final static int VALUE = Integer.MAX_VALUE;
@Setup
public void init()
{
this.queue = new ArrayBlockingQueue<>(10);
}
@GroupThreads(5)
@Group("blockingQueue")
@Benchmark
public void put() throws InterruptedException
{
this.queue.put(VALUE);
}
@GroupThreads(5)
@Group("blockingQueue")
@Benchmark
public int take() throws InterruptedException
{
return this.queue.take();
}
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(JmhTestApp17_StackProfiler.class.getSimpleName())
.timeout(TimeValue.seconds(10))
.addProfiler(StackProfiler.class)
.build();
new Runner(opt).run();
}
}
【StackProfiler样例 - 代码运行结果】
# JMH version: 1.36
# VM version: JDK 1.8.0_281, Java HotSpot(TM) 64-Bit Server VM, 25.281-b09
# VM invoker: D:\Develop\JDK\jdk1.8.0_281\jre\bin\java.exe
# VM options: -javaagent:C:\Program Files\JetBrains\IntelliJ IDEA 2021.1.3\lib\idea_rt.jar=1431:C:\Program Files\JetBrains\IntelliJ IDEA 2021.1.3\bin -Dfile.encoding=UTF-8
# Blackhole mode: full + dont-inline hint (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 5 iterations, 10 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 s per iteration, ***WARNING: The timeout might be too low!***
# Threads: 10 threads (1 group; 5x "put", 5x "take" in each group), will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: cn.zhuangyt.javabase.jmh.JmhTestApp17_StackProfiler.blockingQueue
# Run progress: 0.00% complete, ETA 00:01:40
# Fork: 1 of 1
# Warmup Iteration 1: (benchmark timed out, interrupted 1 times) 11.438 ±(99.9%) 6.489 us/op
# Warmup Iteration 2: (benchmark timed out, interrupted 1 times) 11.984 ±(99.9%) 6.463 us/op
# Warmup Iteration 3: 10.120 ±(99.9%) 6.382 us/op
# Warmup Iteration 4: (benchmark timed out, interrupted 1 times) 12.545 ±(99.9%) 7.091 us/op
# Warmup Iteration 5: 10.464 ±(99.9%) 4.765 us/op
Iteration 1: 9.519 ±(99.9%) 4.039 us/op
put: 9.834 ±(99.9%) 13.564 us/op
take: 9.205 ±(99.9%) 7.101 us/op
·stack: <delayed till summary>
Iteration 2: 10.761 ±(99.9%) 4.969 us/op
put: 10.628 ±(99.9%) 12.275 us/op
take: 10.893 ±(99.9%) 14.461 us/op
·stack: <delayed till summary>
Iteration 3: 11.941 ±(99.9%) 9.002 us/op
put: 11.009 ±(99.9%) 9.926 us/op
take: 12.873 ±(99.9%) 32.438 us/op
·stack: <delayed till summary>
Iteration 4: 13.876 ±(99.9%) 10.938 us/op
put: 13.309 ±(99.9%) 23.108 us/op
take: 14.444 ±(99.9%) 34.647 us/op
·stack: <delayed till summary>
Iteration 5: 11.473 ±(99.9%) 7.817 us/op
put: 11.103 ±(99.9%) 20.193 us/op
take: 11.843 ±(99.9%) 21.887 us/op
·stack: <delayed till summary>
Result "cn.zhuangyt.javabase.jmh.JmhTestApp17_StackProfiler.blockingQueue":
11.514 ±(99.9%) 6.182 us/op [Average]
(min, avg, max) = (9.519, 11.514, 13.876), stdev = 1.606
CI (99.9%): [5.332, 17.696] (assumes normal distribution)
Secondary result "cn.zhuangyt.javabase.jmh.JmhTestApp17_StackProfiler.blockingQueue:put":
11.176 ±(99.9%) 4.977 us/op [Average]
(min, avg, max) = (9.834, 11.176, 13.309), stdev = 1.293
CI (99.9%): [6.199, 16.154] (assumes normal distribution)
Secondary result "cn.zhuangyt.javabase.jmh.JmhTestApp17_StackProfiler.blockingQueue:take":
11.852 ±(99.9%) 7.625 us/op [Average]
(min, avg, max) = (9.205, 11.852, 14.444), stdev = 1.980
CI (99.9%): [4.226, 19.477] (assumes normal distribution)
Secondary result "cn.zhuangyt.javabase.jmh.JmhTestApp17_StackProfiler.blockingQueue:·stack":
Stack profiler:
....[Thread state distributions]....................................................................
88.7% WAITING
11.3% RUNNABLE
....[Thread state: WAITING].........................................................................
88.7% 100.0% sun.misc.Unsafe.park
....[Thread state: RUNNABLE]........................................................................
9.1% 80.2% java.net.SocketInputStream.socketRead0
1.7% 15.3% sun.misc.Unsafe.unpark
0.5% 4.3% sun.misc.Unsafe.park
0.0% 0.1% java.util.concurrent.ArrayBlockingQueue.take
0.0% 0.0% java.util.concurrent.ArrayBlockingQueue.put
0.0% 0.0% java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await
0.0% 0.0% cn.zhuangyt.javabase.jmh.jmh_generated.JmhTestApp17_StackProfiler_blockingQueue_jmhTest.blockingQueue_AverageTime
0.0% 0.0% cn.zhuangyt.javabase.jmh.jmh_generated.JmhTestApp17_StackProfiler_blockingQueue_jmhTest.take_avgt_jmhStub
# Run complete. Total time: 00:01:43
REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.
Benchmark Mode Cnt Score Error Units
JmhTestApp17_StackProfiler.blockingQueue avgt 5 11.514 ± 6.182 us/op
JmhTestApp17_StackProfiler.blockingQueue:put avgt 5 11.176 ± 4.977 us/op
JmhTestApp17_StackProfiler.blockingQueue:take avgt 5 11.852 ± 7.625 us/op
JmhTestApp17_StackProfiler.blockingQueue:·stack avgt NaN ---
我们在Options中增加了StackProfiler用于分析线程的堆栈情况,还可以输出线程状态的分布情况。
得出结论:通过上面的输出结果可以看到,线程状态的分布情况为 WAITING:88.7%,RUNNABLE:11.3%,考虑到我们使用的是 BlockingQueue,因此这种分布应该还算合理。
3、GcProfiler
GcProfiler 可用于分析出在测试方法中垃圾回收器在JVM每个内存空间上所花费的时间,本节将使用自定义的类加载器进行类的加载。
【GcProfiler样例 - 代码】
package cn.zhuangyt.javabase.jmh;
import java.net.URL;
import java.net.URLClassLoader;
public class DbydcClassLoader extends URLClassLoader {
private final byte[] bytes;
public DbydcClassLoader(byte[] bytes) {
super(new URL[0], ClassLoader.getSystemClassLoader());
this.bytes = bytes;
}
@Override
protected Class<?> findClass(String name) throws ClassNotFoundException {
return defineClass(name, bytes, 0, bytes.length);
}
}
package cn.zhuangyt.javabase.jmh;
public class Dbydc {
private String name = "大白有点菜";
private int age = 18;
private byte[] data = new byte[1024 * 10];
public static void main(String[] args) {
}
}
package cn.zhuangyt.javabase.jmh;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.profile.GCProfiler;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.concurrent.TimeUnit;
@BenchmarkMode(Mode.AverageTime)
@Fork(1)
@Warmup(iterations = 5)
@Measurement(iterations = 5)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Thread)
public class JmhTestApp18_GcProfiler {
private byte[] dbydcBytes;
private DbydcClassLoader classLoader;
@Setup
public void init() throws IOException
{
this.dbydcBytes = Files.readAllBytes(
Paths.get("D:\\Dbydc.class")
);
this.classLoader = new DbydcClassLoader(dbydcBytes);
}
@Benchmark
public Object testLoadClass() throws ClassNotFoundException, IllegalAccessException, InstantiationException {
Class<?> dbydcClass = Class.forName("cn.zhuangyt.javabase.jmh.Dbydc", true, classLoader);
return dbydcClass.newInstance();
}
public static void main(String[] args) throws RunnerException {
final Options opts = new OptionsBuilder()
.include(JmhTestApp18_GcProfiler.class.getSimpleName())
.addProfiler(GCProfiler.class)
.jvmArgsAppend("-Xmx128M")
.build();
new Runner(opts).run();
}
}
有些东西需要注意一下,编译 Dbydc.java 为 Dbydc.class 其实很简单,添加一个 main 函数运行一下就可以了,Dbydc.class 生成在 target 目录下。笔者拷贝 Dbydc.class 文件到 D盘根目录下方便测试。使用 Class.forName 方法加载 Dbydc.class,需要把 Dbydc.class 的相对路径写全,即写所在的包路径,只写类名会报找不到类异常
。
【GcProfiler样例 - 代码运行结果】
# JMH version: 1.36
# VM version: JDK 1.8.0_281, Java HotSpot(TM) 64-Bit Server VM, 25.281-b09
# VM invoker: D:\Develop\JDK\jdk1.8.0_281\jre\bin\java.exe
# VM options: -javaagent:C:\Program Files\JetBrains\IntelliJ IDEA 2021.1.3\lib\idea_rt.jar=9976:C:\Program Files\JetBrains\IntelliJ IDEA 2021.1.3\bin -Dfile.encoding=UTF-8 -Xmx128M
# Blackhole mode: full + dont-inline hint (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 5 iterations, 10 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: cn.zhuangyt.javabase.jmh.JmhTestApp18_GcProfiler.testLoadClass
# Run progress: 0.00% complete, ETA 00:01:40
# Fork: 1 of 1
# Warmup Iteration 1: 1.332 us/op
# Warmup Iteration 2: 1.167 us/op
# Warmup Iteration 3: 1.159 us/op
# Warmup Iteration 4: 1.416 us/op
# Warmup Iteration 5: 1.164 us/op
Iteration 1: 1.169 us/op
·gc.alloc.rate: 8385.220 MB/sec
·gc.alloc.rate.norm: 10280.000 B/op
·gc.count: 2041.000 counts
·gc.time: 1168.000 ms
Iteration 2: 1.173 us/op
·gc.alloc.rate: 8359.449 MB/sec
·gc.alloc.rate.norm: 10280.000 B/op
·gc.count: 2037.000 counts
·gc.time: 1175.000 ms
Iteration 3: 1.158 us/op
·gc.alloc.rate: 8462.958 MB/sec
·gc.alloc.rate.norm: 10280.000 B/op
·gc.count: 2060.000 counts
·gc.time: 1202.000 ms
Iteration 4: 1.155 us/op
·gc.alloc.rate: 8489.859 MB/sec
·gc.alloc.rate.norm: 10280.000 B/op
·gc.count: 2068.000 counts
·gc.time: 1196.000 ms
Iteration 5: 1.233 us/op
·gc.alloc.rate: 7953.332 MB/sec
·gc.alloc.rate.norm: 10280.000 B/op
·gc.count: 1938.000 counts
·gc.time: 1107.000 ms
Result "cn.zhuangyt.javabase.jmh.JmhTestApp18_GcProfiler.testLoadClass":
1.178 ±(99.9%) 0.122 us/op [Average]
(min, avg, max) = (1.155, 1.178, 1.233), stdev = 0.032
CI (99.9%): [1.056, 1.300] (assumes normal distribution)
Secondary result "cn.zhuangyt.javabase.jmh.JmhTestApp18_GcProfiler.testLoadClass:·gc.alloc.rate":
8330.164 ±(99.9%) 837.078 MB/sec [Average]
(min, avg, max) = (7953.332, 8330.164, 8489.859), stdev = 217.387
CI (99.9%): [7493.085, 9167.242] (assumes normal distribution)
Secondary result "cn.zhuangyt.javabase.jmh.JmhTestApp18_GcProfiler.testLoadClass:·gc.alloc.rate.norm":
10280.000 ±(99.9%) 0.001 B/op [Average]
(min, avg, max) = (10280.000, 10280.000, 10280.000), stdev = 0.001
CI (99.9%): [10280.000, 10280.000] (assumes normal distribution)
Secondary result "cn.zhuangyt.javabase.jmh.JmhTestApp18_GcProfiler.testLoadClass:·gc.count":
10144.000 ±(99.9%) 0.001 counts [Sum]
(min, avg, max) = (1938.000, 2028.800, 2068.000), stdev = 52.371
CI (99.9%): [10144.000, 10144.000] (assumes normal distribution)
Secondary result "cn.zhuangyt.javabase.jmh.JmhTestApp18_GcProfiler.testLoadClass:·gc.time":
5848.000 ±(99.9%) 0.001 ms [Sum]
(min, avg, max) = (1107.000, 1169.600, 1202.000), stdev = 37.740
CI (99.9%): [5848.000, 5848.000] (assumes normal distribution)
# Run complete. Total time: 00:01:41
REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.
Benchmark Mode Cnt Score Error Units
JmhTestApp18_GcProfiler.testLoadClass avgt 5 1.178 ± 0.122 us/op
JmhTestApp18_GcProfiler.testLoadClass:·gc.alloc.rate avgt 5 8330.164 ± 837.078 MB/sec
JmhTestApp18_GcProfiler.testLoadClass:·gc.alloc.rate.norm avgt 5 10280.000 ± 0.001 B/op
JmhTestApp18_GcProfiler.testLoadClass:·gc.count avgt 5 10144.000 counts
JmhTestApp18_GcProfiler.testLoadClass:·gc.time avgt 5 5848.000 ms
运行上面的基准测试方法,除了得到 testLoadClass() 方法的基准数据之外,还会得到GC相关的信息。
根据 GcProfiler 的输出信息可以看到,在基准方法执行过程之中,GC总共出现过 10144 次,总共耗时 5848 毫秒,在此期间也发生了多次的堆内存的申请,比如,每秒钟大约会有 8830.164MB 的数据被创建,若换算成对testLoadClass方法的每次调用,那么我们会发现大约有 10280.000 Byte 的内存使用。
4、ClassLoaderProfiler
ClassLoaderProfiler 可以帮助我们看到在基准方法的执行过程中有多少类被加载和卸载,但是考虑到在一个类加载器中同一个类只会被加载一次的情况,因此我们需要将Warmup设置为0,以避免在热身阶段就已经加载了基准测试方法所需的所有类。
【ClassLoaderProfiler样例 - 代码】
package cn.zhuangyt.javabase.jmh;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.profile.ClassloaderProfiler;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.concurrent.TimeUnit;
@BenchmarkMode(Mode.AverageTime)
@Fork(1)
@Warmup(iterations = 0)
@Measurement(iterations = 5)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Thread)
public class JmhTestApp19_ClassLoaderProfiler {
private byte[] dbydcBytes;
private DbydcClassLoader classLoader;
@Setup
public void init() throws IOException
{
this.dbydcBytes = Files.readAllBytes(
Paths.get("D:\\Dbydc.class")
);
this.classLoader = new DbydcClassLoader(dbydcBytes);
}
@Benchmark
public Object testLoadClass() throws ClassNotFoundException, IllegalAccessException, InstantiationException {
Class<?> dbydcClass = Class.forName("cn.zhuangyt.javabase.jmh.Dbydc", true, classLoader);
return dbydcClass.newInstance();
}
public static void main(String[] args) throws RunnerException {
final Options opts = new OptionsBuilder()
.include(JmhTestApp19_ClassLoaderProfiler.class.getSimpleName())
.addProfiler(ClassloaderProfiler.class)
.build();
new Runner(opts).run();
}
}
【ClassLoaderProfiler样例 - 代码运行结果】
# JMH version: 1.36
# VM version: JDK 1.8.0_281, Java HotSpot(TM) 64-Bit Server VM, 25.281-b09
# VM invoker: D:\Develop\JDK\jdk1.8.0_281\jre\bin\java.exe
# VM options: -javaagent:C:\Program Files\JetBrains\IntelliJ IDEA 2021.1.3\lib\idea_rt.jar=11136:C:\Program Files\JetBrains\IntelliJ IDEA 2021.1.3\bin -Dfile.encoding=UTF-8
# Blackhole mode: full + dont-inline hint (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: <none>
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: cn.zhuangyt.javabase.jmh.JmhTestApp19_ClassLoaderProfiler.testLoadClass
# Run progress: 0.00% complete, ETA 00:00:50
# Fork: 1 of 1
Iteration 1: 1.251 us/op
·class.load: 952.182 classes/sec
·class.load.norm: ≈ 10⁻⁵ classes/op
·class.unload: ≈ 0 classes/sec
·class.unload.norm: ≈ 0 classes/op
Iteration 2: 1.243 us/op
·class.load: ≈ 0 classes/sec
·class.load.norm: ≈ 0 classes/op
·class.unload: ≈ 0 classes/sec
·class.unload.norm: ≈ 0 classes/op
Iteration 3: 1.218 us/op
·class.load: ≈ 0 classes/sec
·class.load.norm: ≈ 0 classes/op
·class.unload: ≈ 0 classes/sec
·class.unload.norm: ≈ 0 classes/op
Iteration 4: 1.179 us/op
·class.load: ≈ 0 classes/sec
·class.load.norm: ≈ 0 classes/op
·class.unload: ≈ 0 classes/sec
·class.unload.norm: ≈ 0 classes/op
Iteration 5: 1.149 us/op
·class.load: ≈ 0 classes/sec
·class.load.norm: ≈ 0 classes/op
·class.unload: ≈ 0 classes/sec
·class.unload.norm: ≈ 0 classes/op
Result "cn.zhuangyt.javabase.jmh.JmhTestApp19_ClassLoaderProfiler.testLoadClass":
1.208 ±(99.9%) 0.167 us/op [Average]
(min, avg, max) = (1.149, 1.208, 1.251), stdev = 0.043
CI (99.9%): [1.041, 1.375] (assumes normal distribution)
Secondary result "cn.zhuangyt.javabase.jmh.JmhTestApp19_ClassLoaderProfiler.testLoadClass:·class.load":
190.436 ±(99.9%) 1639.715 classes/sec [Average]
(min, avg, max) = (≈ 0, 190.436, 952.182), stdev = 425.829
CI (99.9%): [≈ 0, 1830.151] (assumes normal distribution)
Secondary result "cn.zhuangyt.javabase.jmh.JmhTestApp19_ClassLoaderProfiler.testLoadClass:·class.load.norm":
≈ 10⁻⁶ classes/op
Secondary result "cn.zhuangyt.javabase.jmh.JmhTestApp19_ClassLoaderProfiler.testLoadClass:·class.unload":
≈ 0 classes/sec
Secondary result "cn.zhuangyt.javabase.jmh.JmhTestApp19_ClassLoaderProfiler.testLoadClass:·class.unload.norm":
≈ 0 classes/op
# Run complete. Total time: 00:00:51
REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.
Benchmark Mode Cnt Score Error Units
JmhTestApp19_ClassLoaderProfiler.testLoadClass avgt 5 1.208 ± 0.167 us/op
JmhTestApp19_ClassLoaderProfiler.testLoadClass:·class.load avgt 5 190.436 ± 1639.715 classes/sec
JmhTestApp19_ClassLoaderProfiler.testLoadClass:·class.load.norm avgt 5 ≈ 10⁻⁶ classes/op
JmhTestApp19_ClassLoaderProfiler.testLoadClass:·class.unload avgt 5 ≈ 0 classes/sec
JmhTestApp19_ClassLoaderProfiler.testLoadClass:·class.unload.norm avgt 5 ≈ 0 classes/op
运行上面的基准测试方法,我们将会看到在第一个批次的度量时加载了大量的类,在余下的几次度量中将不会再进行类的加载了,这也符合JVM类加载器的基本逻辑。
我们可以看到,在 testLoadClass 方法的执行过程中,每秒大约会有 190 个类的加载。
5、CompilerProfiler
CompilerProfiler 将会告诉你在代码的执行过程中JIT编译器所花费的优化时间,我们可以打开verbose模式观察更详细的输出。
【CompilerProfiler样例 - 代码】
package cn.zhuangyt.javabase.jmh;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.profile.CompilerProfiler;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import org.openjdk.jmh.runner.options.VerboseMode;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.concurrent.TimeUnit;
@BenchmarkMode(Mode.AverageTime)
@Fork(1)
@Warmup(iterations = 5)
@Measurement(iterations = 5)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Thread)
public class JmhTestApp20_CompilerProfiler {
private byte[] dbydcBytes;
private DbydcClassLoader classLoader;
@Setup
public void init() throws IOException
{
this.dbydcBytes = Files.readAllBytes(
Paths.get("D:\\Dbydc.class")
);
this.classLoader = new DbydcClassLoader(dbydcBytes);
}
@Benchmark
public Object testLoadClass() throws ClassNotFoundException, IllegalAccessException, InstantiationException {
Class<?> dbydcClass = Class.forName("cn.zhuangyt.javabase.jmh.Dbydc", true, classLoader);
return dbydcClass.newInstance();
}
public static void main(String[] args) throws RunnerException {
final Options opts = new OptionsBuilder()
.include(JmhTestApp20_CompilerProfiler.class.getSimpleName())
.addProfiler(CompilerProfiler.class)
.verbosity(VerboseMode.EXTRA)
.build();
new Runner(opts).run();
}
}
【CompilerProfiler样例 - 代码运行结果】
# JMH version: 1.36
# VM version: JDK 1.8.0_281, Java HotSpot(TM) 64-Bit Server VM, 25.281-b09
# VM invoker: D:\Develop\JDK\jdk1.8.0_281\jre\bin\java.exe
# VM options: -javaagent:C:\Program Files\JetBrains\IntelliJ IDEA 2021.1.3\lib\idea_rt.jar=11289:C:\Program Files\JetBrains\IntelliJ IDEA 2021.1.3\bin -Dfile.encoding=UTF-8
# Blackhole mode: full + dont-inline hint (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 5 iterations, 10 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: cn.zhuangyt.javabase.jmh.JmhTestApp20_CompilerProfiler.testLoadClass
# Run progress: 0.00% complete, ETA 00:01:40
Forking using command: [D:\Develop\JDK\jdk1.8.0_281\jre\bin\java.exe, -javaagent:C:\Program Files\JetBrains\IntelliJ IDEA 2021.1.3\lib\idea_rt.jar=11289:C:\Program Files\JetBrains\IntelliJ IDEA 2021.1.3\bin, -Dfile.encoding=UTF-8, -XX:CompileCommandFile=C:\Users\ZHUANG~1\AppData\Local\Temp\jmh6395784747893774195compilecommand, -cp, "D:\Develop\JDK\jdk1.8.0_281\jre\lib\charsets.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\deploy.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\ext\access-bridge-64.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\ext\cldrdata.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\ext\dnsns.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\ext\jaccess.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\ext\jfxrt.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\ext\localedata.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\ext\nashorn.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\ext\sunec.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\ext\sunjce_provider.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\ext\sunmscapi.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\ext\sunpkcs11.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\ext\zipfs.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\javaws.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\jce.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\jfr.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\jfxswt.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\jsse.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\management-agent.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\plugin.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\resources.jar;D:\Develop\JDK\jdk1.8.0_281\jre\lib\rt.jar;D:\Projects\JavaProject\spring-cloud-study\java-base\target\classes;D:\Develop\MavenRepository\org\springframework\boot\spring-boot-starter-web\2.6.3\spring-boot-starter-web-2.6.3.jar;D:\Develop\MavenRepository\org\springframework\boot\spring-boot-starter\2.6.3\spring-boot-starter-2.6.3.jar;D:\Develop\MavenRepository\org\springframework\boot\spring-boot\2.6.3\spring-boot-2.6.3.jar;D:\Develop\MavenRepository\org\springframework\boot\spring-boot-autoconfigure\2.6.3\spring-boot-autoconfigure-2.6.3.jar;D:\Develop\MavenRepository\org\springframework\boot\spring-boot-starter-logging\2.6.3\spring-boot-starter-logging-2.6.3.jar;D:\Develop\MavenRepository\ch\qos\logback\logback-classic\1.2.10\logback-classic-1.2.10.jar;D:\Develop\MavenRepository\ch\qos\logback\logback-core\1.2.10\logback-core-1.2.10.jar;D:\Develop\MavenRepository\org\apache\logging\log4j\log4j-to-slf4j\2.17.1\log4j-to-slf4j-2.17.1.jar;D:\Develop\MavenRepository\org\apache\logging\log4j\log4j-api\2.17.1\log4j-api-2.17.1.jar;D:\Develop\MavenRepository\org\slf4j\jul-to-slf4j\1.7.33\jul-to-slf4j-1.7.33.jar;D:\Develop\MavenRepository\jakarta\annotation\jakarta.annotation-api\1.3.5\jakarta.annotation-api-1.3.5.jar;D:\Develop\MavenRepository\org\yaml\snakeyaml\1.29\snakeyaml-1.29.jar;D:\Develop\MavenRepository\org\springframework\boot\spring-boot-starter-json\2.6.3\spring-boot-starter-json-2.6.3.jar;D:\Develop\MavenRepository\com\fasterxml\jackson\core\jackson-databind\2.13.1\jackson-databind-2.13.1.jar;D:\Develop\MavenRepository\com\fasterxml\jackson\core\jackson-annotations\2.13.1\jackson-annotations-2.13.1.jar;D:\Develop\MavenRepository\com\fasterxml\jackson\core\jackson-core\2.13.1\jackson-core-2.13.1.jar;D:\Develop\MavenRepository\com\fasterxml\jackson\datatype\jackson-datatype-jdk8\2.13.1\jackson-datatype-jdk8-2.13.1.jar;D:\Develop\MavenRepository\com\fasterxml\jackson\datatype\jackson-datatype-jsr310\2.13.1\jackson-datatype-jsr310-2.13.1.jar;D:\Develop\MavenRepository\com\fasterxml\jackson\module\jackson-module-parameter-names\2.13.1\jackson-module-parameter-names-2.13.1.jar;D:\Develop\MavenRepository\org\springframework\boot\spring-boot-starter-tomcat\2.6.3\spring-boot-starter-tomcat-2.6.3.jar;D:\Develop\MavenRepository\org\apache\tomcat\embed\tomcat-embed-core\9.0.56\tomcat-embed-core-9.0.56.jar;D:\Develop\MavenRepository\org\apache\tomcat\embed\tomcat-embed-el\9.0.56\tomcat-embed-el-9.0.56.jar;D:\Develop\MavenRepository\org\apache\tomcat\embed\tomcat-embed-websocket\9.0.56\tomcat-embed-websocket-9.0.56.jar;D:\Develop\MavenRepository\org\springframework\spring-web\5.3.15\spring-web-5.3.15.jar;D:\Develop\MavenRepository\org\springframework\spring-beans\5.3.15\spring-beans-5.3.15.jar;D:\Develop\MavenRepository\org\springframework\spring-webmvc\5.3.15\spring-webmvc-5.3.15.jar;D:\Develop\MavenRepository\org\springframework\spring-aop\5.3.15\spring-aop-5.3.15.jar;D:\Develop\MavenRepository\org\springframework\spring-context\5.3.15\spring-context-5.3.15.jar;D:\Develop\MavenRepository\org\springframework\spring-expression\5.3.15\spring-expression-5.3.15.jar;D:\Develop\MavenRepository\org\slf4j\slf4j-api\1.7.33\slf4j-api-1.7.33.jar;D:\Develop\MavenRepository\org\hamcrest\hamcrest\2.2\hamcrest-2.2.jar;D:\Develop\MavenRepository\org\springframework\spring-core\5.3.15\spring-core-5.3.15.jar;D:\Develop\MavenRepository\org\springframework\spring-jcl\5.3.15\spring-jcl-5.3.15.jar;D:\Develop\MavenRepository\org\openjdk\jmh\jmh-core\1.36\jmh-core-1.36.jar;D:\Develop\MavenRepository\net\sf\jopt-simple\jopt-simple\5.0.4\jopt-simple-5.0.4.jar;D:\Develop\MavenRepository\org\apache\commons\commons-math3\3.2\commons-math3-3.2.jar;D:\Develop\MavenRepository\org\openjdk\jmh\jmh-generator-annprocess\1.36\jmh-generator-annprocess-1.36.jar;D:\Develop\MavenRepository\junit\junit\4.13.2\junit-4.13.2.jar;D:\Develop\MavenRepository\org\hamcrest\hamcrest-core\2.2\hamcrest-core-2.2.jar;D:\Develop\MavenRepository\com\google\guava\guava\31.0.1-jre\guava-31.0.1-jre.jar;D:\Develop\MavenRepository\com\google\guava\failureaccess\1.0.1\failureaccess-1.0.1.jar;D:\Develop\MavenRepository\com\google\guava\listenablefuture\9999.0-empty-to-avoid-conflict-with-guava\listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar;D:\Develop\MavenRepository\com\google\code\findbugs\jsr305\3.0.2\jsr305-3.0.2.jar;D:\Develop\MavenRepository\org\checkerframework\checker-qual\3.12.0\checker-qual-3.12.0.jar;D:\Develop\MavenRepository\com\google\errorprone\error_prone_annotations\2.7.1\error_prone_annotations-2.7.1.jar;D:\Develop\MavenRepository\com\google\j2objc\j2objc-annotations\1.3\j2objc-annotations-1.3.jar;C:\Program Files\JetBrains\IntelliJ IDEA 2021.1.3\lib\idea_rt.jar", org.openjdk.jmh.runner.ForkedMain, 127.0.0.1, 11291]
# Fork: 1 of 1
# Warmup Iteration 1: 1.175 us/op
# Warmup Iteration 2: 1.165 us/op
# Warmup Iteration 3: 1.104 us/op
# Warmup Iteration 4: 1.028 us/op
# Warmup Iteration 5: 1.093 us/op
Iteration 1: 1.244 us/op
·compiler.time.profiled: ≈ 0 ms
·compiler.time.total: 317.000 ms
Iteration 2: 1.091 us/op
·compiler.time.profiled: 2.000 ms
·compiler.time.total: 320.000 ms
Iteration 3: 1.140 us/op
·compiler.time.profiled: 1.000 ms
·compiler.time.total: 321.000 ms
Iteration 4: 1.182 us/op
·compiler.time.profiled: ≈ 0 ms
·compiler.time.total: 321.000 ms
Iteration 5: 1.119 us/op
·compiler.time.profiled: ≈ 0 ms
·compiler.time.total: 321.000 ms
Result "cn.zhuangyt.javabase.jmh.JmhTestApp20_CompilerProfiler.testLoadClass":
1.155 ±(99.9%) 0.229 us/op [Average]
(min, avg, max) = (1.091, 1.155, 1.244), stdev = 0.060
CI (99.9%): [0.926, 1.385] (assumes normal distribution)
Secondary result "cn.zhuangyt.javabase.jmh.JmhTestApp20_CompilerProfiler.testLoadClass:·compiler.time.profiled":
3.000 ±(99.9%) 0.001 ms [Sum]
(min, avg, max) = (≈ 0, 0.600, 2.000), stdev = 0.894
CI (99.9%): [3.000, 3.000] (assumes normal distribution)
Secondary result "cn.zhuangyt.javabase.jmh.JmhTestApp20_CompilerProfiler.testLoadClass:·compiler.time.total":
321.000 ±(99.9%) 0.001 ms [Maximum]
(min, avg, max) = (317.000, 320.000, 321.000), stdev = 1.732
CI (99.9%): [321.000, 321.000] (assumes normal distribution)
# Run complete. Total time: 00:01:41
REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.
Benchmark Mode Cnt Score Error Units
JmhTestApp20_CompilerProfiler.testLoadClass avgt 5 1.155 ± 0.229 us/op
JmhTestApp20_CompilerProfiler.testLoadClass:·compiler.time.profiled avgt 5 3.000 ms
JmhTestApp20_CompilerProfiler.testLoadClass:·compiler.time.total avgt 5 321.000 ms
我们可以看到,在整个方法的执行过程中,profiled 的优化耗时为 3 毫秒,total 的优化耗时为 321 毫秒。
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)