@assylias 答案基本上告诉你它是如何工作的,并且是非常好的答案。我已经使用字符串重复数据删除测试了生产应用程序并得到了一些结果。该网络应用程序大量使用字符串,因此我认为其优势非常明显。
要启用字符串重复数据删除,您必须添加以下 JVM 参数(您至少需要 Java 8u20):
-XX:+UseG1GC -XX:+UseStringDeduplication -XX:+PrintStringDeduplicationStatistics
最后一项是可选的,但正如名称所示,它会显示字符串重复数据删除统计信息。这是我的:
[GC concurrent-string-deduplication, 2893.3K->2672.0B(2890.7K), avg 97.3%, 0.0175148 secs]
[Last Exec: 0.0175148 secs, Idle: 3.2029081 secs, Blocked: 0/0.0000000 secs]
[Inspected: 96613]
[Skipped: 0( 0.0%)]
[Hashed: 96598(100.0%)]
[Known: 2( 0.0%)]
[New: 96611(100.0%) 2893.3K]
[Deduplicated: 96536( 99.9%) 2890.7K( 99.9%)]
[Young: 0( 0.0%) 0.0B( 0.0%)]
[Old: 96536(100.0%) 2890.7K(100.0%)]
[Total Exec: 452/7.6109490 secs, Idle: 452/776.3032184 secs, Blocked: 11/0.0258406 secs]
[Inspected: 27108398]
[Skipped: 0( 0.0%)]
[Hashed: 26828486( 99.0%)]
[Known: 19025( 0.1%)]
[New: 27089373( 99.9%) 823.9M]
[Deduplicated: 26853964( 99.1%) 801.6M( 97.3%)]
[Young: 4732( 0.0%) 171.3K( 0.0%)]
[Old: 26849232(100.0%) 801.4M(100.0%)]
[Table]
[Memory Usage: 2834.7K]
[Size: 65536, Min: 1024, Max: 16777216]
[Entries: 98687, Load: 150.6%, Cached: 415, Added: 252375, Removed: 153688]
[Resize Count: 6, Shrink Threshold: 43690(66.7%), Grow Threshold: 131072(200.0%)]
[Rehash Count: 0, Rehash Threshold: 120, Hash Seed: 0x0]
[Age Threshold: 3]
[Queue]
[Dropped: 0]
这些是运行应用程序 10 分钟后的结果。如您所见,字符串去重已执行452次和“重复数据删除”801.6 MB字符串。检查字符串重复数据删除27 000 000字符串。当我将使用标准并行 GC 的 Java 7 与使用 G1 GC 的 Java 8u20 的内存消耗进行比较并启用字符串重复数据删除时,堆大约下降了50%:
Java 7 并行 GC
![Java 7 Parallel GC](https://i.stack.imgur.com/rhmQ9.png)
带有字符串重复数据删除功能的 Java 8 G1 GC
![Java 8 G1 GC with String Deduplication](https://i.stack.imgur.com/tDWnV.png)