JDK core JAVA source code analysis (4)-Off-heap memory, zero copy, DirectByteBuffer, and thinking about FileChannel in NIO

JDK core JAVA source code analysis (4)-Off-heap memory, zero copy, DirectByteBuffer, and thinking about FileChannel in NIO

I have wanted to write this series for a long time, and it is also a summary and improvement for myself. It turns out that when learning JAVA, those JAVA introductory books will tell you some rules and rules, but when we use it, we generally have a hard time remembering, because we use less and don't know why. Knowing the reason is the only way to impress and apply what you have learned.

This article conducts an in-depth analysis of off-heap memory and DirectBuffer, understands Java's mechanism for off-heap memory processing, and prepares for the next file IO

Java stack memory and off-heap memory

First we throw out a formula:

java  = -Xmx  +  *-Xss  + -XX:MaxDirectMemorySize  + MetaSpace 
 

1. Stack memory

Stack memory refers to heap memory and stack memory: heap memory is the memory managed by GC, and stack memory is thread memory.

Heap memory structure:

There is also a more detailed structure diagram (including MetaSpace and code cache):

Note that PermGen is replaced by MetaSpace after Java8, which can be automatically expanded at runtime, and the default is infinite

Let's look at the following piece of code to briefly understand the relationship between the stack:

public static void main(String[] args) {
    Object o = new Object();
}
 

Which new Object()is allocated on the heap, and the variable Object o is on the thread stack of main.

  • All parts of the application use heap memory, and then the stack memory is used by running a thread.
  • No matter when the object is created, it will be stored in the heap memory, and the stack memory contains its references. The stack memory only contains references to original value variables and object variables in the heap.
  • Objects stored in the heap are globally accessible, but the stack memory cannot be accessed by other threads.
  • Through the JVM parameter -Xmxwe can specify the maximum heap memory size, and -Xsswe can specify the memory size of each thread stack

2. Off-heap memory

2.1. Generalized off-heap memory

Except for stack memory, the rest is off-heap memory, including memory allocated by jvm itself during operation, codecache, memory allocated in jni, memory allocated by DirectByteBuffer, etc.

2.2. Off-heap memory in a narrow sense-DirectByteBuffer

As a java developer, we often say that the off-heap memory overflows. It is actually off-heap memory in a narrow sense. This mainly refers to the allocation of memory when java.nio.DirectByteBuffer is created. In this article, we also mainly talk about the narrow sense. Off-heap memory, because it is more closely related to the problems we usually encounter

Why use off-heap memory. Usually because:

  • Can be shared between processes, reducing replication between virtual machines
  • Improvement of garbage collection pause: If you use some long-lived and large-existing objects, YGC or FullGC will often start, you can consider putting these objects out of the heap. Excessive heap will affect the performance of Java applications. If off-heap memory is used, the off-heap memory is directly managed by the operating system (not a virtual machine). The result of this is to maintain a small heap memory to reduce the impact of garbage collection on the application.
  • In some scenarios, the performance of program I/O manipulation can be improved. The step of copying data from in-heap memory to off-heap memory is omitted.

3. JNI call and kernel mode and user mode

  • Kernel mode: cpu can access all data in memory, including peripheral devices, such as hard disks, network cards, and cpu can also switch itself from one program to another.
  • User mode: Only limited access to memory is allowed, and access to peripheral devices is not allowed. The ability to occupy the cpu is deprived, and the cpu resources can be obtained by other programs.
  • System call: In order to enable upper-layer applications to access these resources, the kernel provides access interfaces for upper-layer applications

Java calls native methods, that is, JNI, which is a type of system call.

Let's take an example, file reading; Java itself cannot read files because the user mode does not have permission to access peripheral devices. Need to switch the kernel state to read through the system call.

Currently, JAVA's IO methods include stream-based traditional IO and block-based NIO (although file reading is not strictly NIO in the strict sense, haha). Stream-oriented means that one or more bytes can be read from the stream at a time. You have the final say on what you read. There is no cache (here refers to the use of the stream without any cache, receiving or sending The data is cached in the operating system, and the stream is like a water pipe to read data from the operating system s cache) and can only read data from the stream sequentially, if you need to skip some bytes or read the data that has been read You must first buffer the data read from the stream. The block-oriented processing method is somewhat different. The data is read/written to the buffer first, and you can control where the data is read according to your needs. This gives users some more flexibility in the process of processing. However, the extra work you need to do is to check whether all the data you need has been in the buffer. You also need to ensure that when more data enters the buffer, Unprocessed data in the buffer will not be overwritten.

We only analyze the block-based NIO method here. In JAVA, this block is ByteBuffer.

4. Principle of Zero Copy under Linux

Most web servers have to handle a large amount of static content, and most of them read data from disk files and write them to sockets. Let's take this process as an example to look at the Linux workflow in different modes

4.1. Normal Read/Write Mode

The code abstraction involved:

//tmp_buf
read(file, tmp_buf, len);
//tmp_buf socket
write(socket, tmp_buf, len);
 

It looks like a simple step but has been replicated a lot:

  1. When the read system call is called, the data is copied to kernel mode through DMA (Direct Memory Access)
  2. Then it is controlled by the CPU to copy the kernel mode data to the buffer in user mode
  3. After the read call is completed, the write call first copies the data in the buffer in user mode to the socket buffer in kernel mode
  4. Finally, through DMA copy, the data in the socket buffer in kernel mode is copied to the network card device for transmission.

From the above process, it can be seen that the data went from kernel mode to user mode in vain, and two copies were wasted (the first time, from the kernel mode to the user mode; the second time, from the user mode and back to the kernel mode. , That is, the second and third steps of the above four processes.), and these two copies are CPU copies, which occupy CPU resources

4.2. sendfile mode

Only one system call is required to transfer files through sendfile. When sendfile is called:

  1. First read the data from the disk to the kernel buffer through DMA copy
  2. Then copy the data from the kernel buffer to the sokcet buffer through CPU copy
  3. Finally, through DMA copy, the data in the socket buffer is copied to the network card buffer to send sendfile. Compared with the read/write method, there is one less mode switch and one CPU copy. But from the above process, it can also be found that it is unnecessary to copy data from the kernel buffer to the socket buffer.

4.3. Improved sendfile mode

The Linux2.4 kernel has improved the sendFile mode:

The improved process is as follows:

  1. DMA copy copies the disk data to the kernel buffer 2. Appends the position and offset of the current data to be sent in the kernel buffer to the socket buffer
  2. DMA gather copy directly copies the data in the kernel buffer to the network card according to the position and offset in the socket buffer.

After the above process, the data is transferred from the disk after only two copies. (In fact, this Zero copy is for the kernel, and the data is Zero-copy in the kernel mode).

Many current high-performance http servers have introduced the sendfile mechanism, such as nginx, lighttpd, etc.

5. Changes in Java Zero Copy Implementation

Zero-Copy technology eliminates the steps of copying the read buffer of the operating system to the program buffer and copying from the program buffer to the socket buffer, and directly copies the read buffer to the socket buffer. The FileChannal.transferTo() method in Java NIO is just Such an implementation

public void transferTo(long position,long count,WritableByteChannel target);
 

The transferTo() method transfers data from one channel to another writable channel, and its internal implementation depends on the operating system's support for zero copy technology. In the Unix operating system and various linux hairstyle versions, this function is finally realized through the sendfile() system call. The following is the definition of this method:

#include <sys/socket.h>
ssize_t sendfile(int out_fd, int in_fd, off_t *offset, size_t count);

 

5.1. Low-level implementation before Linux 2.4

As mentioned before, we use the following two pictures to show more clearly the copying and kernel mode user mode switching:

There are only two times of switching between kernel and user mode, and only three copies of data (only one time using cpu resources). After Linux 2.4, we can remove the only one cpu copy.

5.2. Low-level implementation after Linux 2.4

On Linux systems with kernel 2.4 or above, socket buffer descriptors will be used to meet this requirement. This method not only reduces the switching between kernel user modes, but also eliminates the need for the cpu to participate in the copy process. From the user's point of view, the transferTo() method is still called, but its essence has changed:

  1. After calling the transferTo method, the data is copied from the file to a buffer in the kernel by DMA.

  2. The data is no longer copied to the buffer associated with the socket, only a descriptor (containing the location and length of the data, etc.) is appended to the buffer associated with the socket. DMA directly transfers the data in the buffer in the kernel to the protocol engine, eliminating the only remaining data copy that requires a cpu cycle.

5.3 The zero-copy performance achieved by JAVA ordinary byte stream IO and NIOFileChannel:

Go directly to the source code:

import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.nio.channels.FileChannel;

public class FileCopyTest {

   /**
     *  
     * @param fromFile  
     * @param toFile    
     * @throws FileNotFoundException  
     */
    public static void fileCopyNormal(File fromFile, File toFile) throws FileNotFoundException {
        InputStream inputStream = null;
        OutputStream outputStream = null;
        try {
            inputStream = new BufferedInputStream(new FileInputStream(fromFile));
            outputStream = new BufferedOutputStream(new FileOutputStream(toFile));
           //1kB 
            byte[] bytes = new byte[1024];
            int i;
           //
            while ((i = inputStream.read(bytes)) != -1) {
                outputStream.write(bytes, 0, i);
            }
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                if (inputStream != null) {
                    inputStream.close();
                }
                if (outputStream != null) {
                    outputStream.close();
                }
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

   /**
     *  filechannel 
     *
     * @param fromFile  
     * @param toFile    
     */
    public static void fileCopyWithFileChannel(File fromFile, File toFile) {
        FileInputStream fileInputStream = null;
        FileOutputStream fileOutputStream = null;
        FileChannel fileChannelInput = null;
        FileChannel fileChannelOutput = null;
        try {
            fileInputStream = new FileInputStream(fromFile);
            fileOutputStream = new FileOutputStream(toFile);
           //fileInputStream 
            fileChannelInput = fileInputStream.getChannel();
           //fileOutputStream 
            fileChannelOutput = fileOutputStream.getChannel();
           //fileChannelInput fileChannelOutput 
            fileChannelInput.transferTo(0, fileChannelInput.size(), fileChannelOutput);
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                if (fileInputStream != null) {
                    fileInputStream.close();
                }
                if (fileChannelInput != null) {
                    fileChannelInput.close();
                }
                if (fileOutputStream != null) {
                    fileOutputStream.close();
                }
                if (fileChannelOutput != null) {
                    fileChannelOutput.close();
                }
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

    public static void main(String[] args) throws IOException {
        File fromFile = new File("D:/readFile.txt");
        File toFile = new File("D:/outputFile.txt");

       //
        fileCopyNormal(fromFile, toFile);
        fileCopyWithFileChannel(fromFile, toFile);

       //
        long start = System.currentTimeMillis();
        for (int i = 0; i < 1000; i++) {
            fileCopyNormal(fromFile, toFile);
        }
        System.out.println("fileCopyNormal time: " + (System.currentTimeMillis() - start));

        start = System.currentTimeMillis();
        for (int i = 0; i < 1000; i++) {
            fileCopyWithFileChannel(fromFile, toFile);
        }
        System.out.println("fileCopyWithFileChannel time: " + (System.currentTimeMillis() - start));
    }
}
 

Test Results:

fileCopyNormal time: 14271
fileCopyWithFileChannel time: 6632
 

The time difference is more than doubled (the file size is about 8MB), and the gap should be more obvious if the file is larger.

6. DirectBuffer Allocation

The core buffer of NIO in Java is ByteBuffer, and all IO operations are carried out through this ByteBuffer; there are two types of Bytebuffer : HeapByteBuffer is allocated

ByteBuffer buffer = ByteBuffer.allocate(int capacity);
 

Allocate DirectByteBuffer

ByteBuffer buffer = ByteBuffer.allocateDirect(int capacity);
 

The difference between the two:

6.1. Why does HeapByteBuffer copy one more time?

6.1.1. FileChannel force api description

The force method of FileChannel: The FileChannel.force() method forces the data in the channel that has not been written to the disk to be written to the disk. For performance reasons, the operating system caches data in memory, so there is no guarantee that the data written to the FileChannel will be written to the disk immediately. To ensure this, you need to call the force() method. The force() method has a boolean parameter, which indicates whether to write file metadata (permission information, etc.) to the disk at the same time.

6.1.2. FileChannel and SocketChannel dependent IOUtil source code analysis

Whether it is FileChannel or SocketChannel, their read and write methods all rely on the same method of IOUtil, let's take a look here: IOUtil.java

static int write(FileDescriptor var0, ByteBuffer var1, long var2, NativeDispatcher var4) throws IOException {
   //DirectBuffer 
    if (var1 instanceof DirectBuffer) {
        return writeFromNativeBuffer(var0, var1, var2, var4);
    } else {
       //DirectBuffer
       //
        int var5 = var1.position();
       //
        int var6 = var1.limit();

        assert var5 <= var6;
       //buffer DirectByteBuffer
        int var7 = var5 <= var6 ? var6 - var5 : 0;
        ByteBuffer var8 = Util.getTemporaryDirectBuffer(var7);

        int var10;
        try {

            var8.put(var1);
            var8.flip();
            var1.position(var5);
           //DirectBuffer 
            int var9 = writeFromNativeBuffer(var0, var8, var2, var4);
            if (var9 > 0) {
                var1.position(var5 + var9);
            }

            var10 = var9;
        } finally {
           //DirectByteBuffer
            Util.offerFirstTemporaryDirectBuffer(var8);
        }

        return var10;
    }
}
//
 

6.1.3. Why must copy to DirectByteBuffer for reading and writing (system call)

First of all, let me make a point. The thread that executes the native method is considered to be in SafePoint. Therefore, if NIO is not copied to DirectByteBuffer, there will be GC rearrangement of object memory (you can refer to my other article: blog.csdn.net/zhxdick/art...

Traditional BIO is Stream-oriented. The underlying implementation can be understood as writing a byte array, and calling the native method to write IO. The parameter passed is this array. Even if the GC changes the memory address, the latest reference to this array can still be found. The address, and the corresponding method is: FileOutputStream.write

private native void writeBytes(byte b[], int off, int len, boolean append)
        throws IOException;
 

But NIO, in order to improve efficiency, the memory address is passed, which saves an indirect application, but DirectByteBuffer must be used to prevent the memory address from changing, the corresponding is NativeDispatcher.write

abstract int write(FileDescriptor fd, long address, int len)
        throws IOException;
 

So why does the memory address change? GC will reclaim useless objects, and at the same time will defragment, move the location of objects in memory to reduce memory fragmentation. DirectByteBuffer is not controlled by GC. If you don't use DirectByteBuffer but use HeapByteBuffer, if GC occurs when the system call is called, the memory location of HeapByteBuffer has changed, but the kernel state cannot perceive this change and cause the system call to read or write the wrong data. Therefore, IO system calls must be made through HeapByteBuffer that is not affected by GC.

Suppose we want to read a piece of data from the network, and then send this piece of data out, the process of using Non-direct ByteBuffer is like this:

   >  DirectByteBuffer  >   Non-direct ByteBuffer  >  Direct ByteBuffer  >  
 

In this way, a memory (ie, native memory) is directly allocated outside the heap to store data, and the program directly reads/writes the data to the off-heap memory through JNI. Because the data is written directly to the off-heap memory, this method will no longer allocate memory in the JVM-controlled heap to store data, and there is no operation of copying data in the on-heap memory and off-heap memory. In this way, when performing I/O operations, you only need to pass this off-heap memory address to the JNI I/O function.

The process of using Direct ByteBuffer is as follows:

   >   Direct ByteBuffer  >  
 

It can be seen that in addition to the time of constructing and deconstructing the temporary Direct ByteBuffer, at least the time of two memory copies can be saved. So whether to use Direct Buffer under any circumstances?

It's not. For most applications, the time for two memory copies is almost negligible, while the time for constructing and deconstructing DirectBuffer is relatively long. In the implementation of JVM, some methods will cache a part of temporary Direct ByteBuffer, which means that if Direct ByteBuffer is used, it can only save the time of two memory copies, but cannot save the time of construction and destruction. As far as Sun's implementation is concerned, write (ByteBuffer) and read (ByteBuffer) methods will cache temporary Direct ByteBuffer, while write (ByteBuffer[]) and read (ByteBuffer[]) generate new temporary Direct ByteBuffer each time.

6.2. ByteBuffer creation

6.2.1. ByteBuffer creates HeapByteBuffer

Allocated on the heap, the Java virtual machine is directly responsible for garbage collection, you can think of it as a wrapper class for a byte array

class HeapByteBuffer
    extends ByteBuffer
{
    HeapByteBuffer(int cap, int lim) {           //package-private
        super(-1, 0, lim, cap, new byte[cap], 0);
       /*
        hb = new byte[cap];
        offset = 0;
        */
    }
}
    
public abstract class ByteBuffer
    extends Buffer
    implements Comparable<ByteBuffer>
{
   //These fields are declared here rather than in Heap-X-Buffer in order to
   //reduce the number of virtual method invocations needed to access these
   //values, which is especially costly when coding small buffers.
   //
    final byte[] hb;                 //Non-null only for heap buffers
    final int offset;
    boolean isReadOnly;                //Valid only for heap buffers
   //Creates a new buffer with the given mark, position, limit, capacity,
   //backing array, and array offset
   //
    ByteBuffer(int mark, int pos, int lim, int cap,  //package-private
                 byte[] hb, int offset)
    {
        super(mark, pos, lim, cap);
        this.hb = hb;
        this.offset = offset;
    }
 

6.2.2. DirectByteBuffer

This class is not as simple as HeapByteBuffer

DirectByteBuffer(int cap) {                  //package-private
    super(-1, 0, cap, cap);
    boolean pa = VM.isDirectMemoryPageAligned();
    int ps = Bits.pageSize();
    long size = Math.max(1L, (long)cap + (pa ? ps : 0));
    Bits.reserveMemory(size, cap);
    long base = 0;
    try {
        base = unsafe.allocateMemory(size);
    } catch (OutOfMemoryError x) {
        Bits.unreserveMemory(size, cap);
        throw x;
    }
    unsafe.setMemory(base, size, (byte) 0);
    if (pa && (base % ps != 0)) {
       //Round up to page boundary
        address = base + ps - (base & (ps - 1));
    } else {
        address = base;
    }
    cleaner = Cleaner.create(this, new Deallocator(base, size, cap));
    att = null;
 

Bits.reserveMemory(size, cap) method

static void reserveMemory(long size, int cap) {
    synchronized (Bits.class) {
        if (!memoryLimitSet && VM.isBooted()) {
            maxMemory = VM.maxDirectMemory();
            memoryLimitSet = true;
        }
       //-XX:MaxDirectMemorySize limits the total capacity rather than the
       //actual memory usage, which will differ when buffers are page
       //aligned.
        if (cap <= maxMemory - totalCapacity) {
            reservedMemory += size;
            totalCapacity += cap;
            count++;
            return;
        }
    }
    System.gc();
    try {
        Thread.sleep(100);
    } catch (InterruptedException x) {
       //Restore interrupt status
        Thread.currentThread().interrupt();
    }
    synchronized (Bits.class) {
        if (totalCapacity + cap > maxMemory)
            throw new OutOfMemoryError("Direct buffer memory");
        reservedMemory += size;
        totalCapacity += cap;
        count++;
    }
}
 

In DirectByteBuffer, first apply for a quota to the Bits class. The Bits class has a global totalCapacity variable that records the total size of all DirectByteBuffers. Every time you apply, you must first check whether the limit is exceeded. The off-heap memory limit defaults to the in-heap memory (Set by -Xmx) Similar, can be reset with -XX:MaxDirectMemorySize.

If not specified, the default value of this parameter is the value of Xmx minus the value of 1 Survior area. If you set the startup parameters -Xmx20M -Xmn10M -XX: SurvivorRatio=8, then apply for 20M-1M=19M DirectMemory

If the limit is exceeded, Sytem.gc() will be executed actively, hoping to actively reclaim some off-heap memory. System.gc() will trigger a full gc, of course, provided that you have not set -XX:+DisableExplicitGC to disable explicit GC. And you need to know that calling System.gc() does not guarantee that full gc will be executed immediately. Then sleep for one hundred milliseconds to see if the totalCapacity has dropped. If the memory is still insufficient, an OOM exception will be thrown. If the quota is approved, call the famous sun.misc.Unsafe to allocate memory and return the memory base address

Therefore, in the general framework, a large block of DirectByteBuffer will be applied at startup, and then do memory management by yourself

Finally, create a Cleaner and bind the Deallocator class representing the cleanup action-reduce the totalCapacity in Bits, and call Unsafe to adjust free to release the memory.

6.2.3. ByteBuffer recycling

Don't talk about HeapByteBuffer, GC will help to deal with it. Here we mainly say that the DirectByteBuffer object that exists in the heap is very small. It only stores a few attributes such as base address and size, and a Cleaner, but it represents a large section of memory allocated later, which is the so-called iceberg object. Among them, first is a static variable of the Cleaner class. The Cleaner object will be added to the Cleaner list when it is initialized, forming a reference relationship with first, and ReferenceQueue is used to store the Cleaner object that needs to be recycled.

If the DirectByteBuffer object is recycled in a GC, only the Cleaner object only saves the data (start address, size, and capacity) of the off-heap memory. In the next Full GC, put the Cleaner object into the ReferenceQueue. And trigger the clean method.

A quick review of the GC mechanism in the heap, when the young generation is full, young gc will occur; if the object has not expired at this time, it will not be recycled; after a few times of young gc, the object will be migrated to the old generation; When the old generation is full, full gc will occur.

An embarrassing situation can be seen here, because the size of DirectByteBuffer itself is very small, as long as you survive the young gc, you can stay in the old generation comfortably even if it has failed, and it is not easy to burst the old generation to trigger full gc. , If no other big guys enter the old generation to trigger full gc, they will be consumed there all the time, occupying a large amount of external memory and will not be released.

At this time, you can only rely on system.gc(), which is triggered when the aforementioned application quota exceeds the limit, to save the field. But this last insurance is actually not very good. 1. it will interrupt the entire process, and then it will let the current thread sleep for a full one hundred milliseconds, and if gc is not completed within one hundred milliseconds, it will still throw OOM mercilessly. abnormal. Also, in case, in case everyone is superstitious that a certain tuning guide sets -DisableExplicitGC to ban system.gc(), it will be no fun.

Therefore, it is better to recycle the off-heap memory by yourself. For example, Netty does this.

7. How to view the usage of DirectBuffer:

7.1. In-process acquisition:

MBeanServer mbs = ManagementFactory. getPlatformMBeanServer() ;
ObjectName objectName = new ObjectName("java.nio:type=BufferPool,name=direct" ) ;
MBeanInfo info = mbs.getMBeanInfo(objectName) ;
for(MBeanAttributeInfo i : info.getAttributes()) {
    System.out .println(i.getName() + ":" + mbs.getAttribute(objectName , i.getName()));
}
 

7.2. Remote process

JMX acquisition If the target machine does not start JMX, then add the jvm parameter:

-Dcom.sun.management.jmxremote.port=9999 
-Dcom.sun.management.jmxremote.authenticate=false 
-Dcom.sun.management.jmxremotAe.ssl=false
 

Restart the process and then the machine is accessed through the JMX connection:

String jmxURL = "service:jmx:rmi:///jndi/rmi://10.125.6.204:9999/jmxrmi" ;
JMXServiceURL serviceURL = new JMXServiceURL(jmxURL);
Map map = new HashMap() ;
String[] credentials = new String[] { "monitorRole" , "QED" } ;
map.put( "jmx.remote.credentials" , credentials) ;
JMXConnector connector = JMXConnectorFactory. connect(serviceURL , map);
MBeanServerConnection mbsc = connector.getMBeanServerConnection() ;
ObjectName objectName = new ObjectName("java.nio:type=BufferPool,name=direct" ) ;
MBeanInfo mbInfo = mbsc.getMBeanInfo(objectName) ;
for(MBeanAttributeInfo i : mbInfo.getAttributes()) {
    System.out .println(i.getName() + ":" + mbsc.getAttribute(objectName , i.getName()));
}
 

The local can also be viewed through the JConsole tool:

But be careful not to collect too frequently. Otherwise it will trigger all threads to enter the safe point (that is, Stop the world)

7.3. jcmd command view

This needs to enable native memory collection, but this will often trigger all threads to enter the safe point (that is, Stop the world), so it is not recommended to open online applications.

Example:

$ jcmd 71 VM.native_memory
71:

Native Memory Tracking:

Total: reserved=1631932KB, committed=367400KB
-                 Java Heap (reserved=131072KB, committed=131072KB)
                            (mmap: reserved=131072KB, committed=131072KB) 

-                     Class (reserved=1120142KB, committed=79830KB)
                            (classes #15267)
                            (  instance classes #14230, array classes #1037)
                            (malloc=1934KB #32977) 
                            (mmap: reserved=1118208KB, committed=77896KB) 
                            (  Metadata:   )
                            (    reserved=69632KB, committed=68272KB)
                            (    used=66725KB)
                            (    free=1547KB)
                            (    waste=0KB =0.00%)
                            (  Class space:)
                            (    reserved=1048576KB, committed=9624KB)
                            (    used=8939KB)
                            (    free=685KB)
                            (    waste=0KB =0.00%)

-                    Thread (reserved=24786KB, committed=5294KB)
                            (thread #56)
                            (stack: reserved=24500KB, committed=5008KB)
                            (malloc=198KB #293) 
                            (arena=88KB #110)

-                      Code (reserved=250635KB, committed=45907KB)
                            (malloc=2947KB #13459) 
                            (mmap: reserved=247688KB, committed=42960KB) 

-                        GC (reserved=48091KB, committed=48091KB)
                            (malloc=10439KB #18634) 
                            (mmap: reserved=37652KB, committed=37652KB) 

-                  Compiler (reserved=358KB, committed=358KB)
                            (malloc=249KB #1450) 
                            (arena=109KB #5)

-                  Internal (reserved=1165KB, committed=1165KB)
                            (malloc=1125KB #3363) 
                            (mmap: reserved=40KB, committed=40KB) 

-                     Other (reserved=16696KB, committed=16696KB)
                            (malloc=16696KB #35) 

-                    Symbol (reserved=15277KB, committed=15277KB)
                            (malloc=13543KB #180850) 
                            (arena=1734KB #1)

-    Native Memory Tracking (reserved=4436KB, committed=4436KB)
                            (malloc=378KB #5359) 
                            (tracking overhead=4058KB)

-        Shared class space (reserved=17144KB, committed=17144KB)
                            (mmap: reserved=17144KB, committed=17144KB) 

-               Arena Chunk (reserved=1850KB, committed=1850KB)
                            (malloc=1850KB) 

-                   Logging (reserved=4KB, committed=4KB)
                            (malloc=4KB #179) 

-                 Arguments (reserved=19KB, committed=19KB)
                            (malloc=19KB #512) 

-                    Module (reserved=258KB, committed=258KB)
                            (malloc=258KB #2356) 
 

Among them, the memory used by DirectBuffer is included in the category of Other