After playing microservices for so long, you don't even know about RPC, do you?

After playing microservices for so long, you don't even know about RPC, do you?


First understand what is RPC, why RPC, RPC refers to remote procedure call, that is to say, two servers A and B, one application is deployed on server A, and you want to call the functions/methods provided by the application on server B, because they are not there. A memory space cannot be called directly, and the semantics of the call and the data of the call need to be expressed through the network.


RPC function goal

The main functional goal of RPC is to make it easier to build distributed computing (applications) without losing the semantic simplicity of local calls while providing powerful remote call capabilities. In order to achieve this goal, the RPC framework needs to provide a transparent calling mechanism so that users do not have to explicitly distinguish between local calls and remote calls. An implementation structure is given in the previous "Introduction to the Basics", which is based on the stub structure. . Below we will specifically refine the implementation of the stub structure.

RPC call classification

There are two types of RPC calls:

1. The synchronous calling client waits for the execution of the call to complete and returns the result. 2. After calling the client side asynchronously, there is no need to wait for the execution result to return, but the return result can still be obtained by means of callback notification. If the client does not care about the result of the call, it becomes a one-way asynchronous call, and the one-way call does not need to return the result.

The difference between asynchronous and synchronous is whether to wait for the server to complete the execution and return the result.

RPC structure disassembly

"The Simple Story" gives a relatively coarse-grained RPC implementation conceptual structure, here we further refine which components it should consist of, as shown in the following figure.


RPC service side through RpcServerto export (export) remote interface method, and the client side through RpcClientto the introduction (import) remote interface methods. The client calls the remote interface method like a local method. The RPC framework provides a proxy implementation of the interface, and the actual call will be delegated to the proxy RpcProxy. The agent encapsulates the call information and forwards the call RpcInvokerto the actual execution. On the client side, RpcInvokerthrough the connector RpcConnectorto maintain the channel with the server side RpcChannel, and useRpcProtocol execution protocol encoding (encode) and sends the encoded request message to the server through the channel.

RPC server of the receiver RpcAcceptorreceives a call request from the client, using the same RpcProtocolexecution protocol decoding (decode). The decoded call information is passed to RpcProcessorto control the calling process, and finally it is delegated to call to RpcInvokerto actually execute and return the call result.

RPC component responsibilities

Above we have further disassembled the various components of the RPC implementation structure. Below we will explain in detail the division of responsibilities of each component.

1. RpcServer is responsible for exporting the remote interface 2. RpcClient is responsible for importing the proxy implementation of the remote interface 3. The proxy implementation of the RpcProxy remote interface 4. RpcInvoker client implementation: responsible for encoding the call information and sending the call request to the server and Waiting for the call result to return to the server implementation: responsible for calling the specific implementation of the server interface and returning the call result 5. RpcProtocol is responsible for protocol encoding/decoding 6. RpcConnector is responsible for maintaining the connection channel between the client and the server and sending data to the server 7. RpcAcceptor Responsible for receiving client requests and returning request results 8. RpcProcessor is responsible for controlling the calling process on the server side, including managing the calling thread pool, timeout period, etc. 9. RpcChannel data transmission channel

RPC implementation analysis

After further disassembling the components and dividing the responsibilities, here takes the implementation of the RPC framework conceptual model on the java platform as an example to analyze in detail the factors that need to be considered in the implementation.

Export remote interface

Exporting a remote interface means that only the exported interface can be used for remote calls, while the unexported interface cannot. The code snippet for exporting the interface in java may be as follows:

DemoService demo = new ...; RpcServer server = new ...; server.export(DemoService.class, demo, options);

We can export the entire interface, or we can export only some methods in the interface in a more fine-grained way, such as:

//Only export the method with signature hi(String s) in DemoService server.export(DemoService.class, demo, "hi", new Class<?>[] {String.class }, options);

Another special kind of call in java is polymorphism, that is, an interface may have multiple implementations, so which one is called during remote calls? The semantics of this local call is implicitly realized through the reference polymorphism provided by jvm, so for RPC, cross-process calls cannot be implicitly realized. If the frontDemoService interface has two implementations, then you need to specifically mark the different implementations when exporting the interface, such as:

DemoService demo = new ...; DemoService demo2 = new ...; RpcServer server = new ...; server.export(DemoService.class, demo, options); server.export("demo2", DemoService.class, demo2 , options);

The demo2 above is another implementation. We mark it as "demo2" to export. Then we need to pass this mark to call the correct implementation class when calling remotely. This solves the semantics of polymorphic calls.

Import remote interface and client proxy

Compared with exporting a remote interface, the client code must obtain the method or process definition of the remote interface in order to be able to initiate a call. At present, most cross-language platform RPC frameworks use code generators to generate stub codes according to IDL definitions. In this way, the actual import process is completed during compilation by the code generator. Some of the cross-language platform RPC frameworks I have used, such as CORBAR, WebService, ICE, and Thrift, are all in this way.

The way of code generation is an inevitable choice for a cross-language platform RPC framework, and RPC for the same language platform can be implemented by sharing interface definitions. The code snippet for importing the interface in java may be as follows:

RpcClient client = new ...; DemoService demo = client.refer(DemoService.class); demo.hi("how are you?");

In java,'import' is a keyword, so in the code snippets we use refer to express the meaning of import interface. The import method here is essentially a code generation technology, but it is generated at runtime, which looks more concise than the code generation during static compilation. At least two technologies are provided in java to provide dynamic code generation, one is jdk dynamic proxy, and the other is bytecode generation. Dynamic proxy is more convenient to use than bytecode generation, but the performance of dynamic proxy is inferior to direct bytecode generation, and bytecode generation is much worse in code readability. Weighing the two, I personally think that it is more important to sacrifice some performance to gain code readability and maintainability.

Protocol codec

The client agent needs to encode the invocation information before initiating the invocation. This requires consideration of what information needs to be encoded and in what format to transmit to the server in order for the server to complete the invocation. For efficiency reasons, the less information to be encoded, the better (less data to be transmitted), and the simpler the encoding rules, the better (higher execution efficiency). Let's first look at what information needs to be encoded:

- Call code-- 1. Interface method includes interface name, method name 2. Method parameters include parameter types, parameter values 3. Call attributes include call attribute information, such as call attachment implicit parameters, call timeout time, etc. - Return code - 1. Return the return value defined in the interface method of the result 2. Return code Exception return code 3. Return exception information Call exception information

In addition to the above necessary call information, we may also need some meta-information to facilitate program coding and decoding and possible future expansion. In this way, our encoded message is divided into two parts, one part is meta-information, and the other part is necessary information for calling. If we design an RPC protocol message, we put the meta-information in the protocol message header, and the necessary information in the protocol message body. A conceptual RPC protocol message design format is given below:


- Message header-- magic: protocol magic number, designed for decoding header size: protocol header length, designed for extension version: protocol version, designed for compatibility st: message body serialization type hb: heartbeat message mark, for long connection transmission Layer heartbeat design ow: one-way message mark, rp: response message mark, if not set, the default is request message status code: response message status code reserved: reserved for byte alignment message id: message id body size: message body length - message Body-serialized encoding, usually in the following format xml: such as webservie soap json: such as JSON-RPC binary: such as thrift; hession; kryo, etc.

After the format is determined, the encoding and decoding are simple. Since the header length is fixed, we are more concerned about the serialization method of the message body. We care about three aspects of serialization: 1. The efficiency of serialization and deserialization, the faster the better. 2. The byte length after serialization, the smaller the better. 3. Compatibility of serialization and deserialization, if fields are added to the interface parameter object, is it compatible? The above three points are sometimes impossible to have both fish and bear's paws. This involves specific serialization library implementation details, so I will not further analyze it in this article.

Transmission service

After the protocol is encoded, it is naturally necessary to transmit the encoded RPC request message to the server, and the server returns a result message or confirmation message to the client after execution. The application scenario of RPC is essentially a reliable request response message flow, similar to HTTP. Therefore, the TCP protocol that chooses the long connection method will be more efficient. Unlike HTTP, we define the unique id of each message at the protocol level, so it is easier to reuse the connection.

Since long connections are used, the first question is how many root connections are needed between client and server? In fact, there is no difference in use between single connection and multiple connection. For application types with small data transmission volume, single connection is basically sufficient. The biggest difference between single connection and multiple connection is that each connection has its own private sending and receiving buffers, so when large data volume is transmitted, it is scattered in different connection buffers to get better throughput efficiency. Therefore, if your data transfer volume is not enough to keep the buffer of a single connection in a saturated state, the use of multiple connections will not produce any significant improvement, but will increase the overhead of connection management.

The connection is established and maintained by the client. If the client and server are directly connected, the connection will generally not be interrupted (except for physical link failures, of course). If the client and server connection passes through some load transfer devices, it is possible that the connection will be interrupted by these intermediate devices when the connection is inactive for a period of time. In order to maintain the connection, it is necessary to periodically send heartbeat data for each connection to maintain the connection uninterrupted. The heartbeat message is an internal message used by the RPC framework library. There is also a special heartbeat bit in the previous protocol header structure, which is used to mark the heartbeat message. It is transparent to business applications.

Execute call

What the client stub does is just encode the message and transmit it to the server, and the actual calling process takes place on the server. structure dismantling server stub from the foregoing, we subdivided RpcProcessorand RpcInvokertwo components, a process responsible for controlling calls, one for real call. Here we still take the implementation of these two components in java as an example to analyze what they need to do?

The dynamic interface call of the implementation code in java is currently called by reflection. In addition to carrying jdk reflected native, some third-party libraries also provide better reflective performance calls, so RpcInvokerthat encapsulate implementation details called reflection.

What factors need to be considered in the control of the invocation process, and RpcProcessorwhat kind of invocation control service needs to be provided? Here are a few points to inspire thinking:

1. Improve efficiency. Each request should be executed as soon as possible. Therefore, we can't create a thread to execute each request. We need to provide thread pool services. 2. Resource isolation When we export multiple remote interfaces, how to avoid a single interface call occupying all thread resources and causing other interface execution blocking. 3. Timeout control When an interface executes slowly, and the client side has timed out to give up waiting, the thread on the server side continues to execute at this time, it is meaningless.

RPC exception handling

No matter how hard RPC tries to disguise remote calls like local calls, there are still big differences, and there are some exceptions that will never be encountered during local calls. Before talking about exception handling, let's compare some differences between local calls and RPC calls: 1. Local calls must be executed, while remote calls are not necessarily. The call message may not be sent to the server due to network reasons. 2. Local calls will only throw exceptions declared by the interface, while remote calls will also run out of other exceptions during the runtime of the RPC framework. 3. The performance of local calls and remote calls may differ greatly, depending on the proportion of RPC inherent consumption. It is these differences that determine the need for more consideration when using RPC. When an exception is thrown when calling a remote interface, the exception may be a business exception or a runtime exception thrown by the RPC framework (such as network interruption, etc.). A business exception indicates that the server has executed the call, which may not be executed normally due to some reasons, while the RPC runtime exception may cause the server to not execute it at all. The exception handling strategy for the caller naturally needs to be differentiated.

Since the inherent cost of RPC is several orders of magnitude higher than that of local calls, the inherent cost of local calls is on the order of nanoseconds, while the inherent cost of RPC is on the order of milliseconds. Then it is not suitable for too light computing tasks to export remote interfaces to be served by independent processes. Only when the time spent on computing tasks is much higher than the inherent consumption of RPC is it worth exporting to provide services for remote interfaces.

summary

So far we have proposed a conceptual framework for RPC implementation and analyzed in detail some implementation details that need to be considered. No matter how elegant the concept of RPC, but "there are still a few snakes hidden in the grass", only a deep understanding of the nature of RPC can be better applied.

At last

Welcome to pay attention to my public account [java Xiaoguage's sharing platform], articles will be updated in it, and the collated information will also be placed in it