question

Upvotes
Accepted
23 1 0 2

rsslRead and rsslDecode functions latency spikes

Hi

We have measured the latency of our UPA consumer application, and have noticed some unusual behaviour around the rsslRead and rsslDecode functions. Attached is a graph displaying our findings.

Our measurements are showing that the first call of rsslRead takes longer than subsequent calls for message 'bursts' from the network - the blue line in the graph. We assume that this is due to the rsslRead call 'buffering' messages from the network (so the subsequent calls do not perform network reads). However, the rsslDecode functions (rsslDecodeMsg, rsslDecodeFieldEntry...) also all individually take longer when decoding the first message from rsslRead - the red line in the graph. The slower rsslRead call, and the slower rsslDecode functions results in the first message in the input buffer taking considerably longer to decode than the remainder of the messages.

It is worth noting that these 'spikes' we are seeing in the latency of decoding occur even when we fix the size of the messages by using the dynamic view functionality to register interest in only one field. In the attached graph the messages were updates with a single double field.

We would like the latency of decoding messages in our application to be as stable as possible regardless of the message's position in the input buffer - i.e. we would like the lines in the graph to be as horizontal as possible. Please can you explain why these spikes in latency exist, and how we can remove them?

Thanks,
Mike

UPA version - 8.0.0.L1

decoding_latency.png

elektronelektron-sdkrrteta-apielektron-transport-apiperformancelatency
decoding-latency.png (113.2 KiB)
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

1 Answer

Upvotes
Accepted
39.2k 75 11 27

@Mike Slade

The ESDK development team provided the following explanation for the behavior you observed.

The rsslRead call reads as much off of the wire as possible for efficient processing, which explains the larger amount of time each burst takes.
For encoding and decoding data, this appears to be L2/L3 cache misses in the CPU, either with data or the instruction cache. Since none of the encoding code carries state between full message decoding calls, the development team does not believe there is anything that they can do to mitigate this issue.
If you have further questions, please let us know.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Click below to post an Idea Post Idea