question

Upvotes
Accepted
1 0 0 1

RFA not sending heartbeats to ADS.

We are running RFAJ 7.6.0.E9 through 8.0.0.2 on various Linux systems that connect to various ADS 2.5.0.L1 also running on Linux. The ADS disconnects the session due to not receiving 3 consecutive heartbeats from the RFAJ systgems. This has been confirmed by watching the communication between the RFAJ systems and the ADS's. The JVM, and the relevant code, has been verified to be running before and after the ADS initiates the disconnect. Has anyone seen this behavior? Does anyone have any suggestions for where to look on the RFAJ system to determine where, or why the heartbeats are not being sent?

Thanks

treprfarfa-apiADSlinuxdisconnectionping
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
Accepted
281 1 3 7

Hi @CTM

No ping messages sent out would mean something is interfering with RFA Java's ping management. This could be a) the lack of CPU time, b) RFAJ thread was too busy, c) RFAJ thread exited abnormally.

To verify a), you may check if there was any resource issue on the client machines (e.g. CPU, memory, etc.), esp. when the clients are running on VMs.

For b), a possibility is when RFAJ thread is used for time-intensive event processing. This is when null event queue is used (null specified for event queue when invoking registerClient method).

For c), if the application was able to function normally (without a restart) after the disconnection, it would indicate RFAJ thread still functioned normally and c) can be ruled out.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
1.9k 6 9 16

Dear @CTM,

Perhaps, it might occur when the application process is really busy and consume gargantuan resources, which could block RFA to perform its administrative operations. This problem is called a slow consumer problem. It usually occurs when the application subscribes to a lot of items or items that have a massive update rate, or the process item callback method has a time consuming logic.

However, this issue may also happen with a very tight/impractical pingInterval value as well (such as pingInterval = 1 or 2 seconds).

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

In addition, please let me know if you need the investigation for this issue or not. If so, I'll create a ticket for you to investigate this issue..

Click below to post an Idea Post Idea