question

Upvotes
Accepted
23 5 11 13

How to design resiliency RFA API application under ERT structure ?

Hi Sir

One of my client is considering purchase TR's ERT solution . Below is the structure.

Client's question is about how to design resilience APP. If the connection between POP and EED is disconnected , what kind of alert/status change will be received by APP through RFA API ?

Is it correct in this situation, the physical connection of RFA API is still active , as APP still connect with EED.

And how would APP decide to swap to backup EED to rebuild the new connection ? Does APP need to design the failover logic by itself, or RFA API could provide the failover function to rebuild the connection with another EED ?

Thank you!

BRs

Jessie

refinitiv-realtimetreprfarfa-apiconnection
structure.png (37.8 KiB)
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
Accepted
11.5k 16 7 10

Hi @jessie.lin

If the connection between POP and EED is disconnected , what kind of alert/status change will be received by APP through RFA API ?

Answer: basically, the applicaiton should get the "Waiting for service <SERVICE_NAME> UP. Item recovery in progress..." status message from the API.

The application can use the API's ServiceGroup and connectionList configurations to help the application recovery items.The purpose of Service Group is to provide greater coverage and robustness to RFA clients. Consumer applications requesting an item from a service group may be served by any of the services included in it, whereas a request to a concrete service limits the offering to one service. A consumer may be connected to multiple providers and configure a service group that includes services from the multiple providers.

Please see more detail in section 13.10 "Service Groups" and 13.10.1 "Configuration" in the RFA Java 8.0.1 E3 Developer Guide document.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
20.3k 73 10 20

Hi Jessie,

Looking at your diagram, I assume the client application would not connect to the EED directly.

Normally, the client app would connect to a TREP system i.e. via an ADS (and ADH).

Both RFA and TREP support a variety of source mirroring / failover / hot standby / warm standby / recovery options - requiring the developer to do very little (if anything) at the Application level - with most of the recovery actions performed behind the scenes at the TREP and/or API level.

For a sensible / normal TREP configuration, the TREP system would failover to POP2 EED if POP1 failed or vice versa. For example, if configured for Source Mirroring / Hot Standby - this could happen seamlessly without the application missing any updates / ticks.

RFA itself also supports things like ServerList configuration which results in the API automatically connecting to the next server in the list if one server is not available or connection is lost - no additional code required at the application level.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
23 5 11 13

@Umer Nalla, Thank you for the answer. The client is use their APP directly connect to EED, they haven't consider TREP yet.

And regarding the RFA it's own failover feature, will RFA automatically connect to another EED server when the connection between EED and POP disconnected ? As in this situation, the physical connection should be still available between EED and APP, but just no update of items. Will RFA API recognize this is kinds of server not available or connection lost ?

Thank you very much for your help!

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Hi @jessie.lin,

My response was based on the (incorrect) assumption that the client app was using TREP - as direct connection to EED is generally quite rare.

For direct connection, please refer to Wasin's answer as this is applicable. As explained by Nipat - the RFA ServerList recovery only applies when the connection between app and server (in this case the EED) is lost.

So, when using the ServiceGroup and ConnectionList scenario as advised by Wasin, the client should use serverList in the connection config - rather then hostName - to take advantage of connection lost recovery too.

Upvotes
23 5 11 13

@Wasin Waeosri, Thank you very much , especially indicate the relevant chapters in developer guide. I have some questions need your help :

What is the concept for connection ? If a service is down, does it mean connection is down?

Or connection means the physical connection ?

1, From Chapter 13.10, it introduce the failover of services ( feednames) , but if the connection between EED and POP is down , all services will be DOWN, is it correct ?

2, From chapter 13.11, there are Connection Recovery and Item Recovery, If the item Recovery finally change to a status as CLOSED, would it trigger a connection Recovery by RFA ?

Or the establishment to another connection ( EED) have to be triggered by Client APP , if item is no update and couldn't recovery?

Thank you!

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Hi @jessie.lin

Question1: What is the concept for connection ? Or connection means the physical connection ?

Answer: The connection means a physical connection between the API and the Provider (TREP infrastructure, EED or OMM Provider application).

Question2: If a service is down, does it mean connection is down?

Answer: Basicall, if the connection between the API and the Provider remains alive when the service is down. However, if the EED/TREP infrastructure set the "disconnectServiceDown" parameter to True, it will cut the connection between itself and the consumer (API) if all services are down.

Question3: From Chapter 13.10, it introduce the failover of services ( feednames) , but if the connection between EED and POP is down , all services will be DOWN, is it correct ?

Anwser: Yes.

@Wasin Waeosri , Thank you, could you help indicate where to set the parameter for : if the EED/TREP infrastructure set the "disconnectServiceDown" parameter to True.

It is really important, as if all service down, RFA will automatically build a new connection to another EED. Thank you!


Question4: From chapter 13.11, there are Connection Recovery and Item Recovery, If the item Recovery finally change to a status as CLOSED, would it trigger a connection Recovery by RFA ?

Answer: Regarding section 13.11.2 OMM Interfaces Support, if the stream state is OMMState.Stream.CLOSED and data state is OMMState.Stream.SUSPECT, it means the stream is closed and the API will not attempt to recovery this stream.

Upvote
1.9k 6 9 16

To @jessie.lin,

I'd like to add information on top of @Umer Nalla 's answer regarding serverList parameter, RFA will switch to another next server defined in the serverList under the condition that a current connection is down only, not from data go unavailable or service down reasons.

Here this is the detailed information in the RFA’s Guide:

Failover only occurs when the connection to a source application breaks, and does not occur due to the rejection of any of these user interests.

If discovery mechanisms are needed, Thomson Reuters recommends that you use service groups,
connectionlist(s), and/or user-defined discovery mechanisms.

When a connection to one of the redundant source applications fails, RFA attempts to connect to the next source application in the list.

Here this is the flow chart for a general RFA recovery process:

To demonstrate its behavior, I’ve configured the serverList parameter as follows:

Referring to the serverList above, when you start the application, the connection scenario will be similar to the following events:

1. RFA will connect to the 172.20.33.43 machine first.

2. The connection to the first machine is temporary down.

3. RFA will attempt to connect to the second machine which is 192.168.27.46 immediately.

4. Then, the connection to the second machine is also down temporary as well

5. RFA will check the maxRetryCount parameter (default is -1/infinity) whether it reaches the maximum retry or not.

6 If yes, RFA will wait around 15 seconds (default wait time of connectRetryInterval configuration setting). Otherwise, RFA will stop doing the connection recovery.

7. Then RFA will retry the first server in the list again.

If the connection to the first machine is temporary down again, repeat step 3.

In the end, RFA will switch from the current machine to the next machine, and start over again as follows:

172.20.33.43 -> 192.168.27.46 -> 172.20.33.43 -> 192.168.27.46 -> 172.20.33.43 -> 192.168.27.46 and so on..

Finally, I've put the application log from the test as follows:

*****************************************************************************
*          Begin RFA Java StarterConsumer Program                           *
*****************************************************************************
*myNamespace.Connections.myConnection.traceMsgDomains : NORMAL
*myNamespace.Connections.myConnection.mountTrace : true
*myNamespace.Connections.myConnection.serverList : 172.20.33.43,192.168.27.46
*myNamespace.Connections.myConnection.logFileName : console
*myNamespace.Connections.myConnection.portNumber : 14002
*myNamespace.Connections.myConnection.connectionType : RSSL
*myNamespace.Connections.myConnection.ipcTraceFlags : 0
*myNamespace.Sessions.mySession.connectionList : myConnection


RFA Version:  8.0.0.L2.all.rrg
field dictionary read from RDMFieldDictionary file
enum dictionary read from enumtype.def file
LoginClient: Sending login request...
LoginClient.processEvent: Received Login Response... 
LoginClient: Received Login Response - MsgType.REFRESH_RESP
Jan 06, 2017 2:51:05 PM com.reuters.ipc.TraceLogger traceData
FINER: 
Thread: myNamespace::mySession Session EventQueueGroup
Connection 0
RSSL Transport attempt to connected to 172.20.33.43:14002


Jan 06, 2017 2:51:05 PM com.reuters.ipc.TraceLogger traceData
FINER: 
Thread: myNamespace::mySession Session EventQueueGroup
Connection 0
RSSL Transport connected to 172.20.33.43:14002


LoginClient: Receive an OMM_CONNECTION_EVENT
Name: myNamespace::myConnection
Status: { state: UP, code: NONE, text: ""}
Host: 172.20.33.43
Port: 14002
ComponentVersion: [ads3.0.2.L1.linux.tis.rrg 64-bit]
LoginClient.processEvent: Received Login Response... 
LoginClient: Received Login OK Response
Consumer Login successful...


Jan 06, 2017 3:37:01 PM com.reuters.ipc.TraceLogger traceData
FINER: 
Thread: myNamespace::mySession Session EventQueueGroup
Connection 0
RSSL Connection failed for 172.20.33.43: An existing connection was forcibly closed by the remote host


LoginClient: Receive an OMM_CONNECTION_EVENT
Name: myNamespace::myConnection
Status: { state: DOWN, code: NONE, text: "An existing connection was forcibly closed by the remote host"}
Host: 172.20.33.43
Port: 14002
ComponentVersion: [ads3.0.2.L1.linux.tis.rrg 64-bit]
LoginClient.processEvent: Received Login Response... 
LoginClient: Received Login Response - MsgType.STATUS_RESP
Jan 06, 2017 3:37:01 PM com.reuters.ipc.TraceLogger traceData
FINER: 
Thread: myNamespace::mySession Session EventQueueGroup
Connection 0
RSSL Transport attempt to connected to 192.168.27.46:14002


Jan 06, 2017 3:37:02 PM com.reuters.ipc.TraceLogger traceData
FINER: 
Thread: myNamespace::mySession Session EventQueueGroup
Connection 0
RSSL Transport connected to 192.168.27.46:14002


LoginClient: Receive an OMM_CONNECTION_EVENT
Name: myNamespace::myConnection
Status: { state: UP, code: NONE, text: ""}
Host: 192.168.27.46
Port: 14002
ComponentVersion: []


LoginClient.processEvent: Received Login Response... 
LoginClient: Received Login OK Response
Consumer Login successful...
Jan 06, 2017 3:37:06 PM com.reuters.ipc.TraceLogger traceData
FINER: 
Thread: myNamespace::mySession Session EventQueueGroup
Connection 0
RSSL Connection failed for 192.168.27.46: An existing connection was forcibly closed by the remote host


LoginClient: Receive an OMM_CONNECTION_EVENT
Name: myNamespace::myConnection
Status: { state: DOWN, code: NONE, text: "An existing connection was forcibly closed by the remote host"}
Host: 192.168.27.46
Port: 14002
ComponentVersion: []
LoginClient.processEvent: Received Login Response... 
LoginClient: Received Login Response - MsgType.STATUS_RESP
Jan 06, 2017 3:37:21 PM com.reuters.ipc.TraceLogger traceData
FINER: 
Thread: myNamespace::mySession Session EventQueueGroup
Connection 0
RSSL Transport attempt to connected to 172.20.33.43:14002


Jan 06, 2017 3:37:22 PM com.reuters.ipc.TraceLogger traceData
FINER: 
Thread: myNamespace::mySession Session EventQueueGroup
Connection 0
RSSL Transport connected to 172.20.33.43:14002


LoginClient: Receive an OMM_CONNECTION_EVENT
Name: myNamespace::myConnection
Status: { state: UP, code: NONE, text: ""}
Host: 172.20.33.43
Port: 14002
ComponentVersion: [ads3.0.2.L1.linux.tis.rrg 64-bit]


LoginClient.processEvent: Received Login Response... 
LoginClient: Received Login OK Response
Consumer Login successful...
Jan 06, 2017 3:37:25 PM com.reuters.ipc.TraceLogger traceData
FINER: 
Thread: myNamespace::mySession Session EventQueueGroup
Connection 0
RSSL Connection failed for 172.20.33.43: An existing connection was forcibly closed by the remote host


LoginClient: Receive an OMM_CONNECTION_EVENT
Name: myNamespace::myConnection
Status: { state: DOWN, code: NONE, text: "An existing connection was forcibly closed by the remote host"}
Host: 172.20.33.43
Port: 14002
ComponentVersion: [ads3.0.2.L1.linux.tis.rrg 64-bit]
Jan 06, 2017 3:37:25 PM com.reuters.ipc.TraceLogger traceData
FINER: 
Thread: myNamespace::mySession Session EventQueueGroup
Connection 0
RSSL Transport attempt to connected to 192.168.27.46:14002


LoginClient.processEvent: Received Login Response... 
LoginClient: Received Login Response - MsgType.STATUS_RESP
Jan 06, 2017 3:37:25 PM com.reuters.ipc.TraceLogger traceData
FINER: 
Thread: myNamespace::mySession Session EventQueueGroup
Connection 0
RSSL Transport connected to 192.168.27.46:14002


LoginClient: Receive an OMM_CONNECTION_EVENT
Name: myNamespace::myConnection
Status: { state: UP, code: NONE, text: ""}
Host: 192.168.27.46
Port: 14002
ComponentVersion: []
LoginClient.processEvent: Received Login Response... 
LoginClient: Received Login OK Response
Consumer Login successful...


Jan 06, 2017 3:37:29 PM com.reuters.ipc.TraceLogger traceData
FINER: 
Thread: myNamespace::mySession Session EventQueueGroup
Connection 0
RSSL Connection failed for 192.168.27.46: An existing connection was forcibly closed by the remote host


LoginClient: Receive an OMM_CONNECTION_EVENT
Name: myNamespace::myConnection
Status: { state: DOWN, code: NONE, text: "An existing connection was forcibly closed by the remote host"}
Host: 192.168.27.46
Port: 14002
ComponentVersion: []
LoginClient.processEvent: Received Login Response... 
LoginClient: Received Login Response - MsgType.STATUS_RESP
Jan 06, 2017 3:37:44 PM com.reuters.ipc.TraceLogger traceData
FINER: 
Thread: myNamespace::mySession Session EventQueueGroup
Connection 0
RSSL Transport attempt to connected to 172.20.33.43:14002


Jan 06, 2017 3:37:44 PM com.reuters.ipc.TraceLogger traceData
FINER: 
Thread: myNamespace::mySession Session EventQueueGroup
Connection 0
RSSL Transport connected to 172.20.33.43:14002


LoginClient: Receive an OMM_CONNECTION_EVENT
Name: myNamespace::myConnection
Status: { state: UP, code: NONE, text: ""}
Host: 172.20.33.43
Port: 14002
ComponentVersion: [ads3.0.2.L1.linux.tis.rrg 64-bit]


LoginClient.processEvent: Received Login Response... 
LoginClient: Received Login OK Response
Consumer Login successful...

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
23 5 11 13

@Wasin Waeosri , Regarding your answer for :

Question2: If a service is down, does it mean connection is down?

Answer: Basicall, if the connection between the API and the Provider remains alive when the service is down. However, if the EED/TREP infrastructure set the "disconnectServiceDown" parameter to True, it will cut the connection between itself and the consumer (API) if all services are down.

Following question : could you help indicate where to set the parameter for : if the EED/TREP infrastructure set the "disconnectServiceDown" parameter to True.

It is really important, as if all service down, RFA will automatically build a new connection to another EED. Thank you!

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Click below to post an Idea Post Idea