question

Upvotes
Accepted
7 3 2 4

In RDP JSON payload, what is "payloadVersion" and where is it documented?

We're using the «News Service on Refinitiv Data Platform - User and Design Guide, v2.0» manual to implement our news-handling service (which obtains news stories via alerts pushed to an AWS SQS queue), and there is a problem with versioning of the format of the responses.

By default, if we subscribe for the news alerts without specifying the "payloadVersion" field in the top-level object of the JSON document which is the subscription requests's payload, the service pushes news alerts which contain the field "payloadVersion" with the value "1.0" in the top-level object of the JSON objects constituting the alerts' bodies.
We have figured out we can affect this version by setting the field "payloadVersion" in the subscription requests — for instance, if we specify "payloadVersion": "2.0" when subscribing, we'll receive news alerts containing "payloadVersion" set to "2.0" in their payload, and the payload indeed apparently differs from that of the version 1.0.
Unfortunately, the "payloadVersion" field is not documented in the manual we're using for implementation.

So, the questions are:

- What is the semantics of the "payloadVersion" field?

- Where are different versions of it are documented? How do they affect the schema of the JSON payloads of the news alerts?


- Is there a standard way to get notified when a new payload version becomes available for use?

- Is there any process of deprecation of payload versions and accompanying notifications in advance? (In other words, if payload version "1.0" ever becomes deprecated, we'd like to get notified about such an upcoming change proactively).


rdp-apirefinitiv-data-platformdocumentationjson-payload
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvote
Accepted
23k 22 9 14

Hello @kostix and all,

Please find the information on this content provided in response to your question by RDP News product team (some of this information is planned to be made available via API Playground as part of the Reference Guide):

payloadVersion can be specified in the subscription syntax when it do the subscription


{
   "transport": {
"transportType": "AWS-SQS"
        },
   "filter": {             …
}
"payloadVersion": "1.0" or "2.0" 
}
  • Default value is 2.0, so if user don’t specify the option it will be 2.0
Then in the alert pushed to user queue, it contains the payloadVersion and the payload. For 2.0 it looks like this {
    "data": {
      "attributes": [        {          "domain": {            "type": "string",            "value": "headline"          }        }
      ],
      "envelopeVersion": "1.0",
      "ecpMessageID": "urn:newsml:reuters.com:20171030:nTOPLAT:57705",
      "sourceSeqNo": 1127927,
      "distributionSeqNo": 1,
      "sourceTimestamp": "2021-04-09T04:40:33.000Z",
      "distributionTimestamp": "2021-04-09T04:40:45.883Z",
      "payloadVersion": "2.0",
      "subscriptionID": "551bc6ba-1991-42ab-9c17-39c4ed009e9f",
      "payload": {
"newsItem": [ ]         } } 
  • For 1.0 it looks like this
{
    "data": {
      "attributes": [        {          "domain": {            "type": "string",            "value": "headline"          }        }
      ],
      "envelopeVersion": "1.0",
      "ecpMessageID": "urn:newsml:reuters.com:20171030:nTOPLAT:57705",
      "sourceSeqNo": 1127927,
      "distributionSeqNo": 1,
      "sourceTimestamp": "2021-04-09T04:40:33.000Z",
      "distributionTimestamp": "2021-04-09T04:40:45.883Z",
      "payloadVersion": "1.0",
      "subscriptionID": "551bc6ba-1991-42ab-9c17-39c4ed009e9f",
      "payload": {
        "newsMessage": {             "header": { …}             "itemSet": {
                    "newsItem": [ ]               } } 
  • The key difference between 1.0 and 2.0 is about the “payload”, for 1.0 it includes the newsMessage which contains the header and 2.0 just output the newsItem without the header. The reason why we cut over from 1.0 to 2.0 is to align with request/response and RSF header should be invisible to end user.
  • By saying that default 2.0 is expected as we don’t want to diverge streaming alerts with R/R anymore.


icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvote
23k 22 9 14

Hello @kostix,

As neither RDP News User and Design Guide nor Alerts Reference Guide on API Playground describe submitting payloadVersion as part of the request, I believe that it is not intended to be submitted.

Further, the deployed versions of the service on API Playground are beta1 and v1. Therefore, I would suppose that the default version returned 1.0 is the latest released, and v 2.0 is the next, not made available yet.

In RDP News User and Design Guide I find the description of "Payload Version" but no provided means for the request to drive the selection of it. The version provided is intended, in my understanding, to be selected by the service, as the latest released.

I would not recommend submitting payloadVersion parameter as part of the request.

However, I will try to verify with the product team, if there is any additional information, that can be made available.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

The problem with not submitting the "payloadVersion" field is that we'd like to have the schema of the reponses RDP services send to us be stable—that is, no matter how many revisions of the schema is deployed, we'd like to have the schema of the documents we receive to be fixed.

Of course, since the semantics of this field are not documented, the behaviour could be any: say, the system processing our subscription request could consider the version we claim with "payloadVersion" to be "no less than" and, say, eventually start to send us documents formatted according to some later version of the schema.

It is fine to require us to eventually to upgrade to a later version, but this should be a slow multi-stage process with each schema version coming through several deprecation levels with warnings communicated to the users.

So yes, it would be awesome if you could figure these things out for us.

Upvotes
7 3 2 4

@zoya.farberov, thanks for the explanation!

Can I refine my question a bit more?

Do we have to expect that the newsItem element (with either payloadVersion value) may contain more than a single element? If yes, what are the semantics of this with regard to the fact a single news alert pushed to an AWS SQS queue (or returned in a single R&R API exchange) ostensibly contains a single news story (for instance, see [1])?

In other words, can a single news story contain multiple "news items"? If yes, what whould be the meaning of this — having multiple news items per a single news story? What would be the meaning of the term "item" then?

Or should I ask a separate question for this?


And while we're at it, in our testing environment we observe that subscribing to news story alerts using RDP without specifying payloadVersion would send us pieces formatted using payloadVersion 1.0, not 2.0. That is, the defaults appear to be different — at least in the testing environment.

1. https://community.developers.refinitiv.com/answers/75990/view.html

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

Upvotes
23k 22 9 14

Hello @kostix,

Per RDP News User and Design Guide:

"4.4.2. Item Set In EDP Item Set is one of the following:

 News Item (newsItem): A simple item of type Text.

 Package Item (packageItem): A set of News Items item references to other items and/or resources. ..."

: either single newsItem or multiple packaged newsItems.

It is indeed better to ask a separate question as a new question next time, it allows this question for better visibility and searchability, while including a link(s) to the previous related question(s) correlates any relevant discussions and content.

From my testing, as of now, payloadVersion is 1.0

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

My English may fail me (I'm not a native speaker) but I cannot parse the wording of your citation definitely, and unfortunately it had confused me even more, alas.

Here's why: in JSON payloads containing news stories, an "itemSet" is a JSON object where the "newsItem" and "packageItems" are fields. I parse the statement «Item Set is one of the following: …» as meaning that an "itemSet" object can contain either a single "newsItem" field or a single "packageItem" field but not both in a single object, and also a single "itemset" object cannot contain multiple "newsItem" fields or multiple "packageItems" fields (as per JSON spec, it is possible to have objects with multiple same-named fields, though the semantics of handling them are moot and are deferred to be implemented by the implementation which parses such objects).

So this is the first question: is my interpretation correct (an object in the "itemSet" field may contain either a single "newsItem" field or a single "packageItems" field but not both)?

Hello @khomutov,

My understanding of this section of the guide is the same as yours - either single news item, or single packaged item, that can contain multiple news items references.

The second question is about the contents of the "newsItem" field. It is defined to be a JSON array, and arrays can contain any number of elements (including zero), and my question is about the contents of that "newsItem" array: since the JSON schema allows any number of elements there, it's technically possible to have there any number of objects (or other values). If we suppose there can only sensibly appear JSON objects describing individual news items, the question still remains: how many such objects can we expect to find there in the array which is the value of the "newsItem" field?

In an attempt to better explain my quesion, here's an example from the EDP guide you have referred to (from "Appendix 1"):


"itemSet": {
  "newsItem": [
    {
      "_conformance": ...,
      "_guid": ...,
      ...
    }
	]
}

The question is basically about the fact the value of a "newsItem" field is an array, and I fail to find any mention on whether it's possible to expect more than a single object in there, when parsing.

Upvotes
23k 22 9 14

Hello @kostix,

I think this basic definition stems from IPTC NewsML-G2 Guidelines.

It is, in a general case, possible to have "the child elements of <itemSet> be any number of <newsItem>, <packageItem>, <conceptItem>, and/or <knowledgeItem> components, in any combination, in any order.".

At present, per RDP News, either one newsItem or one packagedItem. But as NewsML spec allows for mulitples, the spec is being observed in the structure definition.

I am going to convert these comments to answers, to allow for the visibility of this information.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

That's fine!
Ideally, I would also like my second question to be answered, if possible—that one regarding the value of "newsItem" being a JSON array.

Hello @kostix,

The format per NewsML spec ( please the link to the spec documentation above) is an array.

RDP service populates per RDP News User Guide (see link above for the specifics).

Does this help?

No, it unfortunately does not. The RDP News User Guide says:

In RDP, Item Set is one of the following:
• News Item (newsItem): A simple item of type Text.
• Package Item (packageItem): A set of references to other News Items and/or resources.
The following table describes the data fields under a newsItem or packageItem

(Emphasis mine.)

...and it's clear that this claim is either false or misformulated (I beleive it's the latter) as we both know a newsItem is an array, not just "a simple item of type Text"). The Guide then goes on presenting a bunch of examples without giving any strict definition of the data format. Hence why I'm asking.

Show more comments
Upvote
23k 22 9 14

Hello @kostix,

Please find further clarification from the product team:

"

a/ RSF is based on the external G2 standard; G2 supports a newsMessage with a contentSet; the contentSet supports 1..N items.

b/ RSF follows the above, but for product reasons only supports a contentSet with 1..N [newsItem or packageItem ]

c/ Currently our text items only have a single newsItem per newsMessage, so that (correctly) looks like a newsItem array with a single entry.

d/ The item array is to allow for other potential product groupings which may be needed in the future - e.g. Reuters has a ‘package’ which has a packageItem + 1..N newsItems.

---

payloadVersion 1.0 (to be deprecated): both is array, no matter if there is only 1 element in the array newsItem[] and packageItem[]

payloadVersion 2.0 : only newsItem[0] returned as an Object. For packageItem inside the contentSet

We are to return the “only” one object to payload 2.0 to align with Request/Response.

{

"data": {

"attributes": [ { "domain": { "type": "string", "value": "headline" } }

],

"envelopeVersion": "1.0",

"ecpMessageID": "urn:newsml:reuters.com:20171030:nTOPLAT:57705",

"sourceSeqNo": 1127927,

"distributionSeqNo": 1,

"sourceTimestamp": "2021-04-09T04:40:33.000Z",

"distributionTimestamp": "2021-04-09T04:40:45.883Z",

"payloadVersion": "2.0",

"subscriptionID": "551bc6ba-1991-42ab-9c17-39c4ed009e9f",

"payload": {

"newsItem": {}
}
}


"

I hope this information is of help

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 5.0 MiB each and 10.0 MiB total.

@zoya.farberov, thank you! I appreciate your assistance.

After discussing this new information with my team, I would like to have two further clarifications, if possible?

What is RSF? Is it «Reuters Strategic Format (RSF), which is a Reuters-specific representation of NewsML-G2.» — to quote this guide (EDP)?

The citation from the product team first states that the item array of newsItem may be used in the future — for instance, to implement packageItem (which, as I interpret the citation, is currently not implemented), — then goes on to state that the payloadVersion 1.0 format which supports such data scheme is to be deprecated, and then it states that with payloadVersion 2.0, it's only possible to return a single newsItem in a single document. What would be the way to present that «a packageItem + 1..N newsItems» combination in a document formatted according to the payloadVersion 2.0 specification? Or is it something to be decided in the future, and published in a redaction to the RDP documentation?
In the latter case, how are we (I mean my employer) supposed to get notified about the new revisions of the RDP data formats / API specs?

Click below to post an Idea Post Idea