Using the New Entity Objects

As of November 2020, the entity object present in the story object is being updated to improve News API users' experience, with data backfilled to August 27th, 2020. This new object provides greater enrichments, enhables enhanced search, and leverages a knowledge base will be updated frequently going foward.

If you have been using entities in your workflow prior to this date (either as search parameters or in the story objects you return), you will need to update your workflow to leverage the new objects and avoid problems with your workflow. This page will walk through what you need to do to move from using the old entity object to using the new one.

Old vs New Entity Objects

The main changes from the old entity object are:

  • Wikipedia and Wikidata links have replaced DBpedia links
  • Entity types have been refined and updated
  • Sentiment is now predicted for every entity
  • Stock tickers have been added to the entity object (where applicable)

Old entity object:

{
    'indices': [
        [0, 4]
        ],
    'links': {'dbpedia': 'http://dbpedia.org/resource/Apple_Inc.'},
    'text': 'Apple',
    'types': [
        'Organisation', 
        'Company', 
        'Agent'
        ]
}

New entity object:

{   "body": [
        {
            "id": "Q2283",
            "links": {
                "wikipedia": "https://en.wikipedia.org/wiki/Microsoft",
                "wikidata": "https://www.wikidata.org/wiki/Q2283"
            },
            "types": [
                "Organization",
                "Business",
                "Software_company"
            ],
            "sentiment": {
                "polarity": "neutral",
                "confidence": 0.68
            },
            "surface_forms": [
                {
                    "text": "Microsoft",
                    "indices": [[3, 11], [110, 118]]
                },
                {
                    "text": "microsoft",
                    "indices": [[72, 79]]
                }
            ],
            "stock_ticker": "MSFT"
        }
    ]
}

New Parameters to Leverage new Entity Data

As there are new objects within the entity object, there are new parameters to filter by these objects.

"entities_title_id": ["Q2283"] ## returns stories that mention Microsoft in the title, where that entity has been tagged with its ID
"entities_body_id": ["Q2283"] ## returns stories that mention Microsoft in the body, where that entity has been tagged with its ID
"entities_title_stock_ticker": ["MSFT"] ## returns stories that mention Microsoft in the title, where that entity has been tagged with its stock ticker
"entities_body_stock_ticker": ["MSFT"] ## returns stories that mention Microsoft in the body, where that entity has been tagged with its stock ticker
"entities_title_links_wikipedia": ["https://en.wikipedia.org/wiki/Microsoft"] ## returns stories that mention Microsoft in the title, where that entity has been tagged with its appropriate Wikipedia link
"entities_body_links_wikipedia": ["https://en.wikipedia.org/wiki/Microsoft"] ## returns stories that mention Microsoft in the body, where that entity has been tagged with its appropriate Wikipedia link
"entities_title_links_wikidata": ["https://www.wikidata.org/wiki/Q2283"] ## returns stories that mention Microsoft in the title, where that entity has been tagged with its appropriate Wikidata link
"entities_body_links_wikidata": ["https://www.wikidata.org/wiki/Q2283"] ## returns stories that mention Microsoft in the body, where that entity has been tagged with its appropriate Wikidata link
"entities_title_surface_forms_text": ["Microsoft"] ## returns stories that mention Microsoft in the title, where that entity has been tagged with the surface form 'Microsoft'
"entities_title_surface_forms_text": ["Microsoft"] ## returns stories that mention Microsoft in the body, where that entity has been tagged with the surface form 'Microsoft'
entities_title_id: ["Q2283"] ## returns stories that mention Microsoft in the title, where that entity has been tagged with its ID
entities_body_id: ["Q2283"] ## returns stories that mention Microsoft in the body, where that entity has been tagged with its ID
entities_title_stock_ticker: ["MSFT"] ## returns stories that mention Microsoft in the title, where that entity has been tagged with its stock ticker
entities_body_stock_ticker: ["MSFT"] ## returns stories that mention Microsoft in the body, where that entity has been tagged with its stock ticker
entities_title_links_wikipedia: ["https://en.wikipedia.org/wiki/Microsoft"] ## returns stories that mention Microsoft in the title, where that entity has been tagged with its appropriate Wikipedia link
entities_body_links_wikipedia: ["https://en.wikipedia.org/wiki/Microsoft"] ## returns stories that mention Microsoft in the body, where that entity has been tagged with its appropriate Wikipedia link
entities_title_links_wikidata: ["https://www.wikidata.org/wiki/Q2283"] ## returns stories that mention Microsoft in the title, where that entity has been tagged with its appropriate Wikidata link
entities_body_links_wikidata: ["https://www.wikidata.org/wiki/Q2283"] ## returns stories that mention Microsoft in the body, where that entity has been tagged with its appropriate Wikidata link
entities_title_surface_forms_text: ["Microsoft"] ## returns stories that mention Microsoft in the title, where that entity has been tagged with the surface form 'Microsoft'
entities_body_surface_forms_text: ["Microsoft"] ## returns stories that mention Microsoft in the body, where that entity has been tagged with the surface form 'Microsoft'
EntitiesTitleId: optional.NewInterface([]string{"Q2283"}) // returns stories that mention Microsoft in the title, where that entity has been tagged with its ID
EntitiesBodyId: optional.NewInterface([]string{"Q2283"}) // returns stories that mention Microsoft in the body, where that entity has been tagged with its ID
EntitiesTitleStockTicker: optional.NewInterface([]string{"MSFT"}) // returns stories that mention Microsoft in the title, where that entity has been tagged with its stock ticker
EntitiesBodyStockTicker: optional.NewInterface([]string{"MSFT"}) // returns stories that mention Microsoft in the body, where that entity has been tagged with its stock ticker
EntitiesTitleLinksWikipedia: optional.NewInterface([]string{"https://en.wikipedia.org/wiki/Microsoft"}) // returns stories that mention Microsoft in the title, where that entity has been tagged with its appropriate Wikipedia link
EntitiesBodyLinksWikipedia: optional.NewInterface([]string{"https://en.wikipedia.org/wiki/Microsoft"}) // returns stories that mention Microsoft in the body, where that entity has been tagged with its appropriate Wikipedia link
EntitiesTitleLinksWikidata: optional.NewInterface([]string{"https://www.wikidata.org/wiki/Q2283"}) // returns stories that mention Microsoft in the title, where that entity has been tagged with its appropriate Wikidata link
EntitiesBodyLinksWikidata: optional.NewInterface([]string{"https://www.wikidata.org/wiki/Q2283"}) // returns stories that mention Microsoft in the body, where that entity has been tagged with its appropriate Wikidata link
EntitiesTitleSurfaceFormsText: optional.NewInterface([]string{"Microsoft"}) // returns stories that mention Microsoft in the title, where that entity has been tagged with the surface form 'Microsoft'
EntitiesBodySurfaceFormsText: optional.NewInterface([]string{"Microsoft"}) // returns stories that mention Microsoft in the body, where that entity has been tagged with the surface form 'Microsoft'
:entities_title_id => ['Q2283'] // returns stories that mention Microsoft in the title, where that entity has been tagged with its ID
:entities_body_id => ['Q2283'] // returns stories that mention Microsoft in the body, where that entity has been tagged with its ID
:entities_title_stock_ticker => ['MSFT'] // returns stories that mention Microsoft in the title, where that entity has been tagged with its stock ticker
:entities_body_stock_ticker => ['MSFT'] // returns stories that mention Microsoft in the body, where that entity has been tagged with its stock ticker
:entities_title_links_wikipedia => ['https://en.wikipedia.org/wiki/Microsoft'] // returns stories that mention Microsoft in the title, where that entity has been tagged with its appropriate Wikipedia link
:entities_body_links_wikipedia => ['https://en.wikipedia.org/wiki/Microsoft'] // returns stories that mention Microsoft in the body, where that entity has been tagged with its appropriate Wikipedia link
:entities_title_links_wikidata => ['https://www.wikidata.org/wiki/Q2283'] // returns stories that mention Microsoft in the title, where that entity has been tagged with its appropriate Wikidata link
:entities_body_links_wikidata => ['https://www.wikidata.org/wiki/Q2283'] // returns stories that mention Microsoft in the body, where that entity has been tagged with its appropriate Wikidata link
:entities_title_surface_forms_text => ['Microsoft'] //  returns stories that mention Microsoft in the title, where that entity has been tagged with the surface form 'Microsoft'
:entities_body_surface_forms_text => ['Microsoft'] // returns stories that mention Microsoft in the body, where that entity has been tagged with the surface form 'Microsoft'

Updating your DBpedia searches to use Wikipedia instead.

The new entity objects contain Wikipedia links instead of DBpedia links. To update your workflow to search by Wikipedia links instead of DBpedia ones, you will need to update both the parameter name and the links you are searching.

Old parameters & values:

"entities.title.links.dbpedia[]": ["http://dbpedia.org/resource/Donald_Trump"]
"entities.body.links.dbpedia[]": ["http://dbpedia.org/resource/Donald_Trump"]

New parameters & values:

"entities.title.links.wikipedia[]": ["https://en.wikipedia.org/wiki/Donald_Trump"]
"entities.body.links.wikipedia[]": ["https://en.wikipedia.org/wiki/Donald_Trump"]

Note that although most DBpedia urls will map accurately to Wikipedia ones by simply substituting http://dbpedia.org/resource/ for https://en.wikipedia.org/wiki/, some will not. We recommend testing out the entities you are currently searching using DBpedia links with Wikipedia links. If you notice any discrepancy in the results returned for an entity, you should check Wikipedia for this entity's corect url.

Updating the entity types being searched

The new entity model applies more refined type data to the entities it recognises. Although conceptually similar, the types in the new entity object are slightly different. This is because Wikidata is now leveraged under the hood instead of DBpedia.

To get a list of the most-mentioned entity types that are relevant to your use, simply call the Trends endpoint and supply "entities.body.types" to the "field" parameter". For example, to see the most-mentioned entity types in stories about "Trump", run this query:

"https://api.aylien.com/news/trends?field=entities.body.type&title=Trump&published_at.start=NOW-5DAYS"

You can also test out the following common types on the new entity object, or see the full list here.

Organization Location Business Human
Country Currency Product Profession
Technology Corporation Bank Software
Financial_institution Stock_exchange U.S._state

Entity-level Sentiment

Sentiment is now predicted at entity-level, for every entity extracted from the story's body and title. Each entity object contains polarity and confidence objects:

"sentiment": {"polarity": "positive", "confidence": 0.78}

New Parameters to Leverage Entity-level Sentiment

There are two new parameters that can be used to filter stories by Entity-level Sentiment, each of which must be used with a polarity value, positive, neutral, or negative.

  • entities.body.sentiment, which returns stories that mention entities with the searched polarity value from the story body

  • entities.title.sentiment, which returns stories that mention entities with the searched polarity value from the story title

"entities.title.sentiment": "positive" ## returns stories that mention positively-covered entities from anywhere in the story

Testing out the Enhanced Search Functionality

With this new data being added to the entity object, we have added new search functionality to properly leverage this data in their searches. Specifically, we now allow users to search for content mentioning entities that meet multiple criteria. Take a look here at how you can make these queries.