Overview

Welcome to Quantexa News API — the most powerful way to aggregate, filter, and integrate global news data into your apps and models.

This technical documentation will guide you through every step of the process, whether you are a total beginner with Quantexa News API, or an experienced user.

The Getting Started section explains the three modes of using Quantexa News API (Search UI, code, and Postman), as well as an introduction to all our key concepts, from authentication to status codes and more. You can also make your first API call using this section by simply copying and pasting our test query.

Common Workflows demonstrate the best way to perform important queries in Quantexa News API, from sorting results, to pagination, and working with dates to using entities and categories. Using these examples, you will quickly become a Quantexa News API expert.

The Parameters section provides everything you need to know about each parameter, including detailed descriptions of their use, and which API endpoints they are able to be used with.

If you have not yet signed up for a 14 day free trial, sign up here. If you have, let’s get started so that you can experience the power of Quantexa News API straight away.

Three methods of accessing Quantexa News API

1. Search UI

The fastest and easiest way of building and testing Quantexa News API queries is using Search UI. Build powerful queries in seconds, and instantly view article results and analytics. When you are happy with your query, simply copy it into your production environment.


Step 1: Getting access to News API

  • Before starting, make sure you have created a News API account and have access to your App ID and Key. Click here to create a free account.


Step 2: Build your search

  • Go to app.aylien.com

  • Enter your username and password — contact the Quantexa support team in case you don’t have your credentials.

  • Enter your query criteria. See the sample below:

Build your search


Step 3: Developer mode

  • Access the developer tab. See image below:

Developer mode

  • You can copy the code from the developer tab and use your programming language of choice.
  • At the moment, it supports languages such as Curl, Python, Java, and Javascript.
  • For more information on how to use Search UI, watch our Getting Started video for developers.

2. Postman

If you are looking for a quick and straightforward way of testing Quantexa News API, and familiarizing yourself with its various endpoints and parameters, you may want to consider using Postman.

Postman is a suite of tools for testing, building and documenting APIs.

Once you've downloaded and installed Postman, follow the steps below to get up and running with Quantexa News API.


Step 1: Getting access to Quantexa News API

  • Before starting, make sure you have created a Quantexa News API account and have access to your App ID and Key. Click here to create a free account.


Step 2: Import API collection to Postman

  • Open Postman, and click on the Import button on the top left corner. See below:

Import API collection


Step 3: Import the collection

  • In the Import dialogue, click on the Import From Link button, paste the following URL into the input box, and click Import.

https://docs.aylien.com/yaml_files/api.yaml


Step 4: Generate the collection

  • You will need to generate the Postman collection from the input YAML file that you have just imported. To do this, name your collection and click Generate Collection.

Generate collection

  • Once the schema is imported, you will be able to see a new collection on the left-hand side.


Step 5: Adding API credentials

  • The next step is to add your App ID and Key. Click on the List Stories endpoint in your collection, then click on Headers. This will reveal two fields (X-AYLIEN-NewsAPI-Application-ID and X-AYLIEN-NewsAPI-Application-Key) where you can add your credentials.

Adding API credentials


Step 6: Setup variables

  • After you've added your credentials, you'll need to replace the placeholder {{baseUrl}} text with https://api.aylien.com/news.

  • Now, you can configure your request and set its parameters by clicking on Params. Doing so reveals all the available parameters for any endpoint, allowing you to view or edit any of the parameters.

  • Once you're finished, click on Send to submit the request and retrieve the results in JSON format in the box below.

3. Code

Using the Quantexa News API in any programming language offers a powerful tool for developers and data analysts seeking to access and analyze news data. With its easy integration, customizable queries, and support for real-time updates, it simplifies the process of fetching and processing news articles.

Additionally, features like sentiment analysis, entity recognition, categories and content summarization make it versatile for various applications, including market research, trend analysis, and content curation. Its scalability and robust documentation further enhance its appeal, making it a valuable asset for those looking to harness the wealth of information available in news articles efficiently.


Python Notebooks

Check out this walkthrough with some cool data visualizations here.

Key concepts

Authentication


Bearer Token

To use Qauntexa News API, you must authenticate using a bearer (access) token. You can obtain this token by sending a request to the authentication endpoint, along with your account username, password, and Application ID.


Obtaining the initial access and refresh token with your credentials

Note that access tokens are only valid for one hour. After that you will need to request a new token. This should be done using the refresh token returned alongside the access token (see below).

With the suggested function below, you will be able to retrieve the access bearer token.

import requests

def get_token(username, password):
    token = requests.post('https://api.aylien.com/v1/oauth/token', auth=(username, password), data={'grant_type': 'password'})

    token, refresh_token =  token.json()['access_token'], token.json()['refresh_token']

    return token, refresh_token


Usage

The code block below demonstrates how to use the function above.

token, refresh_token = get_token(username, password)


Obtaining further access tokens using the refresh token

You can obtain a new access bearer token by sending a ´refresh token´. Refresh tokens allow for continued authentication without the need to use your username and password. It is faster and more secure to use refresh tokens after the initial authentication (as documented above) instead of passing your username and password each time the access token expires.

Refresh tokens are valid for 24 hours, however, with every request to obtain a new access token you will also receive a new refresh token, so there should be no need to use username and password again as long as you always use the latest refresh token within 24 hours.

With the suggested function below, you can get a new access token by passing a refresh token. The response will also include a new refresh token, valid for 24 hours.

import requests

def get_token_by_refresh(refresh_token):
    token = requests.post('https://api.aylien.com/v1/oauth/token', data={'grant_type': 'refresh_token', 'refresh_token': refresh_token})

    token, refresh_token =  token.json()['access_token'], token.json()['refresh_token']

    return token, refresh_token


Usage

The code block below demonstrates how to use the function above.

token, refresh_token = get_token_by_refresh(refresh_token)


Getting authentication headers

With the suggested function below, you will be able to retrieve the authentication header that can be passed in your API calls.

import requests

username = "YOUR EMAIL"
password = "YOUR PASSWORD"
AppID = "YOUR APP ID"

def get_auth_header(username, password, appid):
    # Generate the authorization header for making requests to the Aylien API.

    token = requests.post('https://api.aylien.com/v1/oauth/token', auth=(username, password), data={'grant_type': 'password'})

    token = token.json()['access_token']

    headers = {f'Authorization': 'Bearer {}'.format(token), 'AppId': appid}

    return headers


Usage

The code block below demonstrates how to use the function above.

headers = get_auth_header(username, password, AppID)

API Endpoints

Quantexa News API includes endpoints that provide both retrieval and analysis features that allow you to search, collect, and analyze news content at scale. Each of our endpoints returns a different type of data. Click on each endpoint's documentation on the list to the left or search this page to find documentation on a specific parameter.

The OpenAPI 3.0 specification for our endpoints are available on GitHub under the AYLIEN/apis repository.


Stories

The Stories endpoint enables you to search for news articles based on a specific set of parameters that you set in your query, such as keywords, entities, or sources. The News API gathers articles in near real-time and stores and indexes them along with metadata and enrichments which you can search over.


HTTP Request URL

GET https://api.aylien.com/v6/news/stories


Retrieving all stories

With the suggested function below, you will be able to retrieve all the stories that match your criteria.

import requests
import time
import requests
from pprint import pprint

username = 'YOUR EMAIL'
password = 'YOUR PASSWORD'
AppID = 'YOUR APP ID'


def get_auth_header(username, password, appid):
    # Generate the authorization header for making requests to the Aylien API.

    token = requests.post('https://api.aylien.com/v1/oauth/token', auth=(username, password), data={'grant_type': 'password'})

    token = token.json()['access_token']

    headers = {f'Authorization': 'Bearer {}'.format(token), 'AppId': appid}

    return headers


def get_stories(params, headers):
    #  Fetch stories from the Aylien News API using the provided parameters and headers.

    fetched_stories = []
    stories = None

    while stories is None or len(stories) > 0:
        try:
            response = requests.get('https://api.aylien.com/v6/news/stories', params=params, headers=headers)

            # If the call is successfull it will append it
            if response.status_code == 200:
                response_json = response.json()
                stories = response_json['stories']

                if 'next_page_cursor' in response_json.keys():
                    params['cursor'] = response_json['next_page_cursor']
                else:
                    print('No next_page_cursor')

                fetched_stories += stories

                if len(stories) > 0 and not stories == None:
                    print(
                        'Fetched %d stories. Total story count so far: %d'
                        % (len(stories), len(fetched_stories))
                    )

            # If the application reached the limit per minute it will sleep and retry until the limit is reset
            elif response.status_code == 429:
                time.sleep(10)
                continue

            # If the API call face network or server errors it sleep for few minutes and try again a few times until completely stop the script.
            elif 500 <= response.status_code <= 599:
                time.sleep(260)
                continue

            # If the API call return any other status code it return the error for futher investigation and stop the script.
            else:
                print(response.text)
                break

        except Exception as e:
            # In case the code fall in any exception error.
            print(e)
            break

    return fetched_stories

The code block below demonstrates how to use the function above.

headers = get_auth_header(username, password, AppID)

params = {
    'published_at': '[NOW-14DAYS TO NOW]',
    'language': '(en)',
    'categories': '{{taxonomy:aylien AND id:(ay.fin.stockups OR ay.fin.stkclose) AND score:>=0.65}}',
    'entities': '{{surface_forms:("Apple" OR "Tesla") AND overall_prominence:>=0.65}}',
    'per_page': 100,
}

stories = get_stories(params, headers)


Retrieving top stories

With the suggested function below, you will be able to retrieve the stories that match your criteria until they reach a threshold. It will retrieve the top results for your query. This is useful to control the API response.

import requests
import time
import requests
from pprint import pprint

username = 'YOUR EMAIL'
password = 'YOUR PASSWORD'
AppID = 'YOUR APP ID'


def get_auth_header(username, password, appid):
    # Generate the authorization header for making requests to the Aylien API.

    token = requests.post('https://api.aylien.com/v1/oauth/token', auth=(username, password), data={'grant_type': 'password'})

    token = token.json()['access_token']

    headers = {f'Authorization': 'Bearer {}'.format(token), 'AppId': appid}

    return headers


def get_top_stories(params, headers, n_top_stories=False):
    fetched_stories = []
    stories = None

    if 'per_page' in params.keys():
        if params['per_page'] > n_top_stories and not n_top_stories == False:
            params['per_page'] = n_top_stories

    while (
        stories is None
        or len(stories) > 0
        and (len(fetched_stories) < n_top_stories or n_top_stories == False)
    ):

        try:
            response = requests.get('https://api.aylien.com/v6/news/stories', params=params, headers=headers)

            # If the call is successfull it will append it
            if response.status_code == 200:
                response_json = response.json()
                stories = response_json['stories']

                if 'next_page_cursor' in response_json.keys():
                    params['cursor'] = response_json['next_page_cursor']
                else:
                    pprint('No next_page_cursor')

                fetched_stories += stories

                if len(stories) > 0 and not stories == None:
                    print(
                        'Fetched %d stories. Total story count so far: %d'
                        % (len(stories), len(fetched_stories))
                    )

            # If the application reached the limit per minute it will sleep and retry until the limit is reset
            elif response.status_code == 429:
                time.sleep(10)
                continue

            # If the API call face network or server errors it sleep for few minutes and try again a few times until completely stop the script.
            elif 500 <= response.status_code <= 599:
                time.sleep(260)
                continue

            # If the API call return any other status code it return the error for futher investigation and stop the script.
            else:
                pprint(response.text)
                break

        except Exception as e:
            # In case the code fall in any exception error.
            pprint(e)
            break

    return fetched_stories

The code block below demonstrates how to use the function above.

headers = get_auth_header(username, password, AppID)

params = {
    'published_at': '[NOW-14DAYS TO NOW]',
    'language': '(en)',
    'categories': '{{taxonomy:aylien AND id:(ay.fin.stockups OR ay.fin.stkclose) AND score:>=0.65}}',
    'entities': '{{surface_forms:("Apple" OR "Tesla") AND overall_prominence:>=0.65}}',
    'per_page': 100,
}

stories = get_top_stories(params, headers, 300)


Time series

Time series allows you to track changes in quantitative values contained in stories over time. This information can be anything from mentions of a topic or entities, sentiment about a topic, or the volume of stories published by a source, to name but a few. Using the Date Math Syntax Date Math Syntax, you can easily set your query to return information from any time period, from the current point in time to when Quantexa News API started collecting stories. The returned information can be rounded to a time also specified by you, for example, by setting the results into hourly, daily, or weekly data points.


HTTP Request URL

GET https://api.aylien.com/v6/news/time_series


Retrieving time series

With the suggested function below, you will be able to retrieve all the time series aggregations that match your criteria.

import requests
import time
import requests
from pprint import pprint

username = 'YOUR EMAIL'
password = 'YOUR PASSWORD'
AppID = 'YOUR APP ID'


def get_auth_header(username, password, appid):
    # Generate the authorization header for making requests to the Aylien API.

    token = requests.post('https://api.aylien.com/v1/oauth/token', auth=(username, password), data={'grant_type': 'password'})

    token = token.json()['access_token']

    headers = {f'Authorization': 'Bearer {}'.format(token), 'AppId': appid}

    return headers


def get_timeseries(params, headers):
    # Make a GET request to the Aylien News API to retrieve a time series of news articles based on the given parameters and headers.

    while True:
        try:
            response = requests.get('https://api.aylien.com/v6/news/time_series', params=params, headers=headers)

            # If the call is successfull it will append it
            if response.status_code == 200:
                timeseries = response.json()
                return timeseries

            # If the application reached the limit per minute it will sleep and retry until the limit is reset
            elif response.status_code == 429:
                time.sleep(10)
                continue

            # If the API call face network or server errors it sleep for few minutes and try again a few times until completely stop the script.
            elif 500 <= response.status_code <= 599:
                time.sleep(260)
                continue

            # If the API call return any other status code it return the error for futher investigation and stop the script.
            else:
                pprint(response.text)
                break

        except Exception as e:
            # In case the code fall in any exception error.
            print(e)
            break

The code block below demonstrates how to use the function above.

headers = get_auth_header(username, password, AppID)

params = {
    'published_at': '[NOW-14DAYS TO NOW]',
    'language': '(en)',
    'categories': '{{taxonomy:aylien AND id:(ay.fin.stockups OR ay.fin.stkclose) AND score:>=0.65}}',
    'period': '+14DAYS',
}

timeseries = get_timeseries(params, headers)


Clusters

The Clusters endpoint is used to retrieve cluster objects. A cluster object includes a unique cluster ID, some metadata about the cluster, and a story that best represents the cluster. Stories associated with a cluster can be retrieved by calling the Stories endpoint with the cluster-ID supplied as the cluster parameter.


HTTP Request URL

GET https://api.aylien.com/v6/news/clusters


Retrieving clusters

The suggested function exemplifies how to retrieve clusters from this endpoint.

import requests
import time
import requests
from pprint import pprint

username = 'YOUR EMAIL'
password = 'YOUR PASSWORD'
AppID = 'YOUR APP ID'


def get_auth_header(username, password, appid):
    # Generate the authorization header for making requests to the Aylien API.

    token = requests.post('https://api.aylien.com/v1/oauth/token', auth=(username, password), data={'grant_type': 'password'})

    token = token.json()['access_token']

    headers = {f'Authorization': 'Bearer {}'.format(token), 'AppId': appid}

    return headers


def get_clusters(params, headers):
    #  Make a GET request to the Aylien News API to retrieve news clusters based on the provided parameters and headers.

    while True:
        try:
            response = requests.get('https://api.aylien.com/v6/news/clusters', params=params, headers=headers)

            # If the call is successfull it will append it
            if response.status_code == 200:
                fetched_clusters = response.json()
                return fetched_clusters['clusters']

            # If the application reached the limit per minute it will sleep and retry until the limit is reset
            elif response.status_code == 429:
                time.sleep(10)
                continue

            # If the API call face network or server errors it sleep for few minutes and try again a few times until completely stop the script.
            elif 500 <= response.status_code <= 599:
                time.sleep(260)
                continue

            # If the API call return any other status code it return the error for futher investigation and stop the script.
            else:
                pprint(response.text)
                break

        except Exception as e:
            # In case the code fall in any exception error.
            print(e)
            break

The code block below demonstrates how to use the function above.

headers = get_auth_header(username, password, AppID)

params = {
    'story_count': '[20 TO 100]',
    'earliest_story.start': '2023-09-01T00:00:00Z',
    'earliest_story.end': '2023-09-01T23:59:59Z',
    'latest_story.start': '2023-09-14T00:00:00Z',
    'latest_story.end': '2023-09-14T23:59:59Z',
    'location.country': 'US',
}

clusters = get_clusters(params, headers)


Trends

Trends allow you to identify the most frequent values for categorical attributes contained in stories, e.g., the most frequent entities, concepts or keywords. This endpoint allows you to set parameters like a time period, a subject category, or an entity, and it will return the most mentioned entities or keywords that are mentioned in relation to your query.


HTTP Request URL

GET https://api.aylien.com/v6/news/trends


Retrieving trends

With the suggested function below, you will be able to retrieve the trends that match your criteria.

import requests
import time
import requests
from pprint import pprint

username = 'YOUR EMAIL'
password = 'YOUR PASSWORD'
AppID = 'YOUR APP ID'


def get_auth_header(username, password, appid):
    # Generate the authorization header for making requests to the Aylien API.

    token = requests.post('https://api.aylien.com/v1/oauth/token', auth=(username, password), data={'grant_type': 'password'})

    token = token.json()['access_token']

    headers = {f'Authorization': 'Bearer {}'.format(token), 'AppId': appid}

    return headers


def get_trends(params, headers):
    # Make a GET request to the Aylien News API to retrieve the current news trends based on the provided parameters and headers.

    while True:
        try:
            response = requests.get('https://api.aylien.com/v6/news/trends', params=params, headers=headers)

            # If the call is successfull it will append it
            if response.status_code == 200:
                trends = response.json()
                return trends

            # If the application reached the limit per minute it will sleep and retry until the limit is reset
            elif response.status_code == 429:
                time.sleep(10)
                continue

            # If the API call face network or server errors it sleep for few minutes and try again a few times until completely stop the script.
            elif 500 <= response.status_code <= 599:
                time.sleep(260)
                continue

            # If the API call return any other status code it return the error for futher investigation and stop the script.
            else:
                pprint(response.text)
                break

        except Exception as e:
            # In case the code fall in any exception error.
            print(e)
            break

The code block below demonstrates how to use the function above.

headers = get_auth_header(username, password, AppID)

params = {
    'published_at': '[NOW-14DAYS TO NOW]',
    'language': '(en)',
    'categories': '{{taxonomy:aylien AND id:(ay.fin.stockups OR ay.fin.stkclose) AND score:>=0.65}}',
    'field': 'entities.body.surface_forms.text',
}

trends = get_trends(params, headers)


Autocompletes

Autocomplete is a helper endpoint that enables you to search for entities and sources in the Quantexa News API database. It takes a search term and returns suggestions, which then can be leveraged as search parameters in your News API query. For example, let's say you want to search the news for the company “Apple”, running a query to the Autocomplete endpoint will return a list of entities where they exist, including the entity id {'id': 'Q312', 'text': 'Apple Inc.'}. This entity ID can then be used to search the News API and filter for news specifically on the company since it has been trained in the knowledge base.


Autocomplete types

The Autocomplete endpoint allows you to search for different fields from the database:

  • entity-types - Search entity types to filter by for disambiguation.

  • entity-names - Search the entity knowledge base for organizations, people, locations and much more.

  • sources - Search news publishers from sources inventory.


HTTP Request URL

GET https://api.aylien.com/v2/autocomplete/suggestions/entity-types

GET https://api.aylien.com/v2/autocomplete/suggestions/entity-names

GET https://api.aylien.com/v2/autocomplete/suggestions/sources


Retrieving entity types

With the suggested function below, you will be able to retrieve the entity types associated with the entity searched.

import requests
import time
import requests
from pprint import pprint

username = 'YOUR EMAIL'
password = 'YOUR PASSWORD'
AppID = 'YOUR APP ID'


def get_auth_header(username, password, appid):
    # Generate the authorization header for making requests to the Aylien API.

    token = requests.post('https://api.aylien.com/v1/oauth/token', auth=(username, password), data={'grant_type': 'password'})

    token = token.json()['access_token']

    headers = {f'Authorization': 'Bearer {}'.format(token), 'AppId': appid}

    return headers


def get_autocompletes_entity_types(params, headers):
    # Make a GET request to the Aylien API to retrieve autocomplete suggestions for entity types.

    while True:
        try:
            response = requests.get('https://api.aylien.com/v2/autocomplete/suggestions/entity-types', params=params, headers=headers)

            # If the call is successfull it will append it
            if response.status_code == 200:
                types = response.json()
                return types

            # If the application reached the limit per minute it will sleep and retry until the limit is reset
            elif response.status_code == 429:
                time.sleep(10)
                continue

            # If the API call face network or server errors it sleep for few minutes and try again a few times until completely stop the script.
            elif 500 <= response.status_code <= 599:
                time.sleep(260)
                continue

            # If the API call return any other status code it return the error for futher investigation and stop the script.
            else:
                pprint(response.text)
                break

        except Exception as e:
            # In case the code fall in any exception error.
            print(e)
            break

The code block below demonstrates how to use the function above.

headers = get_auth_header(username, password, AppID)

params = {'term': 'organization', 
          'limit': 3}

entity_types = get_autocompletes_entity_types(params, headers)


Retrieving entities

With the suggested function below, you will be able to retrieve entities available on the knowledge base that approximately match the term passed.

import requests
import time
import requests
from pprint import pprint

username = 'YOUR EMAIL'
password = 'YOUR PASSWORD'
AppID = 'YOUR APP ID'


def get_auth_header(username, password, appid):

    token = requests.post('https://api.aylien.com/v1/oauth/token', auth=(username, password), data={'grant_type': 'password'})

    token = token.json()['access_token']

    headers = {f'Authorization': 'Bearer {}'.format(token), 'AppId': appid}

    return headers


def get_autocompletes_entity_names(params, headers):

    while True:
        try:
            response = requests.get('https://api.aylien.com/v2/autocomplete/suggestions/entity-names', params=params, headers=headers)

            # If the call is successfull it will append it
            if response.status_code == 200:
                entities = response.json()
                return entities

            # If the application reached the limit per minute it will sleep and retry until the limit is reset
            elif response.status_code == 429:
                time.sleep(10)
                continue

            # If the API call face network or server errors it sleep for few minutes and try again a few times until completely stop the script.
            elif 500 <= response.status_code <= 599:
                time.sleep(260)
                continue

            # If the API call return any other status code it return the error for futher investigation and stop the script.
            else:
                pprint(response.text)
                break

        except Exception as e:
            # In case the code fall in any exception error.
            print(e)
            break

The code block below demonstrates how to use the function above.

headers = get_auth_header(username, password, AppID)

params = {'term': 'Apple', 
          'type_id': 'Q43229', 
          'limit': 3}

entity_names = get_autocompletes_entity_names(params, headers)


Retrieving sources

With the suggested function below, you will be able to retrieve sources available on the knowledge base that approximately match the term passed.

import requests
import time
import requests
from pprint import pprint

username = 'YOUR EMAIL'
password = 'YOUR PASSWORD'
AppID = 'YOUR APP ID'


def get_auth_header(username, password, appid):

    token = requests.post('https://api.aylien.com/v1/oauth/token', auth=(username, password), data={'grant_type': 'password'})

    token = token.json()['access_token']

    headers = {f'Authorization': 'Bearer {}'.format(token), 'AppId': appid}

    return headers


def get_autocompletes_sources(params, headers):

    while True:
        try:
            response = requests.get('https://api.aylien.com/v2/autocomplete/suggestions/sources', params=params, headers=headers)

            # If the call is successfull it will append it
            if response.status_code == 200:
                sources = response.json()
                return sources

            # If the application reached the limit per minute it will sleep and retry until the limit is reset
            elif response.status_code == 429:
                time.sleep(10)
                continue

            # If the API call face network or server errors it sleep for few minutes and try again a few times until completely stop the script.
            elif 500 <= response.status_code <= 599:
                time.sleep(260)
                continue

            # If the API call return any other status code it return the error for futher investigation and stop the script.
            else:
                pprint(response.text)
                break

        except Exception as e:
            # In case the code fall in any exception error.
            print(e)
            break

The code block below demonstrates how to use the function above.

headers = get_auth_header(username, password, AppID)

params = {'domain_term': 'bbc.com', 
          'limit': 3}

sources = get_autocompletes_sources(params, headers)


Histograms

Histograms allow you to retrieve a histogram snapshot of any quantitative attribute of articles that match your query. The histogram will show the distribution of articles over a range of values for your specified parameter. For example, to understand how long articles written by a particular journalist are on average, you can use Histograms and specify the journalist's name using author.id[] or author.name, then set the field value to words_count to retrieve a histogram that shows how many of this author's articles are and how long (in words). You can also specify the histogram's start and endpoints, as well as the bin width using interval.start, interval.end and interval.width respectively.


HTTP Request URL

GET https://api.aylien.com/v6/news/histograms


Retrieving histograms

With the suggested function below, you will be able to retrieve the related stories that match your criteria.

import requests
import time
import requests
from pprint import pprint

username = 'YOUR EMAIL'
password = 'YOUR PASSWORD'
AppID = 'YOUR APP ID'


def get_auth_header(username, password, appid):
    # Generate the authorization header for making requests to the Aylien API.

    token = requests.post('https://api.aylien.com/v1/oauth/token', auth=(username, password), data={'grant_type': 'password'})

    token = token.json()['access_token']

    headers = {f'Authorization': 'Bearer {}'.format(token), 'AppId': appid}

    return headers


def get_histograms(params, headers):

    while True:
        try:
            response = requests.get('https://api.aylien.com/v6/news/histograms', params=params, headers=headers)

            # If the call is successfull it will append it
            if response.status_code == 200:
                histograms = response.json()
                return histograms

            # If the application reached the limit per minute it will sleep and retry until the limit is reset
            elif response.status_code == 429:
                time.sleep(10)
                continue

            # If the API call face network or server errors it sleep for few minutes and try again a few times until completely stop the script.
            elif 500 <= response.status_code <= 599:
                time.sleep(260)
                continue

            # If the API call return any other status code it return the error for futher investigation and stop the script.
            else:
                pprint(response.text)
                break

        except Exception as e:
            # In case the code fall in any exception error.
            print(e)
            break

The code block below demonstrates how to use the function above.

headers = get_auth_header(username, password, AppID)

params = {
    'published_at': '[NOW-14DAYS TO NOW]',
    'language': '(en)',
    'categories': '{{taxonomy:aylien AND id:(ay.fin.stockups OR ay.fin.stkclose) AND score:>=0.65}}',
    'entities': '{{surface_forms:("Apple" OR "Tesla") AND overall_prominence:>=0.65}}',
    'interval.start': 1200,
    'interval.end': 5000,
    'field': 'source.rankings.alexa.rank',
}

histograms = get_histograms(params, headers)

AQL

AQL is a Lucene based syntax that enables users to perform powerful searches. Queries in this syntax are made within an aql parameter.

Status codes & Responses


2XX - Success

Status codes from 200 to 299 mean your call was successful.

200 'OK'
Success. Your API call was processed fine.                                                                                                                                                                                                                                          


4XX - Request status

The 4XX codes are intended for cases in which the call seems to have erred. It could be either due to credentials or bad parameters sent with the API call.

400 'Bad Request'
The server cannot or will not process the request due to an apparent client error.
401 'Unauthorized'
Authentication credentials were missing or the token may have expired. For further clarification contact your account manager.
403 'Forbidden'
Your plan does not have access to the feature you requested. More details about the missing feature are provided in the response message. For further clarification contact your account manager.
404 'Not Found'
The URI requested is invalid or the resource requested, such as a story, does not exist. For endpoints URLs follow the link to API endpoints page.
405 'Method Not Allowed'
A request method is not supported for the requested resource. For endpoints methods allowed follow the link to API endpoints page.
414 'Request URI Too Long'
The URL provided was too long for the server to process. The maximum URL length allowed for GET method size is 8 KB, or depending on the set approximately 8 thousand characters.
422 'Unprocessable Entity'
The request was invalid or cannot be otherwise served. For allowed values on API class, follow the link to parameters and common workflows.
429 'Too Many Requests'
Usage limits are exceeded. For more details on how limits works follow the link to pagination. Or further clarification contact your account manager.


5XX - Network or Server status

5XX codes relate to requests that should be valid, but are unable to be fulfilled.

500 'Internal Server Error'
An unexpected error has occurred, and our engineers have been informed. If the issue persists, contact the technical support team.
503 'Service Unavailable'
The server is currently unavailable, or it exceeds the attempts with the same URL. For more details on how to paginate through results, follow the link to pagination.
504 'Gateway Timeout'
The server could not get a response in time, probably because the client is blocked or behind a firewall.

Rate limits

Rate limiting of the API is primarily considered on a per-application basis, more accurately, per application ID you have two limits applied.


Hits rate limit

Each endpoint is subject to a rate limit of 60 hits per minute or 3 hits per second.


Volumes rate limit

This is the allowance of how many stories can be pulled by the application from the /stories endpoint. The limited volumes of stories only apply to the count of stories retrieved from the endpoint stories.

This limit is applied for each application ID. It goes along with the plan acquired. Most commonly, the limit per month period or unlimited, where no limit is applied.


Handling the exceeded rate limit

Quantexa News API will return an HTTP 429 "Too Many Requests" response code when an application exceeds the rate limit for a given API endpoint.

{
   "errors":[
      {
         "code":"KB429",
         "detail":"You've exceeded your hits per minute rate limit. On the News API plan you can't make more than 60 hits per minute. Current usage stats: 62/60 hits per minute. Please follow the rate limit guidelines at the following link: https://docs.aylien.com/newsapi/v6/#rate-limits.",
         "id":"error429_too_many_request",
         "links":{
            "about":"https://docs.aylien.com/newsapi/v6/#rate-limits"
         },
         "status":429,
         "title":"Too Many Requests"
      }
   ]
}


Tracking the hit and volume rates

There are three response headers you can use to check your quota allowance, the number of hits remaining on your quota, and the time and date your quota will be reset.

import requests
from pprint import pprint

username = 'YOUR EMAIL'
password = 'YOUR PASSWORD'
AppId = 'YOUR APP ID'

token = requests.post("https://api.aylien.com/v1/oauth/token", auth=(username, password), data={"grant_type": "password"}).json()["access_token"]

headers = {f"Authorization": "Bearer {}".format(token),
            "AppId":AppId}

url = 'https://api.aylien.com/v6/news/time_series'

#Submitting a date range through an array flat parameter
params ={
    'published_at[]':'[NOW-3DAY TO NOW-2DAY]',
    'language':'en'
}

response = requests.get(url,params=params, headers=headers)
pprint(response.headers)


Response

{'content-encoding': 'gzip',
 'content-length': '143',
 'content-type': 'application/json',
 'date': 'Tue, 05 Sep 2023 16:40:32 GMT',
 'server': 'istio-envoy',
 'vary': 'Accept-Encoding',
 'x-envoy-upstream-service-time': '303',
 'x-powered-by': 'Gateway/v0.0.7-292-g40bef9e',
 'x-ratelimit-hit-limit': '60',
 'x-ratelimit-hit-period': 'minute',
 'x-ratelimit-hit-remaining': '59',
 'x-ratelimit-hit-reset': '2023-09-05 16:34:00 +0000',
 'x-ratelimit-limit': '60',
 'x-ratelimit-remaining': '59',
 'x-ratelimit-reset': '2023-09-05 16:34:00 +0000',
 'x-ratelimit-volume-limit': '100000',
 'x-ratelimit-volume-period': 'month',
 'x-ratelimit-volume-remaining': '98158',
 'x-ratelimit-volume-reset': '2023-10-01 00:00:00 +0000',
 'x-request-id': '2ae977ff-f8a9-4677-b4b2-5c7591d009af',
 'x-request-time': '52 ms', 'x-version': '0.0.2-1402-g7e71e28c'}
  • 'x-ratelimit-hit-limit': The number of allowed requests in the period.

  • 'x-ratelimit-hit-period': The period that the allowed requests can be used.

  • 'x-ratelimit-hit-remaining': The number of remaining requests in the current period.

  • 'x-ratelimit-hit-reset': The remaining window before the rate limit resets in UTC epoch seconds.


Rate limit reset

Our API request limit is based on a per-clock-minute window but not on a 60-second rolling window.

For example, from 12:11 pm to 12:12 pm, you can send 60 requests. At 12:12 this limit resets again. If 20 requests are sent to the API at 30 secs past 12:11 pm then the remaining rate limit will be 40 hits left for that min (i.e. for the next 30 secs).


What happens if you get rate limited a lot?

If you're continually getting rate limited, we may block your application or IP. For repeat offenders, we may block your other applications as well, and disallow you from creating new applications, so be sure to get in touch with our support team if you're getting rate limited a lot.

URL encoding

Quantexa News API can be accessed in many ways. It can either be accessed via programming languages like Python, JS, command cURL, or any other language that supports HTTP method calls or even via development platforms like Postman, but all the HTTP API calls must be encoded in URL no matter the language or platform used.

The following examples show how queries can be properly encoded. These examples are useful in case debugging is needed.


Requests library

Below are the parameters in a dictionary object that can be passed as parameters on GET requests function.

params = {
    "published_at": "[14DAYS TO NOW]",
    "language": "(en)",
    "categories": "{{taxonomy:aylien AND id:(ay.fin.stockups OR ay.fin.stkclose) AND score:>=0.65}}",
    "entities": '{{surface_forms:("Apple" OR "Tesla") AND overall_prominence:>=0.65}}',
    "per_page": 100,
}


URL encoded

https://api.aylien.com/v6/news/stories?published_at=%5B14DAYS+TO+NOW%5D&language=%28en%29&categories=%7B%7Btaxonomy%3Aaylien+AND+id%3A%28ay.fin.stockups+OR+ay.fin.stkclose%29+AND+score%3A%3E%3D0.65%7D%7D&entities=%7B%7Bsurface_forms%3A%28%22Apple%22+OR+%22Tesla%22%29+AND+overall_prominence%3A%3E%3D0.65%7D%7D&per_page=100


Postman

Postman sample


URL encoded

https://api.aylien.com/v6/news/stories?published_at=%5B14DAYS+TO+NOW%5D&language=%28en%29&categories=%7B%7Btaxonomy%3Aaylien+AND+id%3A%28ay.fin.stockups+OR+ay.fin.stkclose%29+AND+score%3A%3E%3D0.65%7D%7D&entities=%7B%7Bsurface_forms%3A%28%22Apple%22+OR+%22Tesla%22%29+AND+overall_prominence%3A%3E%3D0.65%7D%7D&per_page=100

Language support

The News API sources content from across the globe in 16 languages.

All content is available in its native form and also machine-translated English text.

The multiple (human) languages supported can be selected using the language parameter.

English (en) German (de) French (fr) Italian (it)
Spanish (es) Portuguese (pt) Russian (ru) Dutch (nl)
Arabic (ar) Turkish (tr) Traditional Chinese (zh-tw) Simplified Chinese (zh-cn)
Swedish (sv) Danish (da) Finnish (fi) Farsi (fa)

All content benefits from all of the analysis features supported by Quantexa News API. Most NLP enrichments are performed on native English or translated English text. These enrichments include category and industry enrichment, entity extraction, and sentiment analysis. The table below lists the language used for analysis for each feature.

Feature Original Text Analyzed Translated Text Analyzed
Clusters en All Other Languages
Topic Categories en All Other Languages
Entities en All Other Languages
Sentiment (body) en, es, de All Other Languages
Extracted Keywords en, es, de, it, fr, pt All Other Languages
Hashtags en, es, de, it, fr, pt All Other Languages
Word, Character, and Paragraph Counts en, es, de, it, fr, pt All Other Languages

Licensed content

Customers with access to our licensed content will be able to retrieve stories from our licensed sources. The license_type is a field in the stories endpoint response that indicates whether an article has come from a licensed source or if it is a web-sourced article.


License content flag on API response

This field is a non-blank binary Boolean field. It will always return either the value 0 or 1.

license_type 0 - Indicates the story is from a group of web-based, openly accessible sources

[ ...
 {'license_type': 0}
 ...
]

license_type 1 - Indicates the story is from the group of licensed sources

[ ...
 {'license_type': 1}
 ...
]

If you wish to upgrade your account to included licensed content, please speak to your account manager, or contact sales.