A Text Analysis API To Take For A Spin

AYLIEN Text API is a package consisting of eight different Natural Language Processing, Information Retrieval and Machine Learning APIs that will help developers extract meaning and insight from documents.

There are currently 8 endpoints available:

  • Article Extraction: Extracts the main body of article, including embedded media such as images & videos from an URL and removes all the surrounding clutter.
  • Article Summarization: Summarizes an article into a few key sentences.
  • Classification: Classifies a piece of text according to IPTC NewsCode standard into more than 500 categories.
  • Entity Extraction: Extracts named entities (people, organizations, products and locations) and values (URLs, emails, telephone numbers, currency amounts and percentages) mentioned in a body of text.
  • Concept Extraction: Extracts named entities mentioned in a document, disambiguates and cross-links them to DBPedia and Linked Data entities, along with their semantic types (including DBPedia and schema.org types).
  • Language Detection: Detects the main language a document is written in and returns it in ISO 639-1 format, from among 76 different languages.
  • Sentiment Analysis: Detects sentiment of a document in terms of polarity (positive or negative) and subjectivity (subjective or objective).
  • Hashtag Suggestion: Automatically suggests hashtags for better discoverability of content on Social Media.

via Text Analysis API Documentation | AYLIEN.

This might be interesting here when used in conjunction with something like the Free Law Reporter though my initial testing seems to bring uneven results. The API did good work with a copyright case, spotting key phrases and generating a good summary. It didn’t handle Brown v. Board of Education as well, missing key concepts and generating a useless summary. It seems to work better at extracting short newsy articles from cluttered web pages than analyzing lengthy text articles.