Skip to content

What's this?

Application APIs

Lets documents, content and metadata be stored in MARCIE for further API processing. To get this data into the MARCIE store, document decomposition and ingestion, as well as manual document/content submission APIs can be used. Once populated, application APIs provide semantic/syntactic analysis, semantic search and content related querying to be executed at scale. See APPS from the navigation panel for more details.

Content APIs

Content is submitted to endpoints with associated control attributes and results are synchronously returned. Primary examples of these APIs include: content comparison, enrichment, transformation, and analysis (spellcheck, grammar, sentiment, readability). See DOCUMENT and CONTENT from the navigation panel for more details. In this scenario, content is not stored or persisted.

Download OpenAPI description
Overview
MARCIE Support

info@messagepoint.com

Languages
Servers
Mock server

https://marcie.redocly.app/_mock/openapi/

https://w1waoh1clk.execute-api.us-east-1.amazonaws.com/{basePath}/

Text Generative AI

Text focused APIs utilizing generative large language models (LLMs)

Operations

Text Similarity

Provides multiple text-based similarity algorithms to measure the similarity of input text pairs. The provided algorithms are tuned to measure similarity both in the representation (syntax) and the meaning (semantics) of the text content.

Operations

Request

Perform the similarity analysis on the given sentence pair, either syntactic or semantic analysis

Security
api_key
Bodyapplication/jsonrequired
text1stringnon-emptyrequired

The text content with UTF-8 text representation

lang1stringrequired

The two letter language code

Enum"en""fr""es"
text2stringnon-emptyrequired

The text content with UTF-8 text representation

lang2stringrequired

The two letter language code

Enum"en""fr""es"
algostring

Similarity Algorithms

Syntactic Similarity

The syntactic similarity algorithms exclusively focus on the representational features of text. The most dominant of these features is the set of tokens (character and words) being used. Different syntactic similarity algorithms exploit these features differently to provide a measure of similarity between an input text pair. The similarity is measured based on a scale of 0 to 1, where 1 represents the best possible match, and 0 indicates the no match scenario. In addition the base algorithms, we also utilize the approach of character and/or word based shingles to add context for increasing the similarity accuracy. The following syntactic similarity algorithms are supported:

  1. syn.cosine-with-shingles: This represents the combination of applying character-based shingles to the classic cosine similarity algorithm.
  2. syn.sorensen_dice-shingles: This represents the combination of applying character-based shingles to the classic Sørensen–Dice coefficient algorithm.
  3. syn.jw-shingles: This represents the combination of applying character-based shingles to the classic Jaro–Winkler algorithm that is similar in nature to edit distance based measures.
  4. syn.cosine-word: This represents the combination of applying word-based shingles to the classic cosine similarity algorithm. Compared to the syn.cosine-with-shingles, this algorithm will produce less false positives for larger text pieces.
  5. syn.simple: A Semantax proprietary algorithm that is optimized for comparison speed and accuracy. It is based on the cosine similarity algorithm and it combines both character and word based shingles.
  6. syn.weighted-word: A Semantax proprietary algorithm that is optimized for comparison speed and accuracy. It is based on the classic [Jaro–Winkler] algorithm. Both character and word based shingles are combined in a weighted capacity to increase the impact of term-frequency.
  7. syn.sentence: A Semantax proprietary algorithm derived from the classic cosine similarity algorithm. The main feature of this algorithm is the inclusion of NLP (natural language processing) primitives for higher accuracy of similarity comparisons. NLP processing includes lemmatization/stemming, term normalization etc. This algorithm is best suited for a single sentence, or a couple of short sentences as input.
  8. syn.paragraph: A Semantax proprietary algorithm that extends syn.sentence to compare a pair of input paragraphs (a set of sentences). In addition to the syn.sentence features, this algorithm also includes a weighted Jaccard Similarity score of the overlapping sentences across the input pair.

Semantic Similarity

The semantic similarity algorithms focus on comparing the input text pair based on the main concepts present in the text regardless of the words used to represent these concepts. Roughly speaking it is similar to comparing the meaning of the two sentences independent of the words used. See here for more details. Our semantic similarity algorithms are created using modern deep learning based word embeddings trained on enterprise corpus of sample documents. The models are trained on single sentences, and/or short paragraphs as input, and therefore work best for content size in that range.

All of our semantic similarity algorithms support multi and cross lingual scenarios, where the input text pair can be expressed in any of the supported languages (for example en-en, en-fr, en-es, fr-fr, fr-es etc.). The following semantic similarity algorithms are supported:

  1. sem.ssm: The default semantic similarity algorithm that offers the best combination of speed and accuracy with an emphasis on english-to-english common language input pairs.
  2. sem.ssm14: This semantic similarity model is trained on data from government, insurance and banking industry verticals. The model is optimized for speed but provides a good level of over all accuracy.
  3. sem.ssm20: Similar to sem.ssm14, this model is build on a much larger input corpus.
  4. sem.ssm28: Builds on the same approach as the previous two models but also includes basic support for higher semantic relationships such as negation.
  5. sem.ssm30: Similar to ssm28, with better similarity score distribution.
Enum"syn.weighted-word""syn.simple""syn.cosine-with-shingles""syn.sorensen_dice-shingles""syn.cosine-word""syn.jw-shingles""syn.paragraph""syn.sentence""sem.ssm""sem.ssm14"
curl -i -X POST \
  https://marcie.redocly.app/_mock/openapi/text/similarity \
  -H 'Content-Type: application/json' \
  -H 'x-api-key: YOUR_API_KEY_HERE' \
  -d '{
    "text1": "how are you",
    "lang1": "en",
    "text2": "how old are you",
    "lang2": "en",
    "algo": "syn.cosine-word"
  }'

Responses

200 response

Headers
Access-Control-Allow-Originstring
Bodyapplication/json
statusobject

response status

resultobject

response body

One of:

response body

Response
application/json
{ "status": { "success": true, "code": 200 }, "result": { "text1": "string", "text2": "string", "score": 1, "prediction": { … } } }

Text Summary

Generates a summary for the given text.

Operations

Natural Language Processing

MARCIE NLP operations' request on provided content

Operations

Enrichments/Classification

Text enrichment APIs offer various enrichment functions that take the raw text as its input and provides a specific enrichment/feature corresponding to the input text. An enrichment function is idempotent and its output is determined by the input text and the underlying predictive (deep learning based) linguistic model. Some examples of these include text based sentiment, readability calculation etc. Most of the underlying methods can be used either using a "GET" or a "POST" HTTP method. For smaller text, the GET method offers better performance and allows for network optimizations such as caching.

Operations

Text Transformers

MARCIE text transformers operations' request on provided content

Operations

Spelling & Grammar

MARCIE spell and grammar operations' request on provided content

Operations

Content Moderation

MARCIE API Content Moderation

Operations

Translation

MARCIE translation operations

Operations

Application

Root resource for all application APIs

Operations

Content

Root resource for all content APIs

Operations

Document

Root resource for all document APIs

Operations

PDF Document

PDF document parsing & processing APIs

Operations

Word Document

Microsoft Word document parsing.

Operations

XHTML Email

XHTML Email template parsing.

Operations

Self Service

MARCIE API Self Service

Operations