Text Moderation API avatar
Text Moderation API

Pricing

Pay per usage

Go to Store
Text Moderation API

Text Moderation API

Developed by

Sentinel Moderation

Maintained by Community

Uses advanced AI models to analyze and classify user-generated content in real time. It detects harmful or inappropriate content, providing category-level flags and confidence scores to help you enforce community guidelines and keep your platform safe.

0.0 (0)

Pricing

Pay per usage

0

Monthly users

1

Runs succeeded

>99%

Last modified

6 days ago

🛡️ AI Text Moderation Actor

This Apify Actor uses Sentinel Moderation's AI-powered API to classify and flag potentially harmful or inappropriate text content. It detects a wide range of categories including harassment, hate speech, sexual content, illicit activity, self-harm, and violence.

Use this actor to help protect your platform and maintain community guidelines by automating content moderation at scale.


📥 Input Schema

The actor accepts a simple JSON input:

1{
2  "apiKey": "your-sentinelmoderation-api-key",
3  "content": "Text to analyze goes here..."
4}
  • apiKey (string, required): Your API key from SentinelModeration.com.
  • content (string, required): The text you want to classify for moderation.

📤 Output

The actor returns an array containing one moderation result object with the following structure:

1[
2  {
3    "flagged": false,
4    "categories": {
5      "harassment": false,
6      "harassment/threatening": false,
7      "sexual": false,
8      "hate": false,
9      "hate/threatening": false,
10      "illicit": false,
11      "illicit/violent": false,
12      "self-harm/intent": false,
13      "self-harm/instructions": false,
14      "self-harm": false,
15      "sexual/minors": false,
16      "violence": false,
17      "violence/graphic": false
18    },
19    "category_scores": {
20      "harassment": 0.000048,
21      "harassment/threatening": 0.0000066,
22      "sexual": 0.000039,
23      "hate": 0.0000142,
24      "hate/threatening": 0.0000008,
25      "illicit": 0.000022,
26      "illicit/violent": 0.000019,
27      "self-harm/intent": 0.0000011,
28      "self-harm/instructions": 0.0000010,
29      "self-harm": 0.0000020,
30      "sexual/minors": 0.000010,
31      "violence": 0.000016,
32      "violence/graphic": 0.0000056
33    },
34    "error": "NOTE: THIS IS A SAMPLE RESPONSE, AN API KEY FROM SENTINELMODERATION.COM IS REQUIRED TO GET REAL RESULTS FOR THIS ACTOR."
35  }
36]
  • flagged: true if any category crosses the internal moderation threshold.
  • categories: A breakdown of category flags (true/false).
  • category_scores: Raw probability scores for each category (0.0 - 1.0).
  • error: A message shown when a valid API key is not provided.

🧠 Categories Detected

This actor checks for content under the following moderation categories:

  • Harassment
  • Threatening language
  • Sexual content (general & involving minors)
  • Hate speech (general & threatening)
  • Illicit activity (including violent)
  • Self-harm (intent, instructions, general)
  • Violence (including graphic imagery)

🔐 Getting an API Key

To use this actor with real moderation results, you need an API key from Sentinel Moderation:

  1. Go to sentinelmoderation.com
  2. Sign up and generate your API key
  3. Use the key in the apiKey field of the input

✅ Example Use Cases

  • Moderating user comments or posts
  • Screening support messages for abuse
  • Filtering harmful prompts in AI chat systems
  • Pre-checking user-generated bios or profile content

Pricing

Pricing model

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage.