Metadata Tagger

Your precision indexer. The Metadata Tagger Agent reads any article, transcript, or content block and extracts clean, structured metadata. It ensures every piece of content is consistently tagged with the right people, places, topics, and entities — so your archives stay organized, searchable, and context-aware.

Metadata Tagger Agent

Your precision indexer. The Metadata Tagger Agent reads any article, transcript, or content block and extracts clean, structured metadata. It ensures every piece of content is consistently tagged with the right people, places, topics, and entities — so your archives stay organized, searchable, and context-aware.


What It Does

This Agent analyzes source content and produces structured metadata fields: Topics, Categories, Persons, Organizations, Locations, Entities, and (if enabled) Sentiment. It applies strict tagging rules to ensure accuracy, consistency, and reliability — never inventing vague tags or breaking format.


Why It Matters

Metadata is the backbone of search, SEO, personalization, and editorial workflows. Without clean tagging, content gets lost in archives or misfiled in CMS systems. This Agent makes metadata tagging scalable and trustworthy across thousands of articles.


When to Use It

Perfect for:

  • Bulk-indexing content for internal CMS, archives, or knowledge bases
  • Powering content discovery through topics, categories, and entities
  • Identifying people, places, and organizations mentioned in articles
  • Building automated topic clusters or personalization feeds
  • Preparing metadata for newsletters or syndication partners

When Not to Use It

Avoid using this Agent:

  • For stylistic or creative text generation (it produces only metadata)
  • If you want summaries or headlines (use Summarizer or Headline Generator instead)
  • For subjective editorial judgments (it applies strict definitions, not opinions)

Inputs and Configuration

Basic Configuration

FieldRequired?Description
Source ContentAny article, Data Stream result, or pasted content. Used as the tagging base.
Tag TypesOptionalConfiguration of which metadata types to return. If missing or malformed, safe defaults are applied.

Advanced Configuration

FieldRequired?Description
Language ModelDefault = Claude 3.5
Brand GuidelinesNot usedThis Agent ignores brand tone and style fields — outputs are strictly metadata.

Best Reasons & Examples for Each Input Type

  • Free Text (pasted content)
    • Best for: Ad-hoc tagging of raw notes or internal copy.
    • Output: Metadata fields (topics, categories, people, etc.) neatly extracted.
  • Data Stream (curated content feed) 
    • Best for: Bulk processing large volumes of content for clustering or indexing.
    • Output: Consistent tags across all streamed articles.
  • Source URL
    • Best for: Direct tagging of external content or syndicated pieces.
    • Output: Structured tags aligned to content themes and entities.
  • Workspace Article
    • Best for: Tagging your own archive for newsletters or personalization.

Brand, Tone, and Rules

  • This Agent does not apply brand voice or style.
  • Output is always structured metadata, not prose.
  • No editorial spin, no stylistic adjustments.

Metadata Rules

  • Topics: Hierarchical subject matter (e.g., Society, Politics, Democracy)
  • Categories: Editorial groupings (e.g., Health, Faith, Lifestyle)
  • Persons: Named individuals (e.g., Joe Biden, Taylor Swift)
  • Organizations: Groups, agencies, or companies (e.g., FBI, Meta)
  • Locations: Cities, states, countries, or regions (e.g., Texas, Israel)
  • Entities: Abstract concepts, issues, or key ideas (e.g., AI Regulation, Gun Control)
  • Sentiment: Positive, Neutral, or Negative — only included if enabled.

Composition Rules

  • Always output in the fixed order: Topics, Categories, Persons, Organizations, Locations, Entities, Sentiment (if enabled).
  • Each section must appear even if empty.
  • Never rename headers or alter structure.
  • Lists should be comma-separated.
  • No invented or vague tags — only what is clearly present in the source.

Operational Details

Pre-Configured Defaults

  • Model: ChatGPT-4.1
  • Default tagging includes: Topics, Categories, Persons, Organizations, Locations, Entities
  • Sentiment tagging is disabled by default
  • Works with any article, transcript, or Data Stream

Safety & Fallbacks

  • If tag configuration is missing or malformed → falls back to default (all tags on, sentiment off)
  • If no clear tags found → returns empty lists
  • Never breaks or returns error messages — output is always valid

Output Format

Always follows this structure:

Topics: [comma-separated list or empty list]  
Categories: [comma-separated list or empty list]  
Persons: [comma-separated list or empty list]  
Organizations: [comma-separated list or empty list]  
Locations: [comma-separated list or empty list]  
Entities: [comma-separated list or empty list]  
Sentiment: [Positive / Neutral / Negative] ← Only if enabled

Smart Behaviors & Validation

  • Case-insensitive parsing of config values (yes/YES/Yes all valid)
  • Ensures consistency in naming (no duplicates, no misspellings)
  • Applies strict definitions for each metadata type
  • Guarantees valid, structured output every run

Guidance and Integration

Tips & Tricks

  • Enable Sentiment when analyzing opinion-heavy or social content.
  • Pair with Summarizer Agent to generate summaries + metadata together.
  • Use with Article to Email Agent to prepare tagged newsletter-ready content.
  • Apply consistent editorial Categories to power site-level filters and navigation.

Example Use Case

Input:
Article: “President signs new climate regulation bill in Washington.”
Config: Include all tags except Sentiment

Output:

Topics: Politics, Environment  
Categories: Government, Climate  
Persons: [President’s name]  
Organizations: Congress  
Locations: Washington, United States  
Entities: Climate Regulation, Environmental Policy  

Troubleshooting

  • Empty lists returned? → Check if the content actually contains named entities.
  • Wrong tags showing? → Refine input or add an approved tag list for stricter control.
  • Missing Sentiment? → Confirm sentiment tagging is enabled in config.

FAQs

Q: Does it invent tags?
No — it only extracts what is clearly present.

Q: Can I disable certain fields?
Yes — adjust Tag Types in configuration (case-insensitive).

Q: Can it handle multiple languages?
Yes — but extracted tags will match the source language.

Q: Does it always return valid output?
Yes — even with malformed or blank config, safe defaults apply.