Prepare your content

You can improve the quality of scraped content by using data-epi-type attributes to control what gets extracted, and meta tags to provide additional metadata.

Annotate your content with `data-epi-type` attributes

Use data-epi-type attributes to specify which elements the scraper should extract. This helps exclude navigation, footers, and other non-article content from topic analysis.

data-epi-type="title" – Marks the primary title
data-epi-type="content" – Marks the main body content

If you use multiple data-epi-type="content" blocks, they are concatenated during extraction.

Example

<html>
  <head>
    <meta property="og:title" content="Getting Started with Content Recommendations" />
    <meta property="og:description" content="Learn how to implement personalized content recommendations." />
  </head>
  <body>
    <header>Site navigation - excluded from extraction</header>
    <h1 data-epi-type="title">Getting Started with Content Recommendations</h1>
    <div data-epi-type="content">
      <p>This is the main article content that will be analyzed.</p>
      <p>All paragraphs within this div are included for topic extraction.</p>
    </div>
    <footer>Footer content - excluded from extraction</footer>
  </body>
</html>

📘
Note
Content Recommendations collects meta tags separately from data-epi-type attributes. Both extraction methods work independently.

Provide metadata with meta tags

The scraper collects meta tags from your pages for content categorization and rendering in recommendation widgets.

Open Graph tags

You can use the following Open Graph tags:

og:title – Page title
og:description – Page description
og:image – Primary image URL
og:image:secure_url – Secure image URL variant
og:image:alt – Image alt text
og:locale – Content locale
og:region – Geographic region
og:type – Content type
og:url – Canonical URL

📘
Note
The tag og:image:alt is stored as og:image_alt.

Article tags

The scraper collects all tags starting with article:. Common tags include:

article:published_time – Publication date (must be ISO 8601 format with a timezone)
article:category – Content category
article:tag – Content tags (you can have multiple)

Custom tags

The scraper collects all tags starting with idio:. Use these for custom metadata:

<meta property="idio:author_id" content="12345" />
<meta property="idio:content_tier" content="premium" />

Nested fields

idio:* and article:* tags support nested fields. For example, idio:content:category is stored as metadata.idio.content.category.

Multiple meta tags with the same property are stored as an array, including the following example:

<meta property="idio:content:category" content="Technology">
<meta property="idio:content:category" content="Software Development">

This is stored as the following:

{"idio": {"content": {"category": ["Technology", "Software Development"]}}}

You cannot use a field as both a direct value and a container for nested fields. For example, if you set idio:content:tier to "premium", you cannot also use idio:content:tier:category because tier is already assigned a value.

📘
Note
Article and custom tags work with both property and name attributes.

Example

<meta property="og:title" content="How to Optimize Content Recommendations" />
<meta property="og:description" content="Best practices for content recommendations." />
<meta property="og:image" content="https://www.example.com/images/guide.jpg" />

<meta property="article:published_time" content="2025-02-03T15:19:00.000-05:00" />
<meta property="article:category" content="Technology" />
<meta property="article:tag" content="Content Recommendations" />
<meta property="article:tag" content="Personalization" />

<meta property="idio:author_id" content="12345" />
<meta property="idio:content_tier" content="premium" />

Annotate your content with data-epi-type attributes

Example

Note

Provide metadata with meta tags

Open Graph tags

Note

Article tags

Custom tags

Nested fields

Note

Example

Annotate your content with `data-epi-type` attributes