Skip to main content

Documentation Index

Fetch the complete documentation index at: https://slatehq.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

Overview

The Extract LD+JSON Schema block extracts structured data markup from web pages. JSON-LD (JavaScript Object Notation for Linking Data) is the format Google recommends for structured data, and this block helps you analyze what schema markup competitors or any webpage uses. Use it to audit SEO structured data, analyze competitor schema implementations, or gather insights for your own schema strategy.

Configuration

URL

Enter the webpage URL to extract JSON-LD schemas from. This field supports placeholders for dynamic values. Examples:
  • Static URL: https://example.com/blog/article
  • From previous step: {{step_1.output.url}}
  • From loop: {{current.link}}

Output Format

Choose how the extracted schema data is returned.
FormatDescriptionCredits
StructuredRaw JSON-LD schema data exactly as found on the page1 credit
SummaryAI-processed checklist showing which schema types are present5 credits

Structured Output

Returns the raw JSON-LD schema objects extracted from the page.

Output Structure

[
  {
    "@context": "https://schema.org",
    "@type": "Article",
    "headline": "How to Improve SEO Rankings",
    "author": {
      "@type": "Person",
      "name": "John Smith"
    },
    "datePublished": "2024-01-15",
    "publisher": {
      "@type": "Organization",
      "name": "Example Blog"
    }
  },
  {
    "@context": "https://schema.org",
    "@type": "BreadcrumbList",
    "itemListElement": [...]
  }
]

Accessing Structured Data

Get all schemas:
{{step_n.output}}
Get first schema type:
{{step_n.output[0].@type}}
Loop through schemas:
{% for schema in step_n.output %}
  Type: {{ schema.@type }}
{% endfor %}

Summary Output

Returns a markdown table summarizing which schema types are present on the page.

Output Structure

| Schema Type | Status |
|-------------|--------|
| Article | Present |
| Author | Present |
| Organization | Present |
| BreadcrumbList | Present |
This format is useful for quick audits and comparisons without parsing raw JSON.

Supported Schema Types

The block extracts any JSON-LD schema types embedded in web pages, including:
CategorySchema Types
ContentArticle, NewsArticle, BlogPosting, WebPage, FAQPage
People & OrgsPerson, Author, Organization, LocalBusiness
ProductsProduct, Offer, AggregateRating, Review
MediaVideoObject, ImageObject, AudioObject
EventsEvent, EventSeries
NavigationBreadcrumbList, SiteNavigationElement
OtherRecipe, HowTo, Course, JobPosting, and more

Best Practices

  • Use Structured format when you need to analyze specific schema properties
  • Use Summary format for quick audits comparing multiple pages
  • Combine with Loop block to audit schema across multiple URLs
  • Extract competitor schemas to inform your own structured data strategy
  • Validate extracted schemas against Google’s requirements

Common Use Cases

Use CaseConfiguration Tips
Competitor schema auditExtract schemas from competitor URLs, compare coverage
SEO gap analysisCheck which schema types competitors use that you don’t
Content template creationAnalyze top-ranking pages to inform your schema strategy
Bulk schema validationLoop through sitemap URLs, extract and compare schemas
Rich snippet researchIdentify which schemas drive rich results in your niche

Example Workflow: Competitor Schema Audit

Analyze competitor structured data:
  1. Google Search Block:
    • Query: {{input.keyword}}
    • Max Results: 10
  2. Loop Block: Process each result
  3. Extract LD+JSON Schema Block:
    • URL: {{current.link}}
    • Output Format: Summary
  4. LLM Block:
    Analyze these schema summaries from top-ranking pages:
    
    1. Which schema types appear most frequently?
    2. What schemas should we implement?
    3. Any unique schemas used by top competitors?
    
    {{step_3.output}}
    
  5. Google Sheets Block: Store analysis

Example Workflow: Schema Coverage Report

Generate a schema audit for your own site:
  1. Get URLs from Sitemap Block:
    • Sitemap URL: https://yoursite.com/sitemap.xml
  2. Loop Block: Process each URL
  3. Extract LD+JSON Schema Block:
    • URL: {{current.url}}
    • Output Format: Structured
  4. LLM Block:
    Review this page's JSON-LD schema:
    - Is Article/BlogPosting schema present?
    - Are required properties included?
    - Any missing recommended properties?
    
    {{step_3.output}}
    
  5. Google Sheets Block: Append audit results

Troubleshooting

IssueCauseSolution
No schema foundPage has no JSON-LD markupVerify the page uses JSON-LD (not Microdata or RDFa)
Empty outputSchema in different formatBlock only extracts JSON-LD, not other schema formats
TimeoutLarge or slow pageTry a different URL or retry later
Access blockedSite blocking automated requestsSome sites restrict scraping

What’s Next

Now that you understand the Extract LD+JSON Schema block: