Gaffa
  • Introduction
  • Get Started
  • Credits and Pricing
  • Changelog
  • Features
    • Browser Requests
      • Actions
        • Capture DOM
        • Capture Screenshot
        • Capture Snapshot
        • Click
        • Generate Markdown
        • Generate Simplified DOM
        • Print
        • Scroll
        • Type
        • Wait
      • API Playground Examples
        • Export Web Page to PDF
        • Convert Web Page to Markdown
        • Infinitely Scroll an Ecommerce Site
        • Capture a Full Height Screenshot
        • Automated Form Filling
  • API Reference
    • API Authentication
    • POST v1/browser/request
    • GET v1/browser/request/{id}
    • GET v1/browser/requests
Powered by GitBook
On this page
  • Parameters
  • Usage
  • Example Output
Export as PDF
  1. Features
  2. Browser Requests
  3. Actions

Generate Simplified DOM

PreviousGenerate MarkdownNextPrint

Last updated 3 months ago

Type: generate_simplified_dom

When you're looking at the DOM of a web page, there's a lot of unnecessary data that can be discarded if you are only interested in the page's elements or looking to export the data into a LLM. The generate_simplified_dom output format processes the HTML in the following way:

  • Removes all links in the head

  • Removes all script nodes and links to scripts

  • Removes all style nodes

  • Remove style attributes from all elements

  • Remove all links to stylesheets

  • Remove all noscript elements outside of the body

  • Finds all hrefs with query strings and removes the query strings

  • Important meta tags are kept, all others are removed

  • Remove all alternate links

  • Remove all SVG paths

  • Remove empty text nodes and excessive spacing

Parameters

See .

Usage

The following JSON captures the DOM of the page and simplifies it.

"actions": [
    {
        "type": "generate_simplified_dom"
    }
]

We are actively working to improve this and to make this process more configurable - let us know if there's something you think we can improve.

Example Output

universal parameters
6KB
GaffaSimplifiedDOMSample.txt