Skip to content

Search is only available in production builds. Try building and previewing the site to test it out locally.

Configuration & Settings

These settings control how WebSync discovers pages, extracts content, and sends sources to NotebookLM.

Filters are regular expressions that decide which URLs are allowed or blocked during a crawl.

Use them to:

  • Limit crawling to specific paths (for example, only /docs/).
  • Exclude noise like /tag/, /search/, or /category/.

Example patterns:

  • Include only docs: https?://example.com/docs/.*
  • Exclude tags and search: https?://example.com/(tag|search)/.*

The maximum link depth (levels of recursion) from the start page.

Examples:

  • 0 = only the current page.
  • 1 = the current page plus direct links.

The maximum number of pages WebSync will discover and fetch during a crawl. This controls crawl size and runtime.

The maximum number of sources sent to NotebookLM. This can be lower than pages crawled if you are only importing a subset or merging short pages.

Choose how WebSync extracts content from HTML:

  • Raw — send the HTML as-is.
  • parse5 — extract text nodes from HTML using the parse5 library.
  • node-html-markdown — convert HTML to Markdown.
  • defuddle (default) — extract structured content using defuddle.

Choose how sources are sent to NotebookLM:

  • Link — send the URL only (NotebookLM attempts extraction).
  • Content — send extracted content (best for pages requiring JS rendering).
  • Auto — WebSync chooses the best method per page.
  • Merging: very short pages are concatenated into a single source to reduce noise.
  • Deduplication: not supported.

You can use these variables in the template:

  • {title}
  • {url}
  • {timestamp:format?}
  • {date:format?}

Use || for fallbacks, for example: { title || url }. Optional formats use date-fns format strings, for example: MM/dd/yyyy.

When enabled, WebSync skips the audit step and immediately sends all sources to NotebookLM. This is the default behavior.

  • Max depth: 3
  • Max pages to crawl: 1000
  • Max sources to import: 300 (adjust based on your NotebookLM plan)
  • Parsing method: defuddle
  • Posting method: auto
  • Auto-start import after crawl: on