Configuration & Settings

These settings control how WebSync discovers pages, extracts content, and sends sources to NotebookLM.

Include and exclude filters (regex)

Filters are regular expressions that decide which URLs are allowed or blocked during a crawl.

Use them to:

Example patterns:

The maximum link depth (levels of recursion) from the start page.

Examples:

The maximum number of pages WebSync will discover and fetch during a crawl. This controls crawl size and runtime.

The maximum number of sources sent to NotebookLM. This can be lower than pages crawled if you are only importing a subset or merging short pages.

Choose how WebSync extracts content from HTML:

Choose how sources are sent to NotebookLM:

Merging: very short pages are concatenated into a single source to reduce noise.
Deduplication: not supported.

You can use these variables in the template:

Use || for fallbacks, for example: { title || url }. Optional formats use date-fns format strings, for example: MM/dd/yyyy.

When enabled, WebSync skips the audit step and immediately sends all sources to NotebookLM. This is the default behavior.