Website Backup avatar

Website Backup

Try for free

No credit card required

Go to Store
Website Backup

Website Backup

mhamas/website-backup
Try for free

No credit card required

Enables to create a backup of any website by crawling it, so that you don’t lose any content by accident. Ideal e.g. for your personal or company blog.

Start URLs

startURLsarrayOptional

List of URL entry points. Each entry is an object of type {'url': 'http://www.example.com'}

Link selector

linkSelectorstringOptional

CSS selector matching elements with 'href' attributes that should be enqueued. To enqueue urls from

Max pages per run

maxRequestsPerCrawlintegerOptional

The maximum number of pages that the scraper will load. The scraper will stop when this limit is reached. It's always a good idea to set this limit in order to prevent excess platform usage for misconfigured scrapers. Note that the actual number of pages loaded might be slightly higher than this value.

If set to 0, there is no limit.

Default value of this property is 10

Max crawling depth

maxCrawlingDepthintegerOptional

Defines how many links away from the StartURLs will the scraper descend. 0 means unlimited.

Default value of this property is 0

Max concurrency

maxConcurrencyintegerOptional

Defines how many pages can be processed by the scraper in parallel. The scraper automatically increases and decreases concurrency based on available system resources. Use this option to set a hard limit.

Default value of this property is 50

Custom key value store

customKeyValueStorestringOptional

Use custom named key value store for saving results. If the key value store with this name doesn't yet exist, it's created. The snapshots of the pages will be saved in the key value store.

Default value of this property is ""

Custom dataset

customDatasetstringOptional

Use custom named dataset for saving metadata. If the dataset with this name doesn't yet exist, it's created. The metadata about the snapshots of the pages will be saves in the dataset.

Default value of this property is ""

Timeout (in seconds) for backuping a single URL.

timeoutForSingleUrlInSecondsintegerOptional

Timeout in seconds for doing a backup of a single URL. Try to increase this timeout in case you see an error Error: handlePageFunction timed out after X seconds. .

Default value of this property is 120

navigationTimeoutInSecondsintegerOptional

Timeout in seconds in which the navigation needs to finish. Try to increase this if you see an error Navigation timeout of XXX ms exceeded

Default value of this property is 120

URL search parameters to ignore

searchParamsToIgnorearrayOptional

Names of URL search parameters (such as 'source', 'sourceid', etc.) that should be ignored in the URLs when crawling.

Default value of this property is []

Only consider pages under the same domain as one of the provided URLs.

sameOriginbooleanOptional

Only backup URLs with the same origin as any of the start URL origins. E.g. when turned on for a single start URL https://blog.apify.com, only links with prefix https://blog.apify.com will be backed up recursively.

Default value of this property is true

Proxy configuration

proxyConfigurationobjectOptional

Choose to use no proxy, Apify Proxy, or provide custom proxy URLs.

Default value of this property is {}

Developer
Maintained by Community

Actor Metrics

  • 5 monthly users

  • 4 stars

  • >99% runs succeeded

  • Created in Jul 2020

  • Modified 4 years ago

Categories