Keywords Extractor
No credit card required
Keywords Extractor
No credit card required
Use our free website keyword extractor to crawl any website and extract keyword counts on each page.
Start URLs
startUrls
arrayRequired
A static list of URLs to scrape. To be able to add new URLs on the fly, enable the Use request queue option.
For details, see Start URLs in README.
Use Browser
useBrowser
booleanOptional
If on, it will use regular borwser for scraping.
Default value of this property is false
Case sensitive
caseSensitive
booleanOptional
If on, it will only match keywords with exact upper or lower case.
Default value of this property is false
Scan scripts
scanScripts
booleanOptional
If on, it will also count keywords appearing inside scripts.
Default value of this property is false
Link selector
linkSelector
stringOptional
A CSS selector saying which links on the page (<a>
elements with href
attribute) shall be followed and added to the request queue. This setting only applies if Use request queue is enabled. To filter the links added to the queue, use the Pseudo-URLs setting.
If Link selector is empty, the page links are ignored.
For details, see Link selector in README.
Pseudo-URLs
pseudoUrls
arrayOptional
Specifies what kind of URLs found by Link selector should be added to the request queue. A pseudo-URL is a URL with regular expressions enclosed in []
brackets, e.g. http://www.example.com/[.*]
. This setting only applies if the Use request queue option is enabled.
If Pseudo-URLs are omitted, the actor enqueues all links matched by the Link selector.
For details, see Pseudo-URLs in README.
Default value of this property is []
Max depth
maxDepth
integerOptional
How many links deep from the Start URLs do you want to crawl. Start URLs have depth 0.
Default value of this property is 5
Proxy configuration
proxyConfiguration
objectOptional
Specifies proxy servers that will be used by the scraper in order to hide its origin.
For details, see Proxy configuration in README.
Default value of this property is {}
Max pages per run
maxPagesPerCrawl
integerOptional
The maximum number of pages that the scraper will load. The scraper will stop when this limit is reached. It's always a good idea to set this limit in order to prevent excess platform usage for misconfigured scrapers. Note that the actual number of pages loaded might be slightly higher than this value.
If set to 0
, there is no limit.
Default value of this property is 100
Max concurrency
maxConcurrency
integerOptional
Specified the maximum number of pages that can be processed by the scraper in parallel. The scraper automatically increases and decreases concurrency based on available system resources. This option enables you to set an upper limit, for example to reduce the load on a target website.
Default value of this property is 50
Retire Instance After Request Count
retireInstanceAfterRequestCount
integerOptional
How often will the browser itself rotate. Pick higher for smaller consumption, pick less to rotate (test) more proxies
Default value of this property is 50
Actor Metrics
18 monthly users
-
9 stars
>99% runs succeeded
Created in Mar 2020
Modified 4 years ago