Dataset Validity Checker
No credit card required
Dataset Validity Checker
No credit card required
Automatically checks, whether default datasets created by runs of an actor differ too much from the previously encountered ones, allowing it to warn you about web scraping problems caused by, e.g., a website layout changing, or other significant changes in the resulting data.
Actor Id
actId
stringOptional
Id of the actor whose datasets the validity checker is supposed to process.
Task Id
taskId
stringOptional
Id of the task whose datasets the validity checker is supposed to process. Supersedes the actId.
User Token
token
stringOptional
Token of the user owning the examined actor/task. If not filled, token of the user starting the Dataset Validity Checker is used.
Warning Email
warningEmail
stringOptional
An email, where warnings about invalid datasets should be sent.
Clear History
clearHistory
booleanOptional
Set to true if you want the validity checker to discard all previously gathered information about datasets and start anew. You should use this option if you change the actor in a way that significantly changes its results, or if the website changes significantly in a way, that doesn't actually break your actor (e.g. the amount of different items available for purchase at an e-shop changes drastically).
Default value of this property is false
Previous Datasets Considered
previousDatasetsTakenIntoAccount
integerOptional
A number of previous datasets that will be considered when determining whether the dataset is valid. If not filled, the value will be 100.
Minimal Datasets
minimalDatasetCount
integerOptional
Minimal number of datasets processed needed to validate further datasets. Needs to be at most the same value as 'Previous Datasets Considered Count'. If not filled, the value will be 10.
Number Handling Policy
numberHandlingPolicy
EnumOptional
Governs what attributes the Dataset Validity Checker considers to be numbers. If it is 'Strict', only values saved as number type will be considered as such. If 'Loose', strings that are numbers in a non-scientific notation are also handled like numbers. 'Strict' policy is generally better, but if you don't convert numbers to the proper type, using 'Loose' should give you better results.
Value options:
"loose": string"strict": string
Default value of this property is "loose"
Starting At
startingAt
stringOptional
Allows you to control, what will be the earliest run whose dataset will be processed by this run of Dataset Validity Checker. Will be superseded, if runs from later time have already been processed. Has to be ISO 8601 compliant date/time in UTC.
Until
until
stringOptional
Allows you to control, what will be the latest run whose dataset will be processed by this run of Dataset Validity Checker. Has to be ISO 8601 compliant date/time in UTC.
Average Multiplying Coefficient
averageMultiplyingCoefficient
stringOptional
Controls how different the dataset can be compared to the previously seen datasets to still be considered valid in terms of multiples of average difference. Default value is 5.
Actor Metrics
1 monthly user
-
3 stars
>99% runs succeeded
Created in Aug 2019
Modified 2 years ago