Pricing

$30.00/month + usage

Go to Store

Nordstrom Product Scraper

Try for free

Developed by

GetDataForMe

Nordstrom Product Spider scrapes detailed product info from Nordstrom.com, including name, description, price, colors, and sizes in JSON. Ideal for e-commerce analysis, competitor research, and cataloging. Supports multiple URLs, uses proxies, and delivers reliable, structured data for integration.

0.0 (0)

Pricing

$30.00/month + usage

Total users

Monthly users

Runs succeeded

>99%

Last modified

9 days ago

E-commerce

Automation

Apify Template for Scrapy Spiders

This repository serves as a template for deploying Scrapy spiders to Apify. It is automatically updated by a GitHub Actions workflow in the central repository (getdataforme/central_repo) when changes are pushed to spider files in src/spiders/ or src/custom/. Below is an overview of the automated tasks performed to keep this repository in sync.

Automated Tasks

The following tasks are executed by the GitHub Actions workflow when a spider file (e.g., src/spiders/example/example_parser_spider.py) is modified in the central repository:

Repository Creation:
- Creates a new Apify repository (e.g., example_apify) from this template (apify_template) using the GitHub API, if it doesn't already exist.
- Grants push permissions to the scraping team in the getdataforme organization.
Spider File Sync:
- Copies the modified spider file (e.g., example_parser_spider.py) from the central repository to src/spiders/ in this repository.
- Copies the associated requirements.txt (if present) from the spider's directory (e.g., src/spiders/example/) to the root of this repository.
Input Schema Generation:
- Runs generate_input_schema.py to create .actor/input_schema.json.
- Parses the spider's __init__ method (e.g., def __init__(self, location:str, item_limit:int=100, county:str="Japan", *args, **kwargs)) to generate a JSON schema.
- Supports types: string, integer, boolean, number (for Python str, int, bool, float).
- Uses prefill for strings and default for non-strings, with appropriate editor values (textfield, number, checkbox).
- Marks parameters without defaults (e.g., location) as required.
Main Script Update:
- Runs update_main.py to update src/main.py.
- Updates the actor_input section to fetch input values matching the spider's __init__ parameters (e.g., location, item_limit, county).
- Updates the process.crawl call to pass these parameters to the spider (e.g., process.crawl(Spider, location=location, item_limit=item_limit, county=county)).
- Preserves existing settings, comments, and proxy configurations.
Actor Configuration Update:
- Updates .actor/actor.json to set the name field based on the repository name, removing the _apify suffix (e.g., example_apify → example).
- Uses jq to modify the JSON file while preserving other fields (e.g., title, description, input).
Commit and Push:
- Commits changes to src/spiders/$spider_file, requirements.txt, .actor/input_schema.json, src/main.py, and .actor/actor.json.
- Pushes the changes to the main branch of this repository.

Repository Structure

src/spiders/: Contains the Scrapy spider file (e.g., example_parser_spider.py).
src/main.py: Main script to run the spider with Apify Actor integration.
.actor/input_schema.json: JSON schema defining the spider's input parameters.
.actor/actor.json: Actor configuration with the repository name and metadata.
requirements.txt: Python dependencies for the spider.
Dockerfile: Docker configuration for running the Apify Actor.

Prerequisites

The central repository (getdataforme/central_repo) must contain:
- generate_input_schema.py and update_main.py in the root.
- Spider files in src/spiders/ or src/custom/ with a valid __init__ method.
The GitHub Actions workflow requires a GITHUB_TOKEN with repository creation and write permissions.
jq and python3 are installed in the workflow environment.

Testing

To verify the automation:

Push a change to a spider file in src/spiders/ or src/custom/ in the central repository.
Check the generated Apify repository (e.g., getdataforme/example_apify) for:
- Updated src/spiders/$spider_file.
- Correct input_schema.json with parameters matching the spider's __init__.
- Updated src/main.py with correct actor_input and process.crawl lines.
- Updated .actor/actor.json with the correct name field.

Notes

Warning: This Apify actor repository is automatically generated and updated by the GitHub Actions workflow in getdataforme/central_repo. Do not edit this repository directly. To modify the spider, update the corresponding file in src/spiders/ or src/custom/ in the central repository, and the workflow will sync changes to this repository, including:

Copying the spider file to src/spiders/.

Generating .actor/input_schema.json based on the spider’s __init__ parameters.

Updating src/main.py with correct input handling and spider execution.

Setting the name field in .actor/actor.json (e.g., example for example_apify).

Verification: After the workflow completes, verify the actor by checking:

src/spiders/$spider_file matches the central repository.

.actor/input_schema.json includes all __init__ parameters with correct types and defaults.

src/main.py has updated actor_input and process.crawl lines.

.actor/actor.json has the correct name.

Optionally, deploy the actor to Apify and test with sample inputs to ensure functionality.

The workflow supports multiple spider types (scrapy, hrequest, playwright) based on the file path (src/spiders/, src/custom/*/hrequest/, src/custom/*/playwright/).
Commits with [apify] in the message update only Apify repositories; [internal] updates only internal repositories; otherwise, both are updated.
Ensure the spider's __init__ uses supported types (str, int, bool, float) to avoid schema generation errors.

For issues, check the GitHub Actions logs in the central repository or contact the scraping team.

On this page

Apify Template for Scrapy Spiders

Share Actor:

Nordstrom Product Bulk Scraper Pro

hello.datawizard-owner/nordstrom-product-bulk-scraper-pro

Nordstrom Product Bulk Scraper Pro: Easily scrape Nordstrom product listings by keyword. Extract details like name, brand, price, ratings, and stock. Ideal for e-commerce analysis and trend tracking. Use Apify Proxy with Residential group for reliable JSON results. Perfect for market research.

datawizards

Nordstrom Scraper

runtime/nordstrom-scraper

Nordstrom Scraper is an Apify Actor that scrapes product data from Nordstrom’s search results pages. It extracts key product details such as title, brand, image URL, current price, and previous price.

Runtime

5.0

Nordstrom Scraper

trudax/actor-nordstrom-scraper

Nordstrom web scraper to crawl product information including price and sale price, color, and images. Extract all data in a dataset in multiple formats.

Gustavo Rudiger

221

Tellmebaby Product Scraper

getdataforme/tellmebaby-product-scraper

Tellmebaby Product Spider scrapes detailed product info from Tellmebaby.com.au, including name, description, price, and ratings in JSON. Ideal for market research, competitor analysis, and cataloging. Supports multiple URLs, uses proxies, and delivers reliable, structured data for easy integration.

GetDataForMe

Colesau Product Scraper

getdataforme/colesau-product-scraper

The Colesau Product Spider scrapes product details from Coles Australia's website, extracting name, description, price, SKU, and images. Input URLs and get structured JSON for price monitoring, market research, or e-commerce integration. Scalable, reliable, and easy to use.

GetDataForMe

Noon Product Spider

getdataforme/noon-product-spider

The Noon Product Spider is an Apify Actor that scrapes detailed product data from Noon.com, including name, price, images, brand, offers, and ratings. Input product URLs and get structured JSON output for e-commerce monitoring, market research, or price comparison. Fast, reliable, and scalable.

GetDataForMe

Udemy Search Scraper

getdataforme/udemy-search-scraper

The Udemy Search Scraper Spider scrapes detailed course info from Udemy.com, including title, instructors, ratings, and learning outcomes in JSON. Ideal for educational research, course cataloging, and market analysis. Supports multiple URLs, uses proxies, and delivers reliable, structured data.

GetDataForMe

Mercadolivre Reviews Spider

getdataforme/mercadolivre-reviews-spider

Mercadolivre Reviews Spider scrapes detailed customer reviews from Mercado Livre product pages, delivering ratings, text, and metadata in JSON. Ideal for market research, sentiment analysis, and competitor tracking. Supports multiple URLs, uses proxies, and ensures reliable, structured data output.

GetDataForMe

Mercadolivrebr Products Spider

getdataforme/mercadolivrebr-products-scraper

The Mercadolivrebr Products Spider scrapes product details from Mercado Livre Brazil, extracting name, description, price, SKU, images, and ratings. Input URLs and get structured JSON for price monitoring, market research, or e-commerce integration. Scalable, reliable, and easy to use.

GetDataForMe

MediaMarkt - Scraper

flawless_rider/mediamarkt---scraper

Scrapes product data from Media Markt, including name, rating, number of reviews, product number, price, currency, and delivery date. Outputs structured JSON data. Ideal for price comparison, cataloging, and e-commerce. Optimized for current site structure.