JobsDB-HK Scraper avatar
JobsDB-HK Scraper

Under maintenance

Pricing

$11.00/month + usage

Go to Store
JobsDB-HK Scraper

JobsDB-HK Scraper

Under maintenance

Developed by

Mandeep

Maintained by Community

🧰 What This Scraper Does This scraper extracts job listings from hk.jobsdb, pulling key information such as: - `company_name` - `job_title` - `salary` - `work_type` - `job_description` - `job_responsibilities`

5.0 (1)

Pricing

$11.00/month + usage

1

Monthly users

1

Last modified

6 days ago

JobsDB HK Scraper Actor

JobsDB HK Scraper Actor is a private Apify Actor that scrapes job listings from HK JobsDB. The Actor leverages a headless browser via Playwright and an integrated Deepseek API call to accurately extract and structure job details—all without generating any new data.

Overview

This Actor scrapes job listings based on user-specified keywords and returns structured data that includes the following six fields (in this order):

  1. Job Title
  2. Company Name
  3. Work Type
  4. Salary
  5. Job Description (presented as bullet points)
  6. Job Responsibilities (presented as bullet points)

Deepseek is used to process each job listing so that the output data precisely reflects what is present on the web page. Any missing field defaults to “Not mentioned.”

Key Features

  • Headless Browsing with Playwright: Navigates the JobsDB pages and collects job listings.
  • Deepseek API Integration: Processes raw job listing text to extract the six required fields.
  • Responsive Dataset Output: The resulting output is automatically transformed into a neatly organized, responsive table (configured via a dataset schema in the Actor settings) with the following order:
    1. Job Title
    2. Company Name
    3. Work Type
    4. Salary
    5. Job Description
    6. Job Responsibilities
  • Robust Processing: Includes input validation, error handling, and fallback behavior in case Deepseek fails.

How It Works

  1. Input:

    • The Actor accepts an input JSON containing a comma-separated list of keywords (e.g., "security, researcher") and a maximum number of job entries.
  2. Scraping Process:

    • A search URL is dynamically built based on the keywords.
    • Playwright navigates through search result pages and collects each job’s basic details.
    • Each job’s detailed page is loaded and its text content is extracted.
  3. Data Extraction with Deepseek:

    • The scraped text is sent to Deepseek with a prompt instructing it to extract only the following fields (in order):
      1. Job Title
      2. Company Name
      3. Work Type
      4. Salary
      5. Job Description (bullet points)
      6. Job Responsibilities (bullet points)
    • Deepseek returns a JSON response which is parsed and saved as a single record with the six fields combined into a user-friendly format (i.e. multiline bullet lists for descriptions/responsibilities).
  4. Output:

    • The final structured data is saved to a dataset, which is rendered in a responsive table as defined in our dataset schema.
    • The dataset columns appear in this order:
      1. Job Title
      2. Company Name
      3. Work Type
      4. Salary
      5. Job Description
      6. Job Responsibilities
  5. Deepseek API Check:

    • At startup, the Actor pings the Deepseek API (using a simple “Hello, who are you?” message) and logs the raw response. This confirms that the API key is valid and Deepseek is responsive before starting the scraping process.

Usage

  • Deploying:
    The Actor is deployed to the Apify Platform using the private repository configuration. Once deployed, simply provide the input via the Apify input schema and start the Actor.

  • Input Parameters:

    • keywords: A comma-separated list of search keywords.
    • maxEntries: The maximum number of job listings to process.
  • Output:
    The dataset will display the scraped job data in the internal Apify Output UI in the six-field format defined above.

Additional Notes

  • All fields are strictly extracted from the source data using Deepseek; no new content is generated.
  • The code is designed for internal use and tailored to meet our specific data extraction needs from HK JobsDB.

Happy scraping!

Pricing

Pricing model

Rental 

To use this Actor, you have to pay a monthly rental fee to the developer. The rent is subtracted from your prepaid usage every month after the free trial period. You also pay for the Apify platform usage.

Free trial

2 hours

Price

$11.00