Doctolib avatar
Doctolib

Pricing

$9.00/month + usage

Go to Store
Doctolib

Doctolib

Developed by

Anchor

Anchor

Maintained by Community

Scraping Doctolib is now super easy and cheap! Extract phones, names, contact, timings, image and addresses of medics, doctors, hospitals... Best part : you can even customize what info to extract from Doctolib!

1.0 (1)

Pricing

$9.00/month + usage

5

Total users

166

Monthly users

11

Runs succeeded

58%

Issues response

22 hours

Last modified

2 months ago

LE

Output only URL?

Closed

alexalexalexalex opened this issue
2 months ago

I tried to scrape Doctolib based on the tutorial that you have provided, and unfortunately, no data is collected. If this is a UX issue, please update the tutorial. If this is a scraper issue, please fix that. Let me know, thanks.

anchor avatar

Anchor (anchor)

2 months ago

Hello, Thanks for trying Doctolib scrapper. I am sorry that i did not work out of the box for you... let me help you with that. Can you provide your logs and actor run input so that I can reproduce on my side ? Thanks

LE

alexalexalexalex

2 months ago

Sorry I dont understand where to get the logs so I just copy paste stuff:

LE

alexalexalexalex

2 months ago

{ "hideSearchPages": true, "pageFunction": "async function pageFunction(context) {\n\n let data = {}\n let userData = context.request.userData\n data.url = context.request.url\n data.label = userData.label\n \n if(userData && userData.label === 'doctor'){ \n data.nom = await context.page.locator('#main-content h1').innerText({timeout:6000})\n data.tarif = await context.innerTextwrapper(context,'#payment_means')\n data.horaire_contact = await context.innerTextwrapper(context,'#openings_and_contact')\n data.description = await context.innerTextwrapper(context,'.dl-profile-bio')\n data.specialite = await context.innerTextwrapper(context,'.dl-profile-header-speciality')\n data.expertise = await context.innerTextwrapper(context,'#skills')\n try{\n data.phones = await context.getPhones(data.horaire_contact)\n }catch(e){\n context.log.info('Phones not found',e); \n }\n try{\n data.image = await context.page.locator('.dl-profile img').first().getAttribute('src',{timeout:2000})\n if(data.image.startsWith('/')){ data.image = 'https:' + data.image}\n }catch(e){\n context.log.info('Image not found',e); \n } \n \n }else{\n context.log.info('we are not on a doctor page: so a search or pagination page.');\n userData.label = 'doctor';\n const elements = context.page.locator('.search-result-card a[href]');\n const links = await elements.evaluateAll(elems => elems.map(elem => elem.getAttribute('href')));\n let extenstion = 'fr'\n if(context.request.url.includes('doctolib.de')){ extenstion = 'de' }\n if(context.request.url.includes('doctolib.it')){ extenstion = 'it' }\n links.forEach(async link => {\n if(link.startsWith('/')){ link = https://www.doctolib.${extenstion}${link} }\n await context.enqueueRequest(link, userData , true);\n })\n\n }\n context.log.info(ending this page now);\n delete data.label\n return data;\n}\n", "startUrls": [ { "url": "https://www.doctolib.de/privatklinik/stuttgart/lipoklinik?pid=practice-624868&phs=true&page=1&insurance_sector=public&highlight%5Bspeciality_ids%5D%5B%5D=1297", "method": "GET" }, { "url": "https://www.doctolib.de/hautarzt/stuttgart/sandra-teufel?pid=practice-228783&phs=true&page=1&index=1&insurance_sector=public", "method": "GET" }, { "url": "https://www.doctolib.de/plastische-und-asthetische-chirurgie/stuttgart/jens-neumann?pid=practice-561047&phs=true&page=1&index=2&insurance_sector=public", "method": "GET" }, { "url": "https://www.doctolib.de/plastische-und-asthetische-chirurgie/stuttgart/mirela-anghel-bota?pid=practice-578733&phs=true&page=1&index=3&insurance_sector=public", "method": "GET" } ] }

LE

alexalexalexalex

2 months ago

Ah found it!

LE

alexalexalexalex

2 months ago

It seems that I cannot upload the log file here.

LE

alexalexalexalex

2 months ago

2025-06-04T09:12:08.857Z ACTOR: Pulling Docker image of build aPSuTUe6Z2JFtF9vM from registry. 2025-06-04T09:12:22.108Z ACTOR: Creating Docker container. 2025-06-04T09:12:22.306Z ACTOR: Starting Docker container. 2025-06-04T09:12:22.511Z Starting X virtual framebuffer using: Xvfb :99 -ac -screen 0 1920x1080x24+32 -nolisten tcp 2025-06-04T09:12:22.512Z Executing main command 2025-06-04T09:12:23.655Z INFO System info {"apifyVersion":"2.3.2","apifyClientVersion":"2.9.3","osType":"Linux","nodeVersion":"v16.20.2"} 2025-06-04T09:12:24.117Z INFO FingerprintInjector: Successfully initialized. 2025-06-04T09:12:24.119Z INFO Starting the crawl. 2025-06-04T09:12:24.171Z INFO PlaywrightCrawler:AutoscaledPool: state {"currentConcurrency":0,"desiredConcurrency":2,"systemStatus":{"isSystemIdle":true,"memInfo":{"isOverloaded":false,"limitRatio":0.2,"actualRatio":null},"eventLoopInfo":{"isOverloaded":false,"limitRatio":0.6,"actualRatio":null},"cpuInfo":{"isOverloaded":false,"limitRatio":0.4,"actualRatio":null},"clientInfo":{"isOverloaded":false,"limitRatio":0.3,"actualRatio":null}}} 2025-06-04T09:12:31.601Z INFO Page opened, url:https://www.doctolib.de/privatklinik/stuttgart/lipoklinik?pid=practice-624868&phs=true&page=1&insurance_sector=public&highlight%5Bspeciality_ids%5D%5B%5D=1297 - label:undefined 2025-06-04T09:12:31.632Z INFO Page opened, url:https://www.doctolib.de/hautarzt/stuttgart/sandra-teufel?pid=practice-228783&phs=true&page=1&index=1&insurance_sector=public - label:undefined 2025-06-04T09:12:31.664Z INFO we are not on a doctor page: so a search or pagination page. 2025-06-04T09:12:31.684Z INFO we are not on a doctor page: so a search or pagination page. 2025-06-04T09:12:31.711Z INFO ending this page now 2025-06-04T09:12:31.781Z INFO ending this page now 2025-06-04T09:12:39.377Z INFO Page opened, url:https://www.doctolib.de/plastische-und-asthetische-chirurgie/stuttgart/mirela-anghel-bota?pid=practice-578733&phs=true&page=1&index=3&insurance_sector=public - label:undefined 2025-06-04T09:12:39.678Z INFO we are not on a doctor page: so a search or pagination page. 2025-06-04T09:12:40.085Z INFO Page opened, url:https://www.doctolib.de/plastische-und-asthetische-chirurgie/stuttgart/jens-neumann?pid=practice-561047&phs=true&page=1&index=2&insurance_sector=public - label:undefined 2025-06-04T09:12:40.166Z INFO ending this page now 2025-06-04T09:12:40.682Z INFO we are not on a doctor page: so a search or pagination page. 2025-06-04T09:12:40.783Z INFO ending this page now 2025-06-04T09:12:40.890Z INFO PlaywrightCrawler: All the requests from request list and/or request queue have been processed, the crawler will shut down. 2025-06-04T09:12:41.625Z INFO PlaywrightCrawler: Final request statistics: {"requestsFinished":4,"requestsFailed":0,"retryHistogram":[4],"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":8099,"requestsFinishedPerMinute":14,"requestsFailedPerMinute":0,"requestTotalDurationMillis":32397,"requestsTotal":4,"crawlerRuntimeMillis":17512} 2025-06-04T09:12:41.625Z INFO Crawl finished.

LE

alexalexalexalex

2 months ago

What I do want is a list with all the doctors and all the website domains scraped from the links that I have provided above. Unfortunately, this only works in about 10-15% of the entries. The majority of them don't provide any website link of the individual doctors.

anchor avatar

Anchor (anchor)

2 months ago

Let me try to help you. I opened your first link : https://www.doctolib.de/privatklinik/stuttgart/lipoklinik?pid=practice-624868&phs=true&page=1&insurance_sector=public&highlight%5Bspeciality_ids%5D%5B%5D=1297

This page looks like a good doctor page indeed. What you want is extract the website "https://lipoklinik.com/" ? Do I understand correctly ?

anchor avatar

Anchor (anchor)

2 months ago

I added a new column in the output in version 0.7 this morning, maybe it will fit your need. Also, in your INPUT, you are supposed to put "search urls" not doctors "urls". If you really want to put doctor urls, you will need to add "label":"doctor" in the userdata of your input url

anchor avatar

Anchor (anchor)

2 months ago
anchor avatar

Anchor (anchor)

2 months ago

Closing this issue, feel free to reopen if necessary

anchor avatar

Anchor (anchor)

2 months ago

If you are happy with this Actor, can I ask you to rate it 5 ⭐ ? I would really appreciate :)