Video Transcript Scraper: Youtube, X (twitter), Tiktok, etc.
No credit card required
Video Transcript Scraper: Youtube, X (twitter), Tiktok, etc.
No credit card required
Scrapes transcripts from any online video / audio content on any plateform (Youtube, X, ..) in any available language. It delivers outputs in both JSON and LLM-ready formats, making it ideal for analytics, and AI-based applications. Perfect for research and building intelligent conversational agents
I tested on 2 random videos and they both fail:
This one got https://youtu.be/ULSqoPsXqhA?si=2w23aQaOelRswxVb the following eror: Traceback (most recent call last): File "d:\documents\dev\telegram-bots\youtube-analyzer\test\apify", line 15, in print(response.json()) File "C:\Users\ssipa\AppData\Local\Programs\Python\Python311\Lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ UnicodeEncodeError: 'charmap' codec can't encode character '\U0001f4fa' in position 186: character maps to
This one https://www.youtube.com/watch?v=PCt243ogcd8 got the following error Traceback (most recent call last): File "d:\documents\dev\telegram-bots\youtube-analyzer\test\apify", line 16, in print(json.dumps(response.json(), indent=4, ensure_ascii=False)) ^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\ssipa\AppData\Local\Programs\Python\Python311\Lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ UnicodeEncodeError: 'charmap' codec can't encode character '\uff1a' in position 530: character maps to
Sorry for these errors. I will fix it in a couple of hours
Dear Sam, I passed a bug fix yesterday at 4 pm CET time to fix these issues. I tried to run the Actor with the urls provided and I can't reproduce the errors you are experiencing. Maybe you have specified a language ? In such case, would you mind sending me the whole input please ?
Thanks for your reply, here is the simple python script I used to test and the associated error output in the screenshot. I just tested it today, and I have the same error.
Thank you for sharing your code Sam!
I tested your code and it works just fine for me. So here is what I suspect: the unicdeencode error is happening because Python tries to print unicode characters to a console that doesn't support it. Are you using windows console by any chance ? Do you think you can try it on a Linux or wsl console ?
If that doesn't work, perhaps you can specify the encoding of your stdout to utf-8 by adding these both lines import sys sys.stdout.reconfigure(encoding='utf-8')
If that doesn't work, you can always encode the output in utf-8 like following: print(json.dumps(response.json(), indent=4, ensure_ascii=False).encode('utf-8') This would give you an encoded result in utf-8, try decoding it again with .decode('utf-8')
Finally, your result is probably already in a python dict object in the response.json() part. Can you make sure that my_dict = response.json() actually works and contains the data ?
Thank you for your patience. And I hope this helps !
Thank you for your help and patience. You are right! I tested on MacOS and it works fine too. I think your hypothesis is right, the issue is with the terminal I was using on Windows. Your response was definitely helpful. Sorry for the false alert.
Actor Metrics
235 monthly users
-
54 stars
>99% runs succeeded
21 hours response time
Created in Oct 2024
Modified 4 days ago