Audio and Video Transcript (OpenAI Whisper) avatar
Audio and Video Transcript (OpenAI Whisper)

Pricing

$4.99/month + usage

Go to Store
Audio and Video Transcript (OpenAI Whisper)

Audio and Video Transcript (OpenAI Whisper)

Developed by

Vít Tuhý

Maintained by Community

This Actor transcribes audio or video files from publicly accessible URLs using OpenAI's Whisper API. To use this Actor, you'll need to provide your own OpenAI API key. It supports multiple languages and highly customizable parameters, enabling precise control over the transcription process.

0.0 (0)

Pricing

$4.99/month + usage

0

Monthly users

7

Runs succeeded

>99%

Response time

1.3 hours

Last modified

a month ago

Audio and Video Transcript

This Apify actor transcribes audio or video files from publicly accessible URLs using OpenAI's Whisper API. To use this actor, you'll need to provide your own OpenAI API key. It supports multiple languages and highly customizable parameters, enabling precise control over the transcription process. The actor processes each provided URL, downloads the corresponding audio or video files, transcribes them via OpenAI, and securely stores the resulting transcripts in Apify's Storage under the Key-Value Store.


🚀 Features

  • Automatic language detection or manual language specification from an extensive list.
  • Capability to process multiple audio or video URLs simultaneously.
  • Versatile output formats including plain text, JSON, SRT, VTT, and verbose JSON.
  • Optional inclusion of timestamps for individual words (when using verbose JSON format).
  • Fine-tuning through parameters such as temperature, compression ratio thresholds, and speech detection thresholds.
  • Secure handling of your OpenAI API key, hidden from logs for added safety.

🔧 Input Configuration

Configure your actor with the following parameters:

ParameterDescriptionRequired
urlArray of publicly accessible audio/video file URLs
languageLanguage selection or set to Auto-detect
temperatureFloating-point temperature to control variability in transcription
response_formatDesired transcript format (text, srt, vtt, json, verbose_json)
word_timestampsInclude timestamps per word (only valid when using verbose_json format)
promptAdditional textual context to enhance transcription accuracy
temperature_increment_on_fallbackIncrement in temperature if the initial transcription attempt fails
compression_ratio_thresholdMaximum allowable compression ratio for transcript acceptance
logprob_thresholdMinimum log probability required for transcript segments
no_speech_thresholdProbability threshold to detect segments with no speech
openai_api_keyYour personal OpenAI API key (kept secure and hidden)

📥 Example Input

1{
2  "url": [
3    { "url": "https://example.com/sample-audio.mp3" }
4  ],
5  "language": "Auto-detect",
6  "temperature": "0.0",
7  "response_format": "text",
8  "word_timestamps": false,
9  "prompt": "",
10  "temperature_increment_on_fallback": 0,
11  "compression_ratio_threshold": 2,
12  "logprob_threshold": -1,
13  "no_speech_threshold": 1,
14  "openai_api_key": "YOUR_OPENAI_API_KEY"
15}

📤 Output

Transcription results are securely stored within Apify's Storage under the Key-Value Store. Each transcript is saved individually with an identifiable key for convenient access.

Pricing

Pricing model

Rental 

To use this Actor, you have to pay a monthly rental fee to the developer. The rent is subtracted from your prepaid usage every month after the free trial period. You also pay for the Apify platform usage.

Free trial

30 minutes

Price

$4.99