Transcribing Videos

If you do not have a script of the video, you can get the script in many ways, such as:

  • With a digital video, you can use a foot pedal control like vPedal to transcribe the video. It costs less than $100.
  • Train a speech recognition software such as ViaVoice (Win) or MacSpeech (Mac) for a few hours. Then listen to the videotape and repeat the dialogue into the microphone. At the beginning, you are going to get lot of errors. But you can teach the software to fix the errors and down the road you can expect to get about 95% accuracy.
  • Use a professional transcription service. This is the most expensive way to generate the script.

There is a simple way to transcribe and input the spoken dialog directly to your computer. All you need to do is spend an hour or two training a speech recognition software to recognize your speech. By using Naturally Speaking (English and Spanish) and ViaVoice (English)  speech recognition software you can transcribe a video.

With the Naturally Speaking or ViaVoice software, you can listen to the video, and repeat the words that are being said into the microphone. If you want to use this transcription for post captioning, you can import the document into the CPC Captioning software and the text will automatically be broken up into formatted captions.

Lanuguage Supported
Naturally Speaking is available in multiple languages such as, English, Spanish, French, German. ViaVoice is available only in English 

Accuracy
After training these speech recognition software for an hour or two, you can get an accuracy of about 85%. If you fixes the errors and let the software know about the errors, the software will not make the mistakes again. Eventually you can easily get an accuracy of 95% or so.

Naturally Speaking software analyzes a complete sentence or a phrase until you pause talking and gives better accuracy compared to ViaVoice. Since Naturally Speaking analyzes the text before showing it on the screen, if you use it for live captioning, the captions will be delayed by a few seconds. For transcription purposes, the delay does not matter. So Naturally speaking is better than ViaVoice from the accuracy point of view.