QuickPress: “Success” of Automatic Captioning on YouTube

It is, of course, new technology, so we’ll see what happens in the future, but check this picture from Codeman38:

Description: A screen capture of a YouTube vid showing a person rock climbing. The auto caption reads: “for doing what I mean as well as democrats”. Codeman38 has captioned it: Google auto-caption fail. What was actually said: “we’re doing rock climbing as well as table tennis.”

12 thoughts on “QuickPress: “Success” of Automatic Captioning on YouTube

  1. Ahahaha. I was wondering how the auto-captioning was going to work out. Seems it still needs tweaking.

    My personal preference is for transcripts… not only is my hearing slightly wonky (and getting wonkier) after my military experience, but I live in the boonies and have a slow and relatively unreliable satellite internet connection. Sites that do transcripts are an absolute *godsend*.

  2. Yes, I prefer transcripts too. However, that mangled auto-caption is remarkably like how I hear when I’m tired or it’s noisy: words that sound like but could not actually *be* what was said!

  3. @Tlönista: Heh, that’s how my auditory processing often is as well. There’ve been times when I rolled a snippet of something I heard around in my head to try to figure out which set of similar-sounding words actually made the most sense in context. And don’t even get me started on trying to decipher song lyrics…

  4. I really doubt that will ever work because of the huge number of accents, dialects, and languages in the world, speech recognition is just not that advanced yet.

  5. @Amanda: Exactly! Which is why I said what I did in the Fantasy Assistive Devices thread– a universal speech-to-text device is still decades off at best, as much as I would absolutely love to have one.

  6. They do try, yes. I think they’ll keep working on the kinks, but I don’t think a proper speech-to-text is going to come out that doesn’t involve actual audio typists.

    Note to Google: I am an audio typist.

  7. Oh, speaking of which, this YouTube video (captions, but no audio or transcript) of closed captioning recorded off of a Cleveland news broadcast is absolutely hilarious. “My cats got weeded down again,” indeed.


    I get the feeling that this was probably done via voice recognition rather than the usual TV captioning method of using an audio typist…

  8. @Personal Failure: yes! I don’t quite understand how it happens. Because even when I know the word “tall” (for example) is a pretty common word and will come up more in conversation, my brain insists on hearing “tog” or “Gaul” or “tarn”. Extremely vexing for everyone involved!

    It’s possibly an auditory processing thing, I think some other commenters on here have similar issues.

  9. I should remember to do that profile on groups lobbeying to get closed captioning done properly and have a regulatory body and stuff.

  10. I have significant auditory processing issues too. (Interesting – the wikipedia page on auditory processing disorder mentions ear infections. I had SO MANY EAR INFECTIONS when I was little.)

    I desperately want speech-to-text for my cell phone. I was trying to talk to a friend in crisis earlier and could only catch about half of what she was saying. ARGH!

    And it’s been getting worse, very often I just hear speech as a jumbled collection of sounds that I can’t make any sense of. No wonder I prefer the internet!
    .-= Shiyiya´s last blog ..Boycott Method =-.

Comments are closed.