Cracking the Code on Transcribing Song
At Scene Savers, we embrace AI the same way we approach every preservation challenge: with rigor, care, and a constant focus on quality, data integrity, and long-term usability.
Recently, a large-scale client came to us with a familiar but persistent problem. Their project required transcription and closed captioning for broadcast materials, much of it featuring musical segments. For years, they had never been able to successfully transcribe the singing in either archival or current broadcasts.
Singing presents a fundamentally different challenge than speech: pitch, rhythm, melody, and performance style can all confound traditional transcription tools. For collections with significant musical programming, that often leaves valuable content uncaptioned, unsearchable, and effectively hidden.
As AI evolves, we are continually evaluating new models and testing real-world archival scenarios to find the right solutions for our clients. After a focused R&D period, we developed a custom workflow using state-of-the-art AI that can successfully capture both spoken word and sung content. The result has been transformative. Content that had never before existed in text form is now accessible, discoverable, and usable at scale.
Now hundreds of thousands of hours into the project, the workflow has proven both accurate and reliable. For us, this is AI at its best: not replacing preservation standards but strengthening them, unlocking more value from digitized archives while keeping care for the original content at the center of the work.
