-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use VTT transcript to display captions in media player #1469
Conversation
See also #1143. |
@andrewjbtw created a test object in Stage for testing. It didn't seem easy to get video streaming to work in my dev environment, maybe because of CORS? |
Two notes about the test object:
|
87e087e
to
8d468f3
Compare
I pushed this to stage where there is a test video object, and it seems to work! |
end | ||
|
||
def vtt? | ||
mimetype == 'text/plain' && title.end_with?('.vtt') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be text/vtt
? https://developer.mozilla.org/en-US/docs/Web/API/WebVTT_API#webvtt_files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's right @jcoyne. I think it might be coded in the example XML that way? Perhaps it should allow for both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the example XML is just bad data. We should not build in support for bad data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks there are 45 text/vtt files at the moment. I guess I'd need to run a DSA report to see if there are many/any .vtt
files that need the correct media type?
But yes, agreed, we should set things up to encourage the correct description of VTT files. Thanks for spotting this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I'm wondering if it should ignore the extension altogether and only key of the media type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's an important point @andrewjbtw -- it would suck to have to go in and manually fix anything with a transcript because our system codes things incorrectly. Does techmd do the mediatype detection for us at the moment? Do we need a ticket in there, and perhaps a data remediation ticket as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improvements to mime-type detection should go in https://github.com/sul-dlss/assembly-objectfile/blob/229520e3c590dc3e8c83157c5d62c00cd3e53eb7/lib/assembly/object_file.rb#L103.
It does have a problem with this type as it prefers the value returned from file
before the value from extension
file = Assembly::ObjectFile.new('foo.vtt')
file.send :exif_mimetype
=> nil
file.send :file_mimetype
=> "text/plain"
file.send :extension_mimetype
=> "text/vtt"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a couple of different places where mime-types are determined. I think sdr-api
does it as well. Maybe counterintuitively, the mime-type in the structural metadata is generated and stored separately from the mime-type in the technical metadata, which has been the case going back to the Fedora era. The access systems don't use the technical metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I just deposited a VTT file using the SDR API in stage, and it appears both sdr-api and the techmd service applied the correct MIME type: https://argo-stage.stanford.edu/view/druid:pc587kh4617
assembly-objectfile
may be the one spot needing attention here as @jcoyne shared above, and it has a related issue: sul-dlss/assembly-objectfile#19
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wrong about the above: @edsu already patched assembly-objectfile
such that it correctly picks the VTT mime type, and that's in the latest version of the gem.
If a video or audio file have an associated VTT file, add it as a <track> element. This should allow the player to display the transcript during playing of the video or audio.
The logic for determining if a file is a VTT file is now based on the media type, not the filename or its extension.
@jcoyne Does merging this PR mean this will go out to prod and items with VTT will start showing captions next week? I ask because I'm not sure we're ready for that in terms of the styling/UI. |
Take a similar approach to identifying `application/json` files in #50 to allow `.vtt` files to always return the `text/vtt` media type. Fixes #119 Refs sul-dlss/sul-embed#1469
@andrewjbtw my apologies, this is my fault for taking the PR out of draft, which signaled to @jcoyne that it was in fact ready. One thing that came up when we were discussing this in the #dlss-av-captions meeting on April 21st was the need for users to be able to turn off the transcript from playing. It does appear that VideoJS should support adding this if the existing control isn't obvious enough: https://videojs.com/guides/text-tracks/#working-with-text-tracks There was also some concern about the length of the each line of text. I believe this is a function of the lines in the VTT file itself, and isn't behavior we can easily change programmatically. Fortunately there are very few (44) object files coded with a media type of |
Looking at some of those items, I think this will cause vtt captions to appear over burned-in captions. For example, the videos with VTT in https://argo.stanford.edu/view/druid:bb761mb4522 are the two videos with burned in English and German captions. I think with VTT captioning turned on, this will overlay the VTT captions onto the video. It's a small number for sure, but most of them (36) appear to be items where care was taken to generate burned-in captions specifically for a high-profile project. The other 9 appear to be zoom captions. There's a larger number where the VTT was identified as "text/plain" but those won't show captions until remediated. |
@andrewjbtw @jcoyne would it be helpful for me to add a feature flag to turn off VTT transcripts in settings for now? |
@edsu If andrew doesn't want this out yet, then yes, that would be a good idea. @andrewjbtw the user should be able to control the display of VTT captions, so they wouldn't have a problem unless they turned the cc on . |
Let me see if I can get some quick feedback. I do appreciate being able to have captions, just wasn't expecting this this week. The part of the UI that doesn't look great to me is the settings menu, not really the captions themselves. |
@edsu After making another sample for stage (of one of the VT objects) and getting feedback from PSM, we should not turn this on yet. I'm happy to file a separate issue for keeping it off. |
Expected behaviorA user should be able to choose if they want captions to display by clicking a button or otherwise selecting the functionality. Other considerationsWhat if there is no caption file? Should an option to play CC still display? Is it possible for a viewer to only show an option to play CC if there is a web.vtt file present? Or is CC hard-coded into the viewer and we need to notify a user that if there is no web.vtt file, captions are not yet available for playback? If this is the case, can we build in better functionality for requesting that the media be captioned? What about multi-lingual caption support? |
If an object's video or audio file has an associated VTT file, add a
<track>
element which should allow the player to display the transcript during playing of the video or audio.You can see some SDR objects already have VTT files added. Their PURL data looks something like:
Ideally the VTT files would have a
text/vtt
mimetype? Also it might be possible for there to be multiple language transcriptions for a media file, which is not currently handled in this PR. Handling multiple languages should be doable if there was a convention or mechanism for determining the language.Also, I think testing/research with
<audio>
should be done. I did a bit of quick looking around and it seems that maybe captions don't get displayed unless you say it's video instead? That might be stale information though.This was quick exploratory in response to questions from @pleonard212 and @dinahhandel.