Re-run technical metadata for all media items #1441

peetucket · 2024-12-16T18:22:39Z

After sul-dlss/technical-metadata-service#572 is merged, we should run all media items in batch back through technical metadata generation so the volume levels are available so we can correctly send items to speechToTextWF. Else the work in #1439 will prevent them from working correctly.

~~Blocked by sul-dlss/technical-metadata-service#572~~

peetucket · 2025-01-07T22:09:43Z

from the technical-metadata-service readme, it sounds like we need a list of all media druids in a file called druids.txt on the server in the base of the rails app, then do something like this:

RAILS_ENV=production bundler exec rake techmd:generate_for_moab_list['true']

Produce list of druids by filtering for media items and then running a report: https://argo.stanford.edu/report?f%5Bcontent_type_ssim%5D%5B%5D=media&f%5BobjectType_ssim%5D%5B%5D=item
Download report as CSV. Open in excel, remove all columns except druid column. Remove header row. Save as txt file.
Suggest doing in 10k batches by splitting into multiple txt files.
Place txt files on the server. The one currently being run should be called druids.txt and be in the root of the tech md app. scp ~/Downloads/techmd/druids_1.txt dor-techmd-prod-a.stanford.edu:/opt/app/techmd/dor_techmd/current/druids.txt
Start the rake task above, which will queue all the jobs. Suggest doing it in a screen to avoid getting disconnected mid-queuing. screen -S queue
Monitor queues at https://dor-techmd-prod-a.stanford.edu/queues/busy
Monitor HB at https://app.honeybadger.io/projects/68956/faults?q=-is%3Aresolved+-is%3Aignored&sort=last_seen_desc

peetucket mentioned this issue Dec 16, 2024

[HOLD] detect audio file tracks #1439

Open

jmartin-sul added the prod-blocker label Dec 20, 2024

peetucket self-assigned this Jan 8, 2025

peetucket mentioned this issue Jan 15, 2025

more updates to README on how to re-generate technical metadata sul-dlss/technical-metadata-service#581

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-run technical metadata for all media items #1441

Re-run technical metadata for all media items #1441

peetucket commented Dec 16, 2024 •

edited

Loading

peetucket commented Jan 7, 2025 •

edited

Loading

Re-run technical metadata for all media items #1441

Re-run technical metadata for all media items #1441

Comments

peetucket commented Dec 16, 2024 • edited Loading

peetucket commented Jan 7, 2025 • edited Loading

peetucket commented Dec 16, 2024 •

edited

Loading

peetucket commented Jan 7, 2025 •

edited

Loading