Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds tracing id to troubleshoot indexing. #5249

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

justinlittman
Copy link
Contributor

refs #5231

Why was this change made? 🤔

To allow better debugging of indexing.

How was this change tested? 🤨

Stage, unit

⚡ ⚠ If this change has cross service impact, including data writes to shared file systems, run integration tests and/or test in [stage|qa] environment, in addition to specs. ⚡

@mjgiarlo mjgiarlo self-assigned this Jan 15, 2025
Comment on lines 39 to 44
def self.trace_id_for(druid:, source:)
SecureRandom.uuid.tap do |trace_id|
Rails.logger.info("Reindexing #{druid} from #{source} with trace_id=#{trace_id}")
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the long-story-short explanation of all these changes that we want a way to relate log entries and honeybadger exceptions to solr documents? Those are the three spots where these IDs wind up, if I'm reading this right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct. We have Solr docs that are out of date; we don't know how they got written. Is there a race condition? Is stale data being indexed? Are Solr commits being dropped? Something else?

Comment on lines 6 to 12
def self.reindex(cocina_object:, trace_id:)
return unless Settings.solr.enabled

solr_doc = Indexing::Builders::DocumentBuilder.for(
model: cocina_object
model: cocina_object,
trace_id:
).to_solr
Copy link
Member

@mjgiarlo mjgiarlo Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we need to add this arg to Indexer.reindex and Indexer.reindex_later since both of these methods already have:

  1. A Cocina object, which should always have the druid in question (?), and
  2. Kernel#caller_locations, which should get us the path back to the caller

If we could make ☝🏻 work, that would keep trace_id from leaking all over the place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is likely that we're going to want to push trace_id out to other systems and include it in the messages / calls to DSA that request indexing. For this, we'll want trace_id leaking all over the place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, then would ☝🏻 work for DSA-internal use? If so, then we could cut down the changes in this PR and incorporate the remote trace stuff when we work on that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mjgiarlo I've reduced the footprint using your suggested technique. I still left the method signature hooks in place so that an existing trace_id can be passed in.

@justinlittman justinlittman force-pushed the index_tracing branch 3 times, most recently from 9e1f4a3 to 76d4030 Compare January 15, 2025 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants