Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DocDB] Add additional logging about SST files #20880

Open
1 task done
mdbridge opened this issue Jan 31, 2024 · 12 comments
Open
1 task done

[DocDB] Add additional logging about SST files #20880

mdbridge opened this issue Jan 31, 2024 · 12 comments
Labels
area/docdb YugabyteDB core features good first issue This is a good issue to start contributing! kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue

Comments

@mdbridge
Copy link
Contributor

mdbridge commented Jan 31, 2024

Jira Link: DB-9866

Description

During a recent investigation, we wished we had more information about SST files.

This task is to add some relevant additional logging information.

First, we would like to print out the RocksDB block cache prefix in a log message when a SST file is created. We most immediately want to be able to look for collisions between different SST files so it's probably fine to print this as a series of bytes in hex.

Second, UniversalCompactionPicker::SortedRun::DumpSizeInfo already prints some information about selected SST files. Please add to the logging the metadata and data size rather than just total file size.

Issue Type

kind/enhancement

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@mdbridge mdbridge added area/docdb YugabyteDB core features status/awaiting-triage Issue awaiting triage labels Jan 31, 2024
@yugabyte-ci yugabyte-ci added kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue good first issue This is a good issue to start contributing! and removed status/awaiting-triage Issue awaiting triage labels Jan 31, 2024
@Yogeshwar235
Copy link

Hi,
I want to start contributing to YugaByte by picking up this task.

@Rhythm-26
Copy link

Hi @Yogeshwar235, are you working on this?

@kajalchaudhary07
Copy link

@mdbridge can i work on this??

@Rhythm-26
Copy link

Hi @kajalchaudhary07
I have already started working on the same based on offline discussion on slack.

@Rhythm-26
Copy link

@mdbridge is it correct to assume data size = total file size - metadata size?

@mdbridge
Copy link
Contributor Author

hmm. there are logging messages like:

 554209:I1216 15:21:33.238420  7518 event_logger.cc:66] T 2fb5e7e3b3b34ff486c024be1c6b7767 P 5cb8751e07a349de8773702367f70236 [R]: EVENT_LOG_v1 {"time_micros": 1734362486402833, "cf_name": "default", "job": 186, "event": "table_file_creation", "file_number": 1570, "file_size": 1259304, "table_properties": {"data_size": 1191646, "data_index_size": 1814, "filter_size": 65482, "filter_index_size": 22, "raw_key_size": 228996, "raw_average_key_size": 52, "raw_value_size": 2296003, "raw_average_value_size": 521, "num_data_blocks": 78, "num_entries": 4399, "num_filter_blocks": 1, "num_data_index_blocks": 1, "filter_policy_name": "DocKeyV3Filter", "kDeletedKeys": "0"}}

so there may be a more specific meaning. Let me check

@mdbridge
Copy link
Contributor Author

that comes from

~/code/yugabyte-db/src/yb/rocksdb/db/event_helpers.cc:38:
void EventHelpers::LogAndNotifyTableFileCreation(
   EventLogger* event_logger,
   const std::vector<std::shared_ptr<EventListener>>& listeners,
   const FileDescriptor& fd, const TableFileCreationInfo& info) {
 assert(event_logger);
 JSONWriter jwriter;
 AppendCurrentTime(&jwriter);
 jwriter << "cf_name" << info.cf_name
         << "job" << info.job_id
         << "event" << "table_file_creation"
         << "file_number" << fd.GetNumber()
         << "file_size" << fd.GetTotalFileSize();

 // table_properties
 {
   jwriter << "table_properties";
   jwriter.StartObject();

   // basic properties:
   jwriter << "data_size" << info.table_properties.data_size

@mdbridge
Copy link
Contributor Author

that in turn comes from

~/code/yugabyte-db/src/yb/rocksdb/table_properties.h:47:
// TableProperties contains a bunch of read-only properties of its associated
// table.
struct TableProperties {
 public:
  // the total size of all data blocks.
  uint64_t data_size = 0;

@mdbridge
Copy link
Contributor Author

ok, it sadly doesn't look like it would be easy to get the data size or metadata size to UniversalCompactionPicker::SortedRun::DumpSizeInfo. Presumably this is because RocksDB doesn't actually know this information and it would require a read from disk to get it.

Given this, I gets okay to drop the second part of this task.

@mdbridge
Copy link
Contributor Author

mdbridge commented Jan 10, 2025

sorry about the delay replying; I've been on vacation and on-call.

@Rhythm-26
Copy link

Oh okay, though for the second part, i am able to print metadata size and raised PR for the same - #25465
Could you please take a look if thats correct

@rthallamko3
Copy link
Contributor

rthallamko3 commented Jan 16, 2025

@kajalchaudhary07 , @Yogeshwar235 , If you want to contribute, feel free to pick up any of the issues marked as good first issue . cc me - @rthallamko3 on that and I can identify who can help you through the code reviews etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features good first issue This is a good issue to start contributing! kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

6 participants