Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate Values in private_map File for Different Keys When Masking in sosreport #3806

Open
chaudhariyash10 opened this issue Oct 14, 2024 · 7 comments

Comments

@chaudhariyash10
Copy link

When two sosreport collections run concurrently with masking enabled, an issue occurs where duplicate entries are created in both the /etc/sos/clean/default_mapping file and the private_map file. This leads to different keys being mapped to the same masked value, causing confusion when trying to reverse map the masked data from the logs.

Specifically, during concurrent sosreport runs, the first report might assign "host0": "gmail.com", while the second report assigns "host0": "outlook.com". This results in duplicated entries in the private_map and default_mapping files, making it difficult to accurately identify the original data corresponding to the masked values.

Steps to Reproduce:

  1. Delete the /etc/sos/clean/default_mapping file
  2. Start two sosreport collections concurrently with masking enabled.
  3. Inspect the /etc/sos/clean/default_mapping file and the private_map file with the generated tarballs.
  4. Observe that both sosreport runs have different keys assigned to the same masked value.
    Expected Behavior:
    Each sosreport instance should assign unique masked values to corresponding keys without duplication, ensuring proper reverse mapping from logs.
    Impact:
    This issue causes confusion and potential data inaccuracies when trying to map masked values back to the original sensitive information, which can be problematic for log analysis and troubleshooting.

image (4)

@pmoravec
Copy link
Contributor

That is a direct consequence of the independent executions of sos cleaners. Something that would be difficult to prevent, esp. when the cleaners operate on different input data/reports. The best easy solution would be to have a lock on default mapping file to prevent concurrent cleaner executions. Another approach - not so easy - that will fix this issue as a side effect, is https://github.com/orgs/sosreport/discussions/3476#discussioncomment-8501581 .

Any use case to run sos cleaner concurrently? Can't you streamline the executions to prevent the issue?

@chaudhariyash10
Copy link
Author

Thank you for pointing that out that the concurrent executions of sos cleaners seem to arise this issue. Regarding streamlining the process, we have different options to trigger sosreport—one being user-driven and the other systemd-driven. When a systemd-driven sosreport is in progress, we wouldn’t want to keep the user waiting for it to finish before they can initiate another report. Therefore, streamlining executions in these scenarios wouldn’t be feasible, as we need to accommodate both types of triggers without introducing delays.

@pmoravec
Copy link
Contributor

So even waiting for a lock is not a solution as that would delay execution.

The only solution I see as possible is the https://github.com/orgs/sosreport/discussions/3476#discussioncomment-8501581, then.

@TurboTurtle or @arif-ali , any idea?

@TurboTurtle
Copy link
Member

I don't think that rejecting a locking mechanism for not "delaying an execution" is a reasonable ask. That's the entire point of a locking mechanism, and is exactly what would happen today if multiple sos report collections were kicked off by a systemd service (by default, anyway) - systemd would block concurrent execution of the same .service while the first is still running.

@pmoravec
Copy link
Contributor

Assuming the locks, a scenario like:

  • systemd kicks off sos report --clean
  • a user needs to collect sos report --clean but is (b)locked by the systemd-triggered cleaner for a randomly long time

Do you see that as a reasonable cost for preventing the duplicate values in mapping? If the lceaner would be faster than now, or if its execution time would be deterministic, then ok, but here..? I dont know..

Optionally, the user can kill the systemd-triggered cleaner (for some cost of not completing that activity - one can "resume" this by re-running cleaner on the semi-cleaned directory, but further steps of the systemd streamed activity would need to be manually executed as well). If we would have the lock test like "if there is a cleaner PID written in a lock file AND if that process is running, then wait", this can work..

@NikhilKakade-1
Copy link
Contributor

Can we use file locking and incrementally add each key-value pair as they are generated to effectively resolve the issues? (instead of writing the entire set at the end of the sosreport process

@pmoravec
Copy link
Contributor

pmoravec commented Jan 8, 2025

The way of fixing this very depends on the way how we decide to fix #3097 :

Can we use file locking and incrementally add each key-value pair as they are generated to effectively resolve the issues? (instead of writing the entire set at the end of the sosreport process

This does not scale much. Hard to guess performance, but assume there will be hundreds to thousands of records to be written to the mapping file. Then:

  • the file is first read at cleaner startup
  • detecting first new key, we must check if the file hasnt been updated (simple stat of mtime is enough here) - if it was, we have to re-read it completely
  • writing new key-value pair now means re-writing whole json file from scratch; we cant append to tail of the file because of our json structure is not flat

So, having N new key-pair records to write into a mapping file with M records already present, we (over)write (M+N)*N/2 records into the mapping file, iterativelly. Even for the "linear" use case when no cleaner runs concurrently.

We can go that way, but this will be the cost.

For the "values are hashes of keys from some secret seed" approach, we dont need such lock on every write. The only locks are:

  • at the beginning, check if there is an existing seed and if not, generate one (as an atomic operation)
  • at the end of cleaning, read current mapping file, merge it with newly generated pairs, and write it back (as atomic operation)

Anything in between (initial reading of mapping file, generating new pairs of key-value from the same seed) can be done concurrently, and it will be deterministic (for the given same seed).

So considering this request, it gives another argument for the "seed option" (https://github.com/orgs/sosreport/discussions/3476#discussioncomment-8501581). Which has its own gotchas (less user-friendly obfuscated names, potential risk to guess the seed via vocabulary attack (this is serious concern), running collector with cleaner among systems with mutliple different seeds).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants