GTC-2822 Check for dataset ownership on asset requests #515

danscales · 2024-05-10T15:17:03Z

GTC-2822 Check for dataset ownership on asset requests

Make use of get_owner in the asset operations.

The checks for the pure asset operations were slightly trickier than dataset/version operations, because the dataset name is available in the URL path. You need to do a DB operation to get the dataset name. So, I put the actual get_owner() call in the main function rather than in a Depends() argument. I didn't try to add a new Depends() function that magically does the DB operation, since some of the functions already do the DB operation for other reasons.

Updated the test for update_metadata and update_field_metadata to test success and failure cases for different users.

Added a missing check for the ADMIN role in get_owner() and updated its comment.

Updated the detail of the various permission exceptions to be more informative (GTC-2795). Let me know what you think. We could put a bit more information in these messages if we did these errors in the main functions, rather than in the Depends functions, but that is probably not worth it.

In test_dataset.py, several tests were being skipped unintentionally because they were missing the '@pytyest.mark.asyncio' annotation, so I added that in where needed.

Make use of get_owner in the asset operations. The checks for the pure asset operations were slightly trickier than dataset/version operations, because the dataset name is available in the URL path. You need to do a DB operation to get the dataset name. So, I put the actual get_owner() call in the main function rather than in a Depends() argument. I didn't try to add a new Depends() function that magically does the DB operation, since some of the functions already do the DB operation for other reasons. Updated the test for update_metadata and update_field_metadata to test success and failure cases for different users. Added a missing check for the ADMIN role in get_owner() and updated its comment. Updated the detail of the various permission exceptions to be more informative (GTC-2795). Let me know what you think. We could put a bit more information in these messages if we did these errors in the main functions, rather than in the Depends functions, but that is probably not worth it. In test_dataset.py, several tests were being skipped unintentionally because they were missing the '@pytyest.mark.asyncio' annotation, so I added that in where needed.

dmannarino · 2024-05-10T15:48:10Z

app/routes/datasets/dataset.py


+    if user.role == "ADMIN":
+        return user


Thanks for catching this, I thought I had it in the code but seem to have lost it. Justin caught it too.

I'll make mine match this

dmannarino · 2024-05-10T15:55:09Z

tests_v2/unit/app/routes/datasets/test_dataset.py

-    # Create a dataset
-    app.dependency_overrides[get_manager] = get_admin_mocked
+    # Create a dataset with a manager, then make sure get_owner succeeds with an admin.
+    app.dependency_overrides[get_manager] = get_manager_mocked


Thanks for adding the explanations!

jterry64 · 2024-05-10T16:39:42Z

app/routes/assets/asset.py

+        raise HTTPException(status_code=404, detail=str(e))
+
+    # This is the actual check that the user is either the dataset owner or an admin
+    _ = await get_owner(asset_row.dataset, user)


Is there a way we can get the asset row/dataset ID through a chained depends function, or do you think that's not preferable?

As I discussed, I didn't want to do that, since some of the functions already fetch the asset row for their own reasons, so we would be doing the same DB access twice.

jterry64 · 2024-05-10T16:40:02Z

app/routes/datasets/dataset.py


+    if user.role == "ADMIN":
+        return user


I'll make mine match this

jterry64 · 2024-05-10T16:41:15Z

app/routes/datasets/dataset.py

    dataset_row: ORMDataset = await datasets.get_dataset(dataset)
    owner: str = dataset_row.owner_id
    if owner != user.id:
-        raise HTTPException(status_code=401, detail="Unauthorized")
+        raise HTTPException(status_code=401, detail=f"Unauthorized write access to dataset {dataset} (or its versions/assets) by a user who is not an admin or owner of the dataset")


Should we include actually the email of the owner? Otherwise they won't know who to contact. I added a function in another PR to look up a user by ID, I can add that part here

I'll leave GTC-2795 open (better access denied error messages) and fix this in a later change, just so I can get the current change in today.

Thanks for the comment/suggestion!

jterry64 · 2024-05-10T16:49:35Z

tests_v2/unit/app/routes/datasets/test_asset_metadata.py

+    # generic_vector_source_version creates with owner of manager_mocked, so
+    # update by manager_mocked should succeed
+    appmain.dependency_overrides[get_user] = get_manager_mocked
+


Not necessarily in this PR, but wondering if there's a way for us to turn this into a reusable helper function since we keep copying this code block

codecov-commenter · 2024-05-10T18:25:42Z

Codecov Report

Attention: Patch coverage is 37.50000% with 15 lines in your changes are missing coverage. Please review.

Project coverage is 81.71%. Comparing base (302db98) to head (7c81f3f).

Files	Patch %	Lines
app/routes/assets/asset.py	33.33%	12 Missing ⚠️
app/authentication/token.py	0.00%	3 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@                   Coverage Diff                    @@
##           feature/data_manager     #515      +/-   ##
========================================================
- Coverage                 81.86%   81.71%   -0.16%     
========================================================
  Files                       125      125              
  Lines                      5565     5583      +18     
========================================================
+ Hits                       4556     4562       +6     
- Misses                     1009     1021      +12

Flag	Coverage Δ
unittests	`81.71% <37.50%> (-0.16%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

danscales requested review from solomon-negusse, jterry64 and dmannarino May 10, 2024 15:17

dmannarino reviewed May 10, 2024

View reviewed changes

dmannarino approved these changes May 10, 2024

View reviewed changes

jterry64 requested changes May 10, 2024

View reviewed changes

jterry64 approved these changes May 10, 2024

View reviewed changes

Merge remote-tracking branch 'origin/feature/data_manager' into gtc-2822

7c81f3f

danscales merged commit 6b0285b into feature/data_manager May 10, 2024
2 checks passed

danscales deleted the gtc-2822 branch May 10, 2024 18:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GTC-2822 Check for dataset ownership on asset requests #515

GTC-2822 Check for dataset ownership on asset requests #515

danscales commented May 10, 2024

dmannarino May 10, 2024

jterry64 May 10, 2024

dmannarino May 10, 2024

jterry64 May 10, 2024

danscales May 10, 2024

jterry64 May 10, 2024

jterry64 May 10, 2024

danscales May 10, 2024

jterry64 May 10, 2024

codecov-commenter commented May 10, 2024 •

edited

Loading

GTC-2822 Check for dataset ownership on asset requests #515

GTC-2822 Check for dataset ownership on asset requests #515

Conversation

danscales commented May 10, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented May 10, 2024 • edited Loading

Codecov Report

codecov-commenter commented May 10, 2024 •

edited

Loading