-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataBricks Unity Catalog and Cobrix #665
Comments
Some more error message context:
|
Do you read the copybook and the data file via the RDD API? If so, this is the likely cause, as the RDD API is not supported by DataBricks in the Unity Catalog: https://learn.microsoft.com/en-us/azure/databricks/compute/access-mode-limitations#spark-api-limitations-for-unity-catalog-shared-access-mode |
@schwalldorf , Thanks for the interest in the project. Very glad you like it!
What is the Databrics-supported alternative for reading data files concurrently from Spark? |
Hi Ruslan, thanks a lot for your reply. |
Sure. Let's keep this issue open. This is something we might look at at some point. In the meantime somebody might suggest a workaround. |
Hi there, |
So far no progress on this since I don't have access to a Databricks instance at the moment. But this might change during the year, will keep in mind to fix it |
any luck with update on this? |
Not from our side since we are not yet using Databrix's volumes on Unity Catalog. Has this issue been risen with Databricks support as well? If yes, please add a link to the issue. A possible workaround is to use:
Let me know if it works |
Sure, will check and update. Thank you |
@schwalldorf, @saikumare-a, @meghanavemisetty, if you have a stack trace that show lines of Cobrix Scala code the error is happening, it would help a bit. This can at least confirm which API is used for file access at the location. Also, you can try:
|
Hi guys,
thanks a lot for Cobrix. It's really great!
We're moving from Spark (Hadoop) on Premises to DataBricks in the Azure Cloud.
And have encountered a strange problem when using the Unity Catalog.
Both the copybook and the data are stored in a managed Volume in Unity catalog. (Copybooks are simple, no nested fields.) If we do something as simple as
in a Python notebook or script, everything works fine if the code runs on a Compute cluster created by the same person who executes the code. If the code is run by Person A on a cluster created by person B, an "Insufficient Permissions" exception is raised.
See
Person A has full read permissions on any item in the catalog.
The problem only arrises when using Cobrix. If we just load some CSV or parquet file form a Volume, no such problem occurs.
Any idea what goes on here or what we could do?
Any help is much appreciated. Thanks a lot.
The text was updated successfully, but these errors were encountered: