Bump Pydantic to version 2 #160

ThomasLaPiana · 2023-09-07T15:50:17Z

Closes #159

Description Of Changes

Pydantic 2 sees major changes that bring large performance advantages. Worth upgrading to stay up-to-date as well as enjoy downstream effects in increased performance

Pydantic 1 -> 2 Migration Guide: link

I fully expected we might need to tweak more things here based on testing against Fides directly, but we'll need to do more checks there to verify. Here is the corresponding Fides PR

As part of this migration, I adopted a very different general pattern for validation than we used before. Due to the major changes, I found it easier to use model_validator (formerly known as root_validator) and composed multiple validators together into a single, full-model validator instead of always using discrete validators by field.

Code Changes

make updates required for Pydantic v2
update the requirement bounds in requirements.txt
remove Python 3.8 from supported versions

Steps to Confirm

CI checks pass

Pre-Merge Checklist

…a-bump-pydantic-2

requirements.txt

ThomasLaPiana · 2023-11-13T16:02:29Z

@NevilleS a major question we need to answer is if we also want to strip out everything that is deprecated as part of 3.0

I think it makes sense to me from a versioning standpoint, but I also want to get more informed opinions from

@SteveDMurphy @pattisdr & @adamsachs

… versions

ThomasLaPiana · 2023-11-14T07:58:33Z

.github/workflows/pr_checks.yml

@@ -93,9 +93,9 @@ jobs:
  Pytest-Matrix:
    strategy:
      matrix:
-        python_version: ["3.8", "3.9", "3.10", "3.11"]


Python 3.8 wasn't playing nicely, so I axed it. It is almost EOL and we can only make the matrix so large

Understood, how do we generally communicate things like this to customers?

Other than changelog, it'll automatically fail to pip install if they have an older version.

Afaik most people are using Docker containers, this would be most disruptive to customers using the CLI

ThomasLaPiana · 2023-11-14T07:59:03Z

.github/workflows/pr_checks.yml

-        pydantic_version: ["1.8.2", "1.9.2", "1.10.9"]
-        pyyaml_version: ["5.4.1", "6.0"]
+        python_version: ["3.9", "3.10", "3.11"]
+        pydantic_version: ["2.2.1", "2.3.0", "2.4.2", "2.5.0"]


These are the most recent ones, but Pydantic moves fast so we'll need to keep an eye on this

ThomasLaPiana · 2023-11-14T08:01:52Z

requirements.txt

@@ -1,3 +1,3 @@
-pydantic>=1.8.1,<1.11.0
+pydantic>=2.2.1,<=2.6.0


not cross-supporting Pydantic 1.0!

src/fideslang/default_taxonomy/utils.py

ThomasLaPiana · 2023-11-14T08:03:18Z

src/fideslang/validation.py

-    """
-    A FidesKey type that creates a custom constrained string.
-    """
+FidesKey = Annotated[str, PlainValidator(validate_fides_key)]


CustomTypes are now handled very differently, as Annotated instead of Classes that inherit

Thanks for all these helpful review comments

I copied you! haha yours are always so helpful

ThomasLaPiana · 2023-11-14T08:04:01Z

src/fideslang/validation.py

-        raise FidesValidationError("FidesKey can not self-reference!")
-    return value
-
-
 def deprecated_version_later_than_added(


I updated almost all of these functions to be pure functions, not set up as specific validators. They are then more transparently composed on the model itself

ThomasLaPiana · 2023-11-14T08:04:40Z

tests/conftest.py

@@ -16,14 +16,14 @@ def resources_dict():
    """
    resources_dict: Dict[str, Any] = {
        "data_category": models.DataCategory(
-            organization_fides_key=1,
+            organization_fides_key="1",


Pydantic is more picky now! It will not do these subtle conversions (int -> str in this case) but instead will throw an error

ah ok, feels like something that will cause potential issues with existing yaml files etc.

potentially for sure...I'm going to test this branch out in Fides and see what kind of damage it does 😬

it did, so I had to update everything in Fides....will need to think about how to handle this if we don't want to break things...

although coming from the db they should already be correct

ThomasLaPiana · 2023-11-14T08:11:50Z

tests/fideslang/test_models.py

            system_type="SYSTEM",
            tags=["some", "tags"],
        )

-    def test_system_no_egress(self) -> None:


I'm confused here, and I think this was a bug? Not sure why these were expected to fail before but aren't now. Especially given that they seem fine given how the model was written? This is really old code though so I'm not sure if just deleting it is right, but I'm not sure what it is solving either.

I would appreciate any possible insight here!

I think we need to put this back and fix the model-level validation.

It looks like privacy_declarations_reference_data_flows is not getting called now. This first test, the privacy declaration seems to reference a system but there are no system egresses?

ThomasLaPiana · 2023-11-14T08:27:25Z

src/fideslang/models.py

-        "Config for the Evaluation"
-        extra = "ignore"
-        orm_mode = True
+    model_config = ConfigDict(extra="ignore", from_attributes=True)


another change, this time with how models are configured

ThomasLaPiana · 2023-11-14T08:28:08Z

src/fideslang/models.py

        description="An array of data categories describing a system in a privacy declaration.",
    )
    data_use: FidesKey = Field(
        description="The Data Use describing a system in a privacy declaration.",
    )
    data_subjects: List[FidesKey] = Field(
-        default_factory=list,
+        default=[],


I nitted myself here! As taught by @pattisdr it is safe to use [] as a default value in Pydantic

pattisdr · 2023-11-14T22:49:14Z

Starting review..a lot to catch up on here!

ThomasLaPiana · 2023-11-15T05:19:48Z

Starting review..a lot to catch up on here!

I apologize! I just wasn't sure how to break it up further

I added comments everywhere I thought made sense but of course feel free to dig in and ask other questions and I'll reply as best as I can!

pattisdr · 2023-11-15T14:19:34Z

Oh no this is exciting! Lots of great improvements here, I more meant big picture understanding what Pydantic 2 brings as well.

pattisdr · 2023-11-15T14:24:13Z

There's a whole new lingo to learn!

pattisdr

Huge effort here Thomas. Main thing I'd make sure these validators that are working on a list of items still work, in places they're not getting called, but only small things were noted here, quick to turn around.

Have you just pinned this latest fideslang commit in Fides and experimented with the level effort required here? I wonder if trying to integrate will cause more changes needed in Fideslang so it might be nice to verify before we do the big 3.0 release.

pattisdr · 2023-11-15T15:26:51Z

src/fideslang/models.py

    check_valid_country_code
 )
-matching_parent_key_validator = validator("parent_key", allow_reuse=True, always=True)(


Nice, easier to follow too on the model!

pattisdr · 2023-11-15T15:46:44Z

src/fideslang/models.py

+    name: Optional[str] = Field(
+        default=None, description="Human-Readable name for this resource."
+    )
+    description: Optional[str] = Field(
+        default=None, description="A detailed description of what this resource is."
+    )


What was behind pulling out the usage of name_field and description_field here, when you're still using them on other models? Just curious

this is kind of a messy/bad abstraction. Names can sometimes be optional, and sometimes are not (they're treated as keys) so in my opinion this was a premature optimization/abstraction on my part :)

pattisdr · 2023-11-15T15:57:59Z

src/fideslang/models.py

-        """Config for the cookies"""
-
-        orm_mode = True
+    path: Optional[str] = None


Ah got it thanks!

pattisdr · 2023-11-15T16:21:15Z

src/fideslang/models.py

+        default=None,
+        description="Deprecated. "
+        + (
+            DataProtectionImpactAssessment.__doc__ or ""


It looks like DataProtectionImpactAssessment > progress/link files might need default=None too (although this is supposed to be deprecated anyway)

pattisdr · 2023-11-15T16:41:07Z

src/fideslang/models.py

-    @validator("legitimate_interest", always=True)
+    @field_validator("legitimate_interest")
    @classmethod
-    def set_legitimate_interest(cls, value: bool, values: Dict) -> bool:
+    def set_legitimate_interest(cls, value: bool, info: ValidationInfo) -> bool:
        """Sets if a legitimate interest is used."""
+        values = info.data
+
        if values["legal_basis"] == "Legitimate Interests":
            value = True
        return value


This doesn't seem to run if legitimate_interest is not an argument when instantiating a DataUse. Looks like you can add validate_default=True to the legitimate_interest Field above to get the same always=True behavior.

(Doesn't matter much in practice, this field is deprecated, an argument to remove these deprecated fields soon)

pattisdr · 2023-11-15T18:51:48Z

.github/workflows/pr_checks.yml

@@ -93,9 +93,9 @@ jobs:
  Pytest-Matrix:
    strategy:
      matrix:
-        python_version: ["3.8", "3.9", "3.10", "3.11"]


Understood, how do we generally communicate things like this to customers?

pattisdr · 2023-11-15T18:55:48Z

src/fideslang/validation.py

-    """
-    A FidesKey type that creates a custom constrained string.
-    """
+FidesKey = Annotated[str, PlainValidator(validate_fides_key)]


Thanks for all these helpful review comments

pattisdr · 2023-11-15T19:00:38Z

src/fideslang/models.py

@@ -1084,48 +1096,60 @@ class System(FidesModel):
    """

    registry_id: Optional[int] = Field(


It looks like in FIdeslang the type is an integer, but the tests updated to registry_id="1"

The ctl_systems table has it as a string though, so should this be updated to a string? Come to think of it I think I've run into 500 errors before around this - the discrepancy between fideslang and the database here.

Table "public.ctl_systems" Column | Type | Collation | Nullable | Default --------------------------------------+--------------------------+-----------+----------+--------------------------- registry_id | character varying | | |

uh oh...hmm ok I'll need to do more testing over in fides for this

pattisdr · 2023-11-15T19:10:41Z

src/fideslang/validation.py

-    def __get_validators__(cls) -> Generator:
-        yield cls.validate
+    regex: Pattern[str] = re.compile(FIDES_KEY_PATTERN)
+    if not regex.match(value):


This throws confusing Type errors if the fides key is not a string, should we add more validation to require FidesKeys to be a string upfront?

In which situations? We might need to pass in as str specifically

pattisdr · 2023-11-15T19:27:27Z

tests/fideslang/test_models.py

            system_type="SYSTEM",
            tags=["some", "tags"],
        )

-    def test_system_no_egress(self) -> None:


I think we need to put this back and fix the model-level validation.

It looks like privacy_declarations_reference_data_flows is not getting called now. This first test, the privacy declaration seems to reference a system but there are no system egresses?

ThomasLaPiana · 2023-11-20T06:04:49Z

There's a whole new lingo to learn!

no kidding! Soooo many major changes here, hopefully the performance improvements and new ergonomics are worth it!

…orage in Fides

… in Fides

NevilleS · 2024-01-08T22:46:20Z

Migrated this one to Ethyca's downstream fork: ethyca#9. Once we finish this update for the Python bindings it'll be useful to merge upstream 👍

Bump Pydantic to version 2

6891ade

ThomasLaPiana self-assigned this Sep 7, 2023

ThomasLaPiana and others added 4 commits September 8, 2023 10:45

Add Pydantic 2 to Action Matrix

c491758

checkin

153bf2d

feat: remove data qualifiers

7693bcb

feat: more qualifier removals

3e486b1

ThomasLaPiana changed the base branch from main to ThomasLaPiana-remove-data-qualifiers November 10, 2023 08:30

ThomasLaPiana and others added 6 commits November 10, 2023 16:32

Merge branch 'ThomasLaPiana-remove-data-qualifiers' into ThomasLaPian…

7371ff7

…a-bump-pydantic-2

fix: mypy and pylint

20a0785

fix more failing tests

d55e284

fix version and fides_key validation test failures

5f2c86c

fix missing defaults

a718e17

clean up more test failures

c329f44

TheAndrewJackson reviewed Nov 13, 2023

View reviewed changes

requirements.txt Outdated Show resolved Hide resolved

ThomasLaPiana added 7 commits November 14, 2023 00:04

fix more errors

bfb8d3d

fix parent key validation

25e5b80

get everything passing (by removing two tests)

73edf8b

feat: update CI checks for new pydantic versions

126e4be

update requirements file, remove python 3.8 and add 3.12 to supported…

cc4d782

… versions

remove python 3.12 from the matrix

ba08cf1

fix static checks

12ef6a8

ThomasLaPiana commented Nov 14, 2023

View reviewed changes

src/fideslang/default_taxonomy/utils.py Show resolved Hide resolved

ThomasLaPiana commented Nov 14, 2023

View reviewed changes

docs: changelog

63d2795

ThomasLaPiana requested review from TheAndrewJackson, adamsachs and pattisdr November 14, 2023 08:38

ThomasLaPiana marked this pull request as ready for review November 14, 2023 08:38

pattisdr reviewed Nov 15, 2023

View reviewed changes

Base automatically changed from ThomasLaPiana-remove-data-qualifiers to fideslang-3 November 27, 2023 05:01

ThomasLaPiana and others added 6 commits November 27, 2023 21:49

re-add tests and fix privacy declaration checks

40d4faa

Merge branch 'fideslang-3' into ThomasLaPiana-bump-pydantic-2

d82bf8c

fix flexible default test

537537c

fix static checks

191f6df

fix the validators on our custom fideskey types to be json schema valid

9ec615a

turned off strict checking for the registry id

43abfb7

ThomasLaPiana requested a review from pattisdr November 28, 2023 13:58

ThomasLaPiana added 4 commits November 30, 2023 12:11

add more None defaults to optional types

e7de840

remove all uses of URL since they might cause issues with database st…

a66a4ce

…orage in Fides

remove a validator on System that caused issues when loading from Orm…

557aebf

… in Fides

remove deprecation tests and update model (root) validators

b87a229

NevilleS mentioned this pull request Jan 8, 2024

Bump Pydantic to version 2 ethyca/fideslang#9

Closed

9 tasks

NevilleS closed this Jan 8, 2024

NevilleS deleted the ThomasLaPiana-bump-pydantic-2 branch January 8, 2024 22:47

pattisdr mentioned this pull request Jun 17, 2024

Fideslang Pydantic V2 Upgrade ethyca/fideslang#11

Merged

21 tasks

		@@ -1,3 +1,3 @@
		pydantic>=1.8.1,<1.11.0
		pydantic>=2.2.1,<=2.6.0

		@@ -1084,48 +1096,60 @@ class System(FidesModel):
		"""

		registry_id: Optional[int] = Field(

Bump Pydantic to version 2 #160

Bump Pydantic to version 2 #160

Conversation

ThomasLaPiana commented Sep 7, 2023 • edited Loading

Description Of Changes

Code Changes

Steps to Confirm

Pre-Merge Checklist

ThomasLaPiana commented Nov 13, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThomasLaPiana Nov 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pattisdr commented Nov 14, 2023

ThomasLaPiana commented Nov 15, 2023

pattisdr commented Nov 15, 2023

pattisdr commented Nov 15, 2023

pattisdr left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThomasLaPiana commented Nov 20, 2023

NevilleS commented Jan 8, 2024

ThomasLaPiana commented Sep 7, 2023 •

edited

Loading

ThomasLaPiana Nov 14, 2023 •

edited

Loading

pattisdr left a comment •

edited

Loading