Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instruct fury to discard classInfo in the serialised data during deserialisation #1980

Open
Nikhil-n17 opened this issue Dec 16, 2024 · 5 comments

Comments

@Nikhil-n17
Copy link

Feature Request

Scenario is specific to fury data migration.

We have serialised data in the db which is serialised without registering few classes, now we want to register the classes. This means, in the process we can encounter below two new scenarios :

  • Data serialised by fury instance with class registration --> deserialised by fury instance without class registration -- THIS DOES NOT WORK

  • Data serialised by fury instance without class registration --> deserialised by fury instance with class registration

Just wondering, is there a way to instruct Fury to disregard the classInfo in the serialised data during the deserialisation process?

Is your feature request related to a problem? Please describe

As the serialised data contains classInfo, during serialisation fury seems to be expecting the class with same package structure or corresponding classes in registered with same classId to be deserialised. Making it difficult to refactor the data object in the migration.

Describe the solution you'd like

Feature or flag that instructs the fury deserialiser instance not to consider the classInfo in the serialised data. Or any other solutions to get rid of class registration would be very helpful.

Describe alternatives you've considered

No response

Additional context

Code snippet to reproduce the issue locally.

import io.fury.Fury;
import io.fury.ThreadLocalFury;
import io.fury.ThreadSafeFury;
import io.fury.config.CompatibleMode;
import io.fury.config.Language;

public class FuryProblem {

    public static void main(String... args) {
        Wrapper wrapper = new Wrapper();
        ComposedObject composedObject = new ComposedObject();
        composedObject.setEnabled(true);
        wrapper.setComposedObject(composedObject);
        byte[] srcBytes = furyWithRegistration.serializeJavaObject(wrapper);
        Wrapper resultWrapper = furyWithoutRegistration.deserializeJavaObject(srcBytes, Wrapper.class);
        assert composedObject.getEnabled() == resultWrapper.getComposedObject().getEnabled();
    }

    public static ThreadSafeFury furyWithRegistration =
            new ThreadLocalFury(
                    classLoader -> {
                        Fury f =
                                Fury.builder()
                                        .withLanguage(Language.JAVA)
                                        .withClassLoader(classLoader)
                                        .registerGuavaTypes(false)
                                        .withCompatibleMode(CompatibleMode.COMPATIBLE)
                                        .requireClassRegistration(false)
                                        .build();

                        f.register(Wrapper.class);
                        f.register(ComposedObject.class);
                        return f;
                    });

    public static ThreadSafeFury furyWithoutRegistration =
            new ThreadLocalFury(
                    classLoader -> Fury.builder()
                            .withLanguage(Language.JAVA)
                            .withClassLoader(classLoader)
                            .registerGuavaTypes(false)
                            .withCompatibleMode(CompatibleMode.COMPATIBLE)
                            .requireClassRegistration(false)
                            .build());

    static class Wrapper {
        ComposedObject composedObject;

        public ComposedObject getComposedObject() {
            return composedObject;
        }

        public void setComposedObject(ComposedObject composedObject) {
            this.composedObject = composedObject;
        }
    }
    static class ComposedObject {
        Boolean enabled;

        public void setEnabled(Boolean enabled) {
            this.enabled = enabled;
        }

        public Boolean getEnabled() {
            return enabled;
        }
    }

}
@chaokunyang
Copy link
Collaborator

Hi @Nikhil-n17 , why do you want to register some classes? Is that for security consideration? Or you want to skip write classname for smaller serialized data?

It's hard to ignore class id in the serialized data. You must use registration all the time, or just don't use it. The registered classes must have same order and same class id.

One of the solution I think of is that you use multiple Fury instance with different configuration. And you write a different magic number for every Fury instance. And use the coresponding Fury for deserialization.

Here is an example:

Fury fury1 = Fury.builder().withName("fury1").build();
MemoryBuffer buffer = xxx;
buffer.write_int32(magicNumber1);
fury1.serialize(buffer, new SomeObject());

Fury fury2 = Fury.builder().withName("fury2").build();
MemoryBuffer buffer2 = xxx;
buffer2.write_int32(magicNumber1);
fury2.serialize(buffer2, new SomeObject());

int magicNumber = buffer.read_int32();
if (magicNumber == magicNumber1) {
  fury1.deserialize(buffer)
} else if (magicNumber == magicNumber2) {
  fury2.deserialize(buffer)
}

@Nikhil-n17
Copy link
Author

Nikhil-n17 commented Dec 17, 2024

Previously in the legacy app, we had single fury instance to serialise and deserialise. For now, database only contains the data serialised by this instance. With the existing app being migrated to new system ( System B ) in which we have DTO refactoring ( renaming and reusing the possible DTOs) and new fury instance would be created. We are facing issue while deserialising the data in the db (written by legacyApp) in new System and vice versa.

Example
System A - Legacy app - Database already contains data written by this app.

public class DTOStructSystemB {
    String stringField;
    int intField;
    CustomDTOR1 customDTOR1;
    CustomDTOR2 customDTOR2;
}

public class CustomDTOR1 {
    String name;
    String age;
}

public class CustomDTOR2 {
    String name;
    String age;
} 

/* Both CustomDTOR2 and CustomDTOR1 were registered in fury instance */

System B - NEW System

  • won't be using the DTOs or fury instance from old app.
public class DTOStructSystemB {
    String stringField;
    int intField;
    CustomDTOinNewSystem customDTOinNewSystem;
    CustomDTOinNewSystem customDTOinNewSystem;
}

public class CustomDTOinNewSystem {
    String name;
    String age;
}

We are looking for a way to create fury instance in System B in such that, both class are correctly deserialised to CustomDTOinNewSystem, we are forced to have another DTO of same structure in new system just to be able to deserialise the data written from legacy app.

@Nikhil-n17
Copy link
Author

Could you please help if we have way out for this problem?

@chaokunyang
Copy link
Collaborator

chaokunyang commented Jan 3, 2025

@Nikhil-n17 You need to create two Fury with different configuration and registrations for old and new application. Here are the steps:

step1

Upgrade your legacy application to parse a special header which indicate it's the data from newer fury. And if the header matches, you use another Fury instance which has same registered classes as the newew application.

step2

Upgrade your new application, when you serialize data with new fury, you write a special header first, so that the legacy application can use that header to determine whether dispath the data to old or new Fury.

@chaokunyang
Copy link
Collaborator

Hi @Nikhil-n17 , why do you want to register some classes? Is that for security consideration? Or you want to skip write classname for smaller serialized data?

It's hard to ignore class id in the serialized data. You must use registration all the time, or just don't use it. The registered classes must have same order and same class id.

One of the solution I think of is that you use multiple Fury instance with different configuration. And you write a different magic number for every Fury instance. And use the coresponding Fury for deserialization.

Here is an example:

Fury fury1 = Fury.builder().withName("fury1").build();
MemoryBuffer buffer = xxx;
buffer.write_int32(magicNumber1);
fury1.serialize(buffer, new SomeObject());

Fury fury2 = Fury.builder().withName("fury2").build();
MemoryBuffer buffer2 = xxx;
buffer2.write_int32(magicNumber1);
fury2.serialize(buffer2, new SomeObject());

int magicNumber = buffer.read_int32();
if (magicNumber == magicNumber1) {
  fury1.deserialize(buffer)
} else if (magicNumber == magicNumber2) {
  fury2.deserialize(buffer)
}

I think this already can resolve your issue, is there anything that I miss?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants