Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance concern in StartUpDocumentMigrationRunner.cs #52

Open
iamzhaoxu opened this issue Oct 21, 2021 · 1 comment
Open

Performance concern in StartUpDocumentMigrationRunner.cs #52

iamzhaoxu opened this issue Oct 21, 2021 · 1 comment

Comments

@iamzhaoxu
Copy link

Hi,

I have a concern about the code below in the StartUpDocumentMigrationRunner.cs.
As you see we try to load all the documents required migration by cursor which is good. However,
after we build the ReplaceOneModel, it will push all the WriteModel to a list variable "bulk".

If the amount of the document is small and this is fine. But if we want to migrate data more than millions, will we have a memory concern here since there will be millions of records sitting in memory?

 public void RunAll()
        {
            var locations = _collectionLocator.GetLocatesOrEmpty();

            foreach (var locate in locations)
            {
                var information = locate.Value;
                var type = locate.Key;
                var databaseName = GetDatabaseOrDefault(information);
                var collectionVersion = _documentVersionService.GetCollectionVersion(type);

                var collection = _client.GetDatabase(databaseName)
                    .GetCollection<BsonDocument>(information.Collection);

                var bulk = new List<WriteModel<BsonDocument>>();

                var query = CreateQueryForRelevantDocuments(type);

                using (var cursor = collection.FindSync(query))
                {
                    while (cursor.MoveNext())
                    {
                        var batch = cursor.Current;
                        foreach (var document in batch)
                        {
                            _migrationRunner.Run(type, document, collectionVersion);

                            var update = new ReplaceOneModel<BsonDocument>(
                                new BsonDocument {{"_id", document["_id"]}},
                                document
                            );

                            bulk.Add(update);
                        }
                    }
                }

                if (bulk.Count > 0) collection.BulkWrite(bulk);
            }
        }
@rpallares
Copy link

rpallares commented Oct 26, 2023

Do you have any update on this topic ?
Is there a preferred design to not have all document in memory ?

Is it possible to inject our own StartUpDocumentMigrationRunner ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants