Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Summary docs #10435

Merged
merged 17 commits into from
Jun 1, 2022
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 4 additions & 6 deletions api-report/runtime-definitions.api.md
Original file line number Diff line number Diff line change
Expand Up @@ -286,14 +286,14 @@ export interface ISignalEnvelope {
};
}

// @public (undocumented)
// @public
export interface ISummarizeInternalResult extends ISummarizeResult {
// (undocumented)
id: string;
pathPartsForChildren?: string[];
}

// @public (undocumented)
// @public
export interface ISummarizeResult {
// (undocumented)
stats: ISummaryStats;
Expand Down Expand Up @@ -354,7 +354,7 @@ export interface ISummarizerNodeWithGC extends ISummarizerNode {
updateUsedRoutes(usedRoutes: string[], gcTimestamp?: number): void;
}

// @public (undocumented)
// @public
export interface ISummaryStats {
// (undocumented)
blobNodeCount: number;
Expand All @@ -368,11 +368,9 @@ export interface ISummaryStats {
unreferencedBlobSize: number;
}

// @public (undocumented)
// @public
export interface ISummaryTreeWithStats {
// (undocumented)
stats: ISummaryStats;
// (undocumented)
summary: ISummaryTree;
}

Expand Down
64 changes: 58 additions & 6 deletions common/lib/protocol-definitions/src/summary.ts
Original file line number Diff line number Diff line change
Expand Up @@ -21,41 +21,93 @@ export interface ISummaryCommitter {
date: string;
}

/**
Copy link
Contributor

@agarwal-navin agarwal-navin May 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orthogonal to this PR - Interesting to note that ISummaryAuthor and ISummaryCommiter above are never used. We may want to look into removing them. #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracking by #10456

* Type tag used to distinguish different types of nodes in a {@link ISummaryTree}.
*/
// eslint-disable-next-line @typescript-eslint/no-namespace
export namespace SummaryType {
export type Tree = 1;
export type Blob = 2;
export type Handle = 3;
export type Attachment = 4;

export const Tree: Tree = 1 as const;
export const Blob: Blob = 2 as const;
export const Handle: Handle = 3 as const;
export const Attachment: Attachment = 4 as const;
/**
* Represents a sub-tree in the summary.
*/
export const Tree: Tree = 1 as const;

/**
* Represents a blob of data that is added to the summary.
* Such as the user data that is added to the DDS or metadata added by runtime
* such as data store / channel attributes.
*/
export const Blob: Blob = 2 as const;

/**
andre4i marked this conversation as resolved.
Show resolved Hide resolved
* Path to a summary tree object from the last successful summary.
*/
export const Handle: Handle = 3 as const;

/**
* Unique identifier to larger blobs uploaded outside of the summary.
* Ex. DDS has large images or video that will be uploaded by the BlobManager and
* receive an Id that can be used in the summary.
*/
export const Attachment: Attachment = 4 as const;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you happen to know the answer, it would be good to document if there was an explicit choice not to declare SummaryType as an enum?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anthony-murphy Do you know why ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe because of the reason mentioned in this comment here - #9548 (comment)

export type SummaryType = SummaryType.Attachment | SummaryType.Blob | SummaryType.Handle | SummaryType.Tree;

export type SummaryTypeNoHandle = SummaryType.Tree | SummaryType.Blob | SummaryType.Attachment;

/**
* Path to a summary tree object from the last successful summary indicating the summary object hasn't
* changed since it was uploaded.
* To illustrate, if a DataStore did not change since last summary, the framework runtime will use a handle for the
* entire DataStore instead of re-sending the entire subtree. The same concept applies for a DDS.
* An example of handle would be: '/<DataStoreId>/<DDSId>'.
*/
export interface ISummaryHandle {
type: SummaryType.Handle;

// No handles, all other SummaryType are Ok
/**
* Type of Summary Handle (SummaryType.Handle is not supported).
Copy link
Contributor

@CraigMacomber CraigMacomber May 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really the type of the handle, or is it the type of the data to which the handle points? #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it is a type for which the handle references to.

*/
handleType: SummaryTypeNoHandle;

// Stored handle reference
/**
* Unique path that identifies the corresponding sub-tree in a previous summary.
*/
handle: string;
Copy link
Contributor

@CraigMacomber CraigMacomber May 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I get one of these?

Copy link
Contributor Author

@NicholasCouri NicholasCouri May 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a developer, I don't think you need to get it - if I got it right, it is automatically set (id) when we create the fluid file from a summary: Ex. createNewFluidFileFromSummary.

@agarwal-navin to confirm.

Copy link
Contributor

@agarwal-navin agarwal-navin May 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. Currently, every object that wants to use this to send a path to previous summary instead of full summary has to know what the path is. For example, the summarizer node for data store context and channel contexts tracks this path and adds it.
This path is basically its path in the container's object tree. For instance, for DDS it is /<dataStoreId>/<ddsId>.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the way it should it be is similar to how summary tree is built. Each node just adds its summary tree when asked for it. It's the responsibility of the caller to add this tree against the correct path. This way by the time the summary tree reaches the root, all paths are fully built w.r.t. root.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's sync up ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a reference- the only place I found that adds the handle to the tree is when we call encodeSummary (builder.addHandle).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main usage of summary handle in our code -

.

The encodeSummary bit is part of this feature called differential summary which we don't use anymore.

Copy link
Contributor Author

@NicholasCouri NicholasCouri May 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Navin I had missed it - at the SummarizerNode level, if we are tracking it, it is stored under _latestSummary.fullPath.path

}

/**
Copy link
Contributor

@CraigMacomber CraigMacomber May 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to mention here that to reference an already uploaded blob use ISummaryAttachment. Someone looking for how to do that is likley to find themselves here, and not realize they need to look into ISummaryAttachments. #Resolved

* String or Binary data to be uploaded to the server as part of the container's Summary.
* Note: Already uploaded blobs would be referenced by a ISummaryAttachment.
* Additional information can be found here: https://github.com/microsoft/FluidFramework/issues/6568
* Ex. "content": "{ \"pkg\":\"[\\\"OfficeRootComponent\\\",\\\"LastEditedComponent\\\"]\",
* \"summaryFormatVersion\":2,\"isRootDataStore\":false }"
*/
export interface ISummaryBlob {
type: SummaryType.Blob;
content: string | Uint8Array;
}

/**
* Unique identifier for blobs uploaded outside of the summary. Attachment Blobs are uploaded and
* downloaded separately and do not take part of the snapshot payload.
* The id gets returned from the backend after the attachment has been uploaded.
* Additional information can be found here: https://github.com/microsoft/FluidFramework/issues/6374
Copy link
Contributor

@CraigMacomber CraigMacomber May 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This documentation on ISummaryAttachment is super useful to me even in its current state! #Resolved

* Ex. "id": "bQAQKARDdMdTgqICmBa_ZB86YXwGP"
*/
export interface ISummaryAttachment {
type: SummaryType.Attachment;
id: string;
}
Comment on lines 104 to 105
Copy link
Contributor

@CraigMacomber CraigMacomber May 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What format is this in OR how do I get one from a IFluidHandle? Is it the absolute path? A json seralized handle? Something else? #Resolved

Copy link
Contributor Author

@NicholasCouri NicholasCouri May 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an example, it is an id returned from the backend.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not used so I am not really sure. I believe the idea is that id here should be the id that storage gives for this attachment. @vladsud do you know the usage pattern for this?

Copy link
Contributor Author

@NicholasCouri NicholasCouri May 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an example that makes it easier to visualize. @navin, are you sure this is not used ? After all, we do write it :)

Ex. ".blobs": {
"type": 1,
"tree": {
"0": {
"id": "bQAQKARDdMdTgqICmBa_ZB86YXwGP",
"type": 4
}
}
},


/**
* Tree Node data structure with children that are nodes of SummaryObject type:
* Blob, Handle, Attachment or another Tree.
*/
export interface ISummaryTree {
type: SummaryType.Tree;

Expand Down
24 changes: 21 additions & 3 deletions packages/runtime/datastore-definitions/src/channel.ts
Original file line number Diff line number Diff line change
Expand Up @@ -20,15 +20,33 @@ export interface IChannel extends IFluidLoadable {
readonly attributes: IChannelAttributes;

/**
* Generates summary of the channel synchronously.
* @returns A tree representing the summary of the channel.
* Generates summary of the channel synchronously. It is called when an `attach message`
* for a local channel is generated. In other words, when the channel is being attached
* to make it visible to other clients.
* Note: Since Attach Summary is generated for local channels when making them visible to
* remote clients, they don't have any previous summaries to compare against. For this reason,
* The attach summary cannot contain summary handles (paths to sub-trees or blobs).
* It can, however, contain ISummaryAttachment (handles to blobs uploaded async via the blob manager).
* @param fullTree - flag indicating whether the attempt should generate a full
Copy link
Contributor

@agarwal-navin agarwal-navin May 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth adding that this is called when an attach message for a local channel is generated. In other words, when the channel is being attached to make it visible to other clients. #Resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How important is this "attempt"?

Is it a bug or just a perf issue if the implementation returns a summary tree contining a handle for an unchanged subtree? What about a handle for something other than an unchaed subtree (es: a new blob)?

Its important for the documenration on this method to make it possible to tell if an imlementation of it is incorrect, and without answering the above, there are cases where I wouldn't know.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The attach summary cannot contain summary handles, i.e., path to sub-trees (or blobs) in the previous summary. It can however contain ISummaryAttachment, i.e., handles to blobs that were uploaded async via the blob manager.
Since attach summary is generated for local channels when making them visible to remote clients, they don't have any previous summaries to compare against. So, they cannot really send summary handles.

I agree with you that its worth calling it out here.

* summary tree without any handles for unchanged subtrees.
* @param trackState - optimization for tracking state of objects across summaries. If the state
* of an object did not change since last successful summary, an ISummaryHandle can be used
* instead of re-summarizing it. If this is false, the expectation is that you should never
* send an ISummaryHandle since you are not expected to track state.
* Note: The goal is to remove the trackState and automatically decided whether the
* handles will be used or not: https://github.com/microsoft/FluidFramework/issues/10455
* @returns A summary capturing the current state of the channel.
*/
getAttachSummary(fullTree?: boolean, trackState?: boolean): ISummaryTreeWithStats;

/**
* Generates summary of the channel asynchronously.
* This should not be called where the channel can be modified while summarization is in progress.
* @returns A tree representing the summary of the channel.
* @param fullTree - flag indicating whether the attempt should generate a full
Copy link
Contributor

@agarwal-navin agarwal-navin May 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth adding that this is called when generating a summary of the entire container. #Resolved

* summary tree without any handles for unchanged subtrees. It is only set to true when generating
* a summary from the entire container.
* @param trackState - This tells whether we should track state from this summary.
* @returns A summary capturing the current state of the channel.
*/
summarize(fullTree?: boolean, trackState?: boolean): Promise<ISummaryTreeWithStats>;

Expand Down
28 changes: 28 additions & 0 deletions packages/runtime/runtime-definitions/src/summary.ts
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ import {
IGarbageCollectionSummaryDetails,
} from "./garbageCollection";

/**
* Contains the aggregation data from a Tree/Subtree.
*/
export interface ISummaryStats {
treeNodeCount: number;
blobNodeCount: number;
Expand All @@ -24,16 +27,41 @@ export interface ISummaryStats {
unreferencedBlobSize: number;
}

/**
* Represents the summary tree for a node along with the statistics for that tree.
* For example, for a given data store, it contains the data for data store along with a subtree for
* each of its DDS.
* Any component that implements IChannelContext, IFluidDataStoreChannel or extends SharedObject
* will be taking part of the summarization process.
*/
export interface ISummaryTreeWithStats {
/** Represents an agreggation of node counts and blob sizes associated to the current summary information */
stats: ISummaryStats;
/**
* A recursive data structure that will be converted to a snapshot tree and uploaded
* to the backend.
*/
summary: ISummaryTree;
}

/**
* Represents a summary at a current sequence number.
*/
export interface ISummarizeResult {
stats: ISummaryStats;
summary: SummaryTree;
}

/**
* Contains the same data as ISummaryResult but in order to avoid naming colisions,
* the data store summaries are wrapped around an array of labels identified by pathPartsForChildren.
* Ex: id:""
pathPartsForChildren: ["path1"]
stats: ...
summary:
...
"path1":
*/
export interface ISummarizeInternalResult extends ISummarizeResult {
id: string;
/** Additional path parts between this node's ID and its children's IDs. */
Expand Down