Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement list schemas by falling back to other APIs if GET /schemas unavailable #864

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

rohitsanj
Copy link
Contributor

@rohitsanj rohitsanj commented Jan 8, 2025

Summary of Changes

Resolves #860

All SRs are not built equal, the WarpStream SR API does not support the List Schemas endpoint (for good reason).

We need to handle cases where the list schemas API is not available, and instead construct it using 1 + 2*N API calls to GET /subjects and on each subject: list versions to determine latest version, then get schema by version.

Any additional details or context that should be provided?

Pull request checklist

Please check if your PR fulfills the following (if applicable):

Tests
  • Added new
  • Updated existing
  • Deleted existing
Other
  • All new disposables (event listeners, views, channels, etc.) collected as for eventual cleanup?
  • Does anything in this PR need to be mentioned in the user-facing CHANGELOG or README?
  • Have you validated this change locally by packaging and installing the extension .vsix file?
    gulp clicktest

@rohitsanj
Copy link
Contributor Author

Tested against WarpStream playground and it works great.

Here are the steps to test:

  1. Set sidecar version to v0.138.0 (vscode main is not on this version yet and we don't want to bump it yet either)
  2. Start WarpStream by running: docker run -p 9092:9092 -p 9094:9094 public.ecr.aws/warpstream-labs/warpstream_agent:latest playground
  3. Create a direct connection using localhost:9092 as bootstrap server and http://localhost:9094 as SR URI.
  4. Test creating schemas, topics, consume topics. Go crazy.
Screenshot 2025-01-08 at 12 28 37 PM

@rohitsanj rohitsanj marked this pull request as ready for review January 8, 2025 20:43
@rohitsanj rohitsanj requested a review from a team as a code owner January 8, 2025 20:43
Comment on lines 288 to 297
// Get latest schema for each subject
for (const subject of subjects) {
const versions: number[] = await subjectsClient.listVersions({ subject });
const latestVersion: number = Math.max(...versions);
const schema: ResponseSchema = await subjectsClient.getSchemaByVersion({
subject,
version: latestVersion.toString(),
});
schemas.push(schema);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to queue up these as promises and await them all at once, something like:

Suggested change
// Get latest schema for each subject
for (const subject of subjects) {
const versions: number[] = await subjectsClient.listVersions({ subject });
const latestVersion: number = Math.max(...versions);
const schema: ResponseSchema = await subjectsClient.getSchemaByVersion({
subject,
version: latestVersion.toString(),
});
schemas.push(schema);
}
// Get latest schema for each subject
const getSchemaPromises: Promise<ResponseSchema>[] = [];
for (const subject of subjects) {
const versions: number[] = await subjectsClient.listVersions({ subject });
const latestVersion: number = Math.max(...versions);
getSchemaPromises.push(
subjectsClient.getSchemaByVersion({
subject,
version: latestVersion.toString(),
})
);
}
const schemas: ResponseSchema[] = await Promise.all(getSchemaPromises);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah that's much better, thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, take a look again pls?

Copy link
Contributor

@jlrobins jlrobins Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shouples @rohitsanj if there are hundreds, thousands of schemas, is it really a good idea to do an unbounded concurrently-fetch-them-all ? I ... doubt it, and would prefer to "chunkify batch" through something like groups of up to ~10-20 or so at a time.

Copy link
Contributor Author

@rohitsanj rohitsanj Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I could add that in.

But I think in general we need to guard against trying to fetch 1000s of schemas by adding some sort of pagination or "lazy" fetching (even with the listSchemas call which would be an expensive query on server side).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alas, the vscode view controller / treeview constructs expects to be able to get all children of a node at once, see TreeDataProvider.getChildren(), either directly or as result of a thenable (a single promise). Alas cannot provide any sort of generator / async generator based result as far as I can see.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the chunking

@rohitsanj rohitsanj requested a review from shouples January 8, 2025 21:04
@rohitsanj
Copy link
Contributor Author

Tested again with the Promise.all and still works well.

@rohitsanj rohitsanj requested a review from jlrobins January 8, 2025 22:03
Copy link
Contributor

@jlrobins jlrobins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is now only fetching the latest version for each subject group in the fallback code, whereas if client.getSchemas() is successful, will give us all bound versions for each subject.

Will need to discuss with @shouples tomorrow.

@jlrobins jlrobins marked this pull request as draft January 9, 2025 18:22
@jlrobins jlrobins added the on hold Blocked until additional functionality is available or more discussions have happened. label Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
on hold Blocked until additional functionality is available or more discussions have happened.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expect Schema Registry List Schemas API to be absent and fetch schemas using other available APIs
3 participants