Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FedX: left bind join (i.e. OPTIONAL) may produce incorrect results for a single left bindingset as input #5229

Open
aschwarte10 opened this issue Jan 15, 2025 · 0 comments · May be fixed by #5230
Assignees
Labels
🐞 bug issue is a bug 📦 fedx fedx: optimized federated query support
Milestone

Comments

@aschwarte10
Copy link
Contributor

aschwarte10 commented Jan 15, 2025

Current Behavior

It can happen that the left bind join, where the left argument produces a single bindingset, can cause an incorrect (empty) result. The situation can occur for advanced source selection cases.

Expected Behavior

The federation engine correctly evaluates OPTIONAL clauses. For the concrete case a non-empty result is expected

Steps To Reproduce

Unit test reproducing the issue

@ParameterizedTest
	@ValueSource(booleans = { true, false })
	public void test_leftBindJoin_emptyLeftArgumentAsExclusiveGroup(boolean bindLeftJoinOptimizationEnabled)
			throws Exception {

		var endpoints = prepareTest(
				Arrays.asList("/tests/basic/data_emptyStore.ttl", "/tests/basic/data_emptyStore.ttl"));

		Repository repo1 = getRepository(1);
		Repository repo2 = getRepository(2);

		Repository fedxRepo = fedxRule.getRepository();

		fedxRule.setConfig(config -> {
			config.withBoundJoinBlockSize(10);
			config.withEnableOptionalAsBindJoin(bindLeftJoinOptimizationEnabled);
		});

		// add a person
		try (RepositoryConnection conn = repo1.getConnection()) {
			var p = Values.iri("http://ex.com/p1");
			var otherP = Values.iri("http://other.com/p1");
			conn.add(p, OWL.SAMEAS, otherP);
		}

		// add name for person 1
		try (RepositoryConnection conn = repo2.getConnection()) {
			var otherP = Values.iri("http://other.com/p1");
			conn.add(otherP, FOAF.NAME, Values.literal("Person 1"));
		}

		// mark that repo2 for some reason has foaf:age statements (e.g. old cache entry)
		Endpoint repo2Endpoint = endpoints.get(1);
		federationContext().getSourceSelectionCache()
				.updateInformation(new SubQuery(null, FOAF.AGE, null), repo2Endpoint, true);

		fedxRule.enableDebug();

		try {
			// run query which joins results from multiple repos
			// the age does not exist for any person
			try (RepositoryConnection conn = fedxRepo.getConnection()) {
				String query = "PREFIX foaf: <http://xmlns.com/foaf/0.1/> " +
						"SELECT * WHERE { "
						+ " ?person owl:sameAs ?otherPerson . "
						+ " OPTIONAL { ?otherPerson foaf:age ?age .  } " // age does not exist, however is marked as
																			// ExclusiveStatement
						+ "}";

				TupleQuery tupleQuery = conn.prepareTupleQuery(query);
				try (TupleQueryResult tqr = tupleQuery.evaluate()) {
					var bindings = Iterations.asList(tqr);

					Assertions.assertEquals(1, bindings.size());

					for (int i = 1; i <= 1; i++) {
						var p = Values.iri("http://ex.com/p" + i);
						var otherP = Values.iri("http://other.com/p" + i);

						// find the bindingset for the person in the unordered result
						BindingSet bs = bindings.stream()
								.filter(b -> b.getValue("person").equals(p))
								.findFirst()
								.orElseThrow();

						Assertions.assertEquals(otherP, bs.getValue("otherPerson"));

						Assertions.assertEquals(otherP, bs.getValue("otherPerson"));
						Assertions.assertFalse(bs.hasBinding("age"));
					}
				}
			}

		} finally {
			fedxRepo.shutDown();
		}
	}

Version

5.1.0

Are you interested in contributing a solution yourself?

Yes

Anything else?

No response

@aschwarte10 aschwarte10 added the 🐞 bug issue is a bug label Jan 15, 2025
@aschwarte10 aschwarte10 added this to the 5.1.1 milestone Jan 15, 2025
@aschwarte10 aschwarte10 self-assigned this Jan 15, 2025
@aschwarte10 aschwarte10 added the 📦 fedx fedx: optimized federated query support label Jan 15, 2025
aschwarte10 added a commit that referenced this issue Jan 15, 2025
This change fixes a situation that can incorrectly cause empty results.
It happens when the input of the left argument is a single binding set
and for special source selection situations (e.g. the right argument is
marked as ExclusiveStatement while the endpoint does not provide data)

To avoid the issue we also use the regular left join logic also for a
single binding set input, which can handle the situation properly.

Issue is covered with a unit test.
@aschwarte10 aschwarte10 linked a pull request Jan 15, 2025 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug issue is a bug 📦 fedx fedx: optimized federated query support
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant