Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add decay function support for MultiFunctionScoreQuery #641

Merged
merged 4 commits into from
Apr 3, 2024

Conversation

swethakann
Copy link
Contributor

No description provided.

GeoUtils.arcDistance(
origin.getLatitude(), origin.getLongitude(), geoPoint.getLat(), geoPoint.getLon());
double score = decayFunc.computeScore(distance, scale);
return Math.max(0.0, score - offset);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not quite sure whether innerQueryScore and getWeight need to be combined with this score here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think decay functions need the query score, but the result would need to be multiplied by the function weight.

@swethakann swethakann requested a review from aprudhomme March 28, 2024 00:22
@@ -316,8 +316,22 @@ message MultiFunctionScoreQuery {
// Produce score with score script definition
Script script = 3;
}
DecayFunction decayFunction = 4;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this is not part of the Function block?

}

message DecayFunction {
string fieldName = 1;
string decayType = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is only a known set of decay types, it might be better as an enum.

GeoUtils.arcDistance(
origin.getLatitude(), origin.getLongitude(), geoPoint.getLat(), geoPoint.getLon());
double score = decayFunc.computeScore(distance, scale);
return Math.max(0.0, score - offset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think decay functions need the query score, but the result would need to be multiplied by the function weight.

Comment on lines 85 to 86
double score = decayFunc.computeScore(distance, scale);
return Math.max(0.0, score - offset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query filterQuery, float weight, MultiFunctionScoreQuery.DecayFunction decayFunction) {
super(filterQuery, weight, decayFunction);
this.decayFunction = decayFunction;
this.fieldName = decayFunction.getFieldName();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you also pass in the IndexState you get get the FieldDef for validation. The FieldDef can also be used to get the doc values accessor.

@swethakann swethakann force-pushed the RP-10216_support_decay_function branch 2 times, most recently from 25516c6 to 719731c Compare April 1, 2024 20:13
@swethakann swethakann force-pushed the RP-10216_support_decay_function branch from 719731c to d59c35b Compare April 1, 2024 20:20
@swethakann swethakann marked this pull request as ready for review April 1, 2024 20:33
@swethakann swethakann requested a review from aprudhomme April 1, 2024 20:34
Comment on lines 334 to 337
// Distance from origin + offset at which computed score will be equal to decay
string scale = 5;
// Compute decay function for docs with a distance greater than offset, will be 0.0 if none is set
string offset = 6;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the field comments also define the valid values for scale and offset, including what distance specifiers are allowed

// Linear decay function
DECAY_TYPE_LINEAR = 1;
// Gaussian decay function
DECAY_TYPE_GUASS = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if exponential is spelled out, gaussian probably can be too

case DECAY_TYPE_LINEAR:
return new LinearDecayFunction();
default:
return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default can probably throw an illegal argument exception

@@ -77,6 +77,12 @@ public static FilterFunction build(
? QueryNodeMapper.getInstance().getQuery(filterFunctionGrpc.getFilter(), indexState)
: null;
float weight = filterFunctionGrpc.getWeight() != 0.0f ? filterFunctionGrpc.getWeight() : 1.0f;
if (filterFunctionGrpc.hasDecayFunction()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be integrated with the switch statement below

Comment on lines 82 to 84
if (decayFunction.hasGeoPoint()) {
return new GeoPointDecayFilterFunction(filterQuery, weight, decayFunction, indexState);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If geo point is the only mode supported, an exception should probably be throw if it is not set

segmentDocLookup.setDocId(docId);
LoadedDocValues<GeoPoint> latLonValues =
(LoadedDocValues<GeoPoint>) segmentDocLookup.get(fieldName);
this.latLng = latLonValues.toFieldValue(0).getLatLngValue();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to get the lat/lon values directly from the GeoPoint without using toFieldValue

Comment on lines 119 to 120
latLng.getLatitude(),
latLng.getLongitude());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will need to recompute the position for the given doc id

DecayType decayType = 2;
// Origin point to calculate the distance
oneof Origin {
int32 numeric = 3;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If numeric is not supported yet, does it need to be here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove it. Thought we can add support for it too but noticed that the client does not need this immediately

Comment on lines 20 to 26
public interface DecayFunction {
double computeScore(double distance, double offset, double scale);

double computeScale(double scale, double decay);

Explanation explainComputeScore(double distance, double offset, double scale);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to add doc strings to this interface

segmentDocLookup.setDocId(docId);
LoadedDocValues<GeoPoint> latLonValues =
(LoadedDocValues<GeoPoint>) segmentDocLookup.get(fieldName);
this.latLng = latLonValues.toFieldValue(0).getLatLngValue();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the behavior if the field does not have any values present?

@swethakann swethakann force-pushed the RP-10216_support_decay_function branch from 96ce3d4 to 33fd26b Compare April 2, 2024 23:31
@swethakann swethakann force-pushed the RP-10216_support_decay_function branch from 33fd26b to 92b8e79 Compare April 2, 2024 23:32
@swethakann swethakann requested a review from aprudhomme April 2, 2024 23:34
public class GuassianDecayFunction implements DecayFunction {
@Override
public double computeScore(double distance, double offset, double scale) {
return Math.exp((-1.0 * Math.pow(Math.max(0.0, distance - offset), 2.0)) / 2.0 * scale);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the denominator needs to be (2.0 * scale)

}

public void validateLatLonField(FieldDef fieldDef) {
if (!LAT_LON.equals(fieldDef.getType())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better check would be using the instanceof operator, since the object is getting cast

Comment on lines 124 to 126
public boolean validateDocValuesPresent() {
return segmentDocLookup.containsKey(fieldName) && !segmentDocLookup.get(fieldName).isEmpty();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't need to do containsKey, since you already checked that doc values are enabled. You will also want to avoid multiple calls to segmentDocLookup.get(), since it will reload data from the index. Something like this should work

LoadedDocValues<GeoPoint> geoPointLoadedDocValues =
            (LoadedDocValues<GeoPoint>) segmentDocLookup.get(fieldName);
if (geoPointLoadedDocValues.isEmpty()) {
  return 0.0;
} else {
  ...
}

@swethakann swethakann requested a review from aprudhomme April 3, 2024 20:41
@swethakann swethakann merged commit 12d836e into master Apr 3, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants