-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add decay function support for MultiFunctionScoreQuery #641
Conversation
GeoUtils.arcDistance( | ||
origin.getLatitude(), origin.getLongitude(), geoPoint.getLat(), geoPoint.getLon()); | ||
double score = decayFunc.computeScore(distance, scale); | ||
return Math.max(0.0, score - offset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not quite sure whether innerQueryScore and getWeight need to be combined with this score here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think decay functions need the query score, but the result would need to be multiplied by the function weight.
@@ -316,8 +316,22 @@ message MultiFunctionScoreQuery { | |||
// Produce score with score script definition | |||
Script script = 3; | |||
} | |||
DecayFunction decayFunction = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason this is not part of the Function
block?
} | ||
|
||
message DecayFunction { | ||
string fieldName = 1; | ||
string decayType = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is only a known set of decay types, it might be better as an enum.
GeoUtils.arcDistance( | ||
origin.getLatitude(), origin.getLongitude(), geoPoint.getLat(), geoPoint.getLon()); | ||
double score = decayFunc.computeScore(distance, scale); | ||
return Math.max(0.0, score - offset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think decay functions need the query score, but the result would need to be multiplied by the function weight.
double score = decayFunc.computeScore(distance, scale); | ||
return Math.max(0.0, score - offset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this math is right. These are the function definitions https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html#_supported_decay_functions
Query filterQuery, float weight, MultiFunctionScoreQuery.DecayFunction decayFunction) { | ||
super(filterQuery, weight, decayFunction); | ||
this.decayFunction = decayFunction; | ||
this.fieldName = decayFunction.getFieldName(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you also pass in the IndexState
you get get the FieldDef
for validation. The FieldDef
can also be used to get the doc values accessor.
25516c6
to
719731c
Compare
719731c
to
d59c35b
Compare
// Distance from origin + offset at which computed score will be equal to decay | ||
string scale = 5; | ||
// Compute decay function for docs with a distance greater than offset, will be 0.0 if none is set | ||
string offset = 6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the field comments also define the valid values for scale and offset, including what distance specifiers are allowed
// Linear decay function | ||
DECAY_TYPE_LINEAR = 1; | ||
// Gaussian decay function | ||
DECAY_TYPE_GUASS = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: if exponential is spelled out, gaussian probably can be too
case DECAY_TYPE_LINEAR: | ||
return new LinearDecayFunction(); | ||
default: | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default can probably throw an illegal argument exception
@@ -77,6 +77,12 @@ public static FilterFunction build( | |||
? QueryNodeMapper.getInstance().getQuery(filterFunctionGrpc.getFilter(), indexState) | |||
: null; | |||
float weight = filterFunctionGrpc.getWeight() != 0.0f ? filterFunctionGrpc.getWeight() : 1.0f; | |||
if (filterFunctionGrpc.hasDecayFunction()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be integrated with the switch statement below
if (decayFunction.hasGeoPoint()) { | ||
return new GeoPointDecayFilterFunction(filterQuery, weight, decayFunction, indexState); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If geo point is the only mode supported, an exception should probably be throw if it is not set
segmentDocLookup.setDocId(docId); | ||
LoadedDocValues<GeoPoint> latLonValues = | ||
(LoadedDocValues<GeoPoint>) segmentDocLookup.get(fieldName); | ||
this.latLng = latLonValues.toFieldValue(0).getLatLngValue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to get the lat/lon values directly from the GeoPoint
without using toFieldValue
latLng.getLatitude(), | ||
latLng.getLongitude()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You will need to recompute the position for the given doc id
DecayType decayType = 2; | ||
// Origin point to calculate the distance | ||
oneof Origin { | ||
int32 numeric = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If numeric is not supported yet, does it need to be here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove it. Thought we can add support for it too but noticed that the client does not need this immediately
public interface DecayFunction { | ||
double computeScore(double distance, double offset, double scale); | ||
|
||
double computeScale(double scale, double decay); | ||
|
||
Explanation explainComputeScore(double distance, double offset, double scale); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add doc strings to this interface
segmentDocLookup.setDocId(docId); | ||
LoadedDocValues<GeoPoint> latLonValues = | ||
(LoadedDocValues<GeoPoint>) segmentDocLookup.get(fieldName); | ||
this.latLng = latLonValues.toFieldValue(0).getLatLngValue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the behavior if the field does not have any values present?
96ce3d4
to
33fd26b
Compare
33fd26b
to
92b8e79
Compare
public class GuassianDecayFunction implements DecayFunction { | ||
@Override | ||
public double computeScore(double distance, double offset, double scale) { | ||
return Math.exp((-1.0 * Math.pow(Math.max(0.0, distance - offset), 2.0)) / 2.0 * scale); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the denominator needs to be (2.0 * scale)
} | ||
|
||
public void validateLatLonField(FieldDef fieldDef) { | ||
if (!LAT_LON.equals(fieldDef.getType())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A better check would be using the instanceof
operator, since the object is getting cast
public boolean validateDocValuesPresent() { | ||
return segmentDocLookup.containsKey(fieldName) && !segmentDocLookup.get(fieldName).isEmpty(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't need to do containsKey
, since you already checked that doc values are enabled. You will also want to avoid multiple calls to segmentDocLookup.get()
, since it will reload data from the index. Something like this should work
LoadedDocValues<GeoPoint> geoPointLoadedDocValues =
(LoadedDocValues<GeoPoint>) segmentDocLookup.get(fieldName);
if (geoPointLoadedDocValues.isEmpty()) {
return 0.0;
} else {
...
}
No description provided.