Class LevenshteinDetailedDistance
- java.lang.Object
-
- org.apache.commons.text.similarity.LevenshteinDetailedDistance
-
- All Implemented Interfaces:
java.util.function.BiFunction<java.lang.CharSequence,java.lang.CharSequence,LevenshteinResults>,EditDistance<LevenshteinResults>,ObjectSimilarityScore<java.lang.CharSequence,LevenshteinResults>,SimilarityScore<LevenshteinResults>
public class LevenshteinDetailedDistance extends java.lang.Object implements EditDistance<LevenshteinResults>
An algorithm for measuring the difference between two character sequences.This is the number of changes needed to change one sequence into another, where each change is a single character modification (deletion, insertion or substitution).
- Since:
- 1.0
-
-
Constructor Summary
Constructors Constructor Description LevenshteinDetailedDistance()Deprecated.UsegetDefaultInstance().LevenshteinDetailedDistance(java.lang.Integer threshold)Constructs a new instance for a threshold.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description LevenshteinResultsapply(java.lang.CharSequence left, java.lang.CharSequence right)Computes the Levenshtein distance between two Strings.<E> LevenshteinResultsapply(SimilarityInput<E> left, SimilarityInput<E> right)Computes the Levenshtein distance between two Strings.static LevenshteinDetailedDistancegetDefaultInstance()Gets the default instance.java.lang.IntegergetThreshold()Gets the distance threshold.
-
-
-
Constructor Detail
-
LevenshteinDetailedDistance
@Deprecated public LevenshteinDetailedDistance()
Deprecated.UsegetDefaultInstance().Constructs a new instance that uses a version of the algorithm that does not use a threshold parameter.- See Also:
getDefaultInstance()
-
LevenshteinDetailedDistance
public LevenshteinDetailedDistance(java.lang.Integer threshold)
Constructs a new instance for a threshold.If the threshold is not null, distance calculations will be limited to a maximum length.
If the threshold is null, the unlimited version of the algorithm will be used.
- Parameters:
threshold- If this is null then distances calculations will not be limited. This may not be negative.
-
-
Method Detail
-
getDefaultInstance
public static LevenshteinDetailedDistance getDefaultInstance()
Gets the default instance.- Returns:
- The default instace
-
apply
public LevenshteinResults apply(java.lang.CharSequence left, java.lang.CharSequence right)
Computes the Levenshtein distance between two Strings.A higher score indicates a greater distance.
The previous implementation of the Levenshtein distance algorithm was from http://www.merriampark.com/ld.htm
Chas Emerick has written an implementation in Java, which avoids an OutOfMemoryError which can occur when my Java implementation is used with very large strings.
This implementation of the Levenshtein distance algorithm is from http://www.merriampark.com/ldjava.htmdistance.apply(null, *) = Throws
IllegalArgumentExceptiondistance.apply(*, null) = ThrowsIllegalArgumentExceptiondistance.apply("","") = 0 distance.apply("","a") = 1 distance.apply("aaapppp", "") = 7 distance.apply("frog", "fog") = 1 distance.apply("fly", "ant") = 3 distance.apply("elephant", "hippo") = 7 distance.apply("hippo", "elephant") = 7 distance.apply("hippo", "zzzzzzzz") = 8 distance.apply("hello", "hallo") = 1- Specified by:
applyin interfacejava.util.function.BiFunction<java.lang.CharSequence,java.lang.CharSequence,LevenshteinResults>- Specified by:
applyin interfaceObjectSimilarityScore<java.lang.CharSequence,LevenshteinResults>- Specified by:
applyin interfaceSimilarityScore<LevenshteinResults>- Parameters:
left- the first input, must not be null.right- the second input, must not be null.- Returns:
- result distance, or -1.
- Throws:
java.lang.IllegalArgumentException- if either String inputnull.
-
apply
public <E> LevenshteinResults apply(SimilarityInput<E> left, SimilarityInput<E> right)
Computes the Levenshtein distance between two Strings.A higher score indicates a greater distance.
The previous implementation of the Levenshtein distance algorithm was from http://www.merriampark.com/ld.htm
Chas Emerick has written an implementation in Java, which avoids an OutOfMemoryError which can occur when my Java implementation is used with very large strings.
This implementation of the Levenshtein distance algorithm is from http://www.merriampark.com/ldjava.htmdistance.apply(null, *) = Throws
IllegalArgumentExceptiondistance.apply(*, null) = ThrowsIllegalArgumentExceptiondistance.apply("","") = 0 distance.apply("","a") = 1 distance.apply("aaapppp", "") = 7 distance.apply("frog", "fog") = 1 distance.apply("fly", "ant") = 3 distance.apply("elephant", "hippo") = 7 distance.apply("hippo", "elephant") = 7 distance.apply("hippo", "zzzzzzzz") = 8 distance.apply("hello", "hallo") = 1- Type Parameters:
E- The type of similarity score unit.- Parameters:
left- the first input, must not be null.right- the second input, must not be null.- Returns:
- result distance, or -1.
- Throws:
java.lang.IllegalArgumentException- if either String inputnull.- Since:
- 1.13.0
-
getThreshold
public java.lang.Integer getThreshold()
Gets the distance threshold.- Returns:
- The distance threshold.
-
-