Class LevenshteinDetailedDistance

    • Constructor Detail

      • LevenshteinDetailedDistance

        public LevenshteinDetailedDistance​(java.lang.Integer threshold)
        Constructs a new instance for a threshold.

        If the threshold is not null, distance calculations will be limited to a maximum length.

        If the threshold is null, the unlimited version of the algorithm will be used.

        Parameters:
        threshold - If this is null then distances calculations will not be limited. This may not be negative.
    • Method Detail

      • apply

        public LevenshteinResults apply​(java.lang.CharSequence left,
                                        java.lang.CharSequence right)
        Computes the Levenshtein distance between two Strings.

        A higher score indicates a greater distance.

        The previous implementation of the Levenshtein distance algorithm was from http://www.merriampark.com/ld.htm

        Chas Emerick has written an implementation in Java, which avoids an OutOfMemoryError which can occur when my Java implementation is used with very large strings.
        This implementation of the Levenshtein distance algorithm is from http://www.merriampark.com/ldjava.htm

         distance.apply(null, *)             = Throws IllegalArgumentException
         distance.apply(*, null)             = Throws IllegalArgumentException
         distance.apply("","")               = 0
         distance.apply("","a")              = 1
         distance.apply("aaapppp", "")       = 7
         distance.apply("frog", "fog")       = 1
         distance.apply("fly", "ant")        = 3
         distance.apply("elephant", "hippo") = 7
         distance.apply("hippo", "elephant") = 7
         distance.apply("hippo", "zzzzzzzz") = 8
         distance.apply("hello", "hallo")    = 1
         
        Specified by:
        apply in interface java.util.function.BiFunction<java.lang.CharSequence,​java.lang.CharSequence,​LevenshteinResults>
        Specified by:
        apply in interface ObjectSimilarityScore<java.lang.CharSequence,​LevenshteinResults>
        Specified by:
        apply in interface SimilarityScore<LevenshteinResults>
        Parameters:
        left - the first input, must not be null.
        right - the second input, must not be null.
        Returns:
        result distance, or -1.
        Throws:
        java.lang.IllegalArgumentException - if either String input null.
      • apply

        public <E> LevenshteinResults apply​(SimilarityInput<E> left,
                                            SimilarityInput<E> right)
        Computes the Levenshtein distance between two Strings.

        A higher score indicates a greater distance.

        The previous implementation of the Levenshtein distance algorithm was from http://www.merriampark.com/ld.htm

        Chas Emerick has written an implementation in Java, which avoids an OutOfMemoryError which can occur when my Java implementation is used with very large strings.
        This implementation of the Levenshtein distance algorithm is from http://www.merriampark.com/ldjava.htm

         distance.apply(null, *)             = Throws IllegalArgumentException
         distance.apply(*, null)             = Throws IllegalArgumentException
         distance.apply("","")               = 0
         distance.apply("","a")              = 1
         distance.apply("aaapppp", "")       = 7
         distance.apply("frog", "fog")       = 1
         distance.apply("fly", "ant")        = 3
         distance.apply("elephant", "hippo") = 7
         distance.apply("hippo", "elephant") = 7
         distance.apply("hippo", "zzzzzzzz") = 8
         distance.apply("hello", "hallo")    = 1
         
        Type Parameters:
        E - The type of similarity score unit.
        Parameters:
        left - the first input, must not be null.
        right - the second input, must not be null.
        Returns:
        result distance, or -1.
        Throws:
        java.lang.IllegalArgumentException - if either String input null.
        Since:
        1.13.0
      • getThreshold

        public java.lang.Integer getThreshold()
        Gets the distance threshold.
        Returns:
        The distance threshold.