Quantcast
Channel: Rash thoughts about .NET, C#, F# and Dynamics NAV. » Levenshtein distance
Viewing all articles
Browse latest Browse all 3

Damerau-Levenshtein-Distance in F# – part II – O(m+n) space

$
0
0

Last time I showed a naïve implementation of the Damerau-Levenshtein-Distance in F# that needs O(m*n) space. This is really bad if we want to compute the edit distance of large sequences (e.g. DNA sequences). If we look at the algorithm we can easily see that only the last two lines of the (n*m)-matrix are used. This observation leads to a improvement where we compute the distance with only 3 additional arrays of size min(n,m).

/// Calcs the damerau levenshtein distance.    
let calcDL (a:'a array) (b: 'a array) =       
  let n = a.Length + 1
  let m = b.Length + 1
  let lastLine = ref (Array.init m (fun i -> i))
  let lastLastLine = ref (Array.create m 0)
  let actLine = ref (Array.create m 0)
    
  for i in [1..a.Length] do
    (!actLine).[0] <- i      
    for j in [1..b.Length] do          
      let cost = 
        if a.[i-1] = b.[j-1] then 0 else 1
      let deletion = (!lastLine).[j] + 1
      let insertion = (!actLine).[j-1] + 1
      let substitution = (!lastLine).[j-1] + cost
      (!actLine).[j] <- 
        deletion 
        |> min insertion 
        |> min substitution

      if i > 1 && j > 1 then
        if a.[i-1] = b.[j-2] && a.[i-2] = b.[j-1] then
          let transposition = (!lastLastLine).[j-2] + cost  
          (!actLine).[j] <- min (!actLine).[j] transposition
    
    // swap lines
    let temp = !lastLastLine
    lastLastLine := !lastLine
    lastLine := !actLine
    actLine := temp
            
  (!lastLine).[b.Length]

 
let damerauLevenshtein(a:'a array) (b:'a array) =
  if a.Length > b.Length then
    calcDL a b
  else
    calcDL b a

This version of the algorithm needs only O(n+m) space but is not really "functional" style. I will show a more "F#-stylish" version in part III.

Original article: Damerau-Levenshtein-Distance in F# – part II – O(m+n) space

©2012 Rash thoughts about .NET, C#, F# and Dynamics NAV.. All Rights Reserved.


Viewing all articles
Browse latest Browse all 3

Trending Articles