Here's a Perl 6 solution. I use a grammar that knows how to grab four interesting characters despite interstitial stuff. More complex requirements require a different grammar, but that's not so hard.
Each time there's a match, the NString::Actions class object gets a change to inspect the match. It does the same high-water mark thing I was doing before. This looks like a bunch more work, and it is for this trivial example. For more complex examples, it's not going to be that much worse. My Perl 5 version would have to do a lot of tooling to figure out what to keep or not keep.
use Text::Levenshtein;
my $string = 'The quixotic purple and jasmine butterfly flew over the quick zany dog';
grammar NString {
regex n-chars { [<.ignore-chars>* \w]**4 }
regex ignore-chars { \s }
}
class NString::Actions {
# See
my subset IntInf where Int:D | Inf;
has $.target;
has Str $.closest is rw = '';
has IntInf $.closest-distance is rw = Inf;
method n-chars ($/) {
my $string = $/.subst: /\s+/, '', :g;
my $distance = distance( $string, self.target );
# say "Matched <$/>. Distance for $string is $distance";
if $distance < self.closest-distance {
self.closest = $string;
self.closest-distance = $distance;
}
}
}
my $action = NString::Actions.new: target => 'Perl';
loop {
state $from = 0;
my $match = NString.subparse(
$string,
:rule('n-chars'),
:actions($action),
:c($from)
);
last unless ?$match;
$from++;
}
say "Shortest is { $action.closest } with { $action.closest-distance }";
(I did a straight port from Perl 5, which I'll leave here)
I tried the same thing in Perl 6, but I'm sure that this is a bit verbose. I was wondering if there's a clever way to grab groups of N chars to compare. Maybe I'll have some improvement later.
use Text::Levenshtein;
put edit( "four", "foar" );
put edit( "four", "noise fo or blur" );
sub edit ( Str:D $start, Str:D $target --> Int:D ) {
my $target-modified = $target.subst: rx/\s+/, '', :g;
my $last-position-to-check = [-] map { .chars }, $target-modified, $start;
my $closest = Any;
my $closest-distance = $start.chars + 1;
for 0..$last-position-to-check -> $starting-pos {
my $substr = $target-modified.substr: $starting-pos, $start.chars;
my $this-distance = distance( $start, $substr );
put "So far: $substr -> $this-distance";
if $this-distance < $closest-distance {
$closest = $substr;
$closest-distance = $this-distance;
}
last if $this-distance = 0;
}
return $closest-distance // -1;
}