NAME
Array::OverlapFinder - Find/remove overlapping items among ordered
sequences
VERSION
This document describes version 0.005 of Array::OverlapFinder (from Perl
distribution Array-OverlapFinder), released on 2020-01-02.
SYNOPSIS
use Array::OverlapFinder qw(
find_overlap
combine_overlap
);
# sequence is array of strings (compared with 'eq' operator; if you have array
# of records/structures, you can encode each record as JSON or using Data::Dmp,
# for example)
my @seq1 = qw(1 2 3 4 5 6);
my @seq2 = qw(4 5 6 7 8 9);
my @seq3 = qw(8 9 10 11);
my @overlap_items = find_overlap(\@seq1, \@seq2); # => (4,5,6)
my @all_overlap_items = find_overlap(\@seq1, \@seq2, \@seq3); # => ([4,5,6], [8,9])
my ($overlap_items_12, $index2_at_seq1, $overlap_items_13, $index3_at_seq1b) =
find_overlap({detail=>1}, \@seq1, \@seq2, \@seq3); # => ([4,5,6], 3, [8,9], 7)
my @combined_seq = combine_overlap(\@seq1, \@seq2, \@seq3); # => (1,2,3,4,5,6,7,8,9,10,11)
my ($combined_seq, $overlap_items_12, $index2_at_seq1, $overlap_items_13, $index3_at_seq1b) =
combine_overlap({detail=>1}, \@seq1, \@seq2, \@seq3);
# => ([1,2,3,4,5,6,7,8,9,10,11], [4,5,6], 3, [8,9], 7)
DESCRIPTION
Assuming you have two ordered sequences of items that might or might not
overlap, where the first sequence contains "earlier" items and the
second contains possibly "later" items, the functions in this module can
find the overlapping items for you or remove them combining the two
sequence into one:
# condition A, no overlaps
sequence1: 1 2 3 4 5 6
sequence2: 8 9 10
overlap :
combined : 1 2 3 4 5 6 8 9 10
# condition B, overlaps
sequence1: 1 2 3 4 5 6
sequence2: 4 5 6 7 8 9
overlap : 4 5 6
combined : 1 2 3 4 5 6 7 8 9
# condition C, overlaps
sequence1: 1 2 3 4 5 6
sequence2: 4 5
overlap : 4 5
combined : 1 2 3 4 5 6
# condition D, overlaps
sequence1: 1 2 3 4 5 6
sequence2: 4 5 6
overlap : 4 5 6
combined : 1 2 3 4 5 6
# condition E, overlaps (identical)
sequence1: 1 2 3 4 5 6
sequence2: 1 2 3 4 5 6
overlap : 1 2 3 4 5 6
combined : 1 2 3 4 5 6
# condition F, overlaps
sequence1: 1 2 3 4 5 6
sequence2: 1 2 3 4 5 6 7 8
overlap : 1 2 3 4 5 6
combined : 1 2 3 4 5 6 7 8
# condition G1, overlaps in the middle of second sequence will be assumed as non-overlapping
sequence1: 1 2 3 4 5 6
sequence2: 2 3 4 x x 5 6
overlap :
combined : 1 2 3 4 5 6 2 3 4 x x 5 6
# condition G2, multiple overlaps will be assumed as non-overlapping
sequence1: 1 2 3 4 5 6
sequence2: 2 3 4 x x 5 6 y y
overlap :
combined : 1 2 3 4 5 6 2 3 4 x x 5 6 y y
The functions can accept more than two sequences to find/remove
overlapping items in.
Use-cases: forming a non-overlapping sequence of items from repeated
downloads of RSS feed or "recent" page.
FUNCTIONS
All functions are not exported by default, but exportable.
find_overlap
Usage:
find_overlap([ \%opts , ] \@seq1, \@seq2, ...)
combine_overlap
Usage:
combine_overlap([ \%opts , ] \@seq1, \@seq2, ...)
HOMEPAGE
Please visit the project's homepage at
<
https://metacpan.org/release/Array-OverlapFinder>.
SOURCE
Source repository is at
<
https://github.com/perlancar/perl-Array-OverlapFinder>.
BUGS
Please report any bugs or feature requests on the bugtracker website
<
https://github.com/perlancar/perl-Array-OverlapFinder/issues>
When submitting a bug or request, please include a test-file or a patch
to an existing test-file that illustrates the bug or desired feature.
SEE ALSO
nauniq from App::nauniq can also sometimes be used, if you know the
items in the sequence are unique.
Algorithm::Diff
Text::OverlapFinder has a similar name, but the two modules are not that
related.
AUTHOR
perlancar <
[email protected]>
COPYRIGHT AND LICENSE
This software is copyright (c) 2021, 2020 by
[email protected].
This is free software; you can redistribute it and/or modify it under
the same terms as the Perl 5 programming language system itself.