Subj : Dupeloops
To : mark lewis
From : Rob Swindell
Date : Tue Jun 19 2018 02:26 pm
Re: Dupeloops
By: mark lewis to Rob Swindell on Tue Jun 19 2018 03:20 pm
>
> On 2018 Jun 19 11:31:08, you wrote to me:
>
> >> my point is specifically that messages with a ^aRESCANNED control line
> >> should not be passed on to other links... ever... that will stop them
> >> from triggering what looks like a regurge or "dupe dump"... they will
> >> be different than the original message because of the ^aRESCANNED
> >> control line so they will not be caught by most dupe detection
> >> techniques... that's the real problem...
>
> RS> Is that true?
>
> in numerous cases, yes... but, if i want a rescan of an area that had
> damaged data files and i'm trying to recover the last year's messages, why
> should the rescanned messages be sent on to any other system? mine is the
> only one that wants or needs them... why should other linked systems have
> to do the additional work? if we just don't send ^aRESCANNED messages on to
> other systems, no other systems would be bothered...
I don't dispute that rescanned message shouldn't be forwarded to downlinks and
I just committed a change to SBBSecho to that effect.
> RS> Synchronet/SBBSecho uses 2 methods of dupe messge detection:
>
> RS> 1. Message-ID (in the case of FTN, that's everything between "\1MSGID:
> RS> " and
> RS> the CR) - the Message-ID doesn't change when messages a re-scanned
> RS> 2. Message body text (not including kludge/control lines,
> RS> paths/seen-bys,
> RS> and tear/tag/origin lines)
>
> RS> Rescanned messages would (should) be caught as dupes just fine.
>
> that looks ok but not everyone goes that route with their dupe detection
> code...
>
> i've seen the second one cause systems to only see, for example, the first
> monthly posting of something and they never see it again in any of the
> following months... then it is purged out of their message base and they
> don't have it any more and don't receive it either... maybe it is echo
> rules... maybe
> it is a monthly PSA...
And if it's duplicate, it's a duplicate. That's why auto-posters should (?) put
timestamps or other unique data in their message body if they really want to
avoid being ignored as dupes. But including metadata (control lines) in the
dupe detection seems like a bad approach. If message takes a different path,
it'll have different metadata, but it's still a dupe (and often that's how
dupes arrive, via a different path than the original).
digital man
This Is Spinal Tap quote #43:
I feel my role in the band is ... kind of like lukewarm water.
Norco, CA WX: 80.4�F, 47.0% humidity, 12 mph ENE wind, 0.00 inches rain/24hrs