|
| finding duplicates in a found set |
 |
Tue, 11 Mar 2008 11:47:59 -070 |
im having trouble using of the duplicate finder symbol "!"
i seem to be getting false results when i try to find duplicates
inside a found set.
say i have two fields: 'company' and 'address' with respective values:
company1, 1 main st
company2, 2 main st
company3, 3 main st
company4, 1 main st
my intent is to find companies with the same addresses and set the
value of a 3rd field 'mark' = 1 if the address is the same.
if i perform a find in the address field for "!", i get
company1, 1 main st
company4, 1 main st
i then set the 'mark' field for company4 = 1
now the part i dont understand.
show all records
perform find in 'mark' for "1", check omit records
i'm left with:
company1, 1 main st
company2, 2 main st
company3, 3 main st
perform find in 'address' for "!", constrain found set
im left with
company1, 1 main st
why??? shouldn't there be nothing found that matches that criteria?
|
| Post Reply
|
| Re: finding duplicates in a found set |
 |
Wed, 12 Mar 2008 19:07:38 -070 |
> > company1, 1 main st
> > company2, 2 main st
> > company3, 3 main st
>
> > perform find in 'address' for "!", constrain found set
> > im left with
> > company1, 1 main st
>
> > why??? shouldn't there be nothing found that matches that criteria?
>
> No, you get exactly what you are asking for. '1 main st' is a duplicate
> address and it is in the existing found set.
>
> --
> Hans Rijnbout
> Utrecht, Netherlands
not sure i understand...
while '1 main st' is a duplicate in the entire database, it is not a
duplicate in that particular found set.
the current found set is:
company1, 1 main st
company2, 2 main st
company3, 3 main st
these are unique values for these three records.
yet when i perform a find for "!" and select 'constrain found
set' (fmp 7)
it returns the '1 main st' record.
|
| Post Reply
|
| Re: finding duplicates in a found set |
 |
Thu, 13 Mar 2008 00:34:28 +010 |
<geoff.w.miller@gmail.com> wrote:
> im having trouble using of the duplicate finder symbol "!"
> i seem to be getting false results when i try to find duplicates
> inside a found set.
>
> say i have two fields: 'company' and 'address' with respective values:
> company1, 1 main st
> company2, 2 main st
> company3, 3 main st
> company4, 1 main st
>
> my intent is to find companies with the same addresses and set the
> value of a 3rd field 'mark' = 1 if the address is the same.
>
> if i perform a find in the address field for "!", i get
> company1, 1 main st
> company4, 1 main st
>
> i then set the 'mark' field for company4 = 1
>
> now the part i dont understand.
>
> show all records
> perform find in 'mark' for "1", check omit records
> i'm left with:
> company1, 1 main st
> company2, 2 main st
> company3, 3 main st
>
> perform find in 'address' for "!", constrain found set
> im left with
> company1, 1 main st
>
> why??? shouldn't there be nothing found that matches that criteria?
No, you get exactly what you are asking for. '1 main st' is a duplicate
address and it is in the existing found set.
--
Hans Rijnbout
|
| Post Reply
|
| Re: finding duplicates in a found set |
 |
Thu, 13 Mar 2008 10:29:22 -070 |
On Mar 12, 10:18 pm, Helpful Harry <helpful_ha...@nom.de.plume.com>
wrote:
> I've never used the option, but I think you have to read the wording
> carefully - "Constrain FOUND SET". Apparently you're constraining
the
> new Find to display results from the current Found Set, but still
> actually performing the Find over the entire database ... which is not
> the same as performing the Find on the current Found Set only.
That would be a perverse reading of how it works.
The entire point of constraining a found set is to LIMIT the number of
records under consideration in the new find to the those in the found
set.
That said, you are right, "find duplicates" is not doing a
comparative
scan of the entire database, it just looks at the index for that field
value and the record is a match if the index has more than one record
pointer for a given value. The fact that some or all of those
additional record pointers point to records that aren't also in the
found set evidently isn't considered.
I think 'find duplicates' is the only find criteria that is even
affected. No other find criteria matches based on the existence or
status or contents of other records nevermind the found set (excluding
searching through relationships/portals). There is no 'find maximum'
criteria for example. (Though it would be nice if there was, ... and
again... if I searched for the 'maximum' in a constrain to found set I
would expect it to return the record the was the maximum in the found
set, not the record that was the maximum in the entire database if I'm
lucky enough that its in the found set.
> Maybe it's a bug or maybe that's how it's meant to work - perhaps the
> Help File has a more detailed explanation of what it's supposed to be
> doing.
The help file is ambiguous at best, but it would be relatively unusual
to want to find the records in the found set that have duplicates in
the database that may or may not be in the found set. It would be far
more natural and consistent for constrain found set to only consider
the records in the found set.
I'd report it as a bug.
To fix it they'd just have to treat the field as unindexed and create
a temporary index on the fly based on the found set if find duplicates
and constrain found set is selected.
That would be fairly trivial to implement, but I doubt they'll do it
as its been operating the current way for several versions, and
someone somewhere out there is probably relying on it to work like
this by now.
> Probably the best work around is to mark all the duplicates ealier in
> the process. For example, after finding the original duplicates, mark
> the first record as "1" and the other(s) as "2". Then
your second Find
> is simply a matter of omitting those records that are marked with any
> number (or including only those unmarked).
Agreed, some work around or other will be needed.
-cheers,
|
| Post Reply
|
| Re: finding duplicates in a found set |
 |
Thu, 13 Mar 2008 18:18:27 +130 |
In article
<4416584d-d02f-4cb6-b08d-79f5ddf61c65@q78g2000hsh.googlegroups.com>,
"G.Miller" <geoff.w.miller@gmail.com> wrote:
> > > company1, 1 main st
> > > company2, 2 main st
> > > company3, 3 main st
> >
> > > perform find in 'address' for "!", constrain found set
> > > im left with
> > > company1, 1 main st
> >
> > > why??? shouldn't there be nothing found that matches that
criteria?
> >
> > No, you get exactly what you are asking for. '1 main st' is a
duplicate
> > address and it is in the existing found set.
> >
> > --
> > Hans Rijnbout
> > Utrecht, Netherlands
>
> not sure i understand...
> while '1 main st' is a duplicate in the entire database, it is not a
> duplicate in that particular found set.
>
> the current found set is:
> company1, 1 main st
> company2, 2 main st
> company3, 3 main st
> these are unique values for these three records.
> yet when i perform a find for "!" and select 'constrain found
> set' (fmp 7)
> it returns the '1 main st' record.
>
> geoff
I've never used the option, but I think you have to read the wording
carefully - "Constrain FOUND SET". Apparently you're constraining the
new Find to display results from the current Found Set, but still
actually performing the Find over the entire database ... which is not
the same as performing the Find on the current Found Set only.
"Company1" has a duplicate in the database AND it appears in the
current Found Set, so that record is in the resulting new Found Set.
You can test it by simply ommitting the "Company1" record manually
before performing the constrained Find. You should get the empty Found
Set you're expecting since the "Company1" was no longer in the first
current Found Set.
Maybe it's a bug or maybe that's how it's meant to work - perhaps the
Help File has a more detailed explanation of what it's supposed to be
doing.
Probably the best work around is to mark all the duplicates ealier in
the process. For example, after finding the original duplicates, mark
the first record as "1" and the other(s) as "2". Then your
second Find
is simply a matter of omitting those records that are marked with any
number (or including only those unmarked).
Helpful Harry
|
| Post Reply
|
|
|