Extracting Sql: entries

Showing posts with label entries. Show all posts

Sunday, February 19, 2012

Friday, February 17, 2012

I am working on an inherited database and trying to add indexes on existing
tables. A few of the tables have duplicate entries that cause the
application of the indices to fail. I know that on some of the tables I can
simply copy the table, delete the records apply the index import from old
table and only unique records will make it into the table. ( I am correct
here aren't I?) However...
On a couple of tables I need to actually "see" the duplicate records so that
I can choose which records to keep and do a manual deletion of the duplicate
records.
There are no PK's on the tables in question and the indices I need to put in
place are, for example, on 6 of 15 columns. I need to find the dupes and
view the information in non-indexed columns to make my decision on which to
keep and which to delete.
Does this make any sense? Can anyone help me out with how I would go about
identifying the duplicate records?
Thanks in advance.
danieldaniel wrote:
> I am working on an inherited database and trying to add indexes on
> existing tables. A few of the tables have duplicate entries that
> cause the application of the indices to fail. I know that on some of
> the tables I can simply copy the table, delete the records apply the
> index import from old table and only unique records will make it into
> the table. ( I am correct here aren't I?) However...
> On a couple of tables I need to actually "see" the duplicate records
> so that I can choose which records to keep and do a manual deletion
> of the duplicate records.
> There are no PK's on the tables in question and the indices I need to
> put in place are, for example, on 6 of 15 columns. I need to find the
> dupes and view the information in non-indexed columns to make my
> decision on which to keep and which to delete.
> Does this make any sense? Can anyone help me out with how I would go
> about identifying the duplicate records?
> Thanks in advance.
> daniel
You can group by the columns to be indexed and use a having clause to
return the dupes. Not sure of your table size, so the query may be CPU
and tiem consuming:
For example:
Select a, b, c, d
From MyTable
Group By a, b, c, d
Having COUNT(*) > 1
To get rid of the dupes, you could simply create another table and add a
clustered index with the ignore dup keys flag and insert the rows. But
you wouldn't have much control over which dupes were removed.
David Gugick
Imceda Software
www.imceda.com|||"David Gugick" <davidg-nospam@.imceda.com> wrote in message
news:OrwxbzgCFHA.444@.TK2MSFTNGP15.phx.gbl...
> daniel wrote:
> > I am working on an inherited database and trying to add indexes on
> > existing tables. A few of the tables have duplicate entries that
> > cause the application of the indices to fail. I know that on some of
> > the tables I can simply copy the table, delete the records apply the
> > index import from old table and only unique records will make it into
> > the table. ( I am correct here aren't I?) However...
> >
> > On a couple of tables I need to actually "see" the duplicate records
> > so that I can choose which records to keep and do a manual deletion
> > of the duplicate records.
> >
> > There are no PK's on the tables in question and the indices I need to
> > put in place are, for example, on 6 of 15 columns. I need to find the
> > dupes and view the information in non-indexed columns to make my
> > decision on which to keep and which to delete.
> >
> > Does this make any sense? Can anyone help me out with how I would go
> > about identifying the duplicate records?
> >
> > Thanks in advance.
> >
> > daniel
> You can group by the columns to be indexed and use a having clause to
> return the dupes. Not sure of your table size, so the query may be CPU
> and tiem consuming:
> For example:
> Select a, b, c, d
> From MyTable
> Group By a, b, c, d
> Having COUNT(*) > 1
> To get rid of the dupes, you could simply create another table and add a
> clustered index with the ignore dup keys flag and insert the rows. But
> you wouldn't have much control over which dupes were removed.
>
> --
> David Gugick
> Imceda Software
> www.imceda.com
>
Thanks David but I guess maybe I've misunderstood the error message from
SQL.(?) I queried one of the tables on which I receive the indexing error,
with the query you provided, and it returned 0 records. Perhaps providing
the error message will allow someone more knowledgeable than I to tell me
I'm interpreting it wrong. The message is as follows:
"Server: Msg 1505, Level 16, State 1
CREATE UNIQUE INDEX terminated because a duplicate key was found for index
id 9. Most significant primary key is '115040'"
Does this not mean there are duplicate row entries based on the attempted
index columns? If so why do I get a return of 0 records with the query
provided below?
Any ideas anyone?|||daniel wrote:
> Thanks David but I guess maybe I've misunderstood the error message
> from SQL.(?) I queried one of the tables on which I receive the
> indexing error, with the query you provided, and it returned 0
> records. Perhaps providing the error message will allow someone more
> knowledgeable than I to tell me I'm interpreting it wrong. The
> message is as follows:
> "Server: Msg 1505, Level 16, State 1
> CREATE UNIQUE INDEX terminated because a duplicate key was found for
> index id 9. Most significant primary key is '115040'"
> Does this not mean there are duplicate row entries based on the
> attempted index columns? If so why do I get a return of 0 records
> with the query provided below?
> Any ideas anyone?
You'll need to provide the query, the table ddl, and the index statement
in order to tell what's going on.
David Gugick
Imceda Software
www.imceda.com|||You might find this interesting.
http://www.15seconds.com/issue/011009.htm
Ben Miller
"daniel" <dhagwood@.hotmail.com> wrote in message
news:OVAIxhgCFHA.560@.TK2MSFTNGP15.phx.gbl...
>I am working on an inherited database and trying to add indexes on existing
> tables. A few of the tables have duplicate entries that cause the
> application of the indices to fail. I know that on some of the tables I
> can
> simply copy the table, delete the records apply the index import from old
> table and only unique records will make it into the table. ( I am correct
> here aren't I?) However...
> On a couple of tables I need to actually "see" the duplicate records so
> that
> I can choose which records to keep and do a manual deletion of the
> duplicate
> records.
> There are no PK's on the tables in question and the indices I need to put
> in
> place are, for example, on 6 of 15 columns. I need to find the dupes and
> view the information in non-indexed columns to make my decision on which
> to
> keep and which to delete.
> Does this make any sense? Can anyone help me out with how I would go about
> identifying the duplicate records?
> Thanks in advance.
> daniel
>|||daniel wrote:
> "David Gugick" <davidg-nospam@.imceda.com> wrote in message
> news:OrwxbzgCFHA.444@.TK2MSFTNGP15.phx.gbl...
>> daniel wrote:
>> I am working on an inherited database and trying to add indexes on
>> existing tables. A few of the tables have duplicate entries that
>> cause the application of the indices to fail. I know that on some of
>> the tables I can simply copy the table, delete the records apply the
>> index import from old table and only unique records will make it
>> into the table. ( I am correct here aren't I?) However...
>> On a couple of tables I need to actually "see" the duplicate records
>> so that I can choose which records to keep and do a manual deletion
>> of the duplicate records.
>> There are no PK's on the tables in question and the indices I need
>> to put in place are, for example, on 6 of 15 columns. I need to
>> find the dupes and view the information in non-indexed columns to
>> make my decision on which to keep and which to delete.
>> Does this make any sense? Can anyone help me out with how I would go
>> about identifying the duplicate records?
>> Thanks in advance.
>> daniel
>> You can group by the columns to be indexed and use a having clause to
>> return the dupes. Not sure of your table size, so the query may be
>> CPU and tiem consuming:
>> For example:
>> Select a, b, c, d
>> From MyTable
>> Group By a, b, c, d
>> Having COUNT(*) > 1
>> To get rid of the dupes, you could simply create another table and
>> add a clustered index with the ignore dup keys flag and insert the
>> rows. But you wouldn't have much control over which dupes were
>> removed.
>>
>> --
>> David Gugick
>> Imceda Software
>> www.imceda.com
> Thanks David but I guess maybe I've misunderstood the error message
> from SQL.(?) I queried one of the tables on which I receive the
> indexing error, with the query you provided, and it returned 0
> records. Perhaps providing the error message will allow someone more
> knowledgeable than I to tell me I'm interpreting it wrong. The
> message is as follows:
> "Server: Msg 1505, Level 16, State 1
> CREATE UNIQUE INDEX terminated because a duplicate key was found for
> index id 9. Most significant primary key is '115040'"
> Does this not mean there are duplicate row entries based on the
> attempted index columns? If so why do I get a return of 0 records
> with the query provided below?
> Any ideas anyone?
This works for me:
create table #a (col1 int, col2 int, col3 int)
insert into #a values (1, 2, 3)
insert into #a values (1, 2, 3)
insert into #a values (2, 2, 3)
insert into #a values (3, 2, 1)
insert into #a values (3, 2, 1)
insert into #a values (3, 2, 1)
Select count(*) as "# Dupes", col1, col2, col3
From #a
Group By col1, col2, col3
Having count(*) > 1
# Dupes col1 col2 col3
-- -- -- --
2 1 2 3
3 3 2 1
David Gugick
Imceda Software
www.imceda.com|||"David Gugick" <davidg-nospam@.imceda.com> wrote in message
news:ea8zyRlCFHA.2676@.TK2MSFTNGP12.phx.gbl...
> daniel wrote:
> > "David Gugick" <davidg-nospam@.imceda.com> wrote in message
> > news:OrwxbzgCFHA.444@.TK2MSFTNGP15.phx.gbl...
> >> daniel wrote:
> >> I am working on an inherited database and trying to add indexes on
> >> existing tables. A few of the tables have duplicate entries that
> >> cause the application of the indices to fail. I know that on some of
> >> the tables I can simply copy the table, delete the records apply the
> >> index import from old table and only unique records will make it
> >> into the table. ( I am correct here aren't I?) However...
> >>
> >> On a couple of tables I need to actually "see" the duplicate records
> >> so that I can choose which records to keep and do a manual deletion
> >> of the duplicate records.
> >>
> >> There are no PK's on the tables in question and the indices I need
> >> to put in place are, for example, on 6 of 15 columns. I need to
> >> find the dupes and view the information in non-indexed columns to
> >> make my decision on which to keep and which to delete.
> >>
> >> Does this make any sense? Can anyone help me out with how I would go
> >> about identifying the duplicate records?
> >>
> >> Thanks in advance.
> >>
> >> daniel
> >>
> >> You can group by the columns to be indexed and use a having clause to
> >> return the dupes. Not sure of your table size, so the query may be
> >> CPU and tiem consuming:
> >>
> >> For example:
> >>
> >> Select a, b, c, d
> >> From MyTable
> >> Group By a, b, c, d
> >> Having COUNT(*) > 1
> >>
> >> To get rid of the dupes, you could simply create another table and
> >> add a clustered index with the ignore dup keys flag and insert the
> >> rows. But you wouldn't have much control over which dupes were
> >> removed.
> >>
> >>
> >> --
> >> David Gugick
> >> Imceda Software
> >> www.imceda.com
> >>
> >
> > Thanks David but I guess maybe I've misunderstood the error message
> > from SQL.(?) I queried one of the tables on which I receive the
> > indexing error, with the query you provided, and it returned 0
> > records. Perhaps providing the error message will allow someone more
> > knowledgeable than I to tell me I'm interpreting it wrong. The
> > message is as follows:
> >
> > "Server: Msg 1505, Level 16, State 1
> > CREATE UNIQUE INDEX terminated because a duplicate key was found for
> > index id 9. Most significant primary key is '115040'"
> >
> > Does this not mean there are duplicate row entries based on the
> > attempted index columns? If so why do I get a return of 0 records
> > with the query provided below?
> >
> > Any ideas anyone?
> This works for me:
> create table #a (col1 int, col2 int, col3 int)
>
> insert into #a values (1, 2, 3)
> insert into #a values (1, 2, 3)
> insert into #a values (2, 2, 3)
> insert into #a values (3, 2, 1)
> insert into #a values (3, 2, 1)
> insert into #a values (3, 2, 1)
>
> Select count(*) as "# Dupes", col1, col2, col3
> From #a
> Group By col1, col2, col3
> Having count(*) > 1
> # Dupes col1 col2 col3
> -- -- -- --
> 2 1 2 3
> 3 3 2 1
>
> --
> David Gugick
> Imceda Software
> www.imceda.com
>
Thanks again David. I finally got the query to work. Knowing that there were
duplicates I did some intensive eye straining work to figure out the
problem. Appears as though some of the "duplicate" records also contained
unexpected data, empty string vs. NULL value and this was thowing off the
query. I updated the fields as necessary and everything worked out fine.
Thanks again for your assistance.
daniel

Sunday, February 19, 2012

Extracting duplicate records

Friday, February 17, 2012

Extracting duplicate records

Extracting duplicate records

Extracting Sql

Blog Archive

About Me