Hi-
I'm a developer on a team with a 200GB db. We inherited it, it definitely
has design problems. About 3 weeks ago we started seeing a lot of blocking
on our largest table which has affected our systems. We made no changes at
the time the blocking started occurring. The db has been growing about 20
GB/month, and we are re-indexing every weekend. We have significant hardware
to support the database, and our DBAs do not believe it to be
hardware-related.
Is it possible that the SQL Server engine has problems when the db reaches
such a large size? Has anyone else experienced a similar situation or could
offer up ideas?
Thx,
JanHi
How are your indexes laid out compared to the database design? Is there a
clustered index on a very selectable column? Maybe posting the DDL of the
table would help us define a solution.
On large tables, a wrong clustered index can cuase havoc with blocking.
I am running multiple 5Tb databases, on 8Gb RAM servers, with no issues.
Regards
Mike
"Jan" wrote:
> Hi-
> I'm a developer on a team with a 200GB db. We inherited it, it definitely
> has design problems. About 3 weeks ago we started seeing a lot of blocking
> on our largest table which has affected our systems. We made no changes at
> the time the blocking started occurring. The db has been growing about 20
> GB/month, and we are re-indexing every weekend. We have significant hardware
> to support the database, and our DBAs do not believe it to be
> hardware-related.
> Is it possible that the SQL Server engine has problems when the db reaches
> such a large size? Has anyone else experienced a similar situation or could
> offer up ideas?
> Thx,
> Jan|||Thanks for the reply. Wow- 5TB, very impressive!
We do have clustered indexes. Though not exactly the script (security
reasons), this might give you an idea:
CREATE TABLE [dbo].[PROBTBL] (
[RNO] [int] IDENTITY (1, 1) NOT NULL ,
[KNO] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[ANOTHKNO] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ATYPEKIND] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[LNITEMNO] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ACUSTID] [int] NULL ,
[VID] [int] NULL ,
[VNUM] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[BR1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[BR2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CTR] [varchar] (40) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ADDRESS1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ADDRESS2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CTY] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ST] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ZC] [char] (9) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CNTY] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[LTYPE] [char] (4) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[OP] [datetime] NULL ,
[CLS] [datetime] NULL ,
[CLSCODE] [int] NULL ,
[PRN] [decimal](11, 2) NULL ,
[MNAMOUNT] [decimal](11, 2) NULL ,
[LTCHG] [decimal](11, 2) NULL ,
[LTPCT] [decimal](5, 2) NULL ,
[CANCL] [datetime] NULL ,
[TERM] [datetime] NULL ,
[STS] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[S1] [int] NULL ,
[S2] [int] NULL ,
[INTPRD] [decimal](7, 2) NULL ,
[SRVPRD] [decimal](7, 2) NULL ,
[ORIG] [decimal](11, 2) NULL ,
[LDESC] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[MLADDRESS1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[MLADDRESS2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[MLCTY] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[MLST] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[MLZC] [char] (9) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[PPHONE] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[BPHONE] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[IRTE] [decimal](6, 3) NULL ,
[DEF] [datetime] NULL ,
[RFL] [datetime] NULL ,
[INVID] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[INVNMNO] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[IVPRCT] [decimal](7, 3) NULL ,
[INRID] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[INSNMNO] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[INSPCTT] [decimal](7, 3) NULL ,
[INSCRTT] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[PRTY] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[PRCSSR] [int] NULL ,
[NXTCD] [int] NULL ,
[NXTDT] [datetime] NULL ,
[LASTACC] [datetime] NULL ,
[ONHLD] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__ONHLD__4301EA8F] DEFAULT (0),
[IVFEE] [decimal](9, 2) NULL ,
[ISFEE] [decimal](9, 2) NULL ,
[ADDFEEREQ] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__ADDFEE__43F60EC8]
DEFAULT (0),
[ADFE] [decimal](9, 2) NULL ,
[IVCD] [char] (6) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ISCD] [char] (6) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[HLDFRBK] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__HOLDFO__44EA3301] DEFAULT
(0),
[CNTSTD] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__CONTES__45DE573A] DEFAULT
(0),
[VNDPRD] [int] NULL ,
[SVNXTCDE] [int] NULL ,
[SVNXTDT] [datetime] NULL ,
[DJSTBL] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__ADJUST__46D27B73] DEFAULT
(0),
[XTCHNG] [datetime] NULL ,
[SCRWNL] [datetime] NULL ,
[BKNTFY] [datetime] NULL ,
[PRPTND] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[RGNT] [datetime] NULL ,
[MTRT] [datetime] NULL ,
[SRVCLS] [datetime] NULL ,
[NTS] [int] NULL ,
[NVSTD] [int] NULL ,
[PTYP] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[PSN] [int] NULL ,
[MRSN] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[LstMdfd] [datetime] NULL ,
[LstMdfdDTTM] [datetime] NULL ,
[Lngnd] [varchar] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
CONSTRAINT [PK_PROBTBL] PRIMARY KEY CLUSTERED
(
[KNO]
) WITH FILLFACTOR = 40 ON [PRIMARY] ,
CONSTRAINT [FK_Lngnd] FOREIGN KEY
(
[Lngnd]
) REFERENCES [LngndDefinition] (
[Lngnd]
)
) ON [PRIMARY] TEXTIMAGE_ON [TEXT3]
GO
CREATE INDEX [IXC_PROBTBL] ON [dbo].[PROBTBL]([KNO], [VID]) WITH
FILLFACTOR = 90 ON [INDEXES2]
GO
CREATE INDEX [RNO_IDX] ON [dbo].[PROBTBL]([RNO]) WITH FILLFACTOR = 90 ON
[INDEXES2]
GO
CREATE INDEX [Clt_IDX] ON [dbo].[PROBTBL]([ACUSTID]) WITH FILLFACTOR = 90
ON [INDEXES]
GO
CREATE INDEX [LNITEMNO_IDX] ON [dbo].[PROBTBL]([LNITEMNO]) WITH
FILLFACTOR = 90 ON [INDEXES]
GO
CREATE INDEX [[tIDX_PROBTBL_COMPOSITE] ON [dbo].[PROBTBL]([ACUSTID],
[VNUM], [KNO], [ATYPEKIND], [LNITEMNO]) WITH FILLFACTOR = 90 ON [INDEXES2]
GO
CREATE INDEX [Vdors_IDX] ON [dbo].[PROBTBL]([VID]) WITH FILLFACTOR = 90
ON [INDEXES2]
GO
CREATE INDEX [SRVCLS_IDX] ON [dbo].[PROBTBL]([SRVCLS]) WITH FILLFACTOR =90 ON [INDEXES]
GO
CREATE INDEX [VNUM_IDX] ON [dbo].[PROBTBL]([VNUM]) WITH FILLFACTOR = 90
ON [INDEXES]
GO
CREATE INDEX [ADDRESS1_IDX] ON [dbo].[PROBTBL]([ADDRESS1]) WITH
FILLFACTOR = 90 ON [INDEXES2]
GO
CREATE INDEX [NVSTD_IDX] ON [dbo].[PROBTBL]([NVSTD]) WITH FILLFACTOR = 90
ON [INDEXES]
GO
CREATE INDEX [tIDX_PROBTBL_KNO_ACUSTID] ON [dbo].[PROBTBL]([KNO],
[ACUSTID]) WITH FILLFACTOR = 90 ON [INDEXES]
GO
CREATE INDEX [INVNMNO_IDX] ON [dbo].[PROBTBL]([INVNMNO]) WITH FILLFACTOR
= 90 ON [INDEXES2]
GO
CREATE INDEX [CTR_IDX] ON [dbo].[PROBTBL]([CTR]) WITH FILLFACTOR = 90 ON
[INDEXES2]
GO
CREATE INDEX [LastModified_IDX] ON [dbo].[PROBTBL]([LastModified]) WITH
FILLFACTOR = 90 ON [INDEXES]
GO
CREATE INDEX [IDX_TSBodyAtt] ON [dbo].[PROBTBL]([VID], [NVSTD],
[ATYPEKIND]) WITH FILLFACTOR = 90 ON [INDEXES]
GO
CREATE INDEX [tIDX_PROBTBL_ST_CLS] ON [dbo].[PROBTBL]([ST], [CLS]) WITH
FILLFACTOR = 90 ON [INDEXES2]
GO
"Mike Epprecht (SQL MVP)" wrote:
> Hi
> How are your indexes laid out compared to the database design? Is there a
> clustered index on a very selectable column? Maybe posting the DDL of the
> table would help us define a solution.|||You should really find the source of the block. For example, is it a query
an update or an insert? Is the block caused by (say) small "X" locks, or an
"S" lock at the table level. One bad query plan can cause an index scan
which in turn can cause an "S" lock on the table; disaster!
There is some information on obtaining the "block chain" that you could find
using a Google search. I found a script which identifies the blocking
connection, all blocked connections, and query buffer contents for the
blocker (no always helpfull, but sometimes) and I have found this informatin
indespensible.
Without knowing the real cause and nature of the blocking, you might miss
the best solution.|||Jan wrote:
> Thanks for the reply. Wow- 5TB, very impressive!
> We do have clustered indexes. Though not exactly the script (security
> reasons), this might give you an idea:
> CREATE TABLE [dbo].[PROBTBL] (
> [RNO] [int] IDENTITY (1, 1) NOT NULL ,
> [KNO] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
> [ANOTHKNO] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [ATYPEKIND] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [LNITEMNO] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [ACUSTID] [int] NULL ,
> [VID] [int] NULL ,
> [VNUM] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [BR1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [BR2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [CTR] [varchar] (40) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [ADDRESS1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [ADDRESS2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [CTY] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [ST] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [ZC] [char] (9) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [CNTY] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [LTYPE] [char] (4) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [OP] [datetime] NULL ,
> [CLS] [datetime] NULL ,
> [CLSCODE] [int] NULL ,
> [PRN] [decimal](11, 2) NULL ,
> [MNAMOUNT] [decimal](11, 2) NULL ,
> [LTCHG] [decimal](11, 2) NULL ,
> [LTPCT] [decimal](5, 2) NULL ,
> [CANCL] [datetime] NULL ,
> [TERM] [datetime] NULL ,
> [STS] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [S1] [int] NULL ,
> [S2] [int] NULL ,
> [INTPRD] [decimal](7, 2) NULL ,
> [SRVPRD] [decimal](7, 2) NULL ,
> [ORIG] [decimal](11, 2) NULL ,
> [LDESC] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [MLADDRESS1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
> , [MLADDRESS2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS
> NULL , [MLCTY] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS
> NULL , [MLST] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [MLZC] [char] (9) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [PPHONE] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [BPHONE] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [IRTE] [decimal](6, 3) NULL ,
> [DEF] [datetime] NULL ,
> [RFL] [datetime] NULL ,
> [INVID] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [INVNMNO] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [IVPRCT] [decimal](7, 3) NULL ,
> [INRID] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [INSNMNO] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [INSPCTT] [decimal](7, 3) NULL ,
> [INSCRTT] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [PRTY] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [PRCSSR] [int] NULL ,
> [NXTCD] [int] NULL ,
> [NXTDT] [datetime] NULL ,
> [LASTACC] [datetime] NULL ,
> [ONHLD] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__ONHLD__4301EA8F]
> DEFAULT (0), [IVFEE] [decimal](9, 2) NULL ,
> [ISFEE] [decimal](9, 2) NULL ,
> [ADDFEEREQ] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__ADDFEE__43F60EC8]
> DEFAULT (0),
> [ADFE] [decimal](9, 2) NULL ,
> [IVCD] [char] (6) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [ISCD] [char] (6) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [HLDFRBK] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__HOLDFO__44EA3301]
> DEFAULT (0),
> [CNTSTD] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__CONTES__45DE573A]
> DEFAULT (0),
> [VNDPRD] [int] NULL ,
> [SVNXTCDE] [int] NULL ,
> [SVNXTDT] [datetime] NULL ,
> [DJSTBL] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__ADJUST__46D27B73]
> DEFAULT (0),
> [XTCHNG] [datetime] NULL ,
> [SCRWNL] [datetime] NULL ,
> [BKNTFY] [datetime] NULL ,
> [PRPTND] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [RGNT] [datetime] NULL ,
> [MTRT] [datetime] NULL ,
> [SRVCLS] [datetime] NULL ,
> [NTS] [int] NULL ,
> [NVSTD] [int] NULL ,
> [PTYP] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [PSN] [int] NULL ,
> [MRSN] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [LstMdfd] [datetime] NULL ,
> [LstMdfdDTTM] [datetime] NULL ,
> [Lngnd] [varchar] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> CONSTRAINT [PK_PROBTBL] PRIMARY KEY CLUSTERED
> (
> [KNO]
> ) WITH FILLFACTOR = 40 ON [PRIMARY] ,
> CONSTRAINT [FK_Lngnd] FOREIGN KEY
> (
> [Lngnd]
> ) REFERENCES [LngndDefinition] (
> [Lngnd]
> )
> ) ON [PRIMARY] TEXTIMAGE_ON [TEXT3]
> GO
> CREATE INDEX [IXC_PROBTBL] ON [dbo].[PROBTBL]([KNO], [VID]) WITH
> FILLFACTOR = 90 ON [INDEXES2]
> GO
> CREATE INDEX [RNO_IDX] ON [dbo].[PROBTBL]([RNO]) WITH FILLFACTOR => 90 ON [INDEXES2]
> GO
> CREATE INDEX [Clt_IDX] ON [dbo].[PROBTBL]([ACUSTID]) WITH
> FILLFACTOR = 90 ON [INDEXES]
> GO
> CREATE INDEX [LNITEMNO_IDX] ON [dbo].[PROBTBL]([LNITEMNO]) WITH
> FILLFACTOR = 90 ON [INDEXES]
> GO
> CREATE INDEX [[tIDX_PROBTBL_COMPOSITE] ON [dbo].[PROBTBL]([ACUSTID],
> [VNUM], [KNO], [ATYPEKIND], [LNITEMNO]) WITH FILLFACTOR = 90 ON
> [INDEXES2] GO
> CREATE INDEX [Vdors_IDX] ON [dbo].[PROBTBL]([VID]) WITH FILLFACTOR
> = 90
> ON [INDEXES2]
> GO
> CREATE INDEX [SRVCLS_IDX] ON [dbo].[PROBTBL]([SRVCLS]) WITH
> FILLFACTOR = 90 ON [INDEXES]
> GO
> CREATE INDEX [VNUM_IDX] ON [dbo].[PROBTBL]([VNUM]) WITH FILLFACTOR
> = 90
> ON [INDEXES]
> GO
> CREATE INDEX [ADDRESS1_IDX] ON [dbo].[PROBTBL]([ADDRESS1]) WITH
> FILLFACTOR = 90 ON [INDEXES2]
> GO
> CREATE INDEX [NVSTD_IDX] ON [dbo].[PROBTBL]([NVSTD]) WITH
> FILLFACTOR = 90 ON [INDEXES]
> GO
> CREATE INDEX [tIDX_PROBTBL_KNO_ACUSTID] ON [dbo].[PROBTBL]([KNO],
> [ACUSTID]) WITH FILLFACTOR = 90 ON [INDEXES]
> GO
> CREATE INDEX [INVNMNO_IDX] ON [dbo].[PROBTBL]([INVNMNO]) WITH
> FILLFACTOR = 90 ON [INDEXES2]
> GO
> CREATE INDEX [CTR_IDX] ON [dbo].[PROBTBL]([CTR]) WITH FILLFACTOR => 90 ON [INDEXES2]
> GO
> CREATE INDEX [LastModified_IDX] ON [dbo].[PROBTBL]([LastModified])
> WITH FILLFACTOR = 90 ON [INDEXES]
> GO
> CREATE INDEX [IDX_TSBodyAtt] ON [dbo].[PROBTBL]([VID], [NVSTD],
> [ATYPEKIND]) WITH FILLFACTOR = 90 ON [INDEXES]
> GO
> CREATE INDEX [tIDX_PROBTBL_ST_CLS] ON [dbo].[PROBTBL]([ST], [CLS])
> WITH FILLFACTOR = 90 ON [INDEXES2]
> GO
>
>
> "Mike Epprecht (SQL MVP)" wrote:
>> Hi
>> How are your indexes laid out compared to the database design? Is
>> there a clustered index on a very selectable column? Maybe posting
>> the DDL of the table would help us define a solution.
I would recommend some performance tuning on the server. There are a
number of products out there that may be able to point you to the
problem. You could also try using Profiler to see what SQL is taking a
lot of CPU or has long durations.
That looks like a large table with very wide rows. If you are reindexing
weekly, you may consider changing your fill factor so you don't leave
10% free space. That is, if you really don't require that much free
space for the inserts each week.
Plus your PK has a fill factor of 40!!!. That's means you are severly
degrading read performance. Probably more than 2X. You are in effect
increasing the size of the table by about 225%, causing all reads to
lock more than twice as many pages, causing updates to do the same, and
making your disks work so much harder to get at the data.
If you are reindexing every week, then you may want to leave the fill
factor at the default settings, or at least fill the pages 95% full.
Only highly transactional tables with clustered indexes that force page
breaks should really be using a lower fill factor.
I suspect that's a large portion of the problem.
David Gugick
Imceda Software
www.imceda.com|||To add to what David wrote.
You have a lot of indexes, the index maintenance overhead is big in such a
case. This can have horrible effects on blocking. Do you really need all
those indexes?
The databases we hit with such big volumes are fully normalized, with no
table exceeding 20 columns.
Your maximum row width would hit 600 bytes, without the TEXT datatype, so
having a fill factor of 40% on your clustered index really wastes space in
the cache and results in more I/O than necessary.
Also check that Auto update statistics is on for the DB (and especially on
big DB's like this, Torn Page Detection On)
Regards
--
Mike Epprecht, Microsoft SQL Server MVP
Zurich, Switzerland
IM: mike@.epprecht.net
MVP Program: http://www.microsoft.com/mvp
Blog: http://www.msmvps.com/epprecht/
"David Gugick" <davidg-nospam@.imceda.com> wrote in message
news:OzZllPJ3EHA.2624@.TK2MSFTNGP11.phx.gbl...
> Jan wrote:
> > Thanks for the reply. Wow- 5TB, very impressive!
> >
> > We do have clustered indexes. Though not exactly the script (security
> > reasons), this might give you an idea:
> >
> > CREATE TABLE [dbo].[PROBTBL] (
> > [RNO] [int] IDENTITY (1, 1) NOT NULL ,
> > [KNO] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
> > [ANOTHKNO] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [ATYPEKIND] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [LNITEMNO] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [ACUSTID] [int] NULL ,
> > [VID] [int] NULL ,
> > [VNUM] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [BR1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [BR2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [CTR] [varchar] (40) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [ADDRESS1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [ADDRESS2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [CTY] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [ST] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [ZC] [char] (9) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [CNTY] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [LTYPE] [char] (4) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [OP] [datetime] NULL ,
> > [CLS] [datetime] NULL ,
> > [CLSCODE] [int] NULL ,
> > [PRN] [decimal](11, 2) NULL ,
> > [MNAMOUNT] [decimal](11, 2) NULL ,
> > [LTCHG] [decimal](11, 2) NULL ,
> > [LTPCT] [decimal](5, 2) NULL ,
> > [CANCL] [datetime] NULL ,
> > [TERM] [datetime] NULL ,
> > [STS] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [S1] [int] NULL ,
> > [S2] [int] NULL ,
> > [INTPRD] [decimal](7, 2) NULL ,
> > [SRVPRD] [decimal](7, 2) NULL ,
> > [ORIG] [decimal](11, 2) NULL ,
> > [LDESC] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [MLADDRESS1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
> > , [MLADDRESS2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS
> > NULL , [MLCTY] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS
> > NULL , [MLST] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [MLZC] [char] (9) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [PPHONE] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [BPHONE] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [IRTE] [decimal](6, 3) NULL ,
> > [DEF] [datetime] NULL ,
> > [RFL] [datetime] NULL ,
> > [INVID] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [INVNMNO] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [IVPRCT] [decimal](7, 3) NULL ,
> > [INRID] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [INSNMNO] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [INSPCTT] [decimal](7, 3) NULL ,
> > [INSCRTT] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [PRTY] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [PRCSSR] [int] NULL ,
> > [NXTCD] [int] NULL ,
> > [NXTDT] [datetime] NULL ,
> > [LASTACC] [datetime] NULL ,
> > [ONHLD] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__ONHLD__4301EA8F]
> > DEFAULT (0), [IVFEE] [decimal](9, 2) NULL ,
> > [ISFEE] [decimal](9, 2) NULL ,
> > [ADDFEEREQ] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__ADDFEE__43F60EC8]
> > DEFAULT (0),
> > [ADFE] [decimal](9, 2) NULL ,
> > [IVCD] [char] (6) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [ISCD] [char] (6) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [HLDFRBK] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__HOLDFO__44EA3301]
> > DEFAULT (0),
> > [CNTSTD] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__CONTES__45DE573A]
> > DEFAULT (0),
> > [VNDPRD] [int] NULL ,
> > [SVNXTCDE] [int] NULL ,
> > [SVNXTDT] [datetime] NULL ,
> > [DJSTBL] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__ADJUST__46D27B73]
> > DEFAULT (0),
> > [XTCHNG] [datetime] NULL ,
> > [SCRWNL] [datetime] NULL ,
> > [BKNTFY] [datetime] NULL ,
> > [PRPTND] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [RGNT] [datetime] NULL ,
> > [MTRT] [datetime] NULL ,
> > [SRVCLS] [datetime] NULL ,
> > [NTS] [int] NULL ,
> > [NVSTD] [int] NULL ,
> > [PTYP] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [PSN] [int] NULL ,
> > [MRSN] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > [LstMdfd] [datetime] NULL ,
> > [LstMdfdDTTM] [datetime] NULL ,
> > [Lngnd] [varchar] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> > CONSTRAINT [PK_PROBTBL] PRIMARY KEY CLUSTERED
> > (
> > [KNO]
> > ) WITH FILLFACTOR = 40 ON [PRIMARY] ,
> > CONSTRAINT [FK_Lngnd] FOREIGN KEY
> > (
> > [Lngnd]
> > ) REFERENCES [LngndDefinition] (
> > [Lngnd]
> > )
> > ) ON [PRIMARY] TEXTIMAGE_ON [TEXT3]
> > GO
> >
> > CREATE INDEX [IXC_PROBTBL] ON [dbo].[PROBTBL]([KNO], [VID]) WITH
> > FILLFACTOR = 90 ON [INDEXES2]
> > GO
> >
> > CREATE INDEX [RNO_IDX] ON [dbo].[PROBTBL]([RNO]) WITH FILLFACTOR => > 90 ON [INDEXES2]
> > GO
> >
> > CREATE INDEX [Clt_IDX] ON [dbo].[PROBTBL]([ACUSTID]) WITH
> > FILLFACTOR = 90 ON [INDEXES]
> > GO
> >
> > CREATE INDEX [LNITEMNO_IDX] ON [dbo].[PROBTBL]([LNITEMNO]) WITH
> > FILLFACTOR = 90 ON [INDEXES]
> > GO
> >
> > CREATE INDEX [[tIDX_PROBTBL_COMPOSITE] ON [dbo].[PROBTBL]([ACUSTID],
> > [VNUM], [KNO], [ATYPEKIND], [LNITEMNO]) WITH FILLFACTOR = 90 ON
> > [INDEXES2] GO
> >
> > CREATE INDEX [Vdors_IDX] ON [dbo].[PROBTBL]([VID]) WITH FILLFACTOR
> > = 90
> > ON [INDEXES2]
> > GO
> >
> > CREATE INDEX [SRVCLS_IDX] ON [dbo].[PROBTBL]([SRVCLS]) WITH
> > FILLFACTOR = 90 ON [INDEXES]
> > GO
> >
> > CREATE INDEX [VNUM_IDX] ON [dbo].[PROBTBL]([VNUM]) WITH FILLFACTOR
> > = 90
> > ON [INDEXES]
> > GO
> >
> > CREATE INDEX [ADDRESS1_IDX] ON [dbo].[PROBTBL]([ADDRESS1]) WITH
> > FILLFACTOR = 90 ON [INDEXES2]
> > GO
> >
> > CREATE INDEX [NVSTD_IDX] ON [dbo].[PROBTBL]([NVSTD]) WITH
> > FILLFACTOR = 90 ON [INDEXES]
> > GO
> >
> > CREATE INDEX [tIDX_PROBTBL_KNO_ACUSTID] ON [dbo].[PROBTBL]([KNO],
> > [ACUSTID]) WITH FILLFACTOR = 90 ON [INDEXES]
> > GO
> >
> > CREATE INDEX [INVNMNO_IDX] ON [dbo].[PROBTBL]([INVNMNO]) WITH
> > FILLFACTOR = 90 ON [INDEXES2]
> > GO
> >
> > CREATE INDEX [CTR_IDX] ON [dbo].[PROBTBL]([CTR]) WITH FILLFACTOR => > 90 ON [INDEXES2]
> > GO
> >
> > CREATE INDEX [LastModified_IDX] ON [dbo].[PROBTBL]([LastModified])
> > WITH FILLFACTOR = 90 ON [INDEXES]
> > GO
> >
> > CREATE INDEX [IDX_TSBodyAtt] ON [dbo].[PROBTBL]([VID], [NVSTD],
> > [ATYPEKIND]) WITH FILLFACTOR = 90 ON [INDEXES]
> > GO
> >
> > CREATE INDEX [tIDX_PROBTBL_ST_CLS] ON [dbo].[PROBTBL]([ST], [CLS])
> > WITH FILLFACTOR = 90 ON [INDEXES2]
> > GO
> >
> >
> >
> >
> >
> > "Mike Epprecht (SQL MVP)" wrote:
> >
> >> Hi
> >>
> >> How are your indexes laid out compared to the database design? Is
> >> there a clustered index on a very selectable column? Maybe posting
> >> the DDL of the table would help us define a solution.
> I would recommend some performance tuning on the server. There are a
> number of products out there that may be able to point you to the
> problem. You could also try using Profiler to see what SQL is taking a
> lot of CPU or has long durations.
> That looks like a large table with very wide rows. If you are reindexing
> weekly, you may consider changing your fill factor so you don't leave
> 10% free space. That is, if you really don't require that much free
> space for the inserts each week.
> Plus your PK has a fill factor of 40!!!. That's means you are severly
> degrading read performance. Probably more than 2X. You are in effect
> increasing the size of the table by about 225%, causing all reads to
> lock more than twice as many pages, causing updates to do the same, and
> making your disks work so much harder to get at the data.
> If you are reindexing every week, then you may want to leave the fill
> factor at the default settings, or at least fill the pages 95% full.
> Only highly transactional tables with clustered indexes that force page
> breaks should really be using a lower fill factor.
> I suspect that's a large portion of the problem.
>
> --
> David Gugick
> Imceda Software
> www.imceda.com
>|||You have many DateTime fields. You can cut the size of these in half by
converting to SmallDateTime. See BOL.
Paul.
"Jan" <Jan@.discussions.microsoft.com> wrote in message
news:5C690F4A-5E37-4927-9FCC-827D7C412469@.microsoft.com...
> Hi-
> I'm a developer on a team with a 200GB db. We inherited it, it definitely
> has design problems. About 3 weeks ago we started seeing a lot of
> blocking
> on our largest table which has affected our systems. We made no changes
> at
> the time the blocking started occurring. The db has been growing about 20
> GB/month, and we are re-indexing every weekend. We have significant
> hardware
> to support the database, and our DBAs do not believe it to be
> hardware-related.
> Is it possible that the SQL Server engine has problems when the db reaches
> such a large size? Has anyone else experienced a similar situation or
> could
> offer up ideas?
> Thx,
> Jan
Showing posts with label inherited. Show all posts
Showing posts with label inherited. Show all posts
Sunday, February 26, 2012
Extremely large db and blocking
Hi-
I'm a developer on a team with a 200GB db. We inherited it, it definitely
has design problems. About 3 weeks ago we started seeing a lot of blocking
on our largest table which has affected our systems. We made no changes at
the time the blocking started occurring. The db has been growing about 20
GB/month, and we are re-indexing every weekend. We have significant hardware
to support the database, and our DBAs do not believe it to be
hardware-related.
Is it possible that the SQL Server engine has problems when the db reaches
such a large size? Has anyone else experienced a similar situation or could
offer up ideas?
Thx,
Jan
Hi
How are your indexes laid out compared to the database design? Is there a
clustered index on a very selectable column? Maybe posting the DDL of the
table would help us define a solution.
On large tables, a wrong clustered index can cuase havoc with blocking.
I am running multiple 5Tb databases, on 8Gb RAM servers, with no issues.
Regards
Mike
"Jan" wrote:
> Hi-
> I'm a developer on a team with a 200GB db. We inherited it, it definitely
> has design problems. About 3 weeks ago we started seeing a lot of blocking
> on our largest table which has affected our systems. We made no changes at
> the time the blocking started occurring. The db has been growing about 20
> GB/month, and we are re-indexing every weekend. We have significant hardware
> to support the database, and our DBAs do not believe it to be
> hardware-related.
> Is it possible that the SQL Server engine has problems when the db reaches
> such a large size? Has anyone else experienced a similar situation or could
> offer up ideas?
> Thx,
> Jan
|||Thanks for the reply. Wow- 5TB, very impressive!
We do have clustered indexes. Though not exactly the script (security
reasons), this might give you an idea:
CREATE TABLE [dbo].[PROBTBL] (
[RNO] [int] IDENTITY (1, 1) NOT NULL ,
[KNO] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[ANOTHKNO] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ATYPEKIND] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[LNITEMNO] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ACUSTID] [int] NULL ,
[VID] [int] NULL ,
[VNUM] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[BR1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[BR2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CTR] [varchar] (40) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ADDRESS1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ADDRESS2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CTY] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ST] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ZC] [char] (9) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CNTY] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[LTYPE] [char] (4) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[OP] [datetime] NULL ,
[CLS] [datetime] NULL ,
[CLSCODE] [int] NULL ,
[PRN] [decimal](11, 2) NULL ,
[MNAMOUNT] [decimal](11, 2) NULL ,
[LTCHG] [decimal](11, 2) NULL ,
[LTPCT] [decimal](5, 2) NULL ,
[CANCL] [datetime] NULL ,
[TERM] [datetime] NULL ,
[STS] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[S1] [int] NULL ,
[S2] [int] NULL ,
[INTPRD] [decimal](7, 2) NULL ,
[SRVPRD] [decimal](7, 2) NULL ,
[ORIG] [decimal](11, 2) NULL ,
[LDESC] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[MLADDRESS1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[MLADDRESS2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[MLCTY] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[MLST] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[MLZC] [char] (9) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[PPHONE] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[BPHONE] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[IRTE] [decimal](6, 3) NULL ,
[DEF] [datetime] NULL ,
[RFL] [datetime] NULL ,
[INVID] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[INVNMNO] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[IVPRCT] [decimal](7, 3) NULL ,
[INRID] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[INSNMNO] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[INSPCTT] [decimal](7, 3) NULL ,
[INSCRTT] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[PRTY] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[PRCSSR] [int] NULL ,
[NXTCD] [int] NULL ,
[NXTDT] [datetime] NULL ,
[LASTACC] [datetime] NULL ,
[ONHLD] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__ONHLD__4301EA8F] DEFAULT (0),
[IVFEE] [decimal](9, 2) NULL ,
[ISFEE] [decimal](9, 2) NULL ,
[ADDFEEREQ] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__ADDFEE__43F60EC8]
DEFAULT (0),
[ADFE] [decimal](9, 2) NULL ,
[IVCD] [char] (6) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ISCD] [char] (6) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[HLDFRBK] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__HOLDFO__44EA3301] DEFAULT
(0),
[CNTSTD] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__CONTES__45DE573A] DEFAULT
(0),
[VNDPRD] [int] NULL ,
[SVNXTCDE] [int] NULL ,
[SVNXTDT] [datetime] NULL ,
[DJSTBL] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__ADJUST__46D27B73] DEFAULT
(0),
[XTCHNG] [datetime] NULL ,
[SCRWNL] [datetime] NULL ,
[BKNTFY] [datetime] NULL ,
[PRPTND] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[RGNT] [datetime] NULL ,
[MTRT] [datetime] NULL ,
[SRVCLS] [datetime] NULL ,
[NTS] [int] NULL ,
[NVSTD] [int] NULL ,
[PTYP] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[PSN] [int] NULL ,
[MRSN] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[LstMdfd] [datetime] NULL ,
[LstMdfdDTTM] [datetime] NULL ,
[Lngnd] [varchar] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
CONSTRAINT [PK_PROBTBL] PRIMARY KEY CLUSTERED
(
[KNO]
) WITH FILLFACTOR = 40 ON [PRIMARY] ,
CONSTRAINT [FK_Lngnd] FOREIGN KEY
(
[Lngnd]
) REFERENCES [LngndDefinition] (
[Lngnd]
)
) ON [PRIMARY] TEXTIMAGE_ON [TEXT3]
GO
CREATE INDEX [IXC_PROBTBL] ON [dbo].[PROBTBL]([KNO], [VID]) WITH
FILLFACTOR = 90 ON [INDEXES2]
GO
CREATE INDEX [RNO_IDX] ON [dbo].[PROBTBL]([RNO]) WITH FILLFACTOR = 90 ON
[INDEXES2]
GO
CREATE INDEX [Clt_IDX] ON [dbo].[PROBTBL]([ACUSTID]) WITH FILLFACTOR = 90
ON [INDEXES]
GO
CREATE INDEX [LNITEMNO_IDX] ON [dbo].[PROBTBL]([LNITEMNO]) WITH
FILLFACTOR = 90 ON [INDEXES]
GO
CREATE INDEX [[tIDX_PROBTBL_COMPOSITE] ON [dbo].[PROBTBL]([ACUSTID],
[VNUM], [KNO], [ATYPEKIND], [LNITEMNO]) WITH FILLFACTOR = 90 ON [INDEXES2]
GO
CREATE INDEX [Vdors_IDX] ON [dbo].[PROBTBL]([VID]) WITH FILLFACTOR = 90
ON [INDEXES2]
GO
CREATE INDEX [SRVCLS_IDX] ON [dbo].[PROBTBL]([SRVCLS]) WITH FILLFACTOR =
90 ON [INDEXES]
GO
CREATE INDEX [VNUM_IDX] ON [dbo].[PROBTBL]([VNUM]) WITH FILLFACTOR = 90
ON [INDEXES]
GO
CREATE INDEX [ADDRESS1_IDX] ON [dbo].[PROBTBL]([ADDRESS1]) WITH
FILLFACTOR = 90 ON [INDEXES2]
GO
CREATE INDEX [NVSTD_IDX] ON [dbo].[PROBTBL]([NVSTD]) WITH FILLFACTOR = 90
ON [INDEXES]
GO
CREATE INDEX [tIDX_PROBTBL_KNO_ACUSTID] ON [dbo].[PROBTBL]([KNO],
[ACUSTID]) WITH FILLFACTOR = 90 ON [INDEXES]
GO
CREATE INDEX [INVNMNO_IDX] ON [dbo].[PROBTBL]([INVNMNO]) WITH FILLFACTOR
= 90 ON [INDEXES2]
GO
CREATE INDEX [CTR_IDX] ON [dbo].[PROBTBL]([CTR]) WITH FILLFACTOR = 90 ON
[INDEXES2]
GO
CREATE INDEX [LastModified_IDX] ON [dbo].[PROBTBL]([LastModified]) WITH
FILLFACTOR = 90 ON [INDEXES]
GO
CREATE INDEX [IDX_TSBodyAtt] ON [dbo].[PROBTBL]([VID], [NVSTD],
[ATYPEKIND]) WITH FILLFACTOR = 90 ON [INDEXES]
GO
CREATE INDEX [tIDX_PROBTBL_ST_CLS] ON [dbo].[PROBTBL]([ST], [CLS]) WITH
FILLFACTOR = 90 ON [INDEXES2]
GO
"Mike Epprecht (SQL MVP)" wrote:
> Hi
> How are your indexes laid out compared to the database design? Is there a
> clustered index on a very selectable column? Maybe posting the DDL of the
> table would help us define a solution.
|||You should really find the source of the block. For example, is it a query
an update or an insert? Is the block caused by (say) small "X" locks, or an
"S" lock at the table level. One bad query plan can cause an index scan
which in turn can cause an "S" lock on the table; disaster!
There is some information on obtaining the "block chain" that you could find
using a Google search. I found a script which identifies the blocking
connection, all blocked connections, and query buffer contents for the
blocker (no always helpfull, but sometimes) and I have found this informatin
indespensible.
Without knowing the real cause and nature of the blocking, you might miss
the best solution.
|||Jan wrote:[vbcol=seagreen]
> Thanks for the reply. Wow- 5TB, very impressive!
> We do have clustered indexes. Though not exactly the script (security
> reasons), this might give you an idea:
> CREATE TABLE [dbo].[PROBTBL] (
> [RNO] [int] IDENTITY (1, 1) NOT NULL ,
> [KNO] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
> [ANOTHKNO] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [ATYPEKIND] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [LNITEMNO] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [ACUSTID] [int] NULL ,
> [VID] [int] NULL ,
> [VNUM] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [BR1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [BR2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [CTR] [varchar] (40) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [ADDRESS1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [ADDRESS2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [CTY] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [ST] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [ZC] [char] (9) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [CNTY] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [LTYPE] [char] (4) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [OP] [datetime] NULL ,
> [CLS] [datetime] NULL ,
> [CLSCODE] [int] NULL ,
> [PRN] [decimal](11, 2) NULL ,
> [MNAMOUNT] [decimal](11, 2) NULL ,
> [LTCHG] [decimal](11, 2) NULL ,
> [LTPCT] [decimal](5, 2) NULL ,
> [CANCL] [datetime] NULL ,
> [TERM] [datetime] NULL ,
> [STS] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [S1] [int] NULL ,
> [S2] [int] NULL ,
> [INTPRD] [decimal](7, 2) NULL ,
> [SRVPRD] [decimal](7, 2) NULL ,
> [ORIG] [decimal](11, 2) NULL ,
> [LDESC] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [MLADDRESS1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
> , [MLADDRESS2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS
> NULL , [MLCTY] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS
> NULL , [MLST] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [MLZC] [char] (9) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [PPHONE] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [BPHONE] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [IRTE] [decimal](6, 3) NULL ,
> [DEF] [datetime] NULL ,
> [RFL] [datetime] NULL ,
> [INVID] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [INVNMNO] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [IVPRCT] [decimal](7, 3) NULL ,
> [INRID] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [INSNMNO] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [INSPCTT] [decimal](7, 3) NULL ,
> [INSCRTT] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [PRTY] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [PRCSSR] [int] NULL ,
> [NXTCD] [int] NULL ,
> [NXTDT] [datetime] NULL ,
> [LASTACC] [datetime] NULL ,
> [ONHLD] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__ONHLD__4301EA8F]
> DEFAULT (0), [IVFEE] [decimal](9, 2) NULL ,
> [ISFEE] [decimal](9, 2) NULL ,
> [ADDFEEREQ] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__ADDFEE__43F60EC8]
> DEFAULT (0),
> [ADFE] [decimal](9, 2) NULL ,
> [IVCD] [char] (6) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [ISCD] [char] (6) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [HLDFRBK] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__HOLDFO__44EA3301]
> DEFAULT (0),
> [CNTSTD] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__CONTES__45DE573A]
> DEFAULT (0),
> [VNDPRD] [int] NULL ,
> [SVNXTCDE] [int] NULL ,
> [SVNXTDT] [datetime] NULL ,
> [DJSTBL] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__ADJUST__46D27B73]
> DEFAULT (0),
> [XTCHNG] [datetime] NULL ,
> [SCRWNL] [datetime] NULL ,
> [BKNTFY] [datetime] NULL ,
> [PRPTND] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [RGNT] [datetime] NULL ,
> [MTRT] [datetime] NULL ,
> [SRVCLS] [datetime] NULL ,
> [NTS] [int] NULL ,
> [NVSTD] [int] NULL ,
> [PTYP] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [PSN] [int] NULL ,
> [MRSN] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [LstMdfd] [datetime] NULL ,
> [LstMdfdDTTM] [datetime] NULL ,
> [Lngnd] [varchar] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> CONSTRAINT [PK_PROBTBL] PRIMARY KEY CLUSTERED
> (
> [KNO]
> ) WITH FILLFACTOR = 40 ON [PRIMARY] ,
> CONSTRAINT [FK_Lngnd] FOREIGN KEY
> (
> [Lngnd]
> ) REFERENCES [LngndDefinition] (
> [Lngnd]
> )
> ) ON [PRIMARY] TEXTIMAGE_ON [TEXT3]
> GO
> CREATE INDEX [IXC_PROBTBL] ON [dbo].[PROBTBL]([KNO], [VID]) WITH
> FILLFACTOR = 90 ON [INDEXES2]
> GO
> CREATE INDEX [RNO_IDX] ON [dbo].[PROBTBL]([RNO]) WITH FILLFACTOR =
> 90 ON [INDEXES2]
> GO
> CREATE INDEX [Clt_IDX] ON [dbo].[PROBTBL]([ACUSTID]) WITH
> FILLFACTOR = 90 ON [INDEXES]
> GO
> CREATE INDEX [LNITEMNO_IDX] ON [dbo].[PROBTBL]([LNITEMNO]) WITH
> FILLFACTOR = 90 ON [INDEXES]
> GO
> CREATE INDEX [[tIDX_PROBTBL_COMPOSITE] ON [dbo].[PROBTBL]([ACUSTID],
> [VNUM], [KNO], [ATYPEKIND], [LNITEMNO]) WITH FILLFACTOR = 90 ON
> [INDEXES2] GO
> CREATE INDEX [Vdors_IDX] ON [dbo].[PROBTBL]([VID]) WITH FILLFACTOR
> = 90
> ON [INDEXES2]
> GO
> CREATE INDEX [SRVCLS_IDX] ON [dbo].[PROBTBL]([SRVCLS]) WITH
> FILLFACTOR = 90 ON [INDEXES]
> GO
> CREATE INDEX [VNUM_IDX] ON [dbo].[PROBTBL]([VNUM]) WITH FILLFACTOR
> = 90
> ON [INDEXES]
> GO
> CREATE INDEX [ADDRESS1_IDX] ON [dbo].[PROBTBL]([ADDRESS1]) WITH
> FILLFACTOR = 90 ON [INDEXES2]
> GO
> CREATE INDEX [NVSTD_IDX] ON [dbo].[PROBTBL]([NVSTD]) WITH
> FILLFACTOR = 90 ON [INDEXES]
> GO
> CREATE INDEX [tIDX_PROBTBL_KNO_ACUSTID] ON [dbo].[PROBTBL]([KNO],
> [ACUSTID]) WITH FILLFACTOR = 90 ON [INDEXES]
> GO
> CREATE INDEX [INVNMNO_IDX] ON [dbo].[PROBTBL]([INVNMNO]) WITH
> FILLFACTOR = 90 ON [INDEXES2]
> GO
> CREATE INDEX [CTR_IDX] ON [dbo].[PROBTBL]([CTR]) WITH FILLFACTOR =
> 90 ON [INDEXES2]
> GO
> CREATE INDEX [LastModified_IDX] ON [dbo].[PROBTBL]([LastModified])
> WITH FILLFACTOR = 90 ON [INDEXES]
> GO
> CREATE INDEX [IDX_TSBodyAtt] ON [dbo].[PROBTBL]([VID], [NVSTD],
> [ATYPEKIND]) WITH FILLFACTOR = 90 ON [INDEXES]
> GO
> CREATE INDEX [tIDX_PROBTBL_ST_CLS] ON [dbo].[PROBTBL]([ST], [CLS])
> WITH FILLFACTOR = 90 ON [INDEXES2]
> GO
>
>
> "Mike Epprecht (SQL MVP)" wrote:
I would recommend some performance tuning on the server. There are a
number of products out there that may be able to point you to the
problem. You could also try using Profiler to see what SQL is taking a
lot of CPU or has long durations.
That looks like a large table with very wide rows. If you are reindexing
weekly, you may consider changing your fill factor so you don't leave
10% free space. That is, if you really don't require that much free
space for the inserts each week.
Plus your PK has a fill factor of 40!!!. That's means you are severly
degrading read performance. Probably more than 2X. You are in effect
increasing the size of the table by about 225%, causing all reads to
lock more than twice as many pages, causing updates to do the same, and
making your disks work so much harder to get at the data.
If you are reindexing every week, then you may want to leave the fill
factor at the default settings, or at least fill the pages 95% full.
Only highly transactional tables with clustered indexes that force page
breaks should really be using a lower fill factor.
I suspect that's a large portion of the problem.
David Gugick
Imceda Software
www.imceda.com
|||To add to what David wrote.
You have a lot of indexes, the index maintenance overhead is big in such a
case. This can have horrible effects on blocking. Do you really need all
those indexes?
The databases we hit with such big volumes are fully normalized, with no
table exceeding 20 columns.
Your maximum row width would hit 600 bytes, without the TEXT datatype, so
having a fill factor of 40% on your clustered index really wastes space in
the cache and results in more I/O than necessary.
Also check that Auto update statistics is on for the DB (and especially on
big DB's like this, Torn Page Detection On)
Regards
Mike Epprecht, Microsoft SQL Server MVP
Zurich, Switzerland
IM: mike@.epprecht.net
MVP Program: http://www.microsoft.com/mvp
Blog: http://www.msmvps.com/epprecht/
"David Gugick" <davidg-nospam@.imceda.com> wrote in message
news:OzZllPJ3EHA.2624@.TK2MSFTNGP11.phx.gbl...
> Jan wrote:
> I would recommend some performance tuning on the server. There are a
> number of products out there that may be able to point you to the
> problem. You could also try using Profiler to see what SQL is taking a
> lot of CPU or has long durations.
> That looks like a large table with very wide rows. If you are reindexing
> weekly, you may consider changing your fill factor so you don't leave
> 10% free space. That is, if you really don't require that much free
> space for the inserts each week.
> Plus your PK has a fill factor of 40!!!. That's means you are severly
> degrading read performance. Probably more than 2X. You are in effect
> increasing the size of the table by about 225%, causing all reads to
> lock more than twice as many pages, causing updates to do the same, and
> making your disks work so much harder to get at the data.
> If you are reindexing every week, then you may want to leave the fill
> factor at the default settings, or at least fill the pages 95% full.
> Only highly transactional tables with clustered indexes that force page
> breaks should really be using a lower fill factor.
> I suspect that's a large portion of the problem.
>
> --
> David Gugick
> Imceda Software
> www.imceda.com
>
|||You have many DateTime fields. You can cut the size of these in half by
converting to SmallDateTime. See BOL.
Paul.
"Jan" <Jan@.discussions.microsoft.com> wrote in message
news:5C690F4A-5E37-4927-9FCC-827D7C412469@.microsoft.com...
> Hi-
> I'm a developer on a team with a 200GB db. We inherited it, it definitely
> has design problems. About 3 weeks ago we started seeing a lot of
> blocking
> on our largest table which has affected our systems. We made no changes
> at
> the time the blocking started occurring. The db has been growing about 20
> GB/month, and we are re-indexing every weekend. We have significant
> hardware
> to support the database, and our DBAs do not believe it to be
> hardware-related.
> Is it possible that the SQL Server engine has problems when the db reaches
> such a large size? Has anyone else experienced a similar situation or
> could
> offer up ideas?
> Thx,
> Jan
I'm a developer on a team with a 200GB db. We inherited it, it definitely
has design problems. About 3 weeks ago we started seeing a lot of blocking
on our largest table which has affected our systems. We made no changes at
the time the blocking started occurring. The db has been growing about 20
GB/month, and we are re-indexing every weekend. We have significant hardware
to support the database, and our DBAs do not believe it to be
hardware-related.
Is it possible that the SQL Server engine has problems when the db reaches
such a large size? Has anyone else experienced a similar situation or could
offer up ideas?
Thx,
Jan
Hi
How are your indexes laid out compared to the database design? Is there a
clustered index on a very selectable column? Maybe posting the DDL of the
table would help us define a solution.
On large tables, a wrong clustered index can cuase havoc with blocking.
I am running multiple 5Tb databases, on 8Gb RAM servers, with no issues.
Regards
Mike
"Jan" wrote:
> Hi-
> I'm a developer on a team with a 200GB db. We inherited it, it definitely
> has design problems. About 3 weeks ago we started seeing a lot of blocking
> on our largest table which has affected our systems. We made no changes at
> the time the blocking started occurring. The db has been growing about 20
> GB/month, and we are re-indexing every weekend. We have significant hardware
> to support the database, and our DBAs do not believe it to be
> hardware-related.
> Is it possible that the SQL Server engine has problems when the db reaches
> such a large size? Has anyone else experienced a similar situation or could
> offer up ideas?
> Thx,
> Jan
|||Thanks for the reply. Wow- 5TB, very impressive!
We do have clustered indexes. Though not exactly the script (security
reasons), this might give you an idea:
CREATE TABLE [dbo].[PROBTBL] (
[RNO] [int] IDENTITY (1, 1) NOT NULL ,
[KNO] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[ANOTHKNO] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ATYPEKIND] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[LNITEMNO] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ACUSTID] [int] NULL ,
[VID] [int] NULL ,
[VNUM] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[BR1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[BR2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CTR] [varchar] (40) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ADDRESS1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ADDRESS2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CTY] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ST] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ZC] [char] (9) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CNTY] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[LTYPE] [char] (4) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[OP] [datetime] NULL ,
[CLS] [datetime] NULL ,
[CLSCODE] [int] NULL ,
[PRN] [decimal](11, 2) NULL ,
[MNAMOUNT] [decimal](11, 2) NULL ,
[LTCHG] [decimal](11, 2) NULL ,
[LTPCT] [decimal](5, 2) NULL ,
[CANCL] [datetime] NULL ,
[TERM] [datetime] NULL ,
[STS] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[S1] [int] NULL ,
[S2] [int] NULL ,
[INTPRD] [decimal](7, 2) NULL ,
[SRVPRD] [decimal](7, 2) NULL ,
[ORIG] [decimal](11, 2) NULL ,
[LDESC] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[MLADDRESS1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[MLADDRESS2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[MLCTY] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[MLST] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[MLZC] [char] (9) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[PPHONE] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[BPHONE] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[IRTE] [decimal](6, 3) NULL ,
[DEF] [datetime] NULL ,
[RFL] [datetime] NULL ,
[INVID] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[INVNMNO] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[IVPRCT] [decimal](7, 3) NULL ,
[INRID] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[INSNMNO] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[INSPCTT] [decimal](7, 3) NULL ,
[INSCRTT] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[PRTY] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[PRCSSR] [int] NULL ,
[NXTCD] [int] NULL ,
[NXTDT] [datetime] NULL ,
[LASTACC] [datetime] NULL ,
[ONHLD] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__ONHLD__4301EA8F] DEFAULT (0),
[IVFEE] [decimal](9, 2) NULL ,
[ISFEE] [decimal](9, 2) NULL ,
[ADDFEEREQ] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__ADDFEE__43F60EC8]
DEFAULT (0),
[ADFE] [decimal](9, 2) NULL ,
[IVCD] [char] (6) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ISCD] [char] (6) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[HLDFRBK] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__HOLDFO__44EA3301] DEFAULT
(0),
[CNTSTD] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__CONTES__45DE573A] DEFAULT
(0),
[VNDPRD] [int] NULL ,
[SVNXTCDE] [int] NULL ,
[SVNXTDT] [datetime] NULL ,
[DJSTBL] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__ADJUST__46D27B73] DEFAULT
(0),
[XTCHNG] [datetime] NULL ,
[SCRWNL] [datetime] NULL ,
[BKNTFY] [datetime] NULL ,
[PRPTND] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[RGNT] [datetime] NULL ,
[MTRT] [datetime] NULL ,
[SRVCLS] [datetime] NULL ,
[NTS] [int] NULL ,
[NVSTD] [int] NULL ,
[PTYP] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[PSN] [int] NULL ,
[MRSN] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[LstMdfd] [datetime] NULL ,
[LstMdfdDTTM] [datetime] NULL ,
[Lngnd] [varchar] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
CONSTRAINT [PK_PROBTBL] PRIMARY KEY CLUSTERED
(
[KNO]
) WITH FILLFACTOR = 40 ON [PRIMARY] ,
CONSTRAINT [FK_Lngnd] FOREIGN KEY
(
[Lngnd]
) REFERENCES [LngndDefinition] (
[Lngnd]
)
) ON [PRIMARY] TEXTIMAGE_ON [TEXT3]
GO
CREATE INDEX [IXC_PROBTBL] ON [dbo].[PROBTBL]([KNO], [VID]) WITH
FILLFACTOR = 90 ON [INDEXES2]
GO
CREATE INDEX [RNO_IDX] ON [dbo].[PROBTBL]([RNO]) WITH FILLFACTOR = 90 ON
[INDEXES2]
GO
CREATE INDEX [Clt_IDX] ON [dbo].[PROBTBL]([ACUSTID]) WITH FILLFACTOR = 90
ON [INDEXES]
GO
CREATE INDEX [LNITEMNO_IDX] ON [dbo].[PROBTBL]([LNITEMNO]) WITH
FILLFACTOR = 90 ON [INDEXES]
GO
CREATE INDEX [[tIDX_PROBTBL_COMPOSITE] ON [dbo].[PROBTBL]([ACUSTID],
[VNUM], [KNO], [ATYPEKIND], [LNITEMNO]) WITH FILLFACTOR = 90 ON [INDEXES2]
GO
CREATE INDEX [Vdors_IDX] ON [dbo].[PROBTBL]([VID]) WITH FILLFACTOR = 90
ON [INDEXES2]
GO
CREATE INDEX [SRVCLS_IDX] ON [dbo].[PROBTBL]([SRVCLS]) WITH FILLFACTOR =
90 ON [INDEXES]
GO
CREATE INDEX [VNUM_IDX] ON [dbo].[PROBTBL]([VNUM]) WITH FILLFACTOR = 90
ON [INDEXES]
GO
CREATE INDEX [ADDRESS1_IDX] ON [dbo].[PROBTBL]([ADDRESS1]) WITH
FILLFACTOR = 90 ON [INDEXES2]
GO
CREATE INDEX [NVSTD_IDX] ON [dbo].[PROBTBL]([NVSTD]) WITH FILLFACTOR = 90
ON [INDEXES]
GO
CREATE INDEX [tIDX_PROBTBL_KNO_ACUSTID] ON [dbo].[PROBTBL]([KNO],
[ACUSTID]) WITH FILLFACTOR = 90 ON [INDEXES]
GO
CREATE INDEX [INVNMNO_IDX] ON [dbo].[PROBTBL]([INVNMNO]) WITH FILLFACTOR
= 90 ON [INDEXES2]
GO
CREATE INDEX [CTR_IDX] ON [dbo].[PROBTBL]([CTR]) WITH FILLFACTOR = 90 ON
[INDEXES2]
GO
CREATE INDEX [LastModified_IDX] ON [dbo].[PROBTBL]([LastModified]) WITH
FILLFACTOR = 90 ON [INDEXES]
GO
CREATE INDEX [IDX_TSBodyAtt] ON [dbo].[PROBTBL]([VID], [NVSTD],
[ATYPEKIND]) WITH FILLFACTOR = 90 ON [INDEXES]
GO
CREATE INDEX [tIDX_PROBTBL_ST_CLS] ON [dbo].[PROBTBL]([ST], [CLS]) WITH
FILLFACTOR = 90 ON [INDEXES2]
GO
"Mike Epprecht (SQL MVP)" wrote:
> Hi
> How are your indexes laid out compared to the database design? Is there a
> clustered index on a very selectable column? Maybe posting the DDL of the
> table would help us define a solution.
|||You should really find the source of the block. For example, is it a query
an update or an insert? Is the block caused by (say) small "X" locks, or an
"S" lock at the table level. One bad query plan can cause an index scan
which in turn can cause an "S" lock on the table; disaster!
There is some information on obtaining the "block chain" that you could find
using a Google search. I found a script which identifies the blocking
connection, all blocked connections, and query buffer contents for the
blocker (no always helpfull, but sometimes) and I have found this informatin
indespensible.
Without knowing the real cause and nature of the blocking, you might miss
the best solution.
|||Jan wrote:[vbcol=seagreen]
> Thanks for the reply. Wow- 5TB, very impressive!
> We do have clustered indexes. Though not exactly the script (security
> reasons), this might give you an idea:
> CREATE TABLE [dbo].[PROBTBL] (
> [RNO] [int] IDENTITY (1, 1) NOT NULL ,
> [KNO] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
> [ANOTHKNO] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [ATYPEKIND] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [LNITEMNO] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [ACUSTID] [int] NULL ,
> [VID] [int] NULL ,
> [VNUM] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [BR1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [BR2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [CTR] [varchar] (40) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [ADDRESS1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [ADDRESS2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [CTY] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [ST] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [ZC] [char] (9) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [CNTY] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [LTYPE] [char] (4) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [OP] [datetime] NULL ,
> [CLS] [datetime] NULL ,
> [CLSCODE] [int] NULL ,
> [PRN] [decimal](11, 2) NULL ,
> [MNAMOUNT] [decimal](11, 2) NULL ,
> [LTCHG] [decimal](11, 2) NULL ,
> [LTPCT] [decimal](5, 2) NULL ,
> [CANCL] [datetime] NULL ,
> [TERM] [datetime] NULL ,
> [STS] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [S1] [int] NULL ,
> [S2] [int] NULL ,
> [INTPRD] [decimal](7, 2) NULL ,
> [SRVPRD] [decimal](7, 2) NULL ,
> [ORIG] [decimal](11, 2) NULL ,
> [LDESC] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [MLADDRESS1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
> , [MLADDRESS2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS
> NULL , [MLCTY] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS
> NULL , [MLST] [char] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [MLZC] [char] (9) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [PPHONE] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [BPHONE] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [IRTE] [decimal](6, 3) NULL ,
> [DEF] [datetime] NULL ,
> [RFL] [datetime] NULL ,
> [INVID] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [INVNMNO] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [IVPRCT] [decimal](7, 3) NULL ,
> [INRID] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [INSNMNO] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [INSPCTT] [decimal](7, 3) NULL ,
> [INSCRTT] [varchar] (16) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [PRTY] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [PRCSSR] [int] NULL ,
> [NXTCD] [int] NULL ,
> [NXTDT] [datetime] NULL ,
> [LASTACC] [datetime] NULL ,
> [ONHLD] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__ONHLD__4301EA8F]
> DEFAULT (0), [IVFEE] [decimal](9, 2) NULL ,
> [ISFEE] [decimal](9, 2) NULL ,
> [ADDFEEREQ] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__ADDFEE__43F60EC8]
> DEFAULT (0),
> [ADFE] [decimal](9, 2) NULL ,
> [IVCD] [char] (6) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [ISCD] [char] (6) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [HLDFRBK] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__HOLDFO__44EA3301]
> DEFAULT (0),
> [CNTSTD] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__CONTES__45DE573A]
> DEFAULT (0),
> [VNDPRD] [int] NULL ,
> [SVNXTCDE] [int] NULL ,
> [SVNXTDT] [datetime] NULL ,
> [DJSTBL] [bit] NOT NULL CONSTRAINT [DF__PROBTBL__ADJUST__46D27B73]
> DEFAULT (0),
> [XTCHNG] [datetime] NULL ,
> [SCRWNL] [datetime] NULL ,
> [BKNTFY] [datetime] NULL ,
> [PRPTND] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [RGNT] [datetime] NULL ,
> [MTRT] [datetime] NULL ,
> [SRVCLS] [datetime] NULL ,
> [NTS] [int] NULL ,
> [NVSTD] [int] NULL ,
> [PTYP] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [PSN] [int] NULL ,
> [MRSN] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [LstMdfd] [datetime] NULL ,
> [LstMdfdDTTM] [datetime] NULL ,
> [Lngnd] [varchar] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> CONSTRAINT [PK_PROBTBL] PRIMARY KEY CLUSTERED
> (
> [KNO]
> ) WITH FILLFACTOR = 40 ON [PRIMARY] ,
> CONSTRAINT [FK_Lngnd] FOREIGN KEY
> (
> [Lngnd]
> ) REFERENCES [LngndDefinition] (
> [Lngnd]
> )
> ) ON [PRIMARY] TEXTIMAGE_ON [TEXT3]
> GO
> CREATE INDEX [IXC_PROBTBL] ON [dbo].[PROBTBL]([KNO], [VID]) WITH
> FILLFACTOR = 90 ON [INDEXES2]
> GO
> CREATE INDEX [RNO_IDX] ON [dbo].[PROBTBL]([RNO]) WITH FILLFACTOR =
> 90 ON [INDEXES2]
> GO
> CREATE INDEX [Clt_IDX] ON [dbo].[PROBTBL]([ACUSTID]) WITH
> FILLFACTOR = 90 ON [INDEXES]
> GO
> CREATE INDEX [LNITEMNO_IDX] ON [dbo].[PROBTBL]([LNITEMNO]) WITH
> FILLFACTOR = 90 ON [INDEXES]
> GO
> CREATE INDEX [[tIDX_PROBTBL_COMPOSITE] ON [dbo].[PROBTBL]([ACUSTID],
> [VNUM], [KNO], [ATYPEKIND], [LNITEMNO]) WITH FILLFACTOR = 90 ON
> [INDEXES2] GO
> CREATE INDEX [Vdors_IDX] ON [dbo].[PROBTBL]([VID]) WITH FILLFACTOR
> = 90
> ON [INDEXES2]
> GO
> CREATE INDEX [SRVCLS_IDX] ON [dbo].[PROBTBL]([SRVCLS]) WITH
> FILLFACTOR = 90 ON [INDEXES]
> GO
> CREATE INDEX [VNUM_IDX] ON [dbo].[PROBTBL]([VNUM]) WITH FILLFACTOR
> = 90
> ON [INDEXES]
> GO
> CREATE INDEX [ADDRESS1_IDX] ON [dbo].[PROBTBL]([ADDRESS1]) WITH
> FILLFACTOR = 90 ON [INDEXES2]
> GO
> CREATE INDEX [NVSTD_IDX] ON [dbo].[PROBTBL]([NVSTD]) WITH
> FILLFACTOR = 90 ON [INDEXES]
> GO
> CREATE INDEX [tIDX_PROBTBL_KNO_ACUSTID] ON [dbo].[PROBTBL]([KNO],
> [ACUSTID]) WITH FILLFACTOR = 90 ON [INDEXES]
> GO
> CREATE INDEX [INVNMNO_IDX] ON [dbo].[PROBTBL]([INVNMNO]) WITH
> FILLFACTOR = 90 ON [INDEXES2]
> GO
> CREATE INDEX [CTR_IDX] ON [dbo].[PROBTBL]([CTR]) WITH FILLFACTOR =
> 90 ON [INDEXES2]
> GO
> CREATE INDEX [LastModified_IDX] ON [dbo].[PROBTBL]([LastModified])
> WITH FILLFACTOR = 90 ON [INDEXES]
> GO
> CREATE INDEX [IDX_TSBodyAtt] ON [dbo].[PROBTBL]([VID], [NVSTD],
> [ATYPEKIND]) WITH FILLFACTOR = 90 ON [INDEXES]
> GO
> CREATE INDEX [tIDX_PROBTBL_ST_CLS] ON [dbo].[PROBTBL]([ST], [CLS])
> WITH FILLFACTOR = 90 ON [INDEXES2]
> GO
>
>
> "Mike Epprecht (SQL MVP)" wrote:
I would recommend some performance tuning on the server. There are a
number of products out there that may be able to point you to the
problem. You could also try using Profiler to see what SQL is taking a
lot of CPU or has long durations.
That looks like a large table with very wide rows. If you are reindexing
weekly, you may consider changing your fill factor so you don't leave
10% free space. That is, if you really don't require that much free
space for the inserts each week.
Plus your PK has a fill factor of 40!!!. That's means you are severly
degrading read performance. Probably more than 2X. You are in effect
increasing the size of the table by about 225%, causing all reads to
lock more than twice as many pages, causing updates to do the same, and
making your disks work so much harder to get at the data.
If you are reindexing every week, then you may want to leave the fill
factor at the default settings, or at least fill the pages 95% full.
Only highly transactional tables with clustered indexes that force page
breaks should really be using a lower fill factor.
I suspect that's a large portion of the problem.
David Gugick
Imceda Software
www.imceda.com
|||To add to what David wrote.
You have a lot of indexes, the index maintenance overhead is big in such a
case. This can have horrible effects on blocking. Do you really need all
those indexes?
The databases we hit with such big volumes are fully normalized, with no
table exceeding 20 columns.
Your maximum row width would hit 600 bytes, without the TEXT datatype, so
having a fill factor of 40% on your clustered index really wastes space in
the cache and results in more I/O than necessary.
Also check that Auto update statistics is on for the DB (and especially on
big DB's like this, Torn Page Detection On)
Regards
Mike Epprecht, Microsoft SQL Server MVP
Zurich, Switzerland
IM: mike@.epprecht.net
MVP Program: http://www.microsoft.com/mvp
Blog: http://www.msmvps.com/epprecht/
"David Gugick" <davidg-nospam@.imceda.com> wrote in message
news:OzZllPJ3EHA.2624@.TK2MSFTNGP11.phx.gbl...
> Jan wrote:
> I would recommend some performance tuning on the server. There are a
> number of products out there that may be able to point you to the
> problem. You could also try using Profiler to see what SQL is taking a
> lot of CPU or has long durations.
> That looks like a large table with very wide rows. If you are reindexing
> weekly, you may consider changing your fill factor so you don't leave
> 10% free space. That is, if you really don't require that much free
> space for the inserts each week.
> Plus your PK has a fill factor of 40!!!. That's means you are severly
> degrading read performance. Probably more than 2X. You are in effect
> increasing the size of the table by about 225%, causing all reads to
> lock more than twice as many pages, causing updates to do the same, and
> making your disks work so much harder to get at the data.
> If you are reindexing every week, then you may want to leave the fill
> factor at the default settings, or at least fill the pages 95% full.
> Only highly transactional tables with clustered indexes that force page
> breaks should really be using a lower fill factor.
> I suspect that's a large portion of the problem.
>
> --
> David Gugick
> Imceda Software
> www.imceda.com
>
|||You have many DateTime fields. You can cut the size of these in half by
converting to SmallDateTime. See BOL.
Paul.
"Jan" <Jan@.discussions.microsoft.com> wrote in message
news:5C690F4A-5E37-4927-9FCC-827D7C412469@.microsoft.com...
> Hi-
> I'm a developer on a team with a 200GB db. We inherited it, it definitely
> has design problems. About 3 weeks ago we started seeing a lot of
> blocking
> on our largest table which has affected our systems. We made no changes
> at
> the time the blocking started occurring. The db has been growing about 20
> GB/month, and we are re-indexing every weekend. We have significant
> hardware
> to support the database, and our DBAs do not believe it to be
> hardware-related.
> Is it possible that the SQL Server engine has problems when the db reaches
> such a large size? Has anyone else experienced a similar situation or
> could
> offer up ideas?
> Thx,
> Jan
Friday, February 24, 2012
Extracting XML data to columns during query
I'm working with an inherited SQL 2000 database (now moved to 2005)
that stores strings of XML in a text column. I'd like to avoid
changing the schema right now. Each row's XML data looks something
like:
<root><element id='5' name='bob' message='hello' /></root>
The exact attributes in the inner element are unknown, but what I
would like to be able to do is return them as columns in a query such
that I would get:
id name message
-- -- --
5 bob hello
Any thoughts?I was just now experimenting with something similar. Most of this is right
out of the MSDN help. This is reading from an external XML file.
-- create tables for later population using OPENXML.
create table Customers (CustomerID varchar(20) primary key,
ContactName varchar(20),
CompanyName varchar(20))
go
create table Orders( CustomerID varchar(20), OrderDate datetime)
go
declare @.xmlDocument xml
select @.xmlDocument = cast (x as xml)
from OpenRowset (bulk 'P:\SQL scripts\Examples\XML\SourceOfXml.xml',
single_blob) R (x)
select @.xmlDocument
-- Contents of the source XML file.
--
-- <?xml version="1.0" encoding="windows-1252" ?>
-- <ROOT>
-- <Customers CustomerID="XYZAA" ContactName="Joe" CompanyName="Company1">
-- <Orders CustomerID="XYZAA" OrderDate="2000-08-25T00:00:00"/>
-- <Orders CustomerID="XYZAA" OrderDate="2000-10-03T00:00:00"/>
-- </Customers>
-- <Customers CustomerID="XYZBB" ContactName="Steve"
CompanyName="Company2">No Orders yet!
-- </Customers>
-- </ROOT>
declare @.docHandle int
exec sp_xml_preparedocument @.docHandle output, @.xmlDocument
-- Use OpenXML to provide rowset consisting of customer data.
insert Customers
select *
from OpenXML (@.docHandle, N'/ROOT/Customers')
with Customers
-- Use OpenXML to provide rowset consisting of order data.
insert Orders
select *
from OpenXML (@.docHandle, N'//Orders')
with Orders
-- Using OpenXML in a SELECT statement.
select *
from OpenXML (@.docHandle, N'/ROOT/Customers/Orders')
with (CustomerID nchar (5) '../@.CustomerID', OrderDate datetime)
-- Remove the internal representation of the XML document.
exec sp_xml_removedocument @.docHandle
/*
drop table Customers
drop table Orders
*/
"Brian Vallelunga" wrote:
> I'm working with an inherited SQL 2000 database (now moved to 2005)
> that stores strings of XML in a text column. I'd like to avoid
> changing the schema right now. Each row's XML data looks something
> like:
> <root><element id='5' name='bob' message='hello' /></root>
> The exact attributes in the inner element are unknown, but what I
> would like to be able to do is return them as columns in a query such
> that I would get:
> id name message
> -- -- --
> 5 bob hello
>
> Any thoughts?
>|||Hello Brian,
> I'm working with an inherited SQL 2000 database (now moved to 2005)
> that stores strings of XML in a text column. I'd like to avoid
> changing the schema right now. Each row's XML data looks something
> like:
> The exact attributes in the inner element are unknown, but what I
> would like to be able to do is return them as columns in a query such
> that I would get:
Theres lots of ways of doing this is the number of attributes are know, but
when they aren't, your going to have a hard time shaping them to a meaningfu
l
table. So whate exactly do you mean by "attributes are unknown?"
Thanks!
Kent Tegels
DevelopMentor
http://staff.develop.com/ktegels/|||On Apr 3, 11:17 pm, Kent Tegels <kteg...@.develop.com> wrote:
> Hello Brian,
>
> Theres lots of ways of doing this is the number of attributes are know, bu
t
> when they aren't, your going to have a hard time shaping them to a meaning
ful
> table. So whate exactly do you mean by "attributes are unknown?"
> Thanks!
> Kent Tegels
> DevelopMentorhttp://staff.develop.com/ktegels/
Well, the xml data holds responses to online forms. Each form may have
different fields and each form field results in a single attribute key/
value pair. If a form has a first name and last name, the attributes
for these two would show up in the XML.
Obviously this makes it a bit more difficult than if the values were
known. However, for this, we can assume that any given set of data
pulled will be for one particular form, and will thus have the same
attributes in the XML data. I may just have to do this on the client
end, but thought I'd see if a SQL method was available.
that stores strings of XML in a text column. I'd like to avoid
changing the schema right now. Each row's XML data looks something
like:
<root><element id='5' name='bob' message='hello' /></root>
The exact attributes in the inner element are unknown, but what I
would like to be able to do is return them as columns in a query such
that I would get:
id name message
-- -- --
5 bob hello
Any thoughts?I was just now experimenting with something similar. Most of this is right
out of the MSDN help. This is reading from an external XML file.
-- create tables for later population using OPENXML.
create table Customers (CustomerID varchar(20) primary key,
ContactName varchar(20),
CompanyName varchar(20))
go
create table Orders( CustomerID varchar(20), OrderDate datetime)
go
declare @.xmlDocument xml
select @.xmlDocument = cast (x as xml)
from OpenRowset (bulk 'P:\SQL scripts\Examples\XML\SourceOfXml.xml',
single_blob) R (x)
select @.xmlDocument
-- Contents of the source XML file.
--
-- <?xml version="1.0" encoding="windows-1252" ?>
-- <ROOT>
-- <Customers CustomerID="XYZAA" ContactName="Joe" CompanyName="Company1">
-- <Orders CustomerID="XYZAA" OrderDate="2000-08-25T00:00:00"/>
-- <Orders CustomerID="XYZAA" OrderDate="2000-10-03T00:00:00"/>
-- </Customers>
-- <Customers CustomerID="XYZBB" ContactName="Steve"
CompanyName="Company2">No Orders yet!
-- </Customers>
-- </ROOT>
declare @.docHandle int
exec sp_xml_preparedocument @.docHandle output, @.xmlDocument
-- Use OpenXML to provide rowset consisting of customer data.
insert Customers
select *
from OpenXML (@.docHandle, N'/ROOT/Customers')
with Customers
-- Use OpenXML to provide rowset consisting of order data.
insert Orders
select *
from OpenXML (@.docHandle, N'//Orders')
with Orders
-- Using OpenXML in a SELECT statement.
select *
from OpenXML (@.docHandle, N'/ROOT/Customers/Orders')
with (CustomerID nchar (5) '../@.CustomerID', OrderDate datetime)
-- Remove the internal representation of the XML document.
exec sp_xml_removedocument @.docHandle
/*
drop table Customers
drop table Orders
*/
"Brian Vallelunga" wrote:
> I'm working with an inherited SQL 2000 database (now moved to 2005)
> that stores strings of XML in a text column. I'd like to avoid
> changing the schema right now. Each row's XML data looks something
> like:
> <root><element id='5' name='bob' message='hello' /></root>
> The exact attributes in the inner element are unknown, but what I
> would like to be able to do is return them as columns in a query such
> that I would get:
> id name message
> -- -- --
> 5 bob hello
>
> Any thoughts?
>|||Hello Brian,
> I'm working with an inherited SQL 2000 database (now moved to 2005)
> that stores strings of XML in a text column. I'd like to avoid
> changing the schema right now. Each row's XML data looks something
> like:
> The exact attributes in the inner element are unknown, but what I
> would like to be able to do is return them as columns in a query such
> that I would get:
Theres lots of ways of doing this is the number of attributes are know, but
when they aren't, your going to have a hard time shaping them to a meaningfu
l
table. So whate exactly do you mean by "attributes are unknown?"
Thanks!
Kent Tegels
DevelopMentor
http://staff.develop.com/ktegels/|||On Apr 3, 11:17 pm, Kent Tegels <kteg...@.develop.com> wrote:
> Hello Brian,
>
> Theres lots of ways of doing this is the number of attributes are know, bu
t
> when they aren't, your going to have a hard time shaping them to a meaning
ful
> table. So whate exactly do you mean by "attributes are unknown?"
> Thanks!
> Kent Tegels
> DevelopMentorhttp://staff.develop.com/ktegels/
Well, the xml data holds responses to online forms. Each form may have
different fields and each form field results in a single attribute key/
value pair. If a form has a first name and last name, the attributes
for these two would show up in the XML.
Obviously this makes it a bit more difficult than if the values were
known. However, for this, we can assume that any given set of data
pulled will be for one particular form, and will thus have the same
attributes in the XML data. I may just have to do this on the client
end, but thought I'd see if a SQL method was available.
Extracting XML data to columns during query
I'm working with an inherited SQL 2000 database (now moved to 2005)
that stores strings of XML in a text column. I'd like to avoid
changing the schema right now. Each row's XML data looks something
like:
<root><element id='5' name='bob' message='hello' /></root>
The exact attributes in the inner element are unknown, but what I
would like to be able to do is return them as columns in a query such
that I would get:
id name message
-- -- --
5 bob hello
Any thoughts?
I was just now experimenting with something similar. Most of this is right
out of the MSDN help. This is reading from an external XML file.
-- create tables for later population using OPENXML.
create table Customers (CustomerID varchar(20) primary key,
ContactName varchar(20),
CompanyName varchar(20))
go
create table Orders( CustomerID varchar(20), OrderDate datetime)
go
declare @.xmlDocument xml
select @.xmlDocument = cast (x as xml)
from OpenRowset (bulk 'P:\SQL scripts\Examples\XML\SourceOfXml.xml',
single_blob) R (x)
select @.xmlDocument
-- Contents of the source XML file.
-- <?xml version="1.0" encoding="windows-1252" ?>
-- <ROOT>
-- <Customers CustomerID="XYZAA" ContactName="Joe" CompanyName="Company1">
-- <Orders CustomerID="XYZAA" OrderDate="2000-08-25T00:00:00"/>
-- <Orders CustomerID="XYZAA" OrderDate="2000-10-03T00:00:00"/>
-- </Customers>
-- <Customers CustomerID="XYZBB" ContactName="Steve"
CompanyName="Company2">No Orders yet!
-- </Customers>
-- </ROOT>
declare @.docHandle int
exec sp_xml_preparedocument @.docHandle output, @.xmlDocument
-- Use OpenXML to provide rowset consisting of customer data.
insert Customers
select *
from OpenXML (@.docHandle, N'/ROOT/Customers')
with Customers
-- Use OpenXML to provide rowset consisting of order data.
insert Orders
select *
from OpenXML (@.docHandle, N'//Orders')
with Orders
-- Using OpenXML in a SELECT statement.
select *
from OpenXML (@.docHandle, N'/ROOT/Customers/Orders')
with (CustomerID nchar (5) '../@.CustomerID', OrderDate datetime)
-- Remove the internal representation of the XML document.
exec sp_xml_removedocument @.docHandle
/*
drop table Customers
drop table Orders
*/
"Brian Vallelunga" wrote:
> I'm working with an inherited SQL 2000 database (now moved to 2005)
> that stores strings of XML in a text column. I'd like to avoid
> changing the schema right now. Each row's XML data looks something
> like:
> <root><element id='5' name='bob' message='hello' /></root>
> The exact attributes in the inner element are unknown, but what I
> would like to be able to do is return them as columns in a query such
> that I would get:
> id name message
> -- -- --
> 5 bob hello
>
> Any thoughts?
>
|||Hello Brian,
> I'm working with an inherited SQL 2000 database (now moved to 2005)
> that stores strings of XML in a text column. I'd like to avoid
> changing the schema right now. Each row's XML data looks something
> like:
> The exact attributes in the inner element are unknown, but what I
> would like to be able to do is return them as columns in a query such
> that I would get:
Theres lots of ways of doing this is the number of attributes are know, but
when they aren't, your going to have a hard time shaping them to a meaningful
table. So whate exactly do you mean by "attributes are unknown?"
Thanks!
Kent Tegels
DevelopMentor
http://staff.develop.com/ktegels/
|||On Apr 3, 11:17 pm, Kent Tegels <kteg...@.develop.com> wrote:
> Hello Brian,
>
> Theres lots of ways of doing this is the number of attributes are know, but
> when they aren't, your going to have a hard time shaping them to a meaningful
> table. So whate exactly do you mean by "attributes are unknown?"
> Thanks!
> Kent Tegels
> DevelopMentorhttp://staff.develop.com/ktegels/
Well, the xml data holds responses to online forms. Each form may have
different fields and each form field results in a single attribute key/
value pair. If a form has a first name and last name, the attributes
for these two would show up in the XML.
Obviously this makes it a bit more difficult than if the values were
known. However, for this, we can assume that any given set of data
pulled will be for one particular form, and will thus have the same
attributes in the XML data. I may just have to do this on the client
end, but thought I'd see if a SQL method was available.
that stores strings of XML in a text column. I'd like to avoid
changing the schema right now. Each row's XML data looks something
like:
<root><element id='5' name='bob' message='hello' /></root>
The exact attributes in the inner element are unknown, but what I
would like to be able to do is return them as columns in a query such
that I would get:
id name message
-- -- --
5 bob hello
Any thoughts?
I was just now experimenting with something similar. Most of this is right
out of the MSDN help. This is reading from an external XML file.
-- create tables for later population using OPENXML.
create table Customers (CustomerID varchar(20) primary key,
ContactName varchar(20),
CompanyName varchar(20))
go
create table Orders( CustomerID varchar(20), OrderDate datetime)
go
declare @.xmlDocument xml
select @.xmlDocument = cast (x as xml)
from OpenRowset (bulk 'P:\SQL scripts\Examples\XML\SourceOfXml.xml',
single_blob) R (x)
select @.xmlDocument
-- Contents of the source XML file.
-- <?xml version="1.0" encoding="windows-1252" ?>
-- <ROOT>
-- <Customers CustomerID="XYZAA" ContactName="Joe" CompanyName="Company1">
-- <Orders CustomerID="XYZAA" OrderDate="2000-08-25T00:00:00"/>
-- <Orders CustomerID="XYZAA" OrderDate="2000-10-03T00:00:00"/>
-- </Customers>
-- <Customers CustomerID="XYZBB" ContactName="Steve"
CompanyName="Company2">No Orders yet!
-- </Customers>
-- </ROOT>
declare @.docHandle int
exec sp_xml_preparedocument @.docHandle output, @.xmlDocument
-- Use OpenXML to provide rowset consisting of customer data.
insert Customers
select *
from OpenXML (@.docHandle, N'/ROOT/Customers')
with Customers
-- Use OpenXML to provide rowset consisting of order data.
insert Orders
select *
from OpenXML (@.docHandle, N'//Orders')
with Orders
-- Using OpenXML in a SELECT statement.
select *
from OpenXML (@.docHandle, N'/ROOT/Customers/Orders')
with (CustomerID nchar (5) '../@.CustomerID', OrderDate datetime)
-- Remove the internal representation of the XML document.
exec sp_xml_removedocument @.docHandle
/*
drop table Customers
drop table Orders
*/
"Brian Vallelunga" wrote:
> I'm working with an inherited SQL 2000 database (now moved to 2005)
> that stores strings of XML in a text column. I'd like to avoid
> changing the schema right now. Each row's XML data looks something
> like:
> <root><element id='5' name='bob' message='hello' /></root>
> The exact attributes in the inner element are unknown, but what I
> would like to be able to do is return them as columns in a query such
> that I would get:
> id name message
> -- -- --
> 5 bob hello
>
> Any thoughts?
>
|||Hello Brian,
> I'm working with an inherited SQL 2000 database (now moved to 2005)
> that stores strings of XML in a text column. I'd like to avoid
> changing the schema right now. Each row's XML data looks something
> like:
> The exact attributes in the inner element are unknown, but what I
> would like to be able to do is return them as columns in a query such
> that I would get:
Theres lots of ways of doing this is the number of attributes are know, but
when they aren't, your going to have a hard time shaping them to a meaningful
table. So whate exactly do you mean by "attributes are unknown?"
Thanks!
Kent Tegels
DevelopMentor
http://staff.develop.com/ktegels/
|||On Apr 3, 11:17 pm, Kent Tegels <kteg...@.develop.com> wrote:
> Hello Brian,
>
> Theres lots of ways of doing this is the number of attributes are know, but
> when they aren't, your going to have a hard time shaping them to a meaningful
> table. So whate exactly do you mean by "attributes are unknown?"
> Thanks!
> Kent Tegels
> DevelopMentorhttp://staff.develop.com/ktegels/
Well, the xml data holds responses to online forms. Each form may have
different fields and each form field results in a single attribute key/
value pair. If a form has a first name and last name, the attributes
for these two would show up in the XML.
Obviously this makes it a bit more difficult than if the values were
known. However, for this, we can assume that any given set of data
pulled will be for one particular form, and will thus have the same
attributes in the XML data. I may just have to do this on the client
end, but thought I'd see if a SQL method was available.
Sunday, February 19, 2012
Extracting duplicate records
I am working on an inherited database and trying to add indexes on existing
tables. A few of the tables have duplicate entries that cause the
application of the indices to fail. I know that on some of the tables I can
simply copy the table, delete the records apply the index import from old
table and only unique records will make it into the table. ( I am correct
here aren't I?) However...
On a couple of tables I need to actually "see" the duplicate records so that
I can choose which records to keep and do a manual deletion of the duplicate
records.
There are no PK's on the tables in question and the indices I need to put in
place are, for example, on 6 of 15 columns. I need to find the dupes and
view the information in non-indexed columns to make my decision on which to
keep and which to delete.
Does this make any sense? Can anyone help me out with how I would go about
identifying the duplicate records?
Thanks in advance.
daniel
daniel wrote:
> I am working on an inherited database and trying to add indexes on
> existing tables. A few of the tables have duplicate entries that
> cause the application of the indices to fail. I know that on some of
> the tables I can simply copy the table, delete the records apply the
> index import from old table and only unique records will make it into
> the table. ( I am correct here aren't I?) However...
> On a couple of tables I need to actually "see" the duplicate records
> so that I can choose which records to keep and do a manual deletion
> of the duplicate records.
> There are no PK's on the tables in question and the indices I need to
> put in place are, for example, on 6 of 15 columns. I need to find the
> dupes and view the information in non-indexed columns to make my
> decision on which to keep and which to delete.
> Does this make any sense? Can anyone help me out with how I would go
> about identifying the duplicate records?
> Thanks in advance.
> daniel
You can group by the columns to be indexed and use a having clause to
return the dupes. Not sure of your table size, so the query may be CPU
and tiem consuming:
For example:
Select a, b, c, d
From MyTable
Group By a, b, c, d
Having COUNT(*) > 1
To get rid of the dupes, you could simply create another table and add a
clustered index with the ignore dup keys flag and insert the rows. But
you wouldn't have much control over which dupes were removed.
David Gugick
Imceda Software
www.imceda.com
|||"David Gugick" <davidg-nospam@.imceda.com> wrote in message
news:OrwxbzgCFHA.444@.TK2MSFTNGP15.phx.gbl...
> daniel wrote:
> You can group by the columns to be indexed and use a having clause to
> return the dupes. Not sure of your table size, so the query may be CPU
> and tiem consuming:
> For example:
> Select a, b, c, d
> From MyTable
> Group By a, b, c, d
> Having COUNT(*) > 1
> To get rid of the dupes, you could simply create another table and add a
> clustered index with the ignore dup keys flag and insert the rows. But
> you wouldn't have much control over which dupes were removed.
>
> --
> David Gugick
> Imceda Software
> www.imceda.com
>
Thanks David but I guess maybe I've misunderstood the error message from
SQL.(?) I queried one of the tables on which I receive the indexing error,
with the query you provided, and it returned 0 records. Perhaps providing
the error message will allow someone more knowledgeable than I to tell me
I'm interpreting it wrong. The message is as follows:
"Server: Msg 1505, Level 16, State 1
CREATE UNIQUE INDEX terminated because a duplicate key was found for index
id 9. Most significant primary key is '115040'"
Does this not mean there are duplicate row entries based on the attempted
index columns? If so why do I get a return of 0 records with the query
provided below?
Any ideas anyone?
|||daniel wrote:
> Thanks David but I guess maybe I've misunderstood the error message
> from SQL.(?) I queried one of the tables on which I receive the
> indexing error, with the query you provided, and it returned 0
> records. Perhaps providing the error message will allow someone more
> knowledgeable than I to tell me I'm interpreting it wrong. The
> message is as follows:
> "Server: Msg 1505, Level 16, State 1
> CREATE UNIQUE INDEX terminated because a duplicate key was found for
> index id 9. Most significant primary key is '115040'"
> Does this not mean there are duplicate row entries based on the
> attempted index columns? If so why do I get a return of 0 records
> with the query provided below?
> Any ideas anyone?
You'll need to provide the query, the table ddl, and the index statement
in order to tell what's going on.
David Gugick
Imceda Software
www.imceda.com
|||You might find this interesting.
http://www.15seconds.com/issue/011009.htm
Ben Miller
"daniel" <dhagwood@.hotmail.com> wrote in message
news:OVAIxhgCFHA.560@.TK2MSFTNGP15.phx.gbl...
>I am working on an inherited database and trying to add indexes on existing
> tables. A few of the tables have duplicate entries that cause the
> application of the indices to fail. I know that on some of the tables I
> can
> simply copy the table, delete the records apply the index import from old
> table and only unique records will make it into the table. ( I am correct
> here aren't I?) However...
> On a couple of tables I need to actually "see" the duplicate records so
> that
> I can choose which records to keep and do a manual deletion of the
> duplicate
> records.
> There are no PK's on the tables in question and the indices I need to put
> in
> place are, for example, on 6 of 15 columns. I need to find the dupes and
> view the information in non-indexed columns to make my decision on which
> to
> keep and which to delete.
> Does this make any sense? Can anyone help me out with how I would go about
> identifying the duplicate records?
> Thanks in advance.
> daniel
>
|||daniel wrote:
> "David Gugick" <davidg-nospam@.imceda.com> wrote in message
> news:OrwxbzgCFHA.444@.TK2MSFTNGP15.phx.gbl...
> Thanks David but I guess maybe I've misunderstood the error message
> from SQL.(?) I queried one of the tables on which I receive the
> indexing error, with the query you provided, and it returned 0
> records. Perhaps providing the error message will allow someone more
> knowledgeable than I to tell me I'm interpreting it wrong. The
> message is as follows:
> "Server: Msg 1505, Level 16, State 1
> CREATE UNIQUE INDEX terminated because a duplicate key was found for
> index id 9. Most significant primary key is '115040'"
> Does this not mean there are duplicate row entries based on the
> attempted index columns? If so why do I get a return of 0 records
> with the query provided below?
> Any ideas anyone?
This works for me:
create table #a (col1 int, col2 int, col3 int)
insert into #a values (1, 2, 3)
insert into #a values (1, 2, 3)
insert into #a values (2, 2, 3)
insert into #a values (3, 2, 1)
insert into #a values (3, 2, 1)
insert into #a values (3, 2, 1)
Select count(*) as "# Dupes", col1, col2, col3
From #a
Group By col1, col2, col3
Having count(*) > 1
# Dupes col1 col2 col3
-- -- -- --
2 1 2 3
3 3 2 1
David Gugick
Imceda Software
www.imceda.com
|||"David Gugick" <davidg-nospam@.imceda.com> wrote in message
news:ea8zyRlCFHA.2676@.TK2MSFTNGP12.phx.gbl...
> daniel wrote:
> This works for me:
> create table #a (col1 int, col2 int, col3 int)
>
> insert into #a values (1, 2, 3)
> insert into #a values (1, 2, 3)
> insert into #a values (2, 2, 3)
> insert into #a values (3, 2, 1)
> insert into #a values (3, 2, 1)
> insert into #a values (3, 2, 1)
>
> Select count(*) as "# Dupes", col1, col2, col3
> From #a
> Group By col1, col2, col3
> Having count(*) > 1
> # Dupes col1 col2 col3
> -- -- -- --
> 2 1 2 3
> 3 3 2 1
>
> --
> David Gugick
> Imceda Software
> www.imceda.com
>
Thanks again David. I finally got the query to work. Knowing that there were
duplicates I did some intensive eye straining work to figure out the
problem. Appears as though some of the "duplicate" records also contained
unexpected data, empty string vs. NULL value and this was thowing off the
query. I updated the fields as necessary and everything worked out fine.
Thanks again for your assistance.
daniel
tables. A few of the tables have duplicate entries that cause the
application of the indices to fail. I know that on some of the tables I can
simply copy the table, delete the records apply the index import from old
table and only unique records will make it into the table. ( I am correct
here aren't I?) However...
On a couple of tables I need to actually "see" the duplicate records so that
I can choose which records to keep and do a manual deletion of the duplicate
records.
There are no PK's on the tables in question and the indices I need to put in
place are, for example, on 6 of 15 columns. I need to find the dupes and
view the information in non-indexed columns to make my decision on which to
keep and which to delete.
Does this make any sense? Can anyone help me out with how I would go about
identifying the duplicate records?
Thanks in advance.
daniel
daniel wrote:
> I am working on an inherited database and trying to add indexes on
> existing tables. A few of the tables have duplicate entries that
> cause the application of the indices to fail. I know that on some of
> the tables I can simply copy the table, delete the records apply the
> index import from old table and only unique records will make it into
> the table. ( I am correct here aren't I?) However...
> On a couple of tables I need to actually "see" the duplicate records
> so that I can choose which records to keep and do a manual deletion
> of the duplicate records.
> There are no PK's on the tables in question and the indices I need to
> put in place are, for example, on 6 of 15 columns. I need to find the
> dupes and view the information in non-indexed columns to make my
> decision on which to keep and which to delete.
> Does this make any sense? Can anyone help me out with how I would go
> about identifying the duplicate records?
> Thanks in advance.
> daniel
You can group by the columns to be indexed and use a having clause to
return the dupes. Not sure of your table size, so the query may be CPU
and tiem consuming:
For example:
Select a, b, c, d
From MyTable
Group By a, b, c, d
Having COUNT(*) > 1
To get rid of the dupes, you could simply create another table and add a
clustered index with the ignore dup keys flag and insert the rows. But
you wouldn't have much control over which dupes were removed.
David Gugick
Imceda Software
www.imceda.com
|||"David Gugick" <davidg-nospam@.imceda.com> wrote in message
news:OrwxbzgCFHA.444@.TK2MSFTNGP15.phx.gbl...
> daniel wrote:
> You can group by the columns to be indexed and use a having clause to
> return the dupes. Not sure of your table size, so the query may be CPU
> and tiem consuming:
> For example:
> Select a, b, c, d
> From MyTable
> Group By a, b, c, d
> Having COUNT(*) > 1
> To get rid of the dupes, you could simply create another table and add a
> clustered index with the ignore dup keys flag and insert the rows. But
> you wouldn't have much control over which dupes were removed.
>
> --
> David Gugick
> Imceda Software
> www.imceda.com
>
Thanks David but I guess maybe I've misunderstood the error message from
SQL.(?) I queried one of the tables on which I receive the indexing error,
with the query you provided, and it returned 0 records. Perhaps providing
the error message will allow someone more knowledgeable than I to tell me
I'm interpreting it wrong. The message is as follows:
"Server: Msg 1505, Level 16, State 1
CREATE UNIQUE INDEX terminated because a duplicate key was found for index
id 9. Most significant primary key is '115040'"
Does this not mean there are duplicate row entries based on the attempted
index columns? If so why do I get a return of 0 records with the query
provided below?
Any ideas anyone?
|||daniel wrote:
> Thanks David but I guess maybe I've misunderstood the error message
> from SQL.(?) I queried one of the tables on which I receive the
> indexing error, with the query you provided, and it returned 0
> records. Perhaps providing the error message will allow someone more
> knowledgeable than I to tell me I'm interpreting it wrong. The
> message is as follows:
> "Server: Msg 1505, Level 16, State 1
> CREATE UNIQUE INDEX terminated because a duplicate key was found for
> index id 9. Most significant primary key is '115040'"
> Does this not mean there are duplicate row entries based on the
> attempted index columns? If so why do I get a return of 0 records
> with the query provided below?
> Any ideas anyone?
You'll need to provide the query, the table ddl, and the index statement
in order to tell what's going on.
David Gugick
Imceda Software
www.imceda.com
|||You might find this interesting.
http://www.15seconds.com/issue/011009.htm
Ben Miller
"daniel" <dhagwood@.hotmail.com> wrote in message
news:OVAIxhgCFHA.560@.TK2MSFTNGP15.phx.gbl...
>I am working on an inherited database and trying to add indexes on existing
> tables. A few of the tables have duplicate entries that cause the
> application of the indices to fail. I know that on some of the tables I
> can
> simply copy the table, delete the records apply the index import from old
> table and only unique records will make it into the table. ( I am correct
> here aren't I?) However...
> On a couple of tables I need to actually "see" the duplicate records so
> that
> I can choose which records to keep and do a manual deletion of the
> duplicate
> records.
> There are no PK's on the tables in question and the indices I need to put
> in
> place are, for example, on 6 of 15 columns. I need to find the dupes and
> view the information in non-indexed columns to make my decision on which
> to
> keep and which to delete.
> Does this make any sense? Can anyone help me out with how I would go about
> identifying the duplicate records?
> Thanks in advance.
> daniel
>
|||daniel wrote:
> "David Gugick" <davidg-nospam@.imceda.com> wrote in message
> news:OrwxbzgCFHA.444@.TK2MSFTNGP15.phx.gbl...
> Thanks David but I guess maybe I've misunderstood the error message
> from SQL.(?) I queried one of the tables on which I receive the
> indexing error, with the query you provided, and it returned 0
> records. Perhaps providing the error message will allow someone more
> knowledgeable than I to tell me I'm interpreting it wrong. The
> message is as follows:
> "Server: Msg 1505, Level 16, State 1
> CREATE UNIQUE INDEX terminated because a duplicate key was found for
> index id 9. Most significant primary key is '115040'"
> Does this not mean there are duplicate row entries based on the
> attempted index columns? If so why do I get a return of 0 records
> with the query provided below?
> Any ideas anyone?
This works for me:
create table #a (col1 int, col2 int, col3 int)
insert into #a values (1, 2, 3)
insert into #a values (1, 2, 3)
insert into #a values (2, 2, 3)
insert into #a values (3, 2, 1)
insert into #a values (3, 2, 1)
insert into #a values (3, 2, 1)
Select count(*) as "# Dupes", col1, col2, col3
From #a
Group By col1, col2, col3
Having count(*) > 1
# Dupes col1 col2 col3
-- -- -- --
2 1 2 3
3 3 2 1
David Gugick
Imceda Software
www.imceda.com
|||"David Gugick" <davidg-nospam@.imceda.com> wrote in message
news:ea8zyRlCFHA.2676@.TK2MSFTNGP12.phx.gbl...
> daniel wrote:
> This works for me:
> create table #a (col1 int, col2 int, col3 int)
>
> insert into #a values (1, 2, 3)
> insert into #a values (1, 2, 3)
> insert into #a values (2, 2, 3)
> insert into #a values (3, 2, 1)
> insert into #a values (3, 2, 1)
> insert into #a values (3, 2, 1)
>
> Select count(*) as "# Dupes", col1, col2, col3
> From #a
> Group By col1, col2, col3
> Having count(*) > 1
> # Dupes col1 col2 col3
> -- -- -- --
> 2 1 2 3
> 3 3 2 1
>
> --
> David Gugick
> Imceda Software
> www.imceda.com
>
Thanks again David. I finally got the query to work. Knowing that there were
duplicates I did some intensive eye straining work to figure out the
problem. Appears as though some of the "duplicate" records also contained
unexpected data, empty string vs. NULL value and this was thowing off the
query. I updated the fields as necessary and everything worked out fine.
Thanks again for your assistance.
daniel
Friday, February 17, 2012
Extracting duplicate records
I am working on an inherited database and trying to add indexes on existing
tables. A few of the tables have duplicate entries that cause the
application of the indices to fail. I know that on some of the tables I can
simply copy the table, delete the records apply the index import from old
table and only unique records will make it into the table. ( I am correct
here aren't I?) However...
On a couple of tables I need to actually "see" the duplicate records so that
I can choose which records to keep and do a manual deletion of the duplicate
records.
There are no PK's on the tables in question and the indices I need to put in
place are, for example, on 6 of 15 columns. I need to find the dupes and
view the information in non-indexed columns to make my decision on which to
keep and which to delete.
Does this make any sense? Can anyone help me out with how I would go about
identifying the duplicate records?
Thanks in advance.
danieldaniel wrote:
> I am working on an inherited database and trying to add indexes on
> existing tables. A few of the tables have duplicate entries that
> cause the application of the indices to fail. I know that on some of
> the tables I can simply copy the table, delete the records apply the
> index import from old table and only unique records will make it into
> the table. ( I am correct here aren't I?) However...
> On a couple of tables I need to actually "see" the duplicate records
> so that I can choose which records to keep and do a manual deletion
> of the duplicate records.
> There are no PK's on the tables in question and the indices I need to
> put in place are, for example, on 6 of 15 columns. I need to find the
> dupes and view the information in non-indexed columns to make my
> decision on which to keep and which to delete.
> Does this make any sense? Can anyone help me out with how I would go
> about identifying the duplicate records?
> Thanks in advance.
> daniel
You can group by the columns to be indexed and use a having clause to
return the dupes. Not sure of your table size, so the query may be CPU
and tiem consuming:
For example:
Select a, b, c, d
From MyTable
Group By a, b, c, d
Having COUNT(*) > 1
To get rid of the dupes, you could simply create another table and add a
clustered index with the ignore dup keys flag and insert the rows. But
you wouldn't have much control over which dupes were removed.
David Gugick
Imceda Software
www.imceda.com|||"David Gugick" <davidg-nospam@.imceda.com> wrote in message
news:OrwxbzgCFHA.444@.TK2MSFTNGP15.phx.gbl...
> daniel wrote:
> You can group by the columns to be indexed and use a having clause to
> return the dupes. Not sure of your table size, so the query may be CPU
> and tiem consuming:
> For example:
> Select a, b, c, d
> From MyTable
> Group By a, b, c, d
> Having COUNT(*) > 1
> To get rid of the dupes, you could simply create another table and add a
> clustered index with the ignore dup keys flag and insert the rows. But
> you wouldn't have much control over which dupes were removed.
>
> --
> David Gugick
> Imceda Software
> www.imceda.com
>
Thanks David but I guess maybe I've misunderstood the error message from
SQL.(?) I queried one of the tables on which I receive the indexing error,
with the query you provided, and it returned 0 records. Perhaps providing
the error message will allow someone more knowledgeable than I to tell me
I'm interpreting it wrong. The message is as follows:
"Server: Msg 1505, Level 16, State 1
CREATE UNIQUE INDEX terminated because a duplicate key was found for index
id 9. Most significant primary key is '115040'"
Does this not mean there are duplicate row entries based on the attempted
index columns? If so why do I get a return of 0 records with the query
provided below?
Any ideas anyone?|||daniel wrote:
> Thanks David but I guess maybe I've misunderstood the error message
> from SQL.(?) I queried one of the tables on which I receive the
> indexing error, with the query you provided, and it returned 0
> records. Perhaps providing the error message will allow someone more
> knowledgeable than I to tell me I'm interpreting it wrong. The
> message is as follows:
> "Server: Msg 1505, Level 16, State 1
> CREATE UNIQUE INDEX terminated because a duplicate key was found for
> index id 9. Most significant primary key is '115040'"
> Does this not mean there are duplicate row entries based on the
> attempted index columns? If so why do I get a return of 0 records
> with the query provided below?
> Any ideas anyone?
You'll need to provide the query, the table ddl, and the index statement
in order to tell what's going on.
David Gugick
Imceda Software
www.imceda.com|||You might find this interesting.
http://www.15seconds.com/issue/011009.htm
Ben Miller
"daniel" <dhagwood@.hotmail.com> wrote in message
news:OVAIxhgCFHA.560@.TK2MSFTNGP15.phx.gbl...
>I am working on an inherited database and trying to add indexes on existing
> tables. A few of the tables have duplicate entries that cause the
> application of the indices to fail. I know that on some of the tables I
> can
> simply copy the table, delete the records apply the index import from old
> table and only unique records will make it into the table. ( I am correct
> here aren't I?) However...
> On a couple of tables I need to actually "see" the duplicate records so
> that
> I can choose which records to keep and do a manual deletion of the
> duplicate
> records.
> There are no PK's on the tables in question and the indices I need to put
> in
> place are, for example, on 6 of 15 columns. I need to find the dupes and
> view the information in non-indexed columns to make my decision on which
> to
> keep and which to delete.
> Does this make any sense? Can anyone help me out with how I would go about
> identifying the duplicate records?
> Thanks in advance.
> daniel
>|||daniel wrote:
> "David Gugick" <davidg-nospam@.imceda.com> wrote in message
> news:OrwxbzgCFHA.444@.TK2MSFTNGP15.phx.gbl...
> Thanks David but I guess maybe I've misunderstood the error message
> from SQL.(?) I queried one of the tables on which I receive the
> indexing error, with the query you provided, and it returned 0
> records. Perhaps providing the error message will allow someone more
> knowledgeable than I to tell me I'm interpreting it wrong. The
> message is as follows:
> "Server: Msg 1505, Level 16, State 1
> CREATE UNIQUE INDEX terminated because a duplicate key was found for
> index id 9. Most significant primary key is '115040'"
> Does this not mean there are duplicate row entries based on the
> attempted index columns? If so why do I get a return of 0 records
> with the query provided below?
> Any ideas anyone?
This works for me:
create table #a (col1 int, col2 int, col3 int)
insert into #a values (1, 2, 3)
insert into #a values (1, 2, 3)
insert into #a values (2, 2, 3)
insert into #a values (3, 2, 1)
insert into #a values (3, 2, 1)
insert into #a values (3, 2, 1)
Select count(*) as "# Dupes", col1, col2, col3
From #a
Group By col1, col2, col3
Having count(*) > 1
# Dupes col1 col2 col3
-- -- -- --
2 1 2 3
3 3 2 1
David Gugick
Imceda Software
www.imceda.com|||"David Gugick" <davidg-nospam@.imceda.com> wrote in message
news:ea8zyRlCFHA.2676@.TK2MSFTNGP12.phx.gbl...
> daniel wrote:
> This works for me:
> create table #a (col1 int, col2 int, col3 int)
>
> insert into #a values (1, 2, 3)
> insert into #a values (1, 2, 3)
> insert into #a values (2, 2, 3)
> insert into #a values (3, 2, 1)
> insert into #a values (3, 2, 1)
> insert into #a values (3, 2, 1)
>
> Select count(*) as "# Dupes", col1, col2, col3
> From #a
> Group By col1, col2, col3
> Having count(*) > 1
> # Dupes col1 col2 col3
> -- -- -- --
> 2 1 2 3
> 3 3 2 1
>
> --
> David Gugick
> Imceda Software
> www.imceda.com
>
Thanks again David. I finally got the query to work. Knowing that there were
duplicates I did some intensive eye straining work to figure out the
problem. Appears as though some of the "duplicate" records also contained
unexpected data, empty string vs. NULL value and this was thowing off the
query. I updated the fields as necessary and everything worked out fine.
Thanks again for your assistance.
daniel
tables. A few of the tables have duplicate entries that cause the
application of the indices to fail. I know that on some of the tables I can
simply copy the table, delete the records apply the index import from old
table and only unique records will make it into the table. ( I am correct
here aren't I?) However...
On a couple of tables I need to actually "see" the duplicate records so that
I can choose which records to keep and do a manual deletion of the duplicate
records.
There are no PK's on the tables in question and the indices I need to put in
place are, for example, on 6 of 15 columns. I need to find the dupes and
view the information in non-indexed columns to make my decision on which to
keep and which to delete.
Does this make any sense? Can anyone help me out with how I would go about
identifying the duplicate records?
Thanks in advance.
danieldaniel wrote:
> I am working on an inherited database and trying to add indexes on
> existing tables. A few of the tables have duplicate entries that
> cause the application of the indices to fail. I know that on some of
> the tables I can simply copy the table, delete the records apply the
> index import from old table and only unique records will make it into
> the table. ( I am correct here aren't I?) However...
> On a couple of tables I need to actually "see" the duplicate records
> so that I can choose which records to keep and do a manual deletion
> of the duplicate records.
> There are no PK's on the tables in question and the indices I need to
> put in place are, for example, on 6 of 15 columns. I need to find the
> dupes and view the information in non-indexed columns to make my
> decision on which to keep and which to delete.
> Does this make any sense? Can anyone help me out with how I would go
> about identifying the duplicate records?
> Thanks in advance.
> daniel
You can group by the columns to be indexed and use a having clause to
return the dupes. Not sure of your table size, so the query may be CPU
and tiem consuming:
For example:
Select a, b, c, d
From MyTable
Group By a, b, c, d
Having COUNT(*) > 1
To get rid of the dupes, you could simply create another table and add a
clustered index with the ignore dup keys flag and insert the rows. But
you wouldn't have much control over which dupes were removed.
David Gugick
Imceda Software
www.imceda.com|||"David Gugick" <davidg-nospam@.imceda.com> wrote in message
news:OrwxbzgCFHA.444@.TK2MSFTNGP15.phx.gbl...
> daniel wrote:
> You can group by the columns to be indexed and use a having clause to
> return the dupes. Not sure of your table size, so the query may be CPU
> and tiem consuming:
> For example:
> Select a, b, c, d
> From MyTable
> Group By a, b, c, d
> Having COUNT(*) > 1
> To get rid of the dupes, you could simply create another table and add a
> clustered index with the ignore dup keys flag and insert the rows. But
> you wouldn't have much control over which dupes were removed.
>
> --
> David Gugick
> Imceda Software
> www.imceda.com
>
Thanks David but I guess maybe I've misunderstood the error message from
SQL.(?) I queried one of the tables on which I receive the indexing error,
with the query you provided, and it returned 0 records. Perhaps providing
the error message will allow someone more knowledgeable than I to tell me
I'm interpreting it wrong. The message is as follows:
"Server: Msg 1505, Level 16, State 1
CREATE UNIQUE INDEX terminated because a duplicate key was found for index
id 9. Most significant primary key is '115040'"
Does this not mean there are duplicate row entries based on the attempted
index columns? If so why do I get a return of 0 records with the query
provided below?
Any ideas anyone?|||daniel wrote:
> Thanks David but I guess maybe I've misunderstood the error message
> from SQL.(?) I queried one of the tables on which I receive the
> indexing error, with the query you provided, and it returned 0
> records. Perhaps providing the error message will allow someone more
> knowledgeable than I to tell me I'm interpreting it wrong. The
> message is as follows:
> "Server: Msg 1505, Level 16, State 1
> CREATE UNIQUE INDEX terminated because a duplicate key was found for
> index id 9. Most significant primary key is '115040'"
> Does this not mean there are duplicate row entries based on the
> attempted index columns? If so why do I get a return of 0 records
> with the query provided below?
> Any ideas anyone?
You'll need to provide the query, the table ddl, and the index statement
in order to tell what's going on.
David Gugick
Imceda Software
www.imceda.com|||You might find this interesting.
http://www.15seconds.com/issue/011009.htm
Ben Miller
"daniel" <dhagwood@.hotmail.com> wrote in message
news:OVAIxhgCFHA.560@.TK2MSFTNGP15.phx.gbl...
>I am working on an inherited database and trying to add indexes on existing
> tables. A few of the tables have duplicate entries that cause the
> application of the indices to fail. I know that on some of the tables I
> can
> simply copy the table, delete the records apply the index import from old
> table and only unique records will make it into the table. ( I am correct
> here aren't I?) However...
> On a couple of tables I need to actually "see" the duplicate records so
> that
> I can choose which records to keep and do a manual deletion of the
> duplicate
> records.
> There are no PK's on the tables in question and the indices I need to put
> in
> place are, for example, on 6 of 15 columns. I need to find the dupes and
> view the information in non-indexed columns to make my decision on which
> to
> keep and which to delete.
> Does this make any sense? Can anyone help me out with how I would go about
> identifying the duplicate records?
> Thanks in advance.
> daniel
>|||daniel wrote:
> "David Gugick" <davidg-nospam@.imceda.com> wrote in message
> news:OrwxbzgCFHA.444@.TK2MSFTNGP15.phx.gbl...
> Thanks David but I guess maybe I've misunderstood the error message
> from SQL.(?) I queried one of the tables on which I receive the
> indexing error, with the query you provided, and it returned 0
> records. Perhaps providing the error message will allow someone more
> knowledgeable than I to tell me I'm interpreting it wrong. The
> message is as follows:
> "Server: Msg 1505, Level 16, State 1
> CREATE UNIQUE INDEX terminated because a duplicate key was found for
> index id 9. Most significant primary key is '115040'"
> Does this not mean there are duplicate row entries based on the
> attempted index columns? If so why do I get a return of 0 records
> with the query provided below?
> Any ideas anyone?
This works for me:
create table #a (col1 int, col2 int, col3 int)
insert into #a values (1, 2, 3)
insert into #a values (1, 2, 3)
insert into #a values (2, 2, 3)
insert into #a values (3, 2, 1)
insert into #a values (3, 2, 1)
insert into #a values (3, 2, 1)
Select count(*) as "# Dupes", col1, col2, col3
From #a
Group By col1, col2, col3
Having count(*) > 1
# Dupes col1 col2 col3
-- -- -- --
2 1 2 3
3 3 2 1
David Gugick
Imceda Software
www.imceda.com|||"David Gugick" <davidg-nospam@.imceda.com> wrote in message
news:ea8zyRlCFHA.2676@.TK2MSFTNGP12.phx.gbl...
> daniel wrote:
> This works for me:
> create table #a (col1 int, col2 int, col3 int)
>
> insert into #a values (1, 2, 3)
> insert into #a values (1, 2, 3)
> insert into #a values (2, 2, 3)
> insert into #a values (3, 2, 1)
> insert into #a values (3, 2, 1)
> insert into #a values (3, 2, 1)
>
> Select count(*) as "# Dupes", col1, col2, col3
> From #a
> Group By col1, col2, col3
> Having count(*) > 1
> # Dupes col1 col2 col3
> -- -- -- --
> 2 1 2 3
> 3 3 2 1
>
> --
> David Gugick
> Imceda Software
> www.imceda.com
>
Thanks again David. I finally got the query to work. Knowing that there were
duplicates I did some intensive eye straining work to figure out the
problem. Appears as though some of the "duplicate" records also contained
unexpected data, empty string vs. NULL value and this was thowing off the
query. I updated the fields as necessary and everything worked out fine.
Thanks again for your assistance.
daniel
Extracting duplicate records
I am working on an inherited database and trying to add indexes on existing
tables. A few of the tables have duplicate entries that cause the
application of the indices to fail. I know that on some of the tables I can
simply copy the table, delete the records apply the index import from old
table and only unique records will make it into the table. ( I am correct
here aren't I?) However...
On a couple of tables I need to actually "see" the duplicate records so that
I can choose which records to keep and do a manual deletion of the duplicate
records.
There are no PK's on the tables in question and the indices I need to put in
place are, for example, on 6 of 15 columns. I need to find the dupes and
view the information in non-indexed columns to make my decision on which to
keep and which to delete.
Does this make any sense? Can anyone help me out with how I would go about
identifying the duplicate records?
Thanks in advance.
danieldaniel wrote:
> I am working on an inherited database and trying to add indexes on
> existing tables. A few of the tables have duplicate entries that
> cause the application of the indices to fail. I know that on some of
> the tables I can simply copy the table, delete the records apply the
> index import from old table and only unique records will make it into
> the table. ( I am correct here aren't I?) However...
> On a couple of tables I need to actually "see" the duplicate records
> so that I can choose which records to keep and do a manual deletion
> of the duplicate records.
> There are no PK's on the tables in question and the indices I need to
> put in place are, for example, on 6 of 15 columns. I need to find the
> dupes and view the information in non-indexed columns to make my
> decision on which to keep and which to delete.
> Does this make any sense? Can anyone help me out with how I would go
> about identifying the duplicate records?
> Thanks in advance.
> daniel
You can group by the columns to be indexed and use a having clause to
return the dupes. Not sure of your table size, so the query may be CPU
and tiem consuming:
For example:
Select a, b, c, d
From MyTable
Group By a, b, c, d
Having COUNT(*) > 1
To get rid of the dupes, you could simply create another table and add a
clustered index with the ignore dup keys flag and insert the rows. But
you wouldn't have much control over which dupes were removed.
David Gugick
Imceda Software
www.imceda.com|||"David Gugick" <davidg-nospam@.imceda.com> wrote in message
news:OrwxbzgCFHA.444@.TK2MSFTNGP15.phx.gbl...
> daniel wrote:
> > I am working on an inherited database and trying to add indexes on
> > existing tables. A few of the tables have duplicate entries that
> > cause the application of the indices to fail. I know that on some of
> > the tables I can simply copy the table, delete the records apply the
> > index import from old table and only unique records will make it into
> > the table. ( I am correct here aren't I?) However...
> >
> > On a couple of tables I need to actually "see" the duplicate records
> > so that I can choose which records to keep and do a manual deletion
> > of the duplicate records.
> >
> > There are no PK's on the tables in question and the indices I need to
> > put in place are, for example, on 6 of 15 columns. I need to find the
> > dupes and view the information in non-indexed columns to make my
> > decision on which to keep and which to delete.
> >
> > Does this make any sense? Can anyone help me out with how I would go
> > about identifying the duplicate records?
> >
> > Thanks in advance.
> >
> > daniel
> You can group by the columns to be indexed and use a having clause to
> return the dupes. Not sure of your table size, so the query may be CPU
> and tiem consuming:
> For example:
> Select a, b, c, d
> From MyTable
> Group By a, b, c, d
> Having COUNT(*) > 1
> To get rid of the dupes, you could simply create another table and add a
> clustered index with the ignore dup keys flag and insert the rows. But
> you wouldn't have much control over which dupes were removed.
>
> --
> David Gugick
> Imceda Software
> www.imceda.com
>
Thanks David but I guess maybe I've misunderstood the error message from
SQL.(?) I queried one of the tables on which I receive the indexing error,
with the query you provided, and it returned 0 records. Perhaps providing
the error message will allow someone more knowledgeable than I to tell me
I'm interpreting it wrong. The message is as follows:
"Server: Msg 1505, Level 16, State 1
CREATE UNIQUE INDEX terminated because a duplicate key was found for index
id 9. Most significant primary key is '115040'"
Does this not mean there are duplicate row entries based on the attempted
index columns? If so why do I get a return of 0 records with the query
provided below?
Any ideas anyone?|||daniel wrote:
> Thanks David but I guess maybe I've misunderstood the error message
> from SQL.(?) I queried one of the tables on which I receive the
> indexing error, with the query you provided, and it returned 0
> records. Perhaps providing the error message will allow someone more
> knowledgeable than I to tell me I'm interpreting it wrong. The
> message is as follows:
> "Server: Msg 1505, Level 16, State 1
> CREATE UNIQUE INDEX terminated because a duplicate key was found for
> index id 9. Most significant primary key is '115040'"
> Does this not mean there are duplicate row entries based on the
> attempted index columns? If so why do I get a return of 0 records
> with the query provided below?
> Any ideas anyone?
You'll need to provide the query, the table ddl, and the index statement
in order to tell what's going on.
David Gugick
Imceda Software
www.imceda.com|||You might find this interesting.
http://www.15seconds.com/issue/011009.htm
Ben Miller
"daniel" <dhagwood@.hotmail.com> wrote in message
news:OVAIxhgCFHA.560@.TK2MSFTNGP15.phx.gbl...
>I am working on an inherited database and trying to add indexes on existing
> tables. A few of the tables have duplicate entries that cause the
> application of the indices to fail. I know that on some of the tables I
> can
> simply copy the table, delete the records apply the index import from old
> table and only unique records will make it into the table. ( I am correct
> here aren't I?) However...
> On a couple of tables I need to actually "see" the duplicate records so
> that
> I can choose which records to keep and do a manual deletion of the
> duplicate
> records.
> There are no PK's on the tables in question and the indices I need to put
> in
> place are, for example, on 6 of 15 columns. I need to find the dupes and
> view the information in non-indexed columns to make my decision on which
> to
> keep and which to delete.
> Does this make any sense? Can anyone help me out with how I would go about
> identifying the duplicate records?
> Thanks in advance.
> daniel
>|||daniel wrote:
> "David Gugick" <davidg-nospam@.imceda.com> wrote in message
> news:OrwxbzgCFHA.444@.TK2MSFTNGP15.phx.gbl...
>> daniel wrote:
>> I am working on an inherited database and trying to add indexes on
>> existing tables. A few of the tables have duplicate entries that
>> cause the application of the indices to fail. I know that on some of
>> the tables I can simply copy the table, delete the records apply the
>> index import from old table and only unique records will make it
>> into the table. ( I am correct here aren't I?) However...
>> On a couple of tables I need to actually "see" the duplicate records
>> so that I can choose which records to keep and do a manual deletion
>> of the duplicate records.
>> There are no PK's on the tables in question and the indices I need
>> to put in place are, for example, on 6 of 15 columns. I need to
>> find the dupes and view the information in non-indexed columns to
>> make my decision on which to keep and which to delete.
>> Does this make any sense? Can anyone help me out with how I would go
>> about identifying the duplicate records?
>> Thanks in advance.
>> daniel
>> You can group by the columns to be indexed and use a having clause to
>> return the dupes. Not sure of your table size, so the query may be
>> CPU and tiem consuming:
>> For example:
>> Select a, b, c, d
>> From MyTable
>> Group By a, b, c, d
>> Having COUNT(*) > 1
>> To get rid of the dupes, you could simply create another table and
>> add a clustered index with the ignore dup keys flag and insert the
>> rows. But you wouldn't have much control over which dupes were
>> removed.
>>
>> --
>> David Gugick
>> Imceda Software
>> www.imceda.com
> Thanks David but I guess maybe I've misunderstood the error message
> from SQL.(?) I queried one of the tables on which I receive the
> indexing error, with the query you provided, and it returned 0
> records. Perhaps providing the error message will allow someone more
> knowledgeable than I to tell me I'm interpreting it wrong. The
> message is as follows:
> "Server: Msg 1505, Level 16, State 1
> CREATE UNIQUE INDEX terminated because a duplicate key was found for
> index id 9. Most significant primary key is '115040'"
> Does this not mean there are duplicate row entries based on the
> attempted index columns? If so why do I get a return of 0 records
> with the query provided below?
> Any ideas anyone?
This works for me:
create table #a (col1 int, col2 int, col3 int)
insert into #a values (1, 2, 3)
insert into #a values (1, 2, 3)
insert into #a values (2, 2, 3)
insert into #a values (3, 2, 1)
insert into #a values (3, 2, 1)
insert into #a values (3, 2, 1)
Select count(*) as "# Dupes", col1, col2, col3
From #a
Group By col1, col2, col3
Having count(*) > 1
# Dupes col1 col2 col3
-- -- -- --
2 1 2 3
3 3 2 1
David Gugick
Imceda Software
www.imceda.com|||"David Gugick" <davidg-nospam@.imceda.com> wrote in message
news:ea8zyRlCFHA.2676@.TK2MSFTNGP12.phx.gbl...
> daniel wrote:
> > "David Gugick" <davidg-nospam@.imceda.com> wrote in message
> > news:OrwxbzgCFHA.444@.TK2MSFTNGP15.phx.gbl...
> >> daniel wrote:
> >> I am working on an inherited database and trying to add indexes on
> >> existing tables. A few of the tables have duplicate entries that
> >> cause the application of the indices to fail. I know that on some of
> >> the tables I can simply copy the table, delete the records apply the
> >> index import from old table and only unique records will make it
> >> into the table. ( I am correct here aren't I?) However...
> >>
> >> On a couple of tables I need to actually "see" the duplicate records
> >> so that I can choose which records to keep and do a manual deletion
> >> of the duplicate records.
> >>
> >> There are no PK's on the tables in question and the indices I need
> >> to put in place are, for example, on 6 of 15 columns. I need to
> >> find the dupes and view the information in non-indexed columns to
> >> make my decision on which to keep and which to delete.
> >>
> >> Does this make any sense? Can anyone help me out with how I would go
> >> about identifying the duplicate records?
> >>
> >> Thanks in advance.
> >>
> >> daniel
> >>
> >> You can group by the columns to be indexed and use a having clause to
> >> return the dupes. Not sure of your table size, so the query may be
> >> CPU and tiem consuming:
> >>
> >> For example:
> >>
> >> Select a, b, c, d
> >> From MyTable
> >> Group By a, b, c, d
> >> Having COUNT(*) > 1
> >>
> >> To get rid of the dupes, you could simply create another table and
> >> add a clustered index with the ignore dup keys flag and insert the
> >> rows. But you wouldn't have much control over which dupes were
> >> removed.
> >>
> >>
> >> --
> >> David Gugick
> >> Imceda Software
> >> www.imceda.com
> >>
> >
> > Thanks David but I guess maybe I've misunderstood the error message
> > from SQL.(?) I queried one of the tables on which I receive the
> > indexing error, with the query you provided, and it returned 0
> > records. Perhaps providing the error message will allow someone more
> > knowledgeable than I to tell me I'm interpreting it wrong. The
> > message is as follows:
> >
> > "Server: Msg 1505, Level 16, State 1
> > CREATE UNIQUE INDEX terminated because a duplicate key was found for
> > index id 9. Most significant primary key is '115040'"
> >
> > Does this not mean there are duplicate row entries based on the
> > attempted index columns? If so why do I get a return of 0 records
> > with the query provided below?
> >
> > Any ideas anyone?
> This works for me:
> create table #a (col1 int, col2 int, col3 int)
>
> insert into #a values (1, 2, 3)
> insert into #a values (1, 2, 3)
> insert into #a values (2, 2, 3)
> insert into #a values (3, 2, 1)
> insert into #a values (3, 2, 1)
> insert into #a values (3, 2, 1)
>
> Select count(*) as "# Dupes", col1, col2, col3
> From #a
> Group By col1, col2, col3
> Having count(*) > 1
> # Dupes col1 col2 col3
> -- -- -- --
> 2 1 2 3
> 3 3 2 1
>
> --
> David Gugick
> Imceda Software
> www.imceda.com
>
Thanks again David. I finally got the query to work. Knowing that there were
duplicates I did some intensive eye straining work to figure out the
problem. Appears as though some of the "duplicate" records also contained
unexpected data, empty string vs. NULL value and this was thowing off the
query. I updated the fields as necessary and everything worked out fine.
Thanks again for your assistance.
daniel
tables. A few of the tables have duplicate entries that cause the
application of the indices to fail. I know that on some of the tables I can
simply copy the table, delete the records apply the index import from old
table and only unique records will make it into the table. ( I am correct
here aren't I?) However...
On a couple of tables I need to actually "see" the duplicate records so that
I can choose which records to keep and do a manual deletion of the duplicate
records.
There are no PK's on the tables in question and the indices I need to put in
place are, for example, on 6 of 15 columns. I need to find the dupes and
view the information in non-indexed columns to make my decision on which to
keep and which to delete.
Does this make any sense? Can anyone help me out with how I would go about
identifying the duplicate records?
Thanks in advance.
danieldaniel wrote:
> I am working on an inherited database and trying to add indexes on
> existing tables. A few of the tables have duplicate entries that
> cause the application of the indices to fail. I know that on some of
> the tables I can simply copy the table, delete the records apply the
> index import from old table and only unique records will make it into
> the table. ( I am correct here aren't I?) However...
> On a couple of tables I need to actually "see" the duplicate records
> so that I can choose which records to keep and do a manual deletion
> of the duplicate records.
> There are no PK's on the tables in question and the indices I need to
> put in place are, for example, on 6 of 15 columns. I need to find the
> dupes and view the information in non-indexed columns to make my
> decision on which to keep and which to delete.
> Does this make any sense? Can anyone help me out with how I would go
> about identifying the duplicate records?
> Thanks in advance.
> daniel
You can group by the columns to be indexed and use a having clause to
return the dupes. Not sure of your table size, so the query may be CPU
and tiem consuming:
For example:
Select a, b, c, d
From MyTable
Group By a, b, c, d
Having COUNT(*) > 1
To get rid of the dupes, you could simply create another table and add a
clustered index with the ignore dup keys flag and insert the rows. But
you wouldn't have much control over which dupes were removed.
David Gugick
Imceda Software
www.imceda.com|||"David Gugick" <davidg-nospam@.imceda.com> wrote in message
news:OrwxbzgCFHA.444@.TK2MSFTNGP15.phx.gbl...
> daniel wrote:
> > I am working on an inherited database and trying to add indexes on
> > existing tables. A few of the tables have duplicate entries that
> > cause the application of the indices to fail. I know that on some of
> > the tables I can simply copy the table, delete the records apply the
> > index import from old table and only unique records will make it into
> > the table. ( I am correct here aren't I?) However...
> >
> > On a couple of tables I need to actually "see" the duplicate records
> > so that I can choose which records to keep and do a manual deletion
> > of the duplicate records.
> >
> > There are no PK's on the tables in question and the indices I need to
> > put in place are, for example, on 6 of 15 columns. I need to find the
> > dupes and view the information in non-indexed columns to make my
> > decision on which to keep and which to delete.
> >
> > Does this make any sense? Can anyone help me out with how I would go
> > about identifying the duplicate records?
> >
> > Thanks in advance.
> >
> > daniel
> You can group by the columns to be indexed and use a having clause to
> return the dupes. Not sure of your table size, so the query may be CPU
> and tiem consuming:
> For example:
> Select a, b, c, d
> From MyTable
> Group By a, b, c, d
> Having COUNT(*) > 1
> To get rid of the dupes, you could simply create another table and add a
> clustered index with the ignore dup keys flag and insert the rows. But
> you wouldn't have much control over which dupes were removed.
>
> --
> David Gugick
> Imceda Software
> www.imceda.com
>
Thanks David but I guess maybe I've misunderstood the error message from
SQL.(?) I queried one of the tables on which I receive the indexing error,
with the query you provided, and it returned 0 records. Perhaps providing
the error message will allow someone more knowledgeable than I to tell me
I'm interpreting it wrong. The message is as follows:
"Server: Msg 1505, Level 16, State 1
CREATE UNIQUE INDEX terminated because a duplicate key was found for index
id 9. Most significant primary key is '115040'"
Does this not mean there are duplicate row entries based on the attempted
index columns? If so why do I get a return of 0 records with the query
provided below?
Any ideas anyone?|||daniel wrote:
> Thanks David but I guess maybe I've misunderstood the error message
> from SQL.(?) I queried one of the tables on which I receive the
> indexing error, with the query you provided, and it returned 0
> records. Perhaps providing the error message will allow someone more
> knowledgeable than I to tell me I'm interpreting it wrong. The
> message is as follows:
> "Server: Msg 1505, Level 16, State 1
> CREATE UNIQUE INDEX terminated because a duplicate key was found for
> index id 9. Most significant primary key is '115040'"
> Does this not mean there are duplicate row entries based on the
> attempted index columns? If so why do I get a return of 0 records
> with the query provided below?
> Any ideas anyone?
You'll need to provide the query, the table ddl, and the index statement
in order to tell what's going on.
David Gugick
Imceda Software
www.imceda.com|||You might find this interesting.
http://www.15seconds.com/issue/011009.htm
Ben Miller
"daniel" <dhagwood@.hotmail.com> wrote in message
news:OVAIxhgCFHA.560@.TK2MSFTNGP15.phx.gbl...
>I am working on an inherited database and trying to add indexes on existing
> tables. A few of the tables have duplicate entries that cause the
> application of the indices to fail. I know that on some of the tables I
> can
> simply copy the table, delete the records apply the index import from old
> table and only unique records will make it into the table. ( I am correct
> here aren't I?) However...
> On a couple of tables I need to actually "see" the duplicate records so
> that
> I can choose which records to keep and do a manual deletion of the
> duplicate
> records.
> There are no PK's on the tables in question and the indices I need to put
> in
> place are, for example, on 6 of 15 columns. I need to find the dupes and
> view the information in non-indexed columns to make my decision on which
> to
> keep and which to delete.
> Does this make any sense? Can anyone help me out with how I would go about
> identifying the duplicate records?
> Thanks in advance.
> daniel
>|||daniel wrote:
> "David Gugick" <davidg-nospam@.imceda.com> wrote in message
> news:OrwxbzgCFHA.444@.TK2MSFTNGP15.phx.gbl...
>> daniel wrote:
>> I am working on an inherited database and trying to add indexes on
>> existing tables. A few of the tables have duplicate entries that
>> cause the application of the indices to fail. I know that on some of
>> the tables I can simply copy the table, delete the records apply the
>> index import from old table and only unique records will make it
>> into the table. ( I am correct here aren't I?) However...
>> On a couple of tables I need to actually "see" the duplicate records
>> so that I can choose which records to keep and do a manual deletion
>> of the duplicate records.
>> There are no PK's on the tables in question and the indices I need
>> to put in place are, for example, on 6 of 15 columns. I need to
>> find the dupes and view the information in non-indexed columns to
>> make my decision on which to keep and which to delete.
>> Does this make any sense? Can anyone help me out with how I would go
>> about identifying the duplicate records?
>> Thanks in advance.
>> daniel
>> You can group by the columns to be indexed and use a having clause to
>> return the dupes. Not sure of your table size, so the query may be
>> CPU and tiem consuming:
>> For example:
>> Select a, b, c, d
>> From MyTable
>> Group By a, b, c, d
>> Having COUNT(*) > 1
>> To get rid of the dupes, you could simply create another table and
>> add a clustered index with the ignore dup keys flag and insert the
>> rows. But you wouldn't have much control over which dupes were
>> removed.
>>
>> --
>> David Gugick
>> Imceda Software
>> www.imceda.com
> Thanks David but I guess maybe I've misunderstood the error message
> from SQL.(?) I queried one of the tables on which I receive the
> indexing error, with the query you provided, and it returned 0
> records. Perhaps providing the error message will allow someone more
> knowledgeable than I to tell me I'm interpreting it wrong. The
> message is as follows:
> "Server: Msg 1505, Level 16, State 1
> CREATE UNIQUE INDEX terminated because a duplicate key was found for
> index id 9. Most significant primary key is '115040'"
> Does this not mean there are duplicate row entries based on the
> attempted index columns? If so why do I get a return of 0 records
> with the query provided below?
> Any ideas anyone?
This works for me:
create table #a (col1 int, col2 int, col3 int)
insert into #a values (1, 2, 3)
insert into #a values (1, 2, 3)
insert into #a values (2, 2, 3)
insert into #a values (3, 2, 1)
insert into #a values (3, 2, 1)
insert into #a values (3, 2, 1)
Select count(*) as "# Dupes", col1, col2, col3
From #a
Group By col1, col2, col3
Having count(*) > 1
# Dupes col1 col2 col3
-- -- -- --
2 1 2 3
3 3 2 1
David Gugick
Imceda Software
www.imceda.com|||"David Gugick" <davidg-nospam@.imceda.com> wrote in message
news:ea8zyRlCFHA.2676@.TK2MSFTNGP12.phx.gbl...
> daniel wrote:
> > "David Gugick" <davidg-nospam@.imceda.com> wrote in message
> > news:OrwxbzgCFHA.444@.TK2MSFTNGP15.phx.gbl...
> >> daniel wrote:
> >> I am working on an inherited database and trying to add indexes on
> >> existing tables. A few of the tables have duplicate entries that
> >> cause the application of the indices to fail. I know that on some of
> >> the tables I can simply copy the table, delete the records apply the
> >> index import from old table and only unique records will make it
> >> into the table. ( I am correct here aren't I?) However...
> >>
> >> On a couple of tables I need to actually "see" the duplicate records
> >> so that I can choose which records to keep and do a manual deletion
> >> of the duplicate records.
> >>
> >> There are no PK's on the tables in question and the indices I need
> >> to put in place are, for example, on 6 of 15 columns. I need to
> >> find the dupes and view the information in non-indexed columns to
> >> make my decision on which to keep and which to delete.
> >>
> >> Does this make any sense? Can anyone help me out with how I would go
> >> about identifying the duplicate records?
> >>
> >> Thanks in advance.
> >>
> >> daniel
> >>
> >> You can group by the columns to be indexed and use a having clause to
> >> return the dupes. Not sure of your table size, so the query may be
> >> CPU and tiem consuming:
> >>
> >> For example:
> >>
> >> Select a, b, c, d
> >> From MyTable
> >> Group By a, b, c, d
> >> Having COUNT(*) > 1
> >>
> >> To get rid of the dupes, you could simply create another table and
> >> add a clustered index with the ignore dup keys flag and insert the
> >> rows. But you wouldn't have much control over which dupes were
> >> removed.
> >>
> >>
> >> --
> >> David Gugick
> >> Imceda Software
> >> www.imceda.com
> >>
> >
> > Thanks David but I guess maybe I've misunderstood the error message
> > from SQL.(?) I queried one of the tables on which I receive the
> > indexing error, with the query you provided, and it returned 0
> > records. Perhaps providing the error message will allow someone more
> > knowledgeable than I to tell me I'm interpreting it wrong. The
> > message is as follows:
> >
> > "Server: Msg 1505, Level 16, State 1
> > CREATE UNIQUE INDEX terminated because a duplicate key was found for
> > index id 9. Most significant primary key is '115040'"
> >
> > Does this not mean there are duplicate row entries based on the
> > attempted index columns? If so why do I get a return of 0 records
> > with the query provided below?
> >
> > Any ideas anyone?
> This works for me:
> create table #a (col1 int, col2 int, col3 int)
>
> insert into #a values (1, 2, 3)
> insert into #a values (1, 2, 3)
> insert into #a values (2, 2, 3)
> insert into #a values (3, 2, 1)
> insert into #a values (3, 2, 1)
> insert into #a values (3, 2, 1)
>
> Select count(*) as "# Dupes", col1, col2, col3
> From #a
> Group By col1, col2, col3
> Having count(*) > 1
> # Dupes col1 col2 col3
> -- -- -- --
> 2 1 2 3
> 3 3 2 1
>
> --
> David Gugick
> Imceda Software
> www.imceda.com
>
Thanks again David. I finally got the query to work. Knowing that there were
duplicates I did some intensive eye straining work to figure out the
problem. Appears as though some of the "duplicate" records also contained
unexpected data, empty string vs. NULL value and this was thowing off the
query. I updated the fields as necessary and everything worked out fine.
Thanks again for your assistance.
daniel
Extracting Data from Badly Designed Table
I've inherited a crappy data structure on a project, see table below
...
CREATE TABLE [REPL_COMPONENT]
(
[ReplacementID] [int] IDENTITY (1, 1) NOT NULL ,
[RepairID] [int] NULL ,
[Repl1] [int] NULL ,
[ReplCost1] [int] NULL ,
[ReplType1] [int] NULL ,
[Repl2] [int] NULL ,
[ReplCost2] [int] NULL ,
[ReplType2] [int] NULL ,
[Repl3] [int] NULL ,
[ReplCost3] [int] NULL ,
[ReplType3] [int] NULL ,
[Repl4] [int] NULL ,
[ReplCost4] [int] NULL ,
[ReplType4] [int] NULL
)
I'd like to write a query to extract this data in normal form. So, for
each row in this table, my query would return 4 rows.
Given, this one row of dummy data:
INSERT INTO [REPL_COMPONENT]([RepairID], [Repl1], [ReplCost1],
[ReplType1], [Repl2], [ReplCost2], [ReplType2], [Repl3], [ReplCost3],
[ReplType3], [Repl4], [ReplCost4], [ReplType4])
VALUES(53, 11, 11, 11, 22, 22, 22, 33, 33, 33, 44, 44, 44)
The new query would return 4 rows
1 53 11 11 11
1 53 22 22 22
1 53 33 33 33
1 53 44 44 44
Appreciate your help!Simple:
select ReplacementID, RepairID, Repl1, ReplCost1, ReplType1 from
REPL_COMPONENT
union all
select ReplacementID, RepairID, Repl2, ReplCost2, ReplType2 from
REPL_COMPONENT
union all
select ReplacementID, RepairID, Repl3, ReplCost3, ReplType3 from
REPL_COMPONENT
union all
select ReplacementID, RepairID, Repl4, ReplCost4, ReplType4 from
REPL_COMPONENT
*mike hodgson*
http://sqlnerd.blogspot.com
George wrote:
>I've inherited a crappy data structure on a project, see table below
>...
>CREATE TABLE [REPL_COMPONENT]
> (
>[ReplacementID] [int] IDENTITY (1, 1) NOT NULL ,
>[RepairID] [int] NULL ,
>[Repl1] [int] NULL ,
>[ReplCost1] [int] NULL ,
>[ReplType1] [int] NULL ,
>[Repl2] [int] NULL ,
>[ReplCost2] [int] NULL ,
>[ReplType2] [int] NULL ,
>[Repl3] [int] NULL ,
>[ReplCost3] [int] NULL ,
>[ReplType3] [int] NULL ,
>[Repl4] [int] NULL ,
>[ReplCost4] [int] NULL ,
>[ReplType4] [int] NULL
> )
>I'd like to write a query to extract this data in normal form. So, for
>each row in this table, my query would return 4 rows.
>Given, this one row of dummy data:
>INSERT INTO [REPL_COMPONENT]([RepairID], [Repl1], [ReplCost1],
>[ReplType1], [Repl2], [ReplCost2], [ReplType2], [Repl3], [ReplCost3],
>[ReplType3], [Repl4], [ReplCost4], [ReplType4])
>VALUES(53, 11, 11, 11, 22, 22, 22, 33, 33, 33, 44, 44, 44)
>The new query would return 4 rows
>1 53 11 11 11
>1 53 22 22 22
>1 53 33 33 33
>1 53 44 44 44
>Appreciate your help!
>
>|||George,
rather than try to work round a 'crappy' data structure, why not simply
re-invent it and normalize the table at an early stage. Log-tern benifits
usually outweigh the short-term effort.
Tony
"George" wrote:
> I've inherited a crappy data structure on a project, see table below
> ...
> CREATE TABLE [REPL_COMPONENT]
> (
> [ReplacementID] [int] IDENTITY (1, 1) NOT NULL ,
> [RepairID] [int] NULL ,
> [Repl1] [int] NULL ,
> [ReplCost1] [int] NULL ,
> [ReplType1] [int] NULL ,
> [Repl2] [int] NULL ,
> [ReplCost2] [int] NULL ,
> [ReplType2] [int] NULL ,
> [Repl3] [int] NULL ,
> [ReplCost3] [int] NULL ,
> [ReplType3] [int] NULL ,
> [Repl4] [int] NULL ,
> [ReplCost4] [int] NULL ,
> [ReplType4] [int] NULL
> )
> I'd like to write a query to extract this data in normal form. So, for
> each row in this table, my query would return 4 rows.
> Given, this one row of dummy data:
> INSERT INTO [REPL_COMPONENT]([RepairID], [Repl1], [ReplCost1],
> [ReplType1], [Repl2], [ReplCost2], [ReplType2], [Repl3], [ReplCost3],
> [ReplType3], [Repl4], [ReplCost4], [ReplType4])
> VALUES(53, 11, 11, 11, 22, 22, 22, 33, 33, 33, 44, 44, 44)
> The new query would return 4 rows
> 1 53 11 11 11
> 1 53 22 22 22
> 1 53 33 33 33
> 1 53 44 44 44
> Appreciate your help!
>
...
CREATE TABLE [REPL_COMPONENT]
(
[ReplacementID] [int] IDENTITY (1, 1) NOT NULL ,
[RepairID] [int] NULL ,
[Repl1] [int] NULL ,
[ReplCost1] [int] NULL ,
[ReplType1] [int] NULL ,
[Repl2] [int] NULL ,
[ReplCost2] [int] NULL ,
[ReplType2] [int] NULL ,
[Repl3] [int] NULL ,
[ReplCost3] [int] NULL ,
[ReplType3] [int] NULL ,
[Repl4] [int] NULL ,
[ReplCost4] [int] NULL ,
[ReplType4] [int] NULL
)
I'd like to write a query to extract this data in normal form. So, for
each row in this table, my query would return 4 rows.
Given, this one row of dummy data:
INSERT INTO [REPL_COMPONENT]([RepairID], [Repl1], [ReplCost1],
[ReplType1], [Repl2], [ReplCost2], [ReplType2], [Repl3], [ReplCost3],
[ReplType3], [Repl4], [ReplCost4], [ReplType4])
VALUES(53, 11, 11, 11, 22, 22, 22, 33, 33, 33, 44, 44, 44)
The new query would return 4 rows
1 53 11 11 11
1 53 22 22 22
1 53 33 33 33
1 53 44 44 44
Appreciate your help!Simple:
select ReplacementID, RepairID, Repl1, ReplCost1, ReplType1 from
REPL_COMPONENT
union all
select ReplacementID, RepairID, Repl2, ReplCost2, ReplType2 from
REPL_COMPONENT
union all
select ReplacementID, RepairID, Repl3, ReplCost3, ReplType3 from
REPL_COMPONENT
union all
select ReplacementID, RepairID, Repl4, ReplCost4, ReplType4 from
REPL_COMPONENT
*mike hodgson*
http://sqlnerd.blogspot.com
George wrote:
>I've inherited a crappy data structure on a project, see table below
>...
>CREATE TABLE [REPL_COMPONENT]
> (
>[ReplacementID] [int] IDENTITY (1, 1) NOT NULL ,
>[RepairID] [int] NULL ,
>[Repl1] [int] NULL ,
>[ReplCost1] [int] NULL ,
>[ReplType1] [int] NULL ,
>[Repl2] [int] NULL ,
>[ReplCost2] [int] NULL ,
>[ReplType2] [int] NULL ,
>[Repl3] [int] NULL ,
>[ReplCost3] [int] NULL ,
>[ReplType3] [int] NULL ,
>[Repl4] [int] NULL ,
>[ReplCost4] [int] NULL ,
>[ReplType4] [int] NULL
> )
>I'd like to write a query to extract this data in normal form. So, for
>each row in this table, my query would return 4 rows.
>Given, this one row of dummy data:
>INSERT INTO [REPL_COMPONENT]([RepairID], [Repl1], [ReplCost1],
>[ReplType1], [Repl2], [ReplCost2], [ReplType2], [Repl3], [ReplCost3],
>[ReplType3], [Repl4], [ReplCost4], [ReplType4])
>VALUES(53, 11, 11, 11, 22, 22, 22, 33, 33, 33, 44, 44, 44)
>The new query would return 4 rows
>1 53 11 11 11
>1 53 22 22 22
>1 53 33 33 33
>1 53 44 44 44
>Appreciate your help!
>
>|||George,
rather than try to work round a 'crappy' data structure, why not simply
re-invent it and normalize the table at an early stage. Log-tern benifits
usually outweigh the short-term effort.
Tony
"George" wrote:
> I've inherited a crappy data structure on a project, see table below
> ...
> CREATE TABLE [REPL_COMPONENT]
> (
> [ReplacementID] [int] IDENTITY (1, 1) NOT NULL ,
> [RepairID] [int] NULL ,
> [Repl1] [int] NULL ,
> [ReplCost1] [int] NULL ,
> [ReplType1] [int] NULL ,
> [Repl2] [int] NULL ,
> [ReplCost2] [int] NULL ,
> [ReplType2] [int] NULL ,
> [Repl3] [int] NULL ,
> [ReplCost3] [int] NULL ,
> [ReplType3] [int] NULL ,
> [Repl4] [int] NULL ,
> [ReplCost4] [int] NULL ,
> [ReplType4] [int] NULL
> )
> I'd like to write a query to extract this data in normal form. So, for
> each row in this table, my query would return 4 rows.
> Given, this one row of dummy data:
> INSERT INTO [REPL_COMPONENT]([RepairID], [Repl1], [ReplCost1],
> [ReplType1], [Repl2], [ReplCost2], [ReplType2], [Repl3], [ReplCost3],
> [ReplType3], [Repl4], [ReplCost4], [ReplType4])
> VALUES(53, 11, 11, 11, 22, 22, 22, 33, 33, 33, 44, 44, 44)
> The new query would return 4 rows
> 1 53 11 11 11
> 1 53 22 22 22
> 1 53 33 33 33
> 1 53 44 44 44
> Appreciate your help!
>
Subscribe to:
Posts (Atom)