Showing posts with label date. Show all posts
Showing posts with label date. Show all posts

Wednesday, March 7, 2012

Fact Tables and Clustered Indexes

A design question:
Say I have a medium-sized fact table w/ a handful of dimension key
columns (including a date) and a handful of measures.
I necessarily need a primary key on the composite dimension key
columns, and I don't know ahead of time which of my dimension key
column(s) will be the best candidate for a clustered index. I do plan
on putting non-clustered indexes on all my dimension key columns, and
the related dimension tables' key columns.
For the sake of argument, let's say we're not partitioning the fact
table.
Assume that new facts occur in time, the fact table grows with time,
and (nearly) all changes to the fact table occur as INSERTs.
Now, all things being equal, is there a benefit of adding a clustered
index to the fact table? Two options:
- Add an IDENTITY column, make it the primary key, and add the
clustered index to it.
- Add the clustered index on the date column, since it has a natural
order.
Basically, I'm after two answers in this scenario:
- Is there a benefit to having a clustered index on a table when the
application doesn't 'really' call for one?
- If so, is it better to add an IDENTITY column (adding size to the
table) or to pick an naturally ordered dimension key? A random key?
The fact's composite key?
Thanks much.
Steven D. Clark
stevec@.clarkdev.com
"Steven Clark" <stevec@.clarkdev.com> wrote in message
news:d7740507.0407100710.2b1644b5@.posting.google.c om...
> Basically, I'm after two answers in this scenario:
> - Is there a benefit to having a clustered index on a table when the
> application doesn't 'really' call for one?
In my opinion, there is never a reason NOT to have one; it's a freebie,
basically, unlike clustered indexes. It doesn't take up any extra disc
space, and you might as well order the data on the disc somehow, rather than
letting the server take care of it... So I always make sure that every table
has one.

> - If so, is it better to add an IDENTITY column (adding size to the
> table) or to pick an naturally ordered dimension key? A random key?
> The fact's composite key?
Some tips for clustered indexes:
- They assist with range queries and grouping. So try to use them for
columns that will be used for those kinds of operations (>, <, BETWEEN, etc,
or make it composite in the same order that you'll be grouping. If you do
composite, order the columns by selectivity, least selective first. This
will create a wider tree, which will result in somewhat quicker grouping.)
- Clustering on a random key is a very bad idea, because it will cause a
lot of page splits, leading to fragmentation. This will slow your data
loads. It will also give you no query benefits at all. So you'll actually
lose on this option.
- Clustering on an IDENTITY key or a DATETIME column that's
automatically set to the date the row is inserted will actually speed up
inserts as it will create a hotspot at the end of the table. So you'll
never have page splits when inserting new data. This can definitely help
speed your data load! Clustering on an IDENTITY will usually not help too
much with queries as, in my experience, most grouping and range operations
don't consider surrogates. Depending on your app, clustering on a DATETIME
as I described can help a lot, as a lot of queries will request data between
two dates, greater than a date, etc.
- Finally, clustering on the composite of your dimensions may be helpful
if you're grouping on them or requesting ranges. However, in the latter
case, remember that a composite index will only be used for searching if the
first column is part of the search criteria, so try to choose one of your
dimensions that will always be searched on (if that exists in your
warehouse).
I hope that answered your questions? Post back if you need some
clarification or further assistance.
|||> It doesn't take up any extra disc space,
That's not true. The clustered index uses the datapages as the leaves, so
that is disk space you use already anyway, but the nodes of the index still
take up extra space on disk. That doesn't mean btw that it is not a good
idea to have a clustered index on every table. In almost all cases the
performance improvement that a clustered index provides more than offsets
the extra disk space used.
Jacco Schalkwijk
SQL Server MVP
"Adam Machanic" <amachanic@.hotmail._removetoemail_.com> wrote in message
news:%23rDxElvZEHA.2388@.TK2MSFTNGP11.phx.gbl...
> "Steven Clark" <stevec@.clarkdev.com> wrote in message
> news:d7740507.0407100710.2b1644b5@.posting.google.c om...
> In my opinion, there is never a reason NOT to have one; it's a
freebie,
> basically, unlike clustered indexes. It doesn't take up any extra disc
> space, and you might as well order the data on the disc somehow, rather
than
> letting the server take care of it... So I always make sure that every
table
> has one.
>
> Some tips for clustered indexes:
> - They assist with range queries and grouping. So try to use them for
> columns that will be used for those kinds of operations (>, <, BETWEEN,
etc,
> or make it composite in the same order that you'll be grouping. If you do
> composite, order the columns by selectivity, least selective first. This
> will create a wider tree, which will result in somewhat quicker grouping.)
> - Clustering on a random key is a very bad idea, because it will cause
a
> lot of page splits, leading to fragmentation. This will slow your data
> loads. It will also give you no query benefits at all. So you'll
actually
> lose on this option.
> - Clustering on an IDENTITY key or a DATETIME column that's
> automatically set to the date the row is inserted will actually speed up
> inserts as it will create a hotspot at the end of the table. So you'll
> never have page splits when inserting new data. This can definitely help
> speed your data load! Clustering on an IDENTITY will usually not help too
> much with queries as, in my experience, most grouping and range operations
> don't consider surrogates. Depending on your app, clustering on a
DATETIME
> as I described can help a lot, as a lot of queries will request data
between
> two dates, greater than a date, etc.
> - Finally, clustering on the composite of your dimensions may be
helpful
> if you're grouping on them or requesting ranges. However, in the latter
> case, remember that a composite index will only be used for searching if
the
> first column is part of the search criteria, so try to choose one of your
> dimensions that will always be searched on (if that exists in your
> warehouse).
> I hope that answered your questions? Post back if you need some
> clarification or further assistance.
>
>
|||"Jacco Schalkwijk" <jacco.please.reply@.to.newsgroups.mvps.org.invalid > wrote
in message news:%23cIz3GDaEHA.2544@.TK2MSFTNGP10.phx.gbl...
> That's not true. The clustered index uses the datapages as the leaves, so
> that is disk space you use already anyway, but the nodes of the index
still
> take up extra space on disk. That doesn't mean btw that it is not a good
> idea to have a clustered index on every table. In almost all cases the
> performance improvement that a clustered index provides more than offsets
> the extra disk space used.
Thanks for the clarification on that... I also didn't think about fill
factor, which could also create the impression of more disc space being
used.

Fact Tables and Clustered Indexes

A design question:
Say I have a medium-sized fact table w/ a handful of dimension key
columns (including a date) and a handful of measures.
I necessarily need a primary key on the composite dimension key
columns, and I don't know ahead of time which of my dimension key
column(s) will be the best candidate for a clustered index. I do plan
on putting non-clustered indexes on all my dimension key columns, and
the related dimension tables' key columns.
For the sake of argument, let's say we're not partitioning the fact
table.
Assume that new facts occur in time, the fact table grows with time,
and (nearly) all changes to the fact table occur as INSERTs.
Now, all things being equal, is there a benefit of adding a clustered
index to the fact table? Two options:
- Add an IDENTITY column, make it the primary key, and add the
clustered index to it.
- Add the clustered index on the date column, since it has a natural
order.
Basically, I'm after two answers in this scenario:
- Is there a benefit to having a clustered index on a table when the
application doesn't 'really' call for one?
- If so, is it better to add an IDENTITY column (adding size to the
table) or to pick an naturally ordered dimension key? A random key?
The fact's composite key?
Thanks much.
Steven D. Clark
stevec@.clarkdev.com"Steven Clark" <stevec@.clarkdev.com> wrote in message
news:d7740507.0407100710.2b1644b5@.posting.google.com...
> Basically, I'm after two answers in this scenario:
> - Is there a benefit to having a clustered index on a table when the
> application doesn't 'really' call for one?
In my opinion, there is never a reason NOT to have one; it's a freebie,
basically, unlike clustered indexes. It doesn't take up any extra disc
space, and you might as well order the data on the disc somehow, rather than
letting the server take care of it... So I always make sure that every table
has one.

> - If so, is it better to add an IDENTITY column (adding size to the
> table) or to pick an naturally ordered dimension key? A random key?
> The fact's composite key?
Some tips for clustered indexes:
- They assist with range queries and grouping. So try to use them for
columns that will be used for those kinds of operations (>, <, BETWEEN, etc,
or make it composite in the same order that you'll be grouping. If you do
composite, order the columns by selectivity, least selective first. This
will create a wider tree, which will result in somewhat quicker grouping.)
- Clustering on a random key is a very bad idea, because it will cause a
lot of page splits, leading to fragmentation. This will slow your data
loads. It will also give you no query benefits at all. So you'll actually
lose on this option.
- Clustering on an IDENTITY key or a DATETIME column that's
automatically set to the date the row is inserted will actually speed up
inserts as it will create a hotspot at the end of the table. So you'll
never have page splits when inserting new data. This can definitely help
speed your data load! Clustering on an IDENTITY will usually not help too
much with queries as, in my experience, most grouping and range operations
don't consider surrogates. Depending on your app, clustering on a DATETIME
as I described can help a lot, as a lot of queries will request data between
two dates, greater than a date, etc.
- Finally, clustering on the composite of your dimensions may be helpful
if you're grouping on them or requesting ranges. However, in the latter
case, remember that a composite index will only be used for searching if the
first column is part of the search criteria, so try to choose one of your
dimensions that will always be searched on (if that exists in your
warehouse).
I hope that answered your questions? Post back if you need some
clarification or further assistance.|||> It doesn't take up any extra disc space,
That's not true. The clustered index uses the datapages as the leaves, so
that is disk space you use already anyway, but the nodes of the index still
take up extra space on disk. That doesn't mean btw that it is not a good
idea to have a clustered index on every table. In almost all cases the
performance improvement that a clustered index provides more than offsets
the extra disk space used.
Jacco Schalkwijk
SQL Server MVP
"Adam Machanic" <amachanic@.hotmail._removetoemail_.com> wrote in message
news:%23rDxElvZEHA.2388@.TK2MSFTNGP11.phx.gbl...
> "Steven Clark" <stevec@.clarkdev.com> wrote in message
> news:d7740507.0407100710.2b1644b5@.posting.google.com...
> In my opinion, there is never a reason NOT to have one; it's a
freebie,
> basically, unlike clustered indexes. It doesn't take up any extra disc
> space, and you might as well order the data on the disc somehow, rather
than
> letting the server take care of it... So I always make sure that every
table
> has one.
>
> Some tips for clustered indexes:
> - They assist with range queries and grouping. So try to use them for
> columns that will be used for those kinds of operations (>, <, BETWEEN,
etc,
> or make it composite in the same order that you'll be grouping. If you do
> composite, order the columns by selectivity, least selective first. This
> will create a wider tree, which will result in somewhat quicker grouping.)
> - Clustering on a random key is a very bad idea, because it will cause
a
> lot of page splits, leading to fragmentation. This will slow your data
> loads. It will also give you no query benefits at all. So you'll
actually
> lose on this option.
> - Clustering on an IDENTITY key or a DATETIME column that's
> automatically set to the date the row is inserted will actually speed up
> inserts as it will create a hotspot at the end of the table. So you'll
> never have page splits when inserting new data. This can definitely help
> speed your data load! Clustering on an IDENTITY will usually not help too
> much with queries as, in my experience, most grouping and range operations
> don't consider surrogates. Depending on your app, clustering on a
DATETIME
> as I described can help a lot, as a lot of queries will request data
between
> two dates, greater than a date, etc.
> - Finally, clustering on the composite of your dimensions may be
helpful
> if you're grouping on them or requesting ranges. However, in the latter
> case, remember that a composite index will only be used for searching if
the
> first column is part of the search criteria, so try to choose one of your
> dimensions that will always be searched on (if that exists in your
> warehouse).
> I hope that answered your questions? Post back if you need some
> clarification or further assistance.
>
>|||"Jacco Schalkwijk" <jacco.please.reply@.to.newsgroups.mvps.org.invalid> wrote
in message news:%23cIz3GDaEHA.2544@.TK2MSFTNGP10.phx.gbl...
> That's not true. The clustered index uses the datapages as the leaves, so
> that is disk space you use already anyway, but the nodes of the index
still
> take up extra space on disk. That doesn't mean btw that it is not a good
> idea to have a clustered index on every table. In almost all cases the
> performance improvement that a clustered index provides more than offsets
> the extra disk space used.
Thanks for the clarification on that... I also didn't think about fill
factor, which could also create the impression of more disc space being
used.

Fact Table Design for Hours Worked

Hi,
I am trying to design a cube with Date, Time and Employee dimension
which can answer following questions:
- how many employees were working at 10am on 15 June 2007? I
- how many man hours were pais between 10am -11am on 15 June 2007. If
somebody has punched in at 10:30 then he should be counted and .5 man
hours.
One solution i can think of is every minute of everyhour for every
employee entered as a row in fact table. In this case if a employees
works for 8 hours that will create 8*60= 480 records in fact table. So
this may not be a good solution.
Ideally it would be better if we can enter one record in fact table
with start time and stop time. But, how can we put corrective
transaction if we need to cancel or change the original transactions.
Any help will be appreciated.
Thanks
Maheshyou can use 2 hour dimension:
Start Time and End Time
to answer:
> - how many employees were working at 10am on 15 June 2007? I
you select the day and all the start times up to 10am and all the end times
after 10am and you have the result.
for the second one, you have to create some formula.
maybe you can create a new measure group which contains only the start time
/ end time , and, for each combination, the duration in minute of the
selected period (ie you select 10h30 and 11h45, the measure returns
75minutes) (call this measure NoMinutesInPeriod)
now, in the cube, you have to create a formula which apply a ratio if the
start time or the end time is between the selected period of time.
measure.NoMinutesInPeriod / (datediff('m', starttime, endtime))
the datediff formula will be more complicated then a simple datediff.
but maybe this idea is a good starting point for you.
or anybody else has a better solution... please share your ideas :-)
"Mahesh" <shrestha.mahesh@.gmail.com> wrote in message
news:1182525993.318614.225100@.a26g2000pre.googlegroups.com...
> Hi,
> I am trying to design a cube with Date, Time and Employee dimension
> which can answer following questions:
> - how many employees were working at 10am on 15 June 2007? I
> - how many man hours were pais between 10am -11am on 15 June 2007. If
> somebody has punched in at 10:30 then he should be counted and .5 man
> hours.
> One solution i can think of is every minute of everyhour for every
> employee entered as a row in fact table. In this case if a employees
> works for 8 hours that will create 8*60= 480 records in fact table. So
> this may not be a good solution.
> Ideally it would be better if we can enter one record in fact table
> with start time and stop time. But, how can we put corrective
> transaction if we need to cancel or change the original transactions.
> Any help will be appreciated.
> Thanks
> Mahesh
>|||Thanks Jeje. Your solution seems to be the best solution. The only
problem with this is the difficulty to add corrective transactions.
For example, if hours worked is entered as 8am - 5pm on June 5 for an
employee. On June 25 if it is changed to be 10am - 5pm, how can we
enter a new corrective transaction on June 25 without changing the
original transaction. We do not want to change the original
transaction to use Incrementa Processing on cubes.
Any response will be very much appreciated.
Mahesh
On Jun 22, 11:46 am, "Jeje" <willg...@.hotmail.com> wrote:
> you can use 2 hour dimension:
> Start Time and End Time
> to answer:> - how many employees were working at 10am on 15 June 2007? I
> you select the day and all the start times up to 10am and all the end time
s
> after 10am and you have the result.
> for the second one, you have to create some formula.
> maybe you can create a new measure group which contains only the start tim
e
> / end time , and, for each combination, the duration in minute of the
> selected period (ie you select 10h30 and 11h45, the measure returns
> 75minutes) (call this measure NoMinutesInPeriod)
> now, in the cube, you have to create a formula which apply a ratio if the
> start time or the end time is between the selected period of time.
> measure.NoMinutesInPeriod / (datediff('m', starttime, endtime))
> the datediff formula will be more complicated then a simple datediff.
> but maybe this idea is a good starting point for you.
> or anybody else has a better solution... please share your ideas :-)
> "Mahesh" <shrestha.mah...@.gmail.com> wrote in message
> news:1182525993.318614.225100@.a26g2000pre.googlegroups.com...
>
>
>
>
>
>
>
>
>
> - Show quoted text -|||if you want to "cancel" a transaction already loaded in a cube.
you have to add another "transaction" with "negative values"
for example:
initially you load the cube using this:
June 25 / 8am - 5pm / +1 (+1 work)
you want to "cancel" this transaction, and replace by a new one:
June 25 / 8am - 5pm / -1 (-1 work)
June 25 / 10am - 5pm / +1 (+1 work)
so the total in the cube is:
June 25 / 8am - 5pm / 0 (+1 + -1 = 0)
June 25 / 10am - 5pm / +1
but using this approach you can't use the "non empty" because the cell is
not empty, the cell contains a total of 0.
Also remember to do full processes at a regular basis because an incremental
loading reduce the cube performance.
good luck!
"Mahesh" <shrestha.mahesh@.gmail.com> wrote in message
news:1182783852.227881.60220@.g37g2000prf.googlegroups.com...
> Thanks Jeje. Your solution seems to be the best solution. The only
> problem with this is the difficulty to add corrective transactions.
> For example, if hours worked is entered as 8am - 5pm on June 5 for an
> employee. On June 25 if it is changed to be 10am - 5pm, how can we
> enter a new corrective transaction on June 25 without changing the
> original transaction. We do not want to change the original
> transaction to use Incrementa Processing on cubes.
> Any response will be very much appreciated.
> Mahesh
>
> On Jun 22, 11:46 am, "Jeje" <willg...@.hotmail.com> wrote:
>|||Thanks you very much for your kind response, Jeje.
Mahesh
On Jun 25, 9:27 am, "Jeje" <willg...@.hotmail.com> wrote:
> if you want to "cancel" a transaction already loaded in a cube.
> you have to add another "transaction" with "negative values"
> for example:
> initially you load the cube using this:
> June 25 / 8am - 5pm / +1 (+1 work)
> you want to "cancel" this transaction, and replace by a new one:
> June 25 / 8am - 5pm / -1 (-1 work)
> June 25 / 10am - 5pm / +1 (+1 work)
> so the total in the cube is:
> June 25 / 8am - 5pm / 0 (+1 + -1 = 0)
> June 25 / 10am - 5pm / +1
> but using this approach you can't use the "non empty" because the cell is
> not empty, the cell contains a total of 0.
> Also remember to do full processes at a regular basis because an increment
al
> loading reduce the cube performance.
> good luck!
> "Mahesh" <shrestha.mah...@.gmail.com> wrote in message
> news:1182783852.227881.60220@.g37g2000prf.googlegroups.com...
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> - Show quoted text -

Sunday, February 26, 2012

Extremely Slow Table

Hi,

I have a table defined as
CREATE TABLE [SH_Data] (
[ID] [int] IDENTITY (1, 1) NOT NULL ,
[Date] [datetime] NULL ,
[Time] [datetime] NULL ,
[TroubleshootId] [int] NOT NULL ,
[ReasonID] [int] NULL ,
[reason_desc] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CS_AS
NULL ,
[maj_reason_id] [int] NULL ,
[maj_reason_desc] [nvarchar] (255) COLLATE
SQL_Latin1_General_CP1_CS_AS NULL ,
[ActionID] [int] NULL ,
[action_desc] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CS_AS
NULL ,
[WinningCaseTitle] [nvarchar] (255) COLLATE
SQL_Latin1_General_CP1_CS_AS NULL ,
[Duration] [int] NULL ,
[dm_version] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CS_AS
NULL ,
[ConnectMethod] [nvarchar] (255) COLLATE
SQL_Latin1_General_CP1_CS_AS NULL ,
[dm_motive] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CS_AS
NULL ,
[HnWhichWlan] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CS_AS
NULL ,
[RouterUsedToConnect] [nvarchar] (255) COLLATE
SQL_Latin1_General_CP1_CS_AS NULL ,
[OS] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CS_AS NULL ,
[WinXpSp2Installed] [nvarchar] (255) COLLATE
SQL_Latin1_General_CP1_CS_AS NULL ,
[Connection] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CS_AS
NULL ,
[Login] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CS_AS NULL
,
[EnteredBy] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CS_AS
NULL ,
[Acct_Num] [int] NULL ,
[Site] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CS_AS NULL ,
CONSTRAINT [PK_SH_Data] PRIMARY KEY CLUSTERED
(
[TroubleshootId]
) ON [PRIMARY]
) ON [PRIMARY]
GO

Which contains 5.6 Million rows and has non clustered indexes on Date,
ReasonID, maj_Reason, Connection. Compared to other tables on the same
server this one is extremely slow. A simple query such as :

SELECT
SD.reason_desc,
SD.Duration,
SD.maj_reason_desc,
SD.[Connection],
SD.aolEnteredBy

FROM dbo.[Sherlock Data] SD

Where SD.[Date] > Dateadd(Month,-2,Getdate())

takes over 2 minutes to run ! I realise the table contains several
large columns which make the table quite large but unfortunately this
cannot be changed for the moment.

How can i assess what is causing the length of Query time ? And what
could i possibly do to speed this table up ? The database itself is
running on a dedicated server which has some other databases. None of
which have this performance issue.

Anyone have any ideas ?Do other queries which benefit of indexes also have bad performance ?

Where SD.[Date] > Dateadd(Month,-2,Getdate())

does a row based comparison, not using the indexes or what does the
query plan tells you about it ?

(Markt the query in QA and press CTRL+L) to see it.

Jens Suessmeyer.|||One possiblility is to change the PK to nonclustered and the index on
[Date] to clustered. If you often do range/grouping queries based on
[Date] then that should be useful, but it might also impact queries
using TroubleshootId, so you need to test any change with a number of
representative queries.

Other general advice would be to review the query plan in QA (Ctrl+K),
run UPDATE STATISTICS on the table, and also try tracing a typical
workload and running it through the Index Tuning Wizard to see what it
can recommend.

If you need more specific comments, you should post the query plan
(using SET SHOWPLAN_TEXT), and it might also be useful to know how many
rows are returned by the query.

Simon|||Yea ive tried a test on both and a query using the indexes take about
20 seconds less to run.

I tried the CTRL + L but its not making much sense to me.|||garydevstore (GaryDataStore@.gmail.com) writes:
> Which contains 5.6 Million rows and has non clustered indexes on Date,
> ReasonID, maj_Reason, Connection. Compared to other tables on the same
> server this one is extremely slow. A simple query such as :

Maybe some terminology is in order here. A road can be fast, but that
does not help you, if you car has a steering wheel out of order causing
you to zig-zag over the road. A car can be fast, but that does not help
if the road is in poor condition, so you cannot driver faster than 30 km/h
anyway.

In this case, the table is the road, and the query plan is the car. A
table itself does not move, but it can be badly fragmented in which case
it can be slow to drive through.

More likely, the query plan is not the best for the query. This is your
query:

> SELECT
> SD.reason_desc,
> SD.Duration,
> SD.maj_reason_desc,
> SD.[Connection],
> SD.aolEnteredBy
> FROM dbo.[Sherlock Data] SD
> Where SD.[Date] > Dateadd(Month,-2,Getdate())

There is a non-clustered index on Date. Assuming that rows are added
to this table regularly, there are presumably quite a few rows that
fits this condition. There are two ways for the optimizer to evaluate
this query: using the index, or scanning the table. The index is good
if only few rows are hit, but if many rows are hit the table scan is
faster. This is because, with the index you will need to read the same
page more than once.

The optimizer makes it choice of plan from the statistics SQL Server
has sampled about the table. The statistics may be out of date (even
if by default SQL Server auto-updates statistics). Try an UPDATE
STATISTICS WITH FULLSCAN, to see if this makes any difference.

But the road can also be in poor condition, that is the table can be
badly fragmented. This can be analysed with DBCC SHOWCONTIG and
remedied with DBCC DBREINDEX.

As suggested in other posts, you should look at the query plan, and see
if it says Clustered Index Scan or Index Seek + Bookmark Lookup.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||Simon Hayes wrote:
> One possiblility is to change the PK to nonclustered and the index on
> [Date] to clustered. If you often do range/grouping queries based on
> [Date] then that should be useful, but it might also impact queries
> using TroubleshootId, so you need to test any change with a number of
> representative queries.

A covering index might be an option, too, especially if there are several
queries with multiple criteria.

One question to the OP: why do you have Date and Time both as timestamp
columns? Other remarkable things: all char columns seem to be unicode
(nvarchar) and have length (255). You might save space by changing to
varchar (if possible) and / or reducing the length. Also, this doesn't
really look like a normalized schema. I would at least expect having ids
for EnteredBy and probably some others.

> Other general advice would be to review the query plan in QA (Ctrl+K),
> run UPDATE STATISTICS on the table, and also try tracing a typical
> workload and running it through the Index Tuning Wizard to see what it
> can recommend.
> If you need more specific comments, you should post the query plan
> (using SET SHOWPLAN_TEXT), and it might also be useful to know how
> many rows are returned by the query.

Plus complete index DDL.

robert

Sunday, February 19, 2012

Extracting the date from a datetime value

Hi,
Is there a single command to extract the whole date part (day, month and
year) from a datetime value?
I have come across the Datepart function, which allows to extract, the day
or month or year depending on the parameters passed to it. Converting all of
the above to string, concatenating all the strings and converting to datetim
e
datatype seems to be a long procedure. Is there any other way to do this?
Thanks.
kdkd
SELECT CONVERT(CHAR(10),GETDATE(),121)
Note : take a look at 'style' the third parameter of the function
"kd" <kd@.discussions.microsoft.com> wrote in message
news:F330C878-4B5A-406C-B17B-7757CA603668@.microsoft.com...
> Hi,
> Is there a single command to extract the whole date part (day, month and
> year) from a datetime value?
> I have come across the Datepart function, which allows to extract, the day
> or month or year depending on the parameters passed to it. Converting all
of
> the above to string, concatenating all the strings and converting to
datetime
> datatype seems to be a long procedure. Is there any other way to do this?
> Thanks.
> kd|||Hi
Just go through this. This might be helpful
http://msdn.microsoft.com/library/d...br />
2f3o.asp
thanks and regards
Chandra
"kd" wrote:

> Hi,
> Is there a single command to extract the whole date part (day, month and
> year) from a datetime value?
> I have come across the Datepart function, which allows to extract, the day
> or month or year depending on the parameters passed to it. Converting all
of
> the above to string, concatenating all the strings and converting to datet
ime
> datatype seems to be a long procedure. Is there any other way to do this?
> Thanks.
> kd|||Here you go.
SELECT CONVERT(NVARCHAR, GETDATE(), 103)
Style 103 is the british/french output stype which gives you the following
output:
03/05/2005
Now to do this in the mm/dd/yy output you would do it as followed:
SELECT CONVERT(NVARCHAR, GETDATE(), 101)
Which gives you the output of:
05/03/2005
Just open books online and in search box type "convert date style" which
will give you the first returned results of "cast and convert" in transact
sql ,this shows you serveral was to convert the date into specific output
formats. Just pick one that is easiest for you to work with. The first one
I
listed above gives you exactly what you were asking for.
Hope this helps.
"kd" wrote:

> Hi,
> Is there a single command to extract the whole date part (day, month and
> year) from a datetime value?
> I have come across the Datepart function, which allows to extract, the day
> or month or year depending on the parameters passed to it. Converting all
of
> the above to string, concatenating all the strings and converting to datet
ime
> datatype seems to be a long procedure. Is there any other way to do this?
> Thanks.
> kd|||This will give the date part of @.dt as a datetime value, without
requiring any conversions to character formats.
dateadd(day,datediff(day,0,@.dt),0)
Steve Kass
Drew University
kd wrote:

>Hi,
>Is there a single command to extract the whole date part (day, month and
>year) from a datetime value?
>I have come across the Datepart function, which allows to extract, the day
>or month or year depending on the parameters passed to it. Converting all o
f
>the above to string, concatenating all the strings and converting to dateti
me
>datatype seems to be a long procedure. Is there any other way to do this?
>Thanks.
>kd
>|||The convert() solutiions presented all convert the datetime value to a
string, and truncate the characters representing the time portion...
Another option to add to your bag of tricks relies on the fact that
DateTimes are stored internally as two integers, which present as a decimal,
the integer portion of which is the date, and the fractional portion of whic
h
is the time. So if you just strip off the fractional portion, you have a
date by itself.
Set NoCount On
Declare @.D DateTime Set @.D = '20050503 14:23:12'
Select @.D OrigDate
Set @.D = Floor(Cast(@.D As Float))
Select @.D ChangedDate"kd" wrote:

> Hi,
> Is there a single command to extract the whole date part (day, month and
> year) from a datetime value?
> I have come across the Datepart function, which allows to extract, the day
> or month or year depending on the parameters passed to it. Converting all
of
> the above to string, concatenating all the strings and converting to datet
ime
> datatype seems to be a long procedure. Is there any other way to do this?
> Thanks.
> kd|||While your solution works, it is not based on the internal
representation of datetime (and that's probably a good thing).
Converting datetime to float and back is a lot of work, and
your example converts 0x0000964900ED15C0 to
0x40E2C9332EA61D95, then applies floor to yield
0x40E2C92000000000, then converts back to the internal
datetime value 0x0000964900000000.
The basis for your solution is the interpretation SQL Server
provides to the user who is asking for a numerical value for
a datetime - SQL Server presents and understands numbers
as dates in terms of number of days from January 1, 1900.
That's part of the abstraction seen by users, and has nothing
to do with the implemenation. Your solution only needs
the cast from datetime to float and vice versa to work as
documented, which it should regardless of the internals.
A (less wise) solution that does depend on the 8-byte
(calling it two-integer is a stretch - bytes are bytes)
implementation of datetime values could use
Substring(Cast(@.D As Binary(8)),1,4)+0x00000000
In any case, I think it's best to avoid implicit and explicit
conversions altogether, except for the one that converts
something typed in the query into a datetime value. I think
it was Gert-Jan who did some testing and found that this
is faster, too.
dateadd(d,datediff(d,0,@.D),0)
SK
CBretana wrote:
>The convert() solutiions presented all convert the datetime value to a
>string, and truncate the characters representing the time portion...
>Another option to add to your bag of tricks relies on the fact that
>DateTimes are stored internally as two integers, which present as a decimal
,
>the integer portion of which is the date, and the fractional portion of whi
ch
>is the time. So if you just strip off the fractional portion, you have a
>date by itself.
>Set NoCount On
>Declare @.D DateTime Set @.D = '20050503 14:23:12'
>Select @.D OrigDate
>Set @.D = Floor(Cast(@.D As Float))
>Select @.D ChangedDate"kd" wrote:
>
>|||Actually it sets the time to midnight that day. And depending on the
appliation you are using to view it it may shoul 12:00 PM in additon to the
date.
KD the real question is what do you intend to do with it? Is it for use in a
query or for presentation, if for presentation and you have mixed locals
where the standard date format i different then it is many times better to d
o
on the client side fo distributed apps. If web based then you need to take
into their local the proper format or opt for a specific format that you can
enforce.
"Steve Kass" wrote:

> This will give the date part of @.dt as a datetime value, without
> requiring any conversions to character formats.
> dateadd(day,datediff(day,0,@.dt),0)
> Steve Kass
> Drew University
> kd wrote:
>
>|||Steve,
I never thought to directly read the binary values and see what they
were... Thisis interesting... The binary value of '20050503 14:23:12' is
0x00009649 00ED15C0 -- I separated high 4 bytes from Low 4 bytes
If you take the high 4 bytes, 0x9649, that is indeed hex for 38,473,
( 9*16*16*16 + 6*16*16 + 4*16 + 9), which is integer for May 3rd 2005, but
no matter what I do, I can't figure out what SQL is doing with the low 4
bytes...
0x00ED15C0 hex is decimal 15,537,600. which is only .1798333333 of a full
24 hour day (86,400,000 millesconds) and it should be 51,791,971
(0.599444111110 of a 24 hr day).
The only thing I'm not clear on is why Casting to a Float should be
"expensive"... Is it because the system has to convert the binary data into
a
binary IEEE Mantissa/Exponent structure? I would have thought that would be
fairly performant, because it's a binary represeentation, which I "imagine"
all computers would somehow be internally optimized for, outside the scope o
f
whatever software was initiating this task. Not so? Why Not? (if you know
!)
Anyway,
"Steve Kass" wrote:

> While your solution works, it is not based on the internal
> representation of datetime (and that's probably a good thing).
> Converting datetime to float and back is a lot of work, and
> your example converts 0x0000964900ED15C0 to
> 0x40E2C9332EA61D95, then applies floor to yield
> 0x40E2C92000000000, then converts back to the internal
> datetime value 0x0000964900000000.
> The basis for your solution is the interpretation SQL Server
> provides to the user who is asking for a numerical value for
> a datetime - SQL Server presents and understands numbers
> as dates in terms of number of days from January 1, 1900.
> That's part of the abstraction seen by users, and has nothing
> to do with the implemenation. Your solution only needs
> the cast from datetime to float and vice versa to work as
> documented, which it should regardless of the internals.
> A (less wise) solution that does depend on the 8-byte
> (calling it two-integer is a stretch - bytes are bytes)
> implementation of datetime values could use
> Substring(Cast(@.D As Binary(8)),1,4)+0x00000000
> In any case, I think it's best to avoid implicit and explicit
> conversions altogether, except for the one that converts
> something typed in the query into a datetime value. I think
> it was Gert-Jan who did some testing and found that this
> is faster, too.
> dateadd(d,datediff(d,0,@.D),0)
> SK
> CBretana wrote:
>
>|||I think this is the thread that Steve is referring to:
http://groups.google.nl/groups?hl=n...>
dy.nl&rnum=4
(url may wrap)
HTH,
Gert-Jan
CBretana wrote:
> Steve,
> I never thought to directly read the binary values and see what they
> were... Thisis interesting... The binary value of '20050503 14:23:12' is
> 0x00009649 00ED15C0 -- I separated high 4 bytes from Low 4 bytes
> If you take the high 4 bytes, 0x9649, that is indeed hex for 38,473,
> ( 9*16*16*16 + 6*16*16 + 4*16 + 9), which is integer for May 3rd 2005, bu
t
> no matter what I do, I can't figure out what SQL is doing with the low 4
> bytes...
> 0x00ED15C0 hex is decimal 15,537,600. which is only .1798333333 of a full
> 24 hour day (86,400,000 millesconds) and it should be 51,791,971
> (0.599444111110 of a 24 hr day).
> The only thing I'm not clear on is why Casting to a Float should be
> "expensive"... Is it because the system has to convert the binary data int
o a
> binary IEEE Mantissa/Exponent structure? I would have thought that would
be
> fairly performant, because it's a binary represeentation, which I "imagine
"
> all computers would somehow be internally optimized for, outside the scope
of
> whatever software was initiating this task. Not so? Why Not? (if you kn
ow!)
> Anyway,
> "Steve Kass" wrote:
>

Extracting Month Name from Date

Is there a function for getting the Month name from a date string? For
example, '01/20/2004' would return 'January'I think it's MonthName()
MonthName(getdate()) would return October
"Blake Gremillion" <Blake Gremillion@.discussions.microsoft.com> wrote in
message news:C3E99BBD-C664-4E50-B849-D19B72B12B97@.microsoft.com...
> Is there a function for getting the Month name from a date string? For
> example, '01/20/2004' would return 'January'
>|||DATENAME(MONTH, GATDATE())
"Kelly" <kelly.hauser@.amerock.com> wrote in message
news:ectKTi#qEHA.2764@.TK2MSFTNGP11.phx.gbl...
> I think it's MonthName()
> MonthName(getdate()) would return October
>
> "Blake Gremillion" <Blake Gremillion@.discussions.microsoft.com> wrote in
> message news:C3E99BBD-C664-4E50-B849-D19B72B12B97@.microsoft.com...
> > Is there a function for getting the Month name from a date string? For
> > example, '01/20/2004' would return 'January'
> >
> >
>

Extracting Latest Date

I have a db which stores all my sales transactions.
The task is to extract the buy price(price) for each item(ItemCode) where the
date is the latest for each item.
For Example:
Sales Table
ItemCode Qty Price Date
-- -- -- --
XYZ 2 23.50 12/03/05
XYZ 3 23.50 13/03/05
XYZ 2 99.99 14/03/05
ABC 14 88.88 01/01/05
RDB 12 77.30 21/04/05
RDB 23 23.35 02/05/05
I would then require only
ItemCode Qty Price Date
-- -- -- --
XYZ 2 99.99 14/03/05
ABC 14 88.88 01/01/05
RDB 23 23.35 02/05/05
How do i write a query to extract the data?
Need help with a project of mine...really stuck with this one..Thank you
On Thu, 04 Aug 2005 07:31:53 GMT, Jan S via droptable.com wrote:

>I have a db which stores all my sales transactions.
>The task is to extract the buy price(price) for each item(ItemCode) where the
>date is the latest for each item.
(snip)
>How do i write a query to extract the data?
> Need help with a project of mine...really stuck with this one..Thank you
Hi Jan,
Method #1:
SELECT a.ItemCode, a.Qty, a.Price, a.[Date]
FROM Sales AS a
WHERE NOT EXISTS
(SELECT *
FROM Sales AS b
WHERE b.ItemCode = a.ItemCode
AND b.[Date] > a.[Date])
Method #2:
SELECT a.ItemCode, a.Qty, a.Price, a.[Date]
FROM Sales AS a
WHERE a.[Date] =
(SELECT MAX(b.[Date])
FROM Sales AS b
WHERE b.ItemCode = a.ItemCode)
Method #3:
SELECT a.ItemCode, a.Qty, a.Price, a.[Date]
FROM Sales AS a
INNER JOIN (SELECT ItemCode, MAX([Date]) AS MaxDate
FROM Sales
GROUP BY ItemCode) AS b
ON b.ItemCode = a.ItemCode
AND b.MaxDate = a.[Date]
(And there might even be more methods...)
If performance is important, then test each of these queries a few times
and use the one that's the fastest. Otherwise, use the one that you find
the easiest to understand, as you'll have to maintain it later.
Best, Hugo
(Remove _NO_ and _SPAM_ to get my e-mail address)
|||Thanks Hugo..Much appreciated
Message posted via droptable.com
http://www.droptable.com/Uwe/Forums...-mseq/200508/1

Extracting from a DATETIME data type

Hi,
I can't figure out how to extract just the date or just the time from a DATETIME data type? Any suggestions?SELECT CONVERT(varchar(10),GetDate(),101)
SELECT CONVERT(varchar(10),GetDate(),108)|||Thanks.

Friday, February 17, 2012

Extracting dates from XML stored in SQL Server 2005

I have recently started storing xml files within SQL 2005. I have a problem extracting a date of format 20-02-2007 out of the XML and storing it in a field of datetime. The database is in American format (2007-02-20) and is failing when I try and insert the date. In the past when extracting the date at .Net level and passing through to SQL Server this was never a problem. I have tried to convert the date first but still fails. Is there an easy way to get around this problem or do I need to store the date as a string?

Thanks

Danny

You could use CONVERT function in you T-SQL statements with 105 style:

declare @.dt varchar(10)

set @.dt ='20-02-2007'

select convert(datetime,@.dt,105)

--RESULT

2007-02-20 00:00:00.000

|||

Yes this worked, but I had to convert to varchar before converting to datetime (cannot convert XML directly to datetime).

Thanks

Danny