Extracting Sql: huge

Showing posts with label huge. Show all posts

Friday, February 24, 2012

extremely bad performance running on huge box ?

hi all !!!
I have a SQL Server 2000 running on a 4 processor 2 GB ram DELL box running
Windows 2003 Server. no other apps run in this machine.
a table witn 20 million records is truncated and recreated once every week.
records are quite simple: (key,value). Table in indexes on key field.
the problem is thar running a "select count(*) from table" takes more than 2
minutes.
even worst, doing a "select * from table where (condition on indexed key
field)" also takes more than 2 minutes.
during these queries, which I do from the Query Analyzer, the overall CPUs
usage is less than 3%.
what could be wrong ? any help would be greatly appreciated.
thanks in advance,
ant.are there 4 physcial processors or is this a Xeon box with
Hyper-Threading,
if HT is enabled, then regardless of the number of actual
processors, try OPTION (MAXDOP 1) or 2
suppose your average row size is 100Bytes, then 20M rows
means the table is 2GB in size
a select count(*) requires a full table scan.
if your data is on a single disk, you can probably achieve
a sequential transfer rate of 50MB/sec, meaning your query
should take 40sec,
if you have 2 disks striped, you ought to be closer to
100MB/sec or 20sec.
if the entire table is in memory, the scan rate should be
800MB/sec,
so something is not right.
on the select *, i need to know the index and the full
query,
also keep in mind the time it QA to receive a large number
of rows
>--Original Message--
>hi all !!!
>I have a SQL Server 2000 running on a 4 processor 2 GB
ram DELL box running
>Windows 2003 Server. no other apps run in this machine.
>a table witn 20 million records is truncated and
recreated once every week.
>records are quite simple: (key,value). Table in indexes
on key field.
>the problem is thar running a "select count(*) from
table" takes more than 2
>minutes.
>even worst, doing a "select * from table where (condition
on indexed key
>field)" also takes more than 2 minutes.
>during these queries, which I do from the Query Analyzer,
the overall CPUs
>usage is less than 3%.
>what could be wrong ? any help would be greatly
appreciated.
>thanks in advance,
> ant.
>
>.
>|||Please post your table DDL (including constraints and indexes), sample data
and problem queries so that we can better help.
> the problem is thar running a "select count(*) from table" takes more than
2
> minutes.
This will require a scan and the duration will be proportional to table
size. If you have only a clustered index and data are not cached, then all
data pages will be read. 2 minutes may or may not be reasonable depending
on your row size and disk configuration.
> even worst, doing a "select * from table where (condition on indexed key
> field)" also takes more than 2 minutes.
This doesn't seem right. If your 'indexed key field' is unique, I would
expect sub-second response. With a non-unique index, I'd expect the
response time to be roughly proportional to the number of rows that satisfy
the condition as long as the expression is sargable.
--
Hope this helps.
Dan Guzman
SQL Server MVP
"Antonio Lopez Arredondo" <adog@.someplace.com> wrote in message
news:uzLV4A%238DHA.3348@.TK2MSFTNGP09.phx.gbl...
> hi all !!!
> I have a SQL Server 2000 running on a 4 processor 2 GB ram DELL box
running
> Windows 2003 Server. no other apps run in this machine.
> a table witn 20 million records is truncated and recreated once every
week.
> records are quite simple: (key,value). Table in indexes on key field.
> the problem is thar running a "select count(*) from table" takes more than
2
> minutes.
> even worst, doing a "select * from table where (condition on indexed key
> field)" also takes more than 2 minutes.
> during these queries, which I do from the Query Analyzer, the overall CPUs
> usage is less than 3%.
> what could be wrong ? any help would be greatly appreciated.
> thanks in advance,
> ant.
>

Sunday, February 19, 2012

Extracting from one database to another automatically hourly

For a GPS utility project we are planning on extracting certain attributes from a huge "GPS Raw Data" read only database which we have access to containing GPS data from several years from several devices attached to vehicles.

The data is time stamped. Where the time gap between pieces of data is more than 10 minutes, a new trip is instance is assumed and in our write access "Trip" database we create a new instance for the data clump with a newtrip idalong with the time range of the data.

The process is to be run hourly to update the "Trip" database with new trips and append to overlapping trips.

We've some questions:

a) Is it easy to read from one database and write into another in c# hourly

b) How would one go about running a C# program automatically every hour on the server?

c) Is there a better way to do this than an hourly update? (dynamically perhaps??)

d) When querying the database and comparing the time stamps, how for instance would we go about identifying a 10 minute gap when the time/date is in the format "22/12/2007 11:25:00". I can't get my head around actually writing this - it's probably ridiculously simple Smile

what type of database r u using ?

If its SQL, why not create a storedprocedure and make it to run in a job every hour, it will best for performance .

|||

Perhaps you can look into solutions like Replication where your GPS Raw Data can be the publisher and your target db can be the subscriber. Any transactions on publisher are replicated on to the subscriber. and the subscriber can be used for querying.

>>a) Is it easy to read from one database and write into another in c# hourly

Depends on how much data there is to read/write.

b) How would one go about running a C# program automatically every hour on the server?

>>You can write a stored procedure that has the logic to query for modified/new records and does an INSERT into the target db.

>>c) Is there a better way to do this than an hourly update? (dynamically perhaps??)

Perhaps. Look into the replication technology and see if it suits your requirements.

>>d) When querying the database and comparing the time stamps, how for instance would we go about identifying a 10 minute gap when the time/date is in the format "22/12/2007 11:25:00". I can't get my head around actually writing this - it's probably ridiculously simple

You can use the DATEDIFF function : IF DATEDIFF(mi, <valueinrow>, Getdate()) = 10. But you'd have to have some kind of service running constantly to trigger this data transfer when the 10 minute interval happens.

|||

Thank you for your help Ramzi!
We are using MS SQL 2005 with Microsoft Visual Web Dev 2005.

I must read into stored procedures if you think this is the best way forward as a solution. Is there a particular utility you would advise for creating/managing stored procedures?

I just downloaded MS SQL Management Studio. However, I cannot find the ASPNETDB.MDF database created in Visual Web Developer on the local machine when I look for it.

|||

Thanks ndinakar

>>a) Is it easy to read from one database and write into another in c# hourly

Depends on how much data there is to read/write.

Once the main "lot" from the past year is indexed, then each hourly update would be rather small with the only perhaps a hundred co-ordinates coming in over an hour. At times there will be no new data.

b) How would one go about running a C# program automatically every hour on the server?

>>You can write a stored procedure that has the logic to query for modified/new records and does an INSERT into the target db.

We will look into the stored procedure function. This looks like it could be very useful.. Would this be used alongside or instead of replication? (bear in mind I've not yet read up on replication)

In the stored procedure that runs hourly could this be used to run through the data from the past hour and identify these if/when these intervals happen or are you saying that it needs to be constantly checking (every few seconds)?

|||

I'm looking for info on stored procedures now.
On the MS website there is mention of a "CREATE TRIGGER" procedure... Is this what you refer to as something we could use with the DATEDIFF comparison?

Is it possible to set a timed execution? Could someone give a small example of this?

All of your help has been much appreciated!

|||

Going by your replies perhaps its best to create a job, have the job call a stored procedure. You can schedule the job to run every 10 minutes and in your stored procedure, query all the new/modified records after the previous run and migrate the data. You might need to store the timestamp of the last run in case the job does not run for any reason or you need to stop the job for a few hours. You can always catch up if you have the timestamp of the previous run.

Check out Books On Line (the documentation that comes with SQL Server) for CREATE PROCEDURE.

For creating/scheduling/managing jobs look for "Jobs" under SQL Server Agent.

|||

I have managed to do some basic copying from one database to another, however, I'm having some problems with the datediff function.

We are selecting the time and vehicle ID from the "GPS raw data" database. We want to compare each entry with the previous one and, where there is a gap of more than 15 mins, return it.

We will group the associated data into another "trip" table, with a primary key of "trip id". So any gap in GPS transmission of more than 15 mins will be considered the end/start of a trip.

This is just vague code.
I want to compare each row really.. I've no idea how to though..(I've put in time & time+1 where time+1 is the next row - Don't know how to do this)

SELECT Time, deviceID,
FROM [gps.db]
WHERE datediff(n, time, time+1) > 15

Friday, February 17, 2012

Extracting Data from lotusNotes into SQL SERVER

I have a form in Notes that is huge. I need to extract a part/field from that form into SQL SERVER. If I use NotesSQL Thru SQL DTSPackages, I am afraid it will run very long. Is there any other option on that?

Any help is appreciated.

Are you using the Notes ODBC driver at all? If not, you can use that along with a DataReader source in SSIS.|||

you need the right connector to be able to deal with the Notes as a source/destination. Please check our Data Sources matrix in our connectivity wiki to see possible alternatives for Lotus Notes integration:

http://ssis.wik.is/Data_Sources

Friday, February 24, 2012

extremely bad performance running on huge box ?

extremely bad performance running on huge box ?

Sunday, February 19, 2012

Extracting from one database to another automatically hourly

Friday, February 17, 2012

Extracting Data from lotusNotes into SQL SERVER

Extracting Sql

Blog Archive

About Me