Hardware performance
CASTalk.com Forum Index CASTalk.com
Discussion of DSP, FPGA, storage and embedded system.
 
 FAQFAQ   MemberlistMemberlist     RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 
 
Google
 
Web castalk.com
Hardware performance

 
Post new topic   Reply to topic    CASTalk.com Forum Index -> Storage System
Author Message
Jim Mack
Guest





Posted: Sat Nov 05, 2005 8:59 am    Post subject: Hardware performance Reply with quote

Looking for a sanity check here --

Given a single (hyperthreaded) 3GHz CPU, 833Mhz FSB, 1GB RAM and a dual channel SCSI 320 card with 64MB cache (PCI66 / 64bit), what sort of sustained read / write performance should I expect to see to an external HW RAID5 array of SCSI 320 drives?

This would be with no applications running, a light load of services -- essentially zero CPU usage -- and separate SATA system drive. The use is a file server with numerous fairly large files being transferred over two high-performance GigE conections. Assume we can saturate the net cards, and assume W2K3 Server and NTFS.

I'm trying to get some idea of what would be a realistic expectation from such a setup, and where/why any bottlenecks might arise.

Thanks for any insights and real-world war stories.

--
Jim
Back to top
Pat [MSFT]
Guest





Posted: Mon Nov 07, 2005 9:15 am    Post subject: Re: Hardware performance Reply with quote

1) Make sure you have SP1 installed. There are perf enhancements for SMB.

CPU is rarely a factor in storage scalability. The read/write perf will
actually depend more on the characteristics of the reading & writing.

Once you saturate the NICs, unless there is local read & writes going on
(which you state won't be happening) then that is your bottleneck. If you
are using teamed NICs (and your switch supports it) you should be able to
saturate a 2GigE pipe. Your most likely problem will be the relatively
small amount of RAM installed - though it depends on the actual size of the
files & how they are accessed.

For example if the application that does the Write operations specifying
Write-Through then the bottleneck will likely be on the spindles - but it is
hard to say without a lot more info.

For this PerfMon is your friend.

For writes, look at the Logical Disk counters for Sec/Write & Current Disk
Queue. The disk queue should be < (spindle count assigned to the LUN + 2).
An occasional spike is OK, but it should recover quickly. Look at the
Sec/Write counter to see how long a write is taking. This will tell you if
you are saturating your array.

For Reads, you can look at Cache:Copy Reads Hit %. If it is low, adding RAM
could help performance. If it is higher than 80%, then additional RAM is
unlikely to help.



Pat


"Jim Mack" <jmack@mdxi.nospam.com> wrote in message
news:eaar5Tb4FHA.3292@tk2msftngp13.phx.gbl...
Looking for a sanity check here --

Given a single (hyperthreaded) 3GHz CPU, 833Mhz FSB, 1GB RAM and a dual
channel SCSI 320 card with 64MB cache (PCI66 / 64bit), what sort of
sustained read / write performance should I expect to see to an external HW
RAID5 array of SCSI 320 drives?

This would be with no applications running, a light load of services --
essentially zero CPU usage -- and separate SATA system drive. The use is a
file server with numerous fairly large files being transferred over two
high-performance GigE conections. Assume we can saturate the net cards, and
assume W2K3 Server and NTFS.

I'm trying to get some idea of what would be a realistic expectation from
such a setup, and where/why any bottlenecks might arise.

Thanks for any insights and real-world war stories.

--
Jim
Back to top
Jim Mack
Guest





Posted: Mon Nov 07, 2005 5:16 pm    Post subject: Re: Hardware performance Reply with quote

Pat [MSFT] wrote:

Thanks for the reply. As I mentioned, I'm primarily interested in knowing if our results are anomolous or whether we're seeing what we should -- i.e., ignoring network effects, even CPU effects, what sort of sustained rate should a 14-spindle RAID5 SCSI 320 array be capable of?


Quote:
1) Make sure you have SP1 installed. There are perf enhancements for
SMB.

Yes, it's installed. This is not a database application, it's in graphics rendering and retrieval -- generally, writes come over SMB, and reads from NFS, and there's not much overlap between the two. As far as we can gauge, disk performance is well below network performance, at least on the SMB side.


Quote:
CPU is rarely a factor in storage scalability. The read/write perf
will actually depend more on the characteristics of the reading &
writing.

Once you saturate the NICs, unless there is local read & writes going
on (which you state won't be happening) then that is your bottleneck.
If you are using teamed NICs (and your switch supports it) you should
be able to saturate a 2GigE pipe. Your most likely problem will be
the relatively small amount of RAM installed - though it depends on
the actual size of the files & how they are accessed.

For array testing we've tried local transfers only. There are three arrays, each a RAID5 with 1.7TB. One comprises 14 spindles (146GB drives), the other two are 7 spindles each (300GB drives), and those two are striped RAID0 in Windows.

We see fairly slow and erratic performance. Our test beds consists of a 1GB file, and a series of 100 50MB files, transferred sequentially from an array to itself, from array to array, and between the system drive (SATA) drive and either array and itself. The system SATA drive outperforms the SCSI arrays in every case.

Read and write of a 1GB file from one array to the other, which should be the fastest since it's between two channels on the SCSI controller, averages maybe 10-12MB/sec, nowhere near the peak rated speed of 320MB/s that the controller and arrays are capable of.

I'm trying to get a handle on whether this should be considered normal.


Quote:
For this PerfMon is your friend.

I'll take a look at those specific tasks. Watching Disk Bytes/Sec shows a 'square wave' graph, where there is a burst of activity followed by a period of no (or minimal) activity. Using the SATA array only, we see a sustained medium-to-high level of activity, with no large peaks and valleys.

Thanks again to anyone who can comment on what we should expect to see, and where there might be opportunities for improvement.

--
Jim
Back to top
Pat [MSFT]
Guest





Posted: Tue Nov 08, 2005 1:17 am    Post subject: Re: Hardware performance Reply with quote

If you are using 10k RPM spindles, you should be able to do ~100 _random_
IOPS per spindle (i.e. lots of seeking). This is what we use for Exchange
sizing (and for those keeping score at home, figure 0.5 IOPS/user (for heavy
users, 0.25 for light users) so 1 10kRPM spindle supports ~200 exchange
users).

I happen to have setup storage for some graphics apps and they should do
_much_ better than 100IOPS/spindle b/c they tend to be very large files and
the reads/writes are heavily weighted towards sequential access. Unless
there is an application level problem (i.e. it writes lots of little blocks
instead of just saving the entire file - which is a possibility) you should
be able to sustain somewhere in the neighborhood of 80% of whatever the
network speed is. I can easily fill a 100MBit pipe copying a 120MB file to
my laptop (7200RPM spindle) over SMB.

Check out the Perfmon Sec./Write. It measures how long the disk access
takes between handoff to the controller and a response back that the write
occurred. If you have a low disk queue, but high Sec./Write then the
bottleneck is in the storage HW (or driver) and you can pursue that. Check

Pat


"Jim Mack" <jmack@mdxi.nospam.com> wrote in message
news:u%23RZYj54FHA.3188@TK2MSFTNGP15.phx.gbl...
Pat [MSFT] wrote:

Thanks for the reply. As I mentioned, I'm primarily interested in knowing
if our results are anomolous or whether we're seeing what we should -- i.e.,
ignoring network effects, even CPU effects, what sort of sustained rate
should a 14-spindle RAID5 SCSI 320 array be capable of?


Quote:
1) Make sure you have SP1 installed. There are perf enhancements for
SMB.

Yes, it's installed. This is not a database application, it's in graphics
rendering and retrieval -- generally, writes come over SMB, and reads from
NFS, and there's not much overlap between the two. As far as we can gauge,
disk performance is well below network performance, at least on the SMB
side.


Quote:
CPU is rarely a factor in storage scalability. The read/write perf
will actually depend more on the characteristics of the reading &
writing.

Once you saturate the NICs, unless there is local read & writes going
on (which you state won't be happening) then that is your bottleneck.
If you are using teamed NICs (and your switch supports it) you should
be able to saturate a 2GigE pipe. Your most likely problem will be
the relatively small amount of RAM installed - though it depends on
the actual size of the files & how they are accessed.

For array testing we've tried local transfers only. There are three arrays,
each a RAID5 with 1.7TB. One comprises 14 spindles (146GB drives), the
other two are 7 spindles each (300GB drives), and those two are striped
RAID0 in Windows.

We see fairly slow and erratic performance. Our test beds consists of a 1GB
file, and a series of 100 50MB files, transferred sequentially from an array
to itself, from array to array, and between the system drive (SATA) drive
and either array and itself. The system SATA drive outperforms the SCSI
arrays in every case.

Read and write of a 1GB file from one array to the other, which should be
the fastest since it's between two channels on the SCSI controller, averages
maybe 10-12MB/sec, nowhere near the peak rated speed of 320MB/s that the
controller and arrays are capable of.

I'm trying to get a handle on whether this should be considered normal.


Quote:
For this PerfMon is your friend.

I'll take a look at those specific tasks. Watching Disk Bytes/Sec shows a
'square wave' graph, where there is a burst of activity followed by a period
of no (or minimal) activity. Using the SATA array only, we see a sustained
medium-to-high level of activity, with no large peaks and valleys.

Thanks again to anyone who can comment on what we should expect to see, and
where there might be opportunities for improvement.

--
Jim
Back to top
Jeff Goldner [MS]
Guest





Posted: Thu Nov 10, 2005 9:16 am    Post subject: Re: Hardware performance Reply with quote

Are you using the system RAID5 capability or do you have a RAID adapter?

"Pat [MSFT]" <patfilot@online.microsoft.com> wrote in message
news:O%237Vrd$4FHA.1096@TK2MSFTNGP10.phx.gbl...
Quote:
If you are using 10k RPM spindles, you should be able to do ~100 _random_
IOPS per spindle (i.e. lots of seeking). This is what we use for Exchange
sizing (and for those keeping score at home, figure 0.5 IOPS/user (for
heavy users, 0.25 for light users) so 1 10kRPM spindle supports ~200
exchange users).

I happen to have setup storage for some graphics apps and they should do
_much_ better than 100IOPS/spindle b/c they tend to be very large files
and the reads/writes are heavily weighted towards sequential access.
Unless there is an application level problem (i.e. it writes lots of
little blocks instead of just saving the entire file - which is a
possibility) you should be able to sustain somewhere in the neighborhood
of 80% of whatever the network speed is. I can easily fill a 100MBit pipe
copying a 120MB file to my laptop (7200RPM spindle) over SMB.

Check out the Perfmon Sec./Write. It measures how long the disk access
takes between handoff to the controller and a response back that the write
occurred. If you have a low disk queue, but high Sec./Write then the
bottleneck is in the storage HW (or driver) and you can pursue that.
Check

Pat


"Jim Mack" <jmack@mdxi.nospam.com> wrote in message
news:u%23RZYj54FHA.3188@TK2MSFTNGP15.phx.gbl...
Pat [MSFT] wrote:

Thanks for the reply. As I mentioned, I'm primarily interested in knowing
if our results are anomolous or whether we're seeing what we should --
i.e., ignoring network effects, even CPU effects, what sort of sustained
rate should a 14-spindle RAID5 SCSI 320 array be capable of?


1) Make sure you have SP1 installed. There are perf enhancements for
SMB.

Yes, it's installed. This is not a database application, it's in graphics
rendering and retrieval -- generally, writes come over SMB, and reads from
NFS, and there's not much overlap between the two. As far as we can
gauge, disk performance is well below network performance, at least on the
SMB side.


CPU is rarely a factor in storage scalability. The read/write perf
will actually depend more on the characteristics of the reading &
writing.

Once you saturate the NICs, unless there is local read & writes going
on (which you state won't be happening) then that is your bottleneck.
If you are using teamed NICs (and your switch supports it) you should
be able to saturate a 2GigE pipe. Your most likely problem will be
the relatively small amount of RAM installed - though it depends on
the actual size of the files & how they are accessed.

For array testing we've tried local transfers only. There are three
arrays, each a RAID5 with 1.7TB. One comprises 14 spindles (146GB
drives), the other two are 7 spindles each (300GB drives), and those two
are striped RAID0 in Windows.

We see fairly slow and erratic performance. Our test beds consists of a
1GB file, and a series of 100 50MB files, transferred sequentially from an
array to itself, from array to array, and between the system drive (SATA)
drive and either array and itself. The system SATA drive outperforms the
SCSI arrays in every case.

Read and write of a 1GB file from one array to the other, which should be
the fastest since it's between two channels on the SCSI controller,
averages maybe 10-12MB/sec, nowhere near the peak rated speed of 320MB/s
that the controller and arrays are capable of.

I'm trying to get a handle on whether this should be considered normal.


For this PerfMon is your friend.

I'll take a look at those specific tasks. Watching Disk Bytes/Sec shows a
'square wave' graph, where there is a burst of activity followed by a
period of no (or minimal) activity. Using the SATA array only, we see a
sustained medium-to-high level of activity, with no large peaks and
valleys.

Thanks again to anyone who can comment on what we should expect to see,
and where there might be opportunities for improvement.

--
Jim
Back to top
Jim Mack
Guest





Posted: Thu Nov 10, 2005 5:01 pm    Post subject: Re: Hardware performance Reply with quote

Jeff Goldner [MS] wrote:
Quote:
Are you using the system RAID5 capability or do you have a RAID
adapter?

We actually have two arrays, one on each channel of a two-channel LSI logic controller, forming three logical drives. Each logical drive is about 1.6TB using HW RAID5. One of them comprises 14 10K spindles x 146GB, and the other two have 7 10K spindles each x 300GB (the controller has a 2TB limit on a logical drive). The 14-spindle drive is used as-is in Windows, and the other two are striped (RAID0) in Windows to form a RAID50 volume (HW5, SW0). Both arrays perform pretty much identically.

Following advice here, we looked at perfmon stats and found that copying 15GB worth of of 50MB files gave us zero cache hits, no matter what we did -- whatever perfmon is measuring there, it isn't in use in this configuration.

Sec/write is consistently 0.020, give or take 0.001.

Disk writes/sec averages 730, but ranges from 360 to 1050 in a see-saw pattern.

Average write queue length varies by target -- copying to the RAID50 volume shows an average value of 29, and a range of 14 to 40. To the RAID5 volume, it's about half that for all values. Disk writes/sec stays the same.

--
Jim


Quote:

"Pat [MSFT]" <patfilot@online.microsoft.com> wrote in message
news:O%237Vrd$4FHA.1096@TK2MSFTNGP10.phx.gbl...
If you are using 10k RPM spindles, you should be able to do ~100
_random_ IOPS per spindle (i.e. lots of seeking). This is what we
use for Exchange sizing (and for those keeping score at home, figure
0.5 IOPS/user (for heavy users, 0.25 for light users) so 1 10kRPM
spindle supports ~200 exchange users).

I happen to have setup storage for some graphics apps and they
should do _much_ better than 100IOPS/spindle b/c they tend to be
very large files and the reads/writes are heavily weighted towards
sequential access. Unless there is an application level problem
(i.e. it writes lots of little blocks instead of just saving the
entire file - which is a possibility) you should be able to sustain
somewhere in the neighborhood of 80% of whatever the network speed
is. I can easily fill a 100MBit pipe copying a 120MB file to my
laptop (7200RPM spindle) over SMB.

Check out the Perfmon Sec./Write. It measures how long the disk
access takes between handoff to the controller and a response back
that the write occurred. If you have a low disk queue, but high
Sec./Write then the bottleneck is in the storage HW (or driver) and
you can pursue that. Check

Pat


"Jim Mack" <jmack@mdxi.nospam.com> wrote in message
news:u%23RZYj54FHA.3188@TK2MSFTNGP15.phx.gbl...
Pat [MSFT] wrote:

Thanks for the reply. As I mentioned, I'm primarily interested in
knowing if our results are anomolous or whether we're seeing what we
should -- i.e., ignoring network effects, even CPU effects, what
sort of sustained rate should a 14-spindle RAID5 SCSI 320 array be
capable of?


1) Make sure you have SP1 installed. There are perf enhancements
for SMB.

Yes, it's installed. This is not a database application, it's in
graphics rendering and retrieval -- generally, writes come over SMB,
and reads from NFS, and there's not much overlap between the two.
As far as we can gauge, disk performance is well below network
performance, at least on the SMB side.


CPU is rarely a factor in storage scalability. The read/write perf
will actually depend more on the characteristics of the reading &
writing.

Once you saturate the NICs, unless there is local read & writes
going on (which you state won't be happening) then that is your
bottleneck. If you are using teamed NICs (and your switch supports
it) you should be able to saturate a 2GigE pipe. Your most likely
problem will be the relatively small amount of RAM installed -
though it depends on the actual size of the files & how they are
accessed.

For array testing we've tried local transfers only. There are three
arrays, each a RAID5 with 1.7TB. One comprises 14 spindles (146GB
drives), the other two are 7 spindles each (300GB drives), and those
two are striped RAID0 in Windows.

We see fairly slow and erratic performance. Our test beds consists
of a 1GB file, and a series of 100 50MB files, transferred
sequentially from an array to itself, from array to array, and
between the system drive (SATA) drive and either array and itself.
The system SATA drive outperforms the SCSI arrays in every case.

Read and write of a 1GB file from one array to the other, which
should be the fastest since it's between two channels on the SCSI
controller, averages maybe 10-12MB/sec, nowhere near the peak rated
speed of 320MB/s that the controller and arrays are capable of.

I'm trying to get a handle on whether this should be considered
normal.


For this PerfMon is your friend.

I'll take a look at those specific tasks. Watching Disk Bytes/Sec
shows a 'square wave' graph, where there is a burst of activity
followed by a period of no (or minimal) activity. Using the SATA
array only, we see a sustained medium-to-high level of activity,
with no large peaks and valleys.

Thanks again to anyone who can comment on what we should expect to
see, and where there might be opportunities for improvement.

--
Jim
Back to top
Tom Stewart
Guest





Posted: Thu Nov 10, 2005 5:16 pm    Post subject: Re: Hardware performance Reply with quote

1. Check out IOMeter (http://www.iometer.org/) for performance testing.
2. RAID5 has a write penalty. If you're doing lots of writes, don't expect great
performance. When you write a disk block, the other blocks in the group have to be read,
parity recalculated, then both the data block and parity block must be written.
3. 64 MB of cache is pretty small. Your see-saw pattern of write performance probably has to
do with cache getting saturated then de-staged.
4. 20 ms per write is pretty slow.
5. See if your HW RAID solution has any performance monitoring tools. Expensive solutions do
tend to come with lots of performance monitoring capabilities. Inexpensive ones often do
not.

--
Tom

"Jim Mack" <jmack@mdxi.nospam.com> wrote in message
news:OEhRLYe5FHA.1416@TK2MSFTNGP09.phx.gbl...
Jeff Goldner [MS] wrote:
Quote:
Are you using the system RAID5 capability or do you have a RAID
adapter?

We actually have two arrays, one on each channel of a two-channel LSI logic controller,
forming three logical drives. Each logical drive is about 1.6TB using HW RAID5. One of
them comprises 14 10K spindles x 146GB, and the other two have 7 10K spindles each x 300GB
(the controller has a 2TB limit on a logical drive). The 14-spindle drive is used as-is in
Windows, and the other two are striped (RAID0) in Windows to form a RAID50 volume (HW5,
SW0). Both arrays perform pretty much identically.

Following advice here, we looked at perfmon stats and found that copying 15GB worth of of
50MB files gave us zero cache hits, no matter what we did -- whatever perfmon is measuring
there, it isn't in use in this configuration.

Sec/write is consistently 0.020, give or take 0.001.

Disk writes/sec averages 730, but ranges from 360 to 1050 in a see-saw pattern.

Average write queue length varies by target -- copying to the RAID50 volume shows an average
value of 29, and a range of 14 to 40. To the RAID5 volume, it's about half that for all
values. Disk writes/sec stays the same.

--
Jim


Quote:

"Pat [MSFT]" <patfilot@online.microsoft.com> wrote in message
news:O%237Vrd$4FHA.1096@TK2MSFTNGP10.phx.gbl...
If you are using 10k RPM spindles, you should be able to do ~100
_random_ IOPS per spindle (i.e. lots of seeking). This is what we
use for Exchange sizing (and for those keeping score at home, figure
0.5 IOPS/user (for heavy users, 0.25 for light users) so 1 10kRPM
spindle supports ~200 exchange users).

I happen to have setup storage for some graphics apps and they
should do _much_ better than 100IOPS/spindle b/c they tend to be
very large files and the reads/writes are heavily weighted towards
sequential access. Unless there is an application level problem
(i.e. it writes lots of little blocks instead of just saving the
entire file - which is a possibility) you should be able to sustain
somewhere in the neighborhood of 80% of whatever the network speed
is. I can easily fill a 100MBit pipe copying a 120MB file to my
laptop (7200RPM spindle) over SMB.

Check out the Perfmon Sec./Write. It measures how long the disk
access takes between handoff to the controller and a response back
that the write occurred. If you have a low disk queue, but high
Sec./Write then the bottleneck is in the storage HW (or driver) and
you can pursue that. Check

Pat


"Jim Mack" <jmack@mdxi.nospam.com> wrote in message
news:u%23RZYj54FHA.3188@TK2MSFTNGP15.phx.gbl...
Pat [MSFT] wrote:

Thanks for the reply. As I mentioned, I'm primarily interested in
knowing if our results are anomolous or whether we're seeing what we
should -- i.e., ignoring network effects, even CPU effects, what
sort of sustained rate should a 14-spindle RAID5 SCSI 320 array be
capable of?


1) Make sure you have SP1 installed. There are perf enhancements
for SMB.

Yes, it's installed. This is not a database application, it's in
graphics rendering and retrieval -- generally, writes come over SMB,
and reads from NFS, and there's not much overlap between the two.
As far as we can gauge, disk performance is well below network
performance, at least on the SMB side.


CPU is rarely a factor in storage scalability. The read/write perf
will actually depend more on the characteristics of the reading &
writing.

Once you saturate the NICs, unless there is local read & writes
going on (which you state won't be happening) then that is your
bottleneck. If you are using teamed NICs (and your switch supports
it) you should be able to saturate a 2GigE pipe. Your most likely
problem will be the relatively small amount of RAM installed -
though it depends on the actual size of the files & how they are
accessed.

For array testing we've tried local transfers only. There are three
arrays, each a RAID5 with 1.7TB. One comprises 14 spindles (146GB
drives), the other two are 7 spindles each (300GB drives), and those
two are striped RAID0 in Windows.

We see fairly slow and erratic performance. Our test beds consists
of a 1GB file, and a series of 100 50MB files, transferred
sequentially from an array to itself, from array to array, and
between the system drive (SATA) drive and either array and itself.
The system SATA drive outperforms the SCSI arrays in every case.

Read and write of a 1GB file from one array to the other, which
should be the fastest since it's between two channels on the SCSI
controller, averages maybe 10-12MB/sec, nowhere near the peak rated
speed of 320MB/s that the controller and arrays are capable of.

I'm trying to get a handle on whether this should be considered
normal.


For this PerfMon is your friend.

I'll take a look at those specific tasks. Watching Disk Bytes/Sec
shows a 'square wave' graph, where there is a burst of activity
followed by a period of no (or minimal) activity. Using the SATA
array only, we see a sustained medium-to-high level of activity,
with no large peaks and valleys.

Thanks again to anyone who can comment on what we should expect to
see, and where there might be opportunities for improvement.

--
Jim
Back to top
 
Post new topic   Reply to topic    CASTalk.com Forum Index -> Storage System All times are GMT
Page 1 of 1

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum




VoIP Electronics Powered by phpBB