using POPCNT in C/C++ on Itanium
CASTalk.com Forum Index CASTalk.com
Discussion of DSP, FPGA, storage and embedded system.
 
 FAQFAQ   MemberlistMemberlist     RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 
 
Google
 
Web castalk.com
using POPCNT in C/C++ on Itanium

 
Post new topic   Reply to topic    CASTalk.com Forum Index -> Computer Architecture
Author Message
gacek
Guest





Posted: Sun Mar 06, 2005 4:32 pm    Post subject: using POPCNT in C/C++ on Itanium Reply with quote

Hi,

I have a vector of integers, I need to find how many bits are set to 1.
I'd like to use processor built in popcnt function, what should the
following code look like (I know I could check more than one bit at
time, but its an example and thats not the issue)


long long numbers[10];
long long i, j, sum=0;

for (i=0;i<10;i++)
{
j= numbers[i];
while (j)
{
if (j&1) sum++;
j >>= 1;
}
}




jacek
Back to top
Norbert Juffa
Guest





Posted: Mon Mar 07, 2005 1:51 am    Post subject: Re: using POPCNT in C/C++ on Itanium Reply with quote

"gacek" <gaceknews@wp.pl> wrote in message news:d0epnv$4qa$1@korweta.task.gda.pl...
Quote:
Hi,

I have a vector of integers, I need to find how many bits are set to 1.
I'd like to use processor built in popcnt function, what should the
following code look like (I know I could check more than one bit at
time, but its an example and thats not the issue)


long long numbers[10];
long long i, j, sum=0;

for (i=0;i<10;i++)
{
j= numbers[i];
while (j)
{
if (j&1) sum++;
j >>= 1;
}
}

Take a look at the section "Counting the 1-bits in an Array" in the
following document: http://www.hackersdelight.org/revisions.pdf. It
shows how to reduce the complexity of a popultation count for arrays
down to O(log(n)) using CSAs. A population count instruction is still
useful as a building block for this approach. If you use the Intel
C/C++ compiler, Itanium's popcnt instruction should be accessible via
an intrinsic in ia64intrin.h: __int64 _m64_popcnt(__int64 a)

-- Norbert
Back to top
gacek
Guest





Posted: Mon Mar 07, 2005 4:22 am    Post subject: Re: using POPCNT in C/C++ on Itanium Reply with quote

thanks,




jacek
Back to top
Guest






Posted: Mon Mar 07, 2005 5:27 am    Post subject: Re: using POPCNT in C/C++ on Itanium Reply with quote

Norbert Juffa wrote:

Quote:
Take a look at the section "Counting the 1-bits in an Array" in the
following document: http://www.hackersdelight.org/revisions.pdf. It

On a related note...

http://www.jjj.de/fxt/fxtpage.html#fxtbook

also has a collection of interesting algorithms similar to that. The
pdf file is at the bottom of the page.

-t
Back to top
Eric Gouriou
Guest





Posted: Mon Mar 07, 2005 5:54 am    Post subject: Re: using POPCNT in C/C++ on Itanium Reply with quote

Norbert Juffa wrote:
Quote:
"gacek" <gaceknews@wp.pl> wrote in message news:d0epnv$4qa$1@korweta.task.gda.pl...
I have a vector of integers, I need to find how many bits are set to 1.
I'd like to use processor built in popcnt function, what should the
following code look like (I know I could check more than one bit at
time, but its an example and thats not the issue)


long long numbers[10];
long long i, j, sum=0;

for (i=0;i<10;i++)
{
j= numbers[i];
while (j)
{
if (j&1) sum++;
j >>= 1;
}
}


Take a look at the section "Counting the 1-bits in an Array" in the
following document: http://www.hackersdelight.org/revisions.pdf. It
shows how to reduce the complexity of a popultation count for arrays
down to O(log(n)) using CSAs. A population count instruction is still
useful as a building block for this approach. If you use the Intel
C/C++ compiler, Itanium's popcnt instruction should be accessible via
an intrinsic in ia64intrin.h: __int64 _m64_popcnt(__int64 a)

In HP's compiler, use the intrinsic _Asm_popcnt()
(pseudo signature "uint64_t _Asm_popcnt (uint64_t r3);")

In gcc:
uint64_t output;

__asm__ __volatile__ ("popcnt %0 = %1"
: "=%r" (output)
: "r" (input));

Writing a macro (/ inline function (hint)) that would present
a single interface on any of those compilers is left as an
exercise to the reader.

If the array is often really sparse (respectively really full),
you might want to check if the whole entry is 0 (respectively
0xFFF..F) and avoid a popcnt in those cases, yet this won't buy
much given the 3-cycle latency of a popcnt on Itanium2.

Eric
Back to top
 
Post new topic   Reply to topic    CASTalk.com Forum Index -> Computer Architecture All times are GMT
Page 1 of 1

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum




VoIP Electronics Powered by phpBB