"zero phase" FFT windows
CASTalk.com Forum Index CASTalk.com
Discussion of DSP, FPGA, storage and embedded system.
 
 FAQFAQ   MemberlistMemberlist     RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 
 
Google
 
Web castalk.com
"zero phase" FFT windows

 
Post new topic   Reply to topic    CASTalk.com Forum Index -> DSP
Author Message
FA
Guest





Posted: Sun Sep 11, 2005 4:16 pm    Post subject: "zero phase" FFT windows Reply with quote

Hi All,
Has anyone else played around with "zero phase" windows for STFT?

The idea here is that instead of the "conventional" way of applying the
window where the zero'th window coefficient is applied to the zero'th input
sample in the FFT buffer etc. and all of the zero padding (if any) is
applied at the end, with a zero phase window you apply the centre
coefficient of the window to the zero'th input sample and circular index the
window applying any zero padding in the centre of the FFT buffer.

The reason that I ask is because I've noticed a few interesting side effects
of this approach - the most interesting of which is that if you work out the
"real" frequency at each FFT bin by phase differentiation (pretty much using
the phase vocoder approach) what you will see is that there are "clusters"
of bins that all register the same "real" frequency.

This has two advantages:

The first is if you want to carry out peak picking to identify partials then
you will get much less frequency jitter than a conventional window in the
case where poor resolution causes subsequent frames to move up and down a
bin or two.

The second advantage is really cool - basically you can identify partials
that aren't actually FFT peaks - e.g. partials obscured by spectral
smearing. What I've done is to write a loop which looks for frequency
clusters and if it finds one it then looks for a frequency cluster around
the first and second harmonic - the reason for the harmonic test is that
this approach isn't completely infallible, however with a few hearistics
like the harmonic test it has managed to give significantly higher perceived
frequency resolution for low pitches when applied to my sinusoidal modelling
application than I would otherwise have been able to get with a given window
size - which has obvious time resolution benefits.

Anyway - I just wanted to see if anyone else has tried this, I've seen the
zero phase window used in a couple of places, but I've not seen anyone else
use the "frequency clustering" property.

Interested in comments.

TTFN
Fraser.
Back to top
Rune Allnor
Guest





Posted: Sun Sep 11, 2005 11:33 pm    Post subject: Re: "zero phase" FFT windows Reply with quote

FA wrote:
Quote:
Hi All,
Has anyone else played around with "zero phase" windows for STFT?

The idea here is that instead of the "conventional" way of applying the
window where the zero'th window coefficient is applied to the zero'th input
sample in the FFT buffer etc. and all of the zero padding (if any) is
applied at the end, with a zero phase window you apply the centre
coefficient of the window to the zero'th input sample and circular index the
window applying any zero padding in the centre of the FFT buffer.

OK... the center window coefficient of an M length window is applied
to x[0]. You use the wrap-around effect to map the "leading edge" of
the window to the end of the data buffer... so far so good.

What part of the data window is mapped to the end? x[-M/2]..x[-1]?

Quote:
The reason that I ask is because I've noticed a few interesting side effects
of this approach - the most interesting of which is that if you work out the
"real" frequency at each FFT bin by phase differentiation (pretty much using
the phase vocoder approach) what you will see is that there are "clusters"
of bins that all register the same "real" frequency.

I don't understand. What do you mean by "real frequency"? How is
this approach different from examining the magnitude of the spectrum?

Quote:
This has two advantages:

The first is if you want to carry out peak picking to identify partials then
you will get much less frequency jitter than a conventional window in the
case where poor resolution causes subsequent frames to move up and down a
bin or two.

How do "conventional windows" shift the frequency? It is well known
that windowing broadens the frequency peaks, but do they shift?

Quote:
The second advantage is really cool - basically you can identify partials
that aren't actually FFT peaks - e.g. partials obscured by spectral
smearing. What I've done is to write a loop which looks for frequency
clusters and if it finds one it then looks for a frequency cluster around
the first and second harmonic - the reason for the harmonic test is that
this approach isn't completely infallible, however with a few hearistics
like the harmonic test it has managed to give significantly higher perceived
frequency resolution for low pitches when applied to my sinusoidal modelling
application than I would otherwise have been able to get with a given window
size - which has obvious time resolution benefits.

I don't understand what you mean. What is a "partial"? What is a
"frequency cluster"?

Quote:
Anyway - I just wanted to see if anyone else has tried this, I've seen the
zero phase window used in a couple of places, but I've not seen anyone else
use the "frequency clustering" property.

I don't understand much of what you do here. As far as I can tell,
doing a time shift in the DFT as you do here, would only affect the
phase term of the spectrum, not the magnitudes. So basically, I don't
see any reason why anything interesting should happen here.

If, on the other hand, you have been examining the real parts of the
spectrum, I would not be surprised if you see some effects.

Could you provide some more details about what you do, and how?

Rune
Back to top
Richard Dobson
Guest





Posted: Mon Sep 12, 2005 12:16 am    Post subject: Re: "zero phase" FFT windows Reply with quote

FA wrote:

Quote:
Hi All,
Has anyone else played around with "zero phase" windows for STFT?

The idea here is that instead of the "conventional" way of applying the
window where the zero'th window coefficient is applied to the zero'th input
sample in the FFT buffer etc. and all of the zero padding (if any) is
applied at the end, with a zero phase window you apply the centre
coefficient of the window to the zero'th input sample and circular index the
window applying any zero padding in the centre of the FFT buffer.

The reason that I ask is because I've noticed a few interesting side effects
of this approach - the most interesting of which is that if you work out the
"real" frequency at each FFT bin by phase differentiation (pretty much using
the phase vocoder approach) what you will see is that there are "clusters"
of bins that all register the same "real" frequency.
...

I am a bit puzzled by this observation; I have found this clustering happens in
all the phase vocoders I have studied (e.g. the CARL one written by Mark Dolson,
which I have based all my own work on), and I have assuemd that it is a natural
emergent feature from the delta-phase calcuation between frames (in pvoc, to
determine amplitude+frequency for each bin). So I am intrigued that there could
be a phase vocoder where this doesn't happen - seems like a contradiction in
terms. None of the pvocs that I am aware of does zero-padding (maybe they
shoudld!), but some do "double-windowing" (option in CARL pvoc, default in
F.R.Foore pvoc).

The clustering of bins around a peak has been described by several authors in
the context of the goal of reducing phase vocoder transient smearing. A paper by
Miller Puckette (of PD fame) on the "phase-locked vocoder" was followed by a
paper by Dolson and Jean Laroche offering a more advanced method; both seem to
exploit this clustering effect (though Dolson and Laroche describe their
calculations in terms of phase tracking rather than frequency bunching; whether
this is a meaninful distinction is another matter).

So you may find it useful to study the Dolson/Laroche paper:

"Improved Phase Vocoder TimeScale Modification of Audio",
IEEE Transactions on AYdio and Speech processing, Vol 7:3, 1999.

If you Google on that title you will find lots of other useful related material,
as that paper has been very widely cited.

A modern source for Dolson's phase vocoder is the Csound sources (look for the
streaming pvoc opcodes), or my own updated versions at:

http://dream.cs.bath.ac.uk/researchdev/pvocex/pvocex.html

I would be interested to have your observations on this in the context of your
"conventional" description - I thought CARL pvoc was already "conventional"!

Note that the method Dolson and Laroche describe is patented on behalf of
Creative Labs/Emu; it is employed for example in the "Audigy" range of
soundcards for their time-scaling facilities. I would have to re-read all the
literature on partial tracking to be sure that bin clustering per se has not
been exploited, but in all the pvocs I know of it is such an obvious phenomenon
that it is most unlikely it has escaped attention. The main problem as you
indicate is that of tracking weak partials (which may be genuine even if not
harmonically related to a suspected fundamental). So the research interest is no
so much that it happens, but in how best/accurately to exploit the fact that it
happens.

Richard Dobson
Back to top
FA
Guest





Posted: Wed Sep 14, 2005 9:26 pm    Post subject: Re: "zero phase" FFT windows Reply with quote

Hi Richard,
As you rightly observe what I'm doing is pretty similar to a phase vocoder.
Basically
I'm applying the phase vocoder phase differentiation to calculate the "true"
frequency
at each bin. My app isn't a true phase vocoder in the sense that although I
have used the phase
differentiation that one does in a phase vocoder I am actually using McAulay
Quatieri Sinusoidal Modelling - that is to say I do peak detection and
pruning (using
a psychoacoustic model) but the frequency (in my magnitude/frequency/phase
parameterisation
of each partiall) has been obtained using the pvoc phase differentiation
approach.

With respect to the zero phase window, my observation was simply that using
this type of window instead of the more conventional window gives less
frequency jitter
when tracking between frames - I suspect that this is less of an issue in a
conventional
pvoc because all of the bins are considered whereas in the Sinusoidal
Modelling only
those bins corresponding to true sinusoidal partials are considered for
modification
and resynthesis.

Aplologies - my posting was intended to stimulate discussion, but you
observe that I had
probably combined two concepts - that of the zero phase window and that of
the frequency
clustering effect - probably just confused matters that :-(

I'll take a look at some of the references you have mentioned - I'm
interested to see how
others have made use of this phenomenon - as you say it's such a noticeable
effect
when pvoc'ing.

It has certainly given noticeable improvements to my pitch shifting
application so I reckon
that there is at least some mileage in it.

Regards,
Frase.


Richard Dobson wrote in message ...
Quote:
FA wrote:

Hi All,
Has anyone else played around with "zero phase" windows for STFT?

The idea here is that instead of the "conventional" way of applying the
window where the zero'th window coefficient is applied to the zero'th
input
sample in the FFT buffer etc. and all of the zero padding (if any) is
applied at the end, with a zero phase window you apply the centre
coefficient of the window to the zero'th input sample and circular index
the
window applying any zero padding in the centre of the FFT buffer.

The reason that I ask is because I've noticed a few interesting side
effects
of this approach - the most interesting of which is that if you work out
the
"real" frequency at each FFT bin by phase differentiation (pretty much
using
the phase vocoder approach) what you will see is that there are
"clusters"
of bins that all register the same "real" frequency.
...

I am a bit puzzled by this observation; I have found this clustering
happens in
all the phase vocoders I have studied (e.g. the CARL one written by Mark
Dolson,
which I have based all my own work on), and I have assuemd that it is a
natural
emergent feature from the delta-phase calcuation between frames (in pvoc,
to
determine amplitude+frequency for each bin). So I am intrigued that there
could
be a phase vocoder where this doesn't happen - seems like a contradiction
in
terms. None of the pvocs that I am aware of does zero-padding (maybe they
shoudld!), but some do "double-windowing" (option in CARL pvoc, default in
F.R.Foore pvoc).

The clustering of bins around a peak has been described by several authors
in
the context of the goal of reducing phase vocoder transient smearing. A
paper by
Miller Puckette (of PD fame) on the "phase-locked vocoder" was followed by
a
paper by Dolson and Jean Laroche offering a more advanced method; both seem
to
exploit this clustering effect (though Dolson and Laroche describe their
calculations in terms of phase tracking rather than frequency bunching;
whether
this is a meaninful distinction is another matter).

So you may find it useful to study the Dolson/Laroche paper:

"Improved Phase Vocoder TimeScale Modification of Audio",
IEEE Transactions on AYdio and Speech processing, Vol 7:3, 1999.

If you Google on that title you will find lots of other useful related
material,
as that paper has been very widely cited.

A modern source for Dolson's phase vocoder is the Csound sources (look for
the
streaming pvoc opcodes), or my own updated versions at:

http://dream.cs.bath.ac.uk/researchdev/pvocex/pvocex.html

I would be interested to have your observations on this in the context of
your
"conventional" description - I thought CARL pvoc was already
"conventional"!

Note that the method Dolson and Laroche describe is patented on behalf of
Creative Labs/Emu; it is employed for example in the "Audigy" range of
soundcards for their time-scaling facilities. I would have to re-read all
the
literature on partial tracking to be sure that bin clustering per se has
not
been exploited, but in all the pvocs I know of it is such an obvious
phenomenon
that it is most unlikely it has escaped attention. The main problem as you
indicate is that of tracking weak partials (which may be genuine even if
not
harmonically related to a suspected fundamental). So the research interest
is no
so much that it happens, but in how best/accurately to exploit the fact
that it
happens.

Richard Dobson




Back to top
FA
Guest





Posted: Wed Sep 14, 2005 9:29 pm    Post subject: Re: "zero phase" FFT windows Reply with quote

Howdy Rune,
If you take a look at the thread Richard Dobson replied with he's on the
right track, when I refer to "true"
frequencies I mean the frequencies that have been obtained on each bin by
using the phase vocoder
phase differentiation approach.

The term "partial" is quite a common term from Sinusoidal Modelling and it
refers simply to spectral peaks
or more accurately it refers to real spectral peaks as opposed to sidelobe
peaks and the trick in
Sinusoidal Modelling is to extract the partials but not the sidelobes

The frequency clustering that I refer to is an artefact of the phase vocoder
algorithm and as I said in my
original post I've made use of that effect in my application to identify
likely partials that aren't actually
spectral peaks (when I said earlier that partials are spectral peaks what I
was trying to say was that
partials are perceptually significant sinusoidal components - impulses in
the frequency domain if you like)

Regards,
Frase.


Rune Allnor wrote in message
<1126463598.258802.289760@f14g2000cwb.googlegroups.com>...
Quote:

FA wrote:
Hi All,
Has anyone else played around with "zero phase" windows for STFT?

The idea here is that instead of the "conventional" way of applying the
window where the zero'th window coefficient is applied to the zero'th
input
sample in the FFT buffer etc. and all of the zero padding (if any) is
applied at the end, with a zero phase window you apply the centre
coefficient of the window to the zero'th input sample and circular index
the
window applying any zero padding in the centre of the FFT buffer.

OK... the center window coefficient of an M length window is applied
to x[0]. You use the wrap-around effect to map the "leading edge" of
the window to the end of the data buffer... so far so good.

What part of the data window is mapped to the end? x[-M/2]..x[-1]?

The reason that I ask is because I've noticed a few interesting side
effects
of this approach - the most interesting of which is that if you work out
the
"real" frequency at each FFT bin by phase differentiation (pretty much
using
the phase vocoder approach) what you will see is that there are
"clusters"
of bins that all register the same "real" frequency.

I don't understand. What do you mean by "real frequency"? How is
this approach different from examining the magnitude of the spectrum?

This has two advantages:

The first is if you want to carry out peak picking to identify partials
then
you will get much less frequency jitter than a conventional window in the
case where poor resolution causes subsequent frames to move up and down a
bin or two.

How do "conventional windows" shift the frequency? It is well known
that windowing broadens the frequency peaks, but do they shift?

The second advantage is really cool - basically you can identify partials
that aren't actually FFT peaks - e.g. partials obscured by spectral
smearing. What I've done is to write a loop which looks for frequency
clusters and if it finds one it then looks for a frequency cluster around
the first and second harmonic - the reason for the harmonic test is that
this approach isn't completely infallible, however with a few hearistics
like the harmonic test it has managed to give significantly higher
perceived
frequency resolution for low pitches when applied to my sinusoidal
modelling
application than I would otherwise have been able to get with a given
window
size - which has obvious time resolution benefits.

I don't understand what you mean. What is a "partial"? What is a
"frequency cluster"?

Anyway - I just wanted to see if anyone else has tried this, I've seen
the
zero phase window used in a couple of places, but I've not seen anyone
else
use the "frequency clustering" property.

I don't understand much of what you do here. As far as I can tell,
doing a time shift in the DFT as you do here, would only affect the
phase term of the spectrum, not the magnitudes. So basically, I don't
see any reason why anything interesting should happen here.

If, on the other hand, you have been examining the real parts of the
spectrum, I would not be surprised if you see some effects.

Could you provide some more details about what you do, and how?

Rune
Back to top
 
Post new topic   Reply to topic    CASTalk.com Forum Index -> DSP All times are GMT
Page 1 of 1

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum




VoIP Electronics Powered by phpBB