16K pentium level one cache
CASTalk.com Forum Index CASTalk.com
Discussion of DSP, FPGA, storage and embedded system.
 
 FAQFAQ   MemberlistMemberlist     RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 
 
Google
 
Web castalk.com
16K pentium level one cache
Goto page Previous  1, 2, 3, 4  Next
 
Post new topic   Reply to topic    CASTalk.com Forum Index -> Computer Architecture
Author Message
karl malbrain
Guest





Posted: Fri Dec 03, 2004 9:51 pm    Post subject: Re: 16K pentium level one cache Reply with quote

Terje Mathisen <terje.mathisen@hda.hydro.com> wrote in message news:<cop37l$ben$1@osl016lin.hda.hydro.com>...
Quote:
karl malbrain wrote:

Terje Mathisen <terje.mathisen@hda.hydro.com> wrote in message news:<coju23$3oi$1@osl016lin.hda.hydro.com>...
Karl, all modern cpus, and in this regard even the Pentium is somewhat
modern (:-(), make it effectively impossible to get totally
deterministic timing.


Actually I just need to avoid DATA-DEPENDENT timings

Ah!!!

You're trying to avoid all _repeatable_ forms of DATA-DEPENDENT timings,
since these could be used to determine the current key value, right?

Right.

Quote:
Having really random variations would be OK, but anything that leaks
info is bad. :-(

(BTW, have you considered a (really) random length delay loop on the
exit of these function calls? Alternatively, time the current run, and
spin-loop until a minimum value is achieved. This would effectively
remove nearly all possibilities to do timing-related crypt-analysis!)

I believe this is done in implementations that need to withstand
physical access and protect their keys, like the standalone PC cards
used for electronic banking.

Quote:
Re. your bank conflict problems: Simply rearrange your Pentium version
of the code to never do more than one L1 cache access/cycle. On the P5
this is quite easy to do, and you really shouldn't give away any
significant performance by doing it.

Right. Does the NOP instruction stall only a single "pipeline"? Do
you know how the FENCE instruction interacts with LEVEL ONE cache?
Thanks, karl m
Back to top
Terje Mathisen
Guest





Posted: Fri Dec 03, 2004 11:43 pm    Post subject: Re: 16K pentium level one cache Reply with quote

karl malbrain wrote:

Quote:
Terje Mathisen <terje.mathisen@hda.hydro.com> wrote in message news:<cop37l$ben$1@osl016lin.hda.hydro.com>...
Re. your bank conflict problems: Simply rearrange your Pentium version
of the code to never do more than one L1 cache access/cycle. On the P5
this is quite easy to do, and you really shouldn't give away any
significant performance by doing it.

Right. Does the NOP instruction stall only a single "pipeline"? Do
you know how the FENCE instruction interacts with LEVEL ONE cache?

On the P5 either pipe can run a NOP, so it is perfect for keeping the
desired pairing after a branch or other u-pipe only operation has caused
a particular starting alignment.

FENCE doesn't exist on P5 afaik.

Terje

--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
Back to top
Terje Mathisen
Guest





Posted: Fri Dec 03, 2004 11:55 pm    Post subject: Re: 16K pentium level one cache Reply with quote

David Wagner wrote:

Quote:
Terje Mathisen wrote:

(BTW, have you considered a (really) random length delay loop on the
exit of these function calls? Alternatively, time the current run, and
spin-loop until a minimum value is achieved. This would effectively
remove nearly all possibilities to do timing-related crypt-analysis!)


I'm skeptical. It seems unlikely to me that this would be enough.
You've just decreased the S/N ratio a bit, but an attacker can still
pick the signal out of the noise by just taking more observations and
averaging them. (The noise will start to average out, eventually
leaving only the signal.)

Not really, or rather: If 'eventually' is past the lifetime of the key
(10 seconds to 25 years?), then it doesn't matter, right?

A spin loop doing RDTSC would keep going until past a certain value,
reducing the maximum possible information left to be modulo the length
of said loop.

However, since all timing measurements (by an attacker) would also have
to be either via the same RDTSC timestamps, or by observing network
packets, it would be exceedingly hard to pick anything out of the noise.

Using the remaining time until deadline as an index for a calculated
loop, setup in such a way as to get within a cycle or two of the desired
total value, would be enough to get well below the 10-25 cycles
(variable!) RDTSC latency.

I.e. I believe this is doable to a level where timing-dependent
cryptanalysis is effectively impossible without hw to observe the
system, but at that level of access it would be much easier to attack
the sw directly.

Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
Back to top
David Wagner
Guest





Posted: Sat Dec 04, 2004 12:23 am    Post subject: Re: 16K pentium level one cache Reply with quote

Terje Mathisen wrote:
Quote:
Would you then accept my other premise which was that crypto running on
a Pentium-class cpu, with direct attacker access to the machine, would
be _much_ easier to break by simply running under a debugger like SoftIce?

Yes, if your threat model is physical attacks, there are other better
attacks.

If your threat model is a malicious process running on the same machine,
it might be possible for that process to talk to the crypto engine but not
to run it under a debugger. Then the timing attack might be serious.

Quote:
For someone attacking this over the network, please suggest a scenario
where you could recover enough bits to break the key in less than, say
10 years?

See Boneh and Brumley's Usenix Security paper for evidence that remote
timing attacks may be possible (even in scenarios where nobody would have
expected them to be).
Back to top
Karl Malbrain
Guest





Posted: Sat Dec 04, 2004 2:36 am    Post subject: Re: 16K pentium level one cache Reply with quote

Terje Mathisen wrote:
Quote:
karl malbrain wrote:

Terje Mathisen <terje.mathisen@hda.hydro.com> wrote in message
news:<cop37l$ben$1@osl016lin.hda.hydro.com>...
Re. your bank conflict problems: Simply rearrange your Pentium
version
of the code to never do more than one L1 cache access/cycle. On the
P5
this is quite easy to do, and you really shouldn't give away any
significant performance by doing it.

Right. Does the NOP instruction stall only a single "pipeline"?
Do
you know how the FENCE instruction interacts with LEVEL ONE cache?

On the P5 either pipe can run a NOP, so it is perfect for keeping the

desired pairing after a branch or other u-pipe only operation has
caused
a particular starting alignment.

FENCE doesn't exist on P5 afaik.

I think I'm clear enough on the P5. The newer pentiums also issue
multiple instructions per cycle. Are there bank-conflict-stalls on
them? Emperically I'd say yes. Thanks, karl m
Back to top
Terje Mathisen
Guest





Posted: Sat Dec 04, 2004 3:20 am    Post subject: Re: 16K pentium level one cache Reply with quote

David Wagner wrote:

Quote:
Terje Mathisen wrote:

A spin loop doing RDTSC would keep going until past a certain value,
reducing the maximum possible information left to be modulo the length
of said loop.


That's different from a random delay, which is what you proposed
previously. Slowing the cipher down to its worst-case execution time
should work to defend against timing attacks. However, for AES, this
is extremely expensive, as the worst-case execution time is much, much
higher than the average-case.

In other words, I remain skeptical that you can get security at a
reasonable performance overhead.

OK.

Would you then accept my other premise which was that crypto running on
a Pentium-class cpu, with direct attacker access to the machine, would
be _much_ easier to break by simply running under a debugger like SoftIce?

For someone attacking this over the network, please suggest a scenario
where you could recover enough bits to break the key in less than, say
10 years?

Terje

--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
Back to top
Tony Nelson
Guest





Posted: Sat Dec 04, 2004 4:24 am    Post subject: Re: 16K pentium level one cache Reply with quote

In article <coqd60$rkn$1@agate.berkeley.edu>,
daw@taverner.cs.berkeley.edu (David Wagner) wrote:

Quote:
Terje Mathisen wrote:
A spin loop doing RDTSC would keep going until past a certain value,
reducing the maximum possible information left to be modulo the length
of said loop.

That's different from a random delay, which is what you proposed
previously. Slowing the cipher down to its worst-case execution time
should work to defend against timing attacks. However, for AES, this
is extremely expensive, as the worst-case execution time is much, much
higher than the average-case.

In other words, I remain skeptical that you can get security at a
reasonable performance overhead.

A delay loop that extended execution to some, say, 10th of worst-case
time would probably erase nearly all the escaping timing information, at
1/10th the cost of the worst-case delay. I don't think much information
is escaping through timing anyway, so lots of averaging and many trials
would normally be needed. If most of the timing information were
erased, it would certainly take many more trials to build up the key.
____________________________________________________________________
TonyN.:' tonynlsn@shore.net
'
Back to top
Karl Malbrain
Guest





Posted: Sat Dec 04, 2004 8:35 am    Post subject: Re: 16K pentium level one cache Reply with quote

David Wagner wrote:
Quote:
Terje Mathisen wrote:
For someone attacking this over the network, please suggest a
scenario
where you could recover enough bits to break the key in less than,
say
10 years?

See Boneh and Brumley's Usenix Security paper for evidence that
remote
timing attacks may be possible (even in scenarios where nobody would
have
expected them to be).

You are comparing an attack that measures differences of 10 million
cycles between RSA encryptions over a local network, and Professor
Bernstein's attack which is accumulating just a few cycles per AES
encryption. I don't see how this can be pulled out of the general
network noise.

On Monday I'll post an exact number.

N.b the paper is located at http://crypto.stanford.edu/~dabo under
papers/ssl-timing.pdf, and my figure comes from Table 1.

karl m
Back to top
Anton Ertl
Guest





Posted: Sat Dec 04, 2004 4:38 pm    Post subject: Timing-based attacks (was: 16K pentium level one cache) Reply with quote

Terje Mathisen <terje.mathisen@hda.hydro.com> writes:
Quote:
You're trying to avoid all _repeatable_ forms of DATA-DEPENDENT timings,
since these could be used to determine the current key value, right?
....
(BTW, have you considered a (really) random length delay loop on the
exit of these function calls? Alternatively, time the current run, and
spin-loop until a minimum value is achieved.

The second option is certainly much better against timing-based
attacks. However, the P5 and later CPUs have performance monitororing
counters that offer additional ways of unwanted information transfer;
a simple delay loop is probably not good enough, but a delay-loop that
does similar things to the AES loop might be good enough.

- anton
--
M. Anton Ertl Some things have to be seen to be believed
anton@mips.complang.tuwien.ac.at Most things have to be believed to be seen
http://www.complang.tuwien.ac.at/anton/home.html
Back to top
Karl Malbrain
Guest





Posted: Sat Dec 04, 2004 11:23 pm    Post subject: Re: 16K pentium level one cache Reply with quote

Tony Nelson wrote:
Quote:
In article <coqd60$rkn$1@agate.berkeley.edu>,
daw@taverner.cs.berkeley.edu (David Wagner) wrote:

Terje Mathisen wrote:
A spin loop doing RDTSC would keep going until past a certain
value,
reducing the maximum possible information left to be modulo the
length
of said loop.

That's different from a random delay, which is what you proposed
previously. Slowing the cipher down to its worst-case execution
time
should work to defend against timing attacks. However, for AES,
this
is extremely expensive, as the worst-case execution time is much,
much
higher than the average-case.

In other words, I remain skeptical that you can get security at a
reasonable performance overhead.

A delay loop that extended execution to some, say, 10th of worst-case

time would probably erase nearly all the escaping timing information,
at
1/10th the cost of the worst-case delay. I don't think much
information
is escaping through timing anyway, so lots of averaging and many
trials
would normally be needed. If most of the timing information were
erased, it would certainly take many more trials to build up the key.

Unfortunately in cryptography it is assumed that one's adversary has
enough resources to expend in "many more" trials -- the current
standard is that anything below a 2^^128 trial barrier is not enough.

The information escaping through timing is just one cycle in 500 cycles
total. karl m
Back to top
Karl Malbrain
Guest





Posted: Sat Dec 04, 2004 11:38 pm    Post subject: Re: Timing-based attacks (was: 16K pentium level one cache) Reply with quote

Anton Ertl wrote:
Quote:
Terje Mathisen <terje.mathisen@hda.hydro.com> writes:
You're trying to avoid all _repeatable_ forms of DATA-DEPENDENT
timings,
since these could be used to determine the current key value, right?
...
(BTW, have you considered a (really) random length delay loop on the

exit of these function calls? Alternatively, time the current run,
and
spin-loop until a minimum value is achieved.

The second option is certainly much better against timing-based
attacks. However, the P5 and later CPUs have performance
monitororing
counters that offer additional ways of unwanted information transfer;
a simple delay loop is probably not good enough, but a delay-loop
that
does similar things to the AES loop might be good enough.

The fear is that a stream of 16 byte messages with programmed content
can be sent to a server and the resultant timing differences due to
level one and level two cache loads and cache-bank-conflicts can be
used to determine the key being used on the remote server.

karl m
Back to top
BRG
Guest





Posted: Fri Dec 10, 2004 3:06 pm    Post subject: Re: Timing-based attacks Reply with quote

Anton Ertl wrote:
Quote:
Terje Mathisen <terje.mathisen@hda.hydro.com> writes:

You're trying to avoid all _repeatable_ forms of DATA-DEPENDENT timings,
since these could be used to determine the current key value, right?

...

(BTW, have you considered a (really) random length delay loop on the
exit of these function calls? Alternatively, time the current run, and
spin-loop until a minimum value is achieved.

The second option is certainly much better against timing-based
attacks. However, the P5 and later CPUs have performance monitororing
counters that offer additional ways of unwanted information transfer;
a simple delay loop is probably not good enough, but a delay-loop that
does similar things to the AES loop might be good enough.

I wondered when someone would notice just how much information can be
obtained from the performance counters :-)

Brian Gladman
Back to top
Terje Mathisen
Guest





Posted: Sat Dec 11, 2004 4:06 am    Post subject: Re: Timing-based attacks Reply with quote

BRG wrote:
Quote:
I wondered when someone would notice just how much information can be
obtained from the performance counters :-)

Here in comp.arch I'm fairly certain it was discussed around the time it
was introduced on the P5/Pentium cpu.

AFAIR I studied the tech docs, which while they didn't say anything
about all the performance counters, did specify both how the RDTSC
opcode worked, and the fact that a secure OS could choose to block it by
setting the TSD (Time Stamp Disable) bit in one of the control registers.

Terje
PS. (For sci.crypt readers) I reverse engineered those EMON counters and
wrote an article about them: "Pentium Secrets", Byte July 1994, pp 191-192
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
Back to top
BRG
Guest





Posted: Sat Dec 11, 2004 5:42 pm    Post subject: Re: Timing-based attacks Reply with quote

Terje Mathisen wrote:
Quote:
BRG wrote:

I wondered when someone would notice just how much information can be
obtained from the performance counters :-)

Here in comp.arch I'm fairly certain it was discussed around the time it
was introduced on the P5/Pentium cpu.

Yes - I was only thinking of this realisation in the context of this
specific thread on sci.crypt.

Quote:
AFAIR I studied the tech docs, which while they didn't say anything
about all the performance counters, did specify both how the RDTSC
opcode worked, and the fact that a secure OS could choose to block it by
setting the TSD (Time Stamp Disable) bit in one of the control registers.

Terje
PS. (For sci.crypt readers) I reverse engineered those EMON counters and
wrote an article about them: "Pentium Secrets", Byte July 1994, pp 191-192

Is there a version of this available anywhere on the web?

Brian Gladman
Back to top
Terje Mathisen
Guest





Posted: Sun Dec 12, 2004 3:08 am    Post subject: Re: Timing-based attacks Reply with quote

BRG wrote:

Quote:
Terje Mathisen wrote:
PS. (For sci.crypt readers) I reverse engineered those EMON counters
and wrote an article about them: "Pentium Secrets", Byte July 1994, pp
191-192

Is there a version of this available anywhere on the web?

checking...

It seems like I don't have any copies myself. :-(

I know there used to be one on the byte.com web site, but that's years
ago...

Yes, it is still there!

http://www.byte.com/art/9407/sec12/art3.htm

It is slightly embarrasing reading it 10+ years later, but most of the
information, except the statement that you'd need kernel mode code to
access RDTSC, was correct.

Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
Back to top
 
Post new topic   Reply to topic    CASTalk.com Forum Index -> Computer Architecture All times are GMT
Goto page Previous  1, 2, 3, 4  Next
Page 3 of 4

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum




VoIP Electronics Powered by phpBB