[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [oc] New 64bit instructions for 32bit processor cores analogy



> 8bit cpus? The goal with the enhanced (vs. new) instructions
> is performance. The original message was to detail an idea

Before starting to experiment with these kinds of things, you definitely
should get hold of a copy of Hennessy and Patterson's
'Computer Architecture: A Quantitative Approach'.
It's _the_ standard computer architecture book.

> that 32bit processor cores could work smarter not harder
> (RISC or CISC) using 64BIT INSTRUCTIONS to reduce the

While 32 bit instructions have some advantages over 16 bit ones, I
really can't see how you could make use of 64 bit. It's not like a
normal processor would have a need for 65536 registers or something
like that. VLIWs and such (like the Itanium) are of course another matter.

> amount of clock cycles required to achieve the same task
> with the processor. The examples I citied were a CMP/JMPcc

The standard form of compare-and-branch is to have it only compare
a register against 0. That way you can do without the ALU delay and
still cover the vast majority of conditional branches without any
extra instrucitons (take a look in the book mentioned above for numbers).

> combination and a memory/register move whilst performing
> another operation on the code, ADD, SUB, OR, AND, etc...

Three address instructions such as
add r1, r2, r3   ; r3 = r1 + r2
are standard on most RISCs I'm aware of.

Things like automatic post-increment on moves to/from memory is
also available, for example on the PPC.

> > True all our instructions today are based upon the instructions
> of years ago. I don't think since the 68000 appeared there has
> been any major jumps in instruction development (to the best of

You definitely need to look at some modern architectures.
PPC, SPARC and ARM are a couple of examples.
 
> tested old-faithful ones. Even when RISC arrived, yes they reduced
> the instruction count but the ones left were just the same old same

The idea with RISC is to have simple instructions, not few.

Having few instructions does not buy you anything as long as the
more feature rich instruction set can still be easily decoded, which
is generally not a problem with 32 bit instructions.
Not having to deal with instructions that can cause multiple page
faults (more than one memory access) is quite another matter.

> Who says because we have a 32bit processor core it has to use 32bit
> data path? Anybody who says that had better look at every Pentium

What definitions of 32 bit processor core and 32 bit data path do you
use here?
The size of the instructions have nothing to do with the size of data
a CPU can deal with. I'm not aware of any current CPU that can deal
with 64 bit integers that does not have 32 bit instructions (except the
Itanium). 'Data path' is usually used to describe the ALU and such.
Also, the size of integer data a CPU can deal with has nothing to do
with the size of its extern bus. The external bus could be a single
bit wide and it wouldn't matter at all, as long as it was fast.

> ever made. They all do it but still they have what I consider 'dumb'
> instructions.

The Pentium's have what I'd call a 32 bit data path. They currently
have 64 bit bus interfaces, though, and instructions can be anything
from a single to something like ten or so bytes.

> where you carry out the instructions sent to you. In the 'old school'
> way of doing it a CMP/JMPcc is like the following:
...

You wouldn't need 64 bits for that even with your proposed actual
comparison between two registers, much less for the almost as efficient
compare register to zero. 

As others have said here, you can't get away from the fact that the
branch can't be decided until your subtraction has completed.
Doing these things in a single pipeline stage will obviously have
implications for the clock speed (you can't decide whether to branch or
not until the result has been computed). 
Doing them in separate pipeline stages means that your branch, if
mispredicted, will cause an extra cycle delay due to the larger
'pipeline bubble'.

I really like the way the PPC does it, with flag registers that you
can set at any time and then use later for your conditional branches.
IIRC, that processor also has a counter register for loops.

I don't know if any of the commonly used desktop/workstation RISCs have
a single flag register like the m68k and x86 (the ARM does). I'd expect
them to have several flag registers, like the PPC, or to set bits in
normal registers according to comparison results.

> 'Old School' for the move/add (add could be any other ALU operation)
...

As I mentioned above, you'll find three-address (r1 = r2 + r3)
instructions in lots of current RISCs.
This does not cause any slow-down compared to the old two-address
style (r1 = r1 + r2), but it does require a couple of more bits.
With the normal 32 entry register files, this is still only 15 bits
(out of 32, usually), which is hardly a problem.
 
> Hopefully you can now see how by expanding the instructions to 64bit can
> produce more powerful versions of the 'old school' instructions. Whilst

Not really.

> As most memory now ships in 64bit wide DIMM modules just how long is it
> before
> CPU designers start taking proper advantage of it?

Uhm. No modern chip should have any difficulty saturating its memory bus
as it is. Indeed, a lot of time is already spent waiting for memory in
many applications.

-- 
  Chalmers University   | Why are these |  e-mail:   rand@cd.chalmers.se
     of Technology      |  .signatures  |            johank@omicron.se
                        | so hard to do |  WWW:      rand.thn.htu.se
   Gothenburg, Sweden   |     well?     |            (fVDI, MGIFv5, QLem)
--
To unsubscribe from cores mailing list please visit http://www.opencores.org/mailinglists.shtml