[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [oc] Processor Instruction reply for Andreas



On Tuesday 11 December 2001 05:19 pm, you wrote:
> Rudi,
>
> <snip>
>
> > The advantages and disadvantages are:
> >
> > For 1:
> > + single instruction, saves program space
>
> My single instruction would probably be as big as two old ones
> but overall the saving on would be around 30% in tests so far.

OK, fine, I'll give you this one, there are different ways of defining
the instruction and their sizes ...

> > - very long pipeline, branches execute many cycles after
> >   issuing (long delay slots)
>
> Who has a very long pipeline? My design doesn't affect the length
> of the pipelining. It may allow it to be reduced without any loss
> in speed. I haven't coded the whole thing yet (working on translator
> at the moment) but as soon as I do I'll let you know how it goes.

Hmm, lets see. We are talking about a CPU implementation in an IC, right ?
Assuming your answer is yes, let me ask you some questions:
1) How are you going to implement the compare and branch
    instructions in a short pipeline and run at high speed ?
2) Have you ever coded a adder/subtracter ?

Assuming you have something like a 0.18u technology available to
you, with an average gate delay 0f 100ps (approx.), how long do you
think your pipeline would be and what speed do you think you could
run it at ???


> > - Depending on implementation, may take different number of
> >  cycles to execute, when a branch is taken, vs. not taken.
>
> Look-ahead as implemented for years now automatically scans for
> jumps to make sure it has the 'alternative' piece of code ready
> if the jump is actually committed thus jumps don't 'surprise+delay'
> the pipelining. I think they even use 2+ integer pipelines in most
> processors these days maybe even for this reason?

I didn't think you where going to make it that complex. ok, you can 
prefetch both ways, and not loose anything either way ...

> > For 2:
> > - Two instructions, need larger program memory
>
> Given that not every instruction is suitable for conversion yes, tests
> show that on average a 60-70% increase in size. Instructions like
> INT 21h are a bit wasteful in 64bit instructions but I would rather waste
> one byte of program space than one processor clock cycle. If its a choice
> between using a '200MHZ performance' processor with a 2MByte program or a
> '360MHZ performance' processor with a 3.2MByte program I would go for the
> faster one. I usually find it easier to add another DIMM than over clock
> my processor by 80%.

Perhaps I misunderstood your entire idea. From what I read above, are you
thinking about a VLIW implementation ?

> > + short pipeline
>
> Theres no reason why the pipeline should be any shorter on either design
> but if you want it shorter then my design may aid that.

All designs I worked on, required a short pipeline at very high speed.

>
> > + faster execution
>
> Mine? Of course, less clock cycles to do the same job means faster
> execution of the same task.
>
> > To summaries, a RISC architecture typically tries to keep it's
>
> instructions
>
> > very simple and easy to execute. The goal for a RISC is to execute a
> > small number of instructions at a very high speed. This results in
> > overall faster execution, even though one might argue that the complex
> > CISC instructions "do more". Compared to a RISC, a CISC is typically
> > alot slower.
>
> If anything my design depending on how you view it either reduces the
> instruction
> set or keeps the same amount of instructions. I'll have more details and
> a sample
> design when I finish my translator. Want to get that done and finished
> before
> jumping into implementing a complete CPU core!

You should consider actual implementation details at an early stage. I 
would even consider doing a trial implementation to see how many levels
of logic you have worst case.

> Paul

rudi
--
To unsubscribe from cores mailing list please visit http://www.opencores.org/mailinglists.shtml