Yahoo Answers is shutting down on May 4th, 2021 (Eastern Time) and beginning April 20th, 2021 (Eastern Time) the Yahoo Answers website will be in read-only mode. There will be no changes to other Yahoo properties or services, or your Yahoo account. You can find more information about the Yahoo Answers shutdown and how to download your data on this help page.

Matlab vectorization - why is it faster?

So I've read all sorts of things on the virtues of vectorizing Matlab code. It's got serious speed advantages (one listing showed a 40x speed increase on operations over a vector of 100,000 elements), but nobody seems to want to explain quite why.

So that's my question. I could understand it if for instance it were operating on shorts and could fit several in a register at a time, so we could perform operations on four values with a single processor call, but MatLab shows massive speed increases on long strings of floating point ( > 1 word long) numbers.

It's my understanding that MatLab is implemented in C++, so I really can't see why it doesn't simply run for loops extremely quickly - and I can't work out how it performs these long vector operations without essentially running a for loop over the length of an array.

If anyone could offer help, that would be much appreciated...

Thanks,

James

2 Answers

Relevance
  • Pfo
    Lv 7
    1 decade ago
    Favorite Answer

    You've fallen for a little trick :)

    32 bit processors work best with 32 bits of data. You might think that using 16 bit integers would be faster, because it can process 2 at a time, but that's totally false. It's actually slower to use values that aren't 32 bit, because the processor has to do some extra book keeping. When processing 32 bit values, the process can streamline them. When processing 16 bit values, the process has to stall and wait for another 16 bit value to come into the pipeline, or execute without one if that isn't happening.

    I assume by vectorizing you mean using the streaming instructions of the processor? Well, the simple answer is because the streaming instructions are very quick for loops. Instead of the processor being dumb and moving memory around its registers, it knows that it's going to process N 32 bit values and do <this> with them. It doesn't bother using instructions to increment indices, it all happens internally. The net result is fewer CPU instructions to peform the same task a for loop can.

  • ?
    Lv 4
    4 years ago

    c++, c and assembly is like speaking in Russian devoid of making use of an interpreter. attempt to income Russian as a 2nd language which isn't person-friendly yet interior the tip you will communicate quicker. Java, Python and Matlab are like hiring an interpreter to interpret each be conscious you assert, you do not could desire to go through gaining understanding of Russian in spite of the shown fact that it is going to consume time. In c++ for occasion you deal without delay with the reminiscence addresses, there's not something extra useful. yet with immediately's computers the differnce will in basic terms take place once you clean up and compute alof of equations for 3-d pics that's could desire to be finished a t just about actual time all you're making use of seek techniques to clean up for optimum recommendations making use of C and C++ in those initiatives will make an excellent sort of differnce. ;;;;;;;;;;;;;;;;;;;;;;; processors run laptop classes those processors comprehend a low point language like assembly the concern with low point language is that it consumes some time handling the technicality of the processor, through fact of this compiled and interpreted language are available place compiled language generate a code which will deal without delay with the laptop, interpreted languages will could desire to bypass via a guy made interface this slows the laptop yet improve your instinct and forestall you from dropping some time handling low point reminiscence subject concerns and so on

Still have questions? Get your answers by asking now.