Talk:Vectorization (parallel computing)
|The contents of the merged into Automatic vectorization. For the contribution history and old versions of the merged article please see its history.page were|
Pipeline, run-time and other issues
I gave this article a good refactoring and there are some edges yet.
- old vector machines are still in use and some simpler (embedded) processors are not as powerful as cutting-edge versions. This is why I said "some times". Feel free to re-phrase that, of course.
- some processors can detect repetition in the load/store pattern and automatically change from normal instructions to vector instructions and the dynamic vectorization in the paper you mention is really interesting, but I was trying to cover "software" vectorizations. In the text I explain that it's not "vectorizing at run-time" but actually "choosing the vectorized version at run-time". Maybe a better title for the topic is needed, but I can't think of one that is as "catchy" as that... ;)
- I'll add memory reference optimizations (induction, reduction, etc) too, but I've been busy with loop dependence analysis and normalized loop.
Some issues on the descriptions added on Dec. 1. 2009
- "pipeline synchronization" in the last line of the Background section: Pipeline synchronization is not a factor for slow vector code for modern SIMD architectures. It used to be the case for old vector machines but not anymore.
- The description of runtime vectorization in the Run-time vs. Compile-time vectorization section: The description of runtime vectorization is misleading. The current description is about compile-time check-code generation for runtime values, which is part of 'code versioning'. I've seen some people are using the word this way but to me, runtime vectorization refers to the techniques that converts scalar instructions to vector counterparts during runtime such as a hardware technique described in the following paper.
So, the whole issue seems to stem from the fact that the two different SIMD architectures (one multimedia extension type and the other old pipelined vector machines) are described in one place together. As another example that comes from this is that you say 'a[i] = a[i+1];' cannot be vectorized in 'Building the dependency graph' section but below in 'Detecting idioms' section, you say 'a[i] = a[i] + a[i+1];' can be vectorized. As you know, 'a[i] = a[i] + a[i+1];' cannot be vectorized in modern SIMD architectures in a normal sense. So, my hope is that someone could make this distinction when referring to the features that are supported in only one of the two types of SIMD architectures. —Preceding unsigned comment added by Vermin8tr (talk • contribs) 17:07, 6 December 2009 (UTC)
Needs an example
- scalar program: for (i=0;i<10;i++) c[i]=a[i]+b[i]
and the equivalent vector version and how the vector version would run faster.
- x86's MMX instruction set provides facilities for adding multiple numbers together. MMX registers are 64 bits wide and can operate on 1 quadword, 2 doublewords, 4 words or 8 bytes. In your example, if we are allowed to assume c/a/b are all bytes (8 bits), then we can use the PADDB (packed add bytes) instruction to add 8 numbers at the same time. For simplicity, I'll change your example to loop to i<8 instead. So the equivalent code, using ASM syntax:
MOVQ mm0, [a] ; Load a to a into the mm0 register MOVQ mm1, [b] ; Load b to b into mm1 PADDB mm0, mm1 ; Add the bytes of mm0 and mm1 together, storing result in mm0 MOVQ [c], mm0 ; Store the result into c to c
- For simplicity, we'll assume (incorrectly) that each instruction takes about the same time.
- Therefore, the scalar version requires:
- 1 initialisation (i = 0)
- 8 checks (i < 8)
- 8 increments (i++)
- 16 loads (get a[i] and b[i])
- 8 adds (a[i] + b[i])
- 8 stores (c[i] = result)
- = 49 instructions*
- The MMX version requires:
- 2 loads (MOVQ mm0/mm1 [a]/[b]
- 1 add (PADDB)
- 1 store (MOVQ [c], mm0)
- = 4 instructions
- The number of instructions is reduced to 4/9=8%. The code is 12.25 times faster (not really, but probably at least 8x in real life).
deleted to include the features of modern SIMD architectures
I deleted the following sentence, because it describes the feature of the conventional vector machines. Such feature is better described in vector processor. Modern SIMD instructions, for example those of AltiVec and SSE, are not much different from the scalar instructions in terms of instruction latency. Similar to scalar instructions, ILP can be exploited for SIMD instructions as well.
since while there may be some overhead to starting up a vector operation, once it starts each individual operation is faster (in part because it avoids the need for instruction decoding).
I deleted the following paragraph, because it is unrelated to the theme. Maybe it should fit a "vectorization (computer graphics)" or something :
Raster-to-vector conversion is an important operation for computer graphics software. Raster images are the natural output of scanners and digital photography, they are the mainstay of TV and digital printing, and they can be created (with varying levels of skill) by users of bitmap-handling software such as Adobe PhotoShop or Corel PhotoPaint. However, raster images do not scale well, and for many purposes vector graphics are to be preferred. Raster-to-vector conversion software needs to: decode raster file formats, detect colour boundaries in images, simplify boundaries into smaller numbers of vectors (typically lines, arcs and Bezier curves), and write out vector files in suitable formats. Well-known raster-to-vector converters include Corel Trace, Adobe Streamline, and most sign-making programs.
I just created a page Vectorization (mathematics) which discusses a different use of the term. I put some disambiguation at the beginning of this article for now. Would it be OK with those who watch this article if it were moved to something like Vectorization (computer science) so that the unmodified Vectorization could be used as a disambiguation page? I don't know how this will affect the proposed merge. Michael Kinyon 17:10, 24 September 2006 (UTC)
- My thoughts exactly. I do not think either use is significantly more prominent than the other. --CyHawk (talk) 00:02, 6 March 2008 (UTC)
computer graphics is part of computer science: still a problem
We have a problem with the present disambiguation, which needs to be sorted out between vectorization, this article, and Vectorization_(computer_graphics). The problem is that comptuer graphics is part of computer science. So we really should rename this page with a (subject) that distinguishes it from the vectorisation used in computer graphics. It still seems a bit weird to me that the heterogeneous representation of an image is called "vector graphics", while raster representations, which can much more obviously be thought of as series of vectors (or matrices), are "non-vectorised", but that's an etymology problem for that another page, not this one. Maybe the idea is that the different pre-defined component objects of an image are something like basis vectors? Anyway....
So what should this article be? i don't see it fitting obviously into any of the main subdivisions in the computer science article. Template:Computer Science has a longer list of subdivisions of computer science. Concurrent computing and Concurrency (computer science) seem to be rather closely related to each other and to this article. My suggestions are:
- Vectorization (concurrent computing) or
- Vectorization (parallel computing)
- I like those. I would suggest renaming the other page first, or even both pages. I like Vectorization (vector graphics) or maybe Vectorization (digital illustration) for the other page. When I see "Vectorization (computer graphics)", the first thing that comes to mind is the hardware vector processing that goes on in a GPU. —Ben FrantzDale (talk) 14:33, 29 October 2010 (UTC)
- Regarding the hardware vector processing in a GPU, i see your point. However, i think the idea of a (subject) for title disambiguation is supposed to preferably use a wider subject rather than a narrower one. Computer graphics is one of the main subjects in the lefthand section of Template:Computer Science. My guess for what is now Vectorization (computer graphics) is that there is probably not much case for it to be a separate article from vector graphics, i.e. it should be a redirect to a subsection of vector graphics, and the content merged into that article. But that would require more work than just a move. And it would require discussion over at the talk page there: Talk:Vectorization (computer graphics). IMHO it's less urgent than renaming this page. And it's not that obvious that a merge+redirect would be a good idea, since vector graphics is reasonably big.
Unconventional old fashion computers?
This sentence is a mess:
Vector processing is a major feature of both conventional and modern supercomputers.
- Good point. Be bold and feel free to rephrase it. —Ben FrantzDale (talk) 00:55, 16 April 2009 (UTC)
image with example code
code in Building the dependency graph section
This code is not good example. When i=0, a = a[-16]. Getting element from array when index < 0 is not good idea. — Preceding unsigned comment added by 184.108.40.206 (talk) 21:03, 11 March 2014 (UTC)