G5-altivec

Best scalar floating point routines on PowerPC G5 Processor

Compile in gcc using -faltivec

Code depends on an algebraic substitution and re-expression of the tight loop written, to break the instruction serialization.

Edited as commented in italics above, also fixed comment error about minimum loop-work for the loop shown. A general library routine would/could simply use a branch test at the top to skip the whole vector/unrolled loop to get down to the scalar tail-work-loop at the bottom if called with small N.

And a general libary routine would deal with unaligned data too ... via permutes in the loop with a permute constant built via vec_lvsl. This adds considerable complexity and costs some performance, particularly on G4, and to a lesser degree on G4+

AltiVec is a single-precision floating point and integer SIMD instruction set designed and owned by Apple, IBM, and Freescale Semiconductor (formerly Motorola's Semiconductor Products Sector) — the AIM alliance. It is implemented on versions of the PowerPC processor architecture, including Motorola's G4, IBM's G5 and POWER6 processors, and P.A. Semi's PWRficient PA6T. AltiVec is a trademark owned solely by Freescale, so the system is also referred to as Velocity Engine by Apple and VMX (Vector Multimedia Extension) by IBM and P.A. Semi.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
G5.pim		G5.pim
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

G5.pim

G5.pim

README.md

README.md

Repository files navigation

G5-altivec

About

Releases

Packages

jzucadi/G5-altivec

Folders and files

Latest commit

History

G5.pim

G5.pim

README.md

README.md

Repository files navigation

G5-altivec

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages