Auto-Parallelization & Auto-Vectorization Tools
VAST is a family of powerful software tools designed to automatically parallelize or vectorize code for improved application performance. All code restructuring can be done automatically at the click of a mouse. The original source files are retained in unmodified form.
VAST-F/Parallel can automatically convert serial code into code designed for multi-core or multi-processor systems and is ideal for use with compilers, which do not include auto-parallel capabilities. VAST-F/Parallel is also an excellent option for use with auto-parallelizing compilers (such as Absoft) to add additional optimizations and allow you to further explore multi-threaded opportunities. All code restructuring is done automatically and the original source files are retained in unmodified form. VAST-F/Parallel also provides OpenMP support.
More Features:
- Full Loop Nest Analysis. Loops are analyzed in simple and complicated loop nests; loops containing the largest amount of work are parallelized. Loops do not have to be tightly nested.
- Extended Parallel Regions. VAST/Parallel extends parallel regions to include multiple parallel loops and intervening scalar code. This cuts down on parallel overhead.
- Threshold testing. All parallel systems have some overhead. When VAST/Parallel finds a parallel region, if the amount of work in the region is not clear at compile time, then VAST/Parallel creates a run-time test. Through this run-time test, the parallel region will only be executed if there is enough work; otherwise, the original serial version is executed.
- Dependence Analysis. VAST/Parallel has very sophisticated data dependency analysis capabilities that allow it to optimize complicated situations. All loop nests are examined to see if they can be executed in parallel safely. VAST/Parallel can resolve ambiguous subscripting by examining variable assignments outside of loops, and restructure the use of variables to avoid certain other dependencies.
- Potential Dependence Testing. When dependencies are unclear at compile time, sometimes VAST/Parallel can generate run-time tests to allow parallelism to proceed.
- Special Reduction Optimization. Summations and other reductions are parallelized through the use of locks or critical regions.
- Shared/Private Determination. All variables in a parallel loop are categorized as shared (seen by all threads) or private (copy in each thread). VAST/Parallel can detect and create private arrays.
- Interprocedural Analysis for Parallel Calls. VAST/Parallel can examine call chains to determine their dependencies, and then parallelize loops containing calls or groups of calls outside loops.
- Automatic recognition of parallel cases. When sections of code deal with disjoint operations, VAST/Parallel can process each section in a separate parallel case.
- Superscalar optimizations. VAST/Parallel includes scalar optimizations to boost performance even in a single thread. Parallel optimizations can be done to outer loops while inner loops are optimized for efficient execution on one thread.
- Array Syntax. VAST/Parallel can in general parallelize and optimize multi-dimensional array syntax just as efficiently as loop nests.
- Choice of static or dynamic partitioning of loop iterations. Load balancing can tradeoff with loop overhead. Use dynamic partitioning when you need more load balancing, static partioning when you are concerned about overhead.
- Number of threads can be set with an environment variable. This allows degree of parallelism to be changed from run to run. When the system is busy you can run with two threads, when it is empty you can run with eight threads, without recompiling your program.
- Choice of thread waiting strategy. You can select either busy waiting or sleep waiting for threads, so that the parallel program can adapt to loaded or dedicated workloads on the target system. Use busy waiting on a lightly loaded system, and sleep waiting when another job might need the cycles.
VAST/Parallel fully supports the OpenMP standard. For calculations where you know exactly what you want parallelized, OpenMp provides a portable way to specify this. VAST/Parallel supports all OpenMP directives/pragmas and functions, and provides diagnostics on incorrect use of the directives.
Special Features:
- Thread private common (choice of methods)
- Orphan directives
- Nested parallelism
- Reduction optimizations
- Environment variables
- Efficient library implementation