Release History

This page lists the Arm Allinea Studio release history.

To download and install the latest version of Arm Allinea Studio, see our downloads page and follow the installation steps given on the download page.

Details on Release versions and links to the Release Notes and Documentation of Arm C/C++ Compiler, Arm Fortran Compiler, and Arm Performance Libraries are provided below.

Arm Allinea Studio also includes Arm Forge (Release History). 

For more compatibility information, see our supported platforms topic.

Arm Allinea Studio

Version 20.2

Released: June 26, 2020

  • Arm Allinea Studio: 20.2 June 26, 2020

    What's new in 20.2

    Arm Allinea Studio 20.2 includes:

    • Arm Compiler for Linux 20.2 - Released 26th June 2020
    • Arm Forge 20.1 - Released 26th June 2020

    Arm Compiler for Linux 20.2

    Additions and changes:

    • Arm Compiler for Linux suite 20.2:

      • Arm Compiler for Linux no longer supplies a ‘suite’ environment module (previously available as ‘<architecture>/<OS>/<OS_Version>/suites/arm-compiler-for-linux/<version>‘). Instead, to prepare your system environment for Arm C/C++/Fortran Compiler, load and use the ‘<architecture>/<OS>/<OS_Version>/arm-linux-compiler/<version>’ environment module.

      • The version of GCC that is packaged with Arm Compiler for Linux is now GCC 9.3.0 instead of GCC 9.2.0. A version of Arm Performance Libraries that is compatible with GCC 9.3.0 is also included.

    • Arm C/C++/Fortran Compiler 20.2:

      • We have restructured the Arm C/C++ Compiler and Arm Fortran Compiler reference guides. If you have bookmarked a topic whose URL address has changed, you will need to update it.

    • Arm Performance Libraries 20.2.0:

      • Improved BLAS level 2 performance for symmetric matrices.

      • Implemented improvements to FFT performance, including faster planning.

      • Implemented improvements to the SVE versions of libamath functions, namely exp, expf, log, logf, sin, sinf, cos, and cosf.

    Resolved issues:

    • Arm Compiler for Linux suite 20.2:

      • Fixed an issue which could occur when installing the package in non-english locales.

    • Arm C/C++/Fortran Compiler 20.2:

      • The libbfd libraries that were previously included in the Arm Compiler for Linux package, have been removed. Arm recommends that you use the libraries on your system.

      • Fixed an issue in Arm C/C++ Compiler that was causing a compilation failure. The failure would happen when the compiler parses an openmp declare variant pragma that has all its function arguments decorated with the linear clause.

      • Fixed further bugs that were preventing the correct build and execution of some workloads on SVE-based systems.

    • Arm Performance Libraries 20.2.0:

      • Fixed a bug in the LAPACK *POTRF routines that would cause a crash when using multiple threads, and when operating on large matrices.

    Open technical issues:

    • Arm Compiler for Linux suite 20.2:

      • The module swap command does not work when using newer versions of the module builtin. To swap the module, you must unload the old module, and then load the new module.

      • Arm C/C++/Fortran Compiler 20.2:

        • To function correctly on RHEL 8 systems, Arm Compiler for Linux relies on using the LD_LIBRARY_PATH environment variable. If you need to unset this environment variable, the compiler might fail to find libstdc++.so.6. If you are affected by this issue, contact the Arm Support team who will help determine a workaround for your system.

        • Copying module files to a different directory after installation, or using a symlink to the module files, might result in an error using 'module load'. If this issue affects you, to get the correct advice for your system configuration, please contact Arm support. Module configuration flexibility will be improved in a later release of Arm Compiler for Linux.

      Arm Forge 20.1

      Arm DDT additions and changes:

      • 20.1
        • Performance Reports is now packaged alongside DDT and MAP in the Forge installation. Performance Reports users should now download and install Forge, and can find documentation in the Forge user guide.
        • Stepping and breakpoints are now supported in Python 3 code.
        • Interpreted Python stacks are now merged inline with the native stack to give a simplified view of Python applications. The evaluation window supports full Python expressions, including assignments, calculations and function calls.
        • Python interpreter debug symbols are no longer required for Python debugging.
        • Added support for Arm Compiler for Linux 20.1.
        • Added support for Arm Compiler for Linux 20.2.
        • Added support for GNU 8.3.0 compiler.
        • Changed the default DDT debugger to Forge GDB 8.2.
        • Added support for MVAPICH 2.3.x.
        • Static analysis for C++ updated to use cppcheck 1.89.
        • Added support for Red Hat Enterprise Linux/CentOS 8.
        • Added support for Open MPI 4.0.x.
        • Deprecated ddt-client and allinea-client (used for manual launch), please use forge-client instead.
        • Deprecated ddt-mpirun (used in .qtf scripts), please use forge-mpirun instead.
        • The backend daemon ddt-debugger has been renamed forge-backend and moved to the libexec directory. The frontend process ddt.bin has been renamed forge.bin.
        • Added support for Cuda 10.2 Debugging.
        • Allow MPI implementation to be specified via command-line options --mpi and --list-mpis.
        • Added support for PGI 20.1.
        • Forge and Licence Server are now built with Qt5 which comes with better OS X support.
        • Removed support for some obsolete compilers and MPIs. See "Supported Platforms" in the user guide for more information.
        • Added the version number to the default install path on Windows and Linux, to allow multiple versions to be installed alongside each other.

      Arm DDT resolved issues:

      • 20.1
        • [FOR-10146] Fixed a bug that caused the Windows installer to crash when opening the release notes.
        • [FOR-6608] Fixed an issue where std::string did not display correctly when using the PGI compiler.
        • [FOR-9981] Fixed a help link in the MDA Viewer.
        • [FOR-4149] Fixed a bug where the "Compare across threads" button in "View Pointers Details" did not work when accessed via the "Current Memory Usage" Tool.
        • [FOR-8854] A document history page is now included in the Arm Forge User Guide listing the version history.
        • [FOR-8977] Fixed an array sorting issue with the variable viewer.
        • [FOR-9068] Fixed a race condition from within cuda-gdb which caused DDT to occasionally fail when launching a kernel.
        • [FOR-9236] Fixed an issue which occasionally displays this benign warning message "'libmap-sampler.so' from LD_PRELOAD cannot be preloaded".
        • [FOR-9271] Fixed an issue where OpenMPI occasionally displays a SIGPIPE signal during startup, causing a failure.
        • [FOR-6070] Fixed an issue with example builds to allow spaces, ampersand (&), and hash (#) in the pathname.
        • [FOR-9934] Fixed a bug where using fence post checking with aligned memory allocators, such as posix_memalign, would lead to buffer overflows.
        • [FOR-9735] Fixed a bug whereby the C++ standard library used by Forge could interfere with target programs.
        • [FOR-9828] Fixed an issue whereby std::ofstream usage would not be correctly categorised as I/O when profiling.
        • [FOR-10036] Fixed possible crash when closing assembly view.
        • [FOR-10106] Fix crash when accepting Reverse Connect requests.
        • [FOR-9918] Fixed a bug where DDT potentially crashes when the program being debugged was signalled when evaluating expressions in gdb.
        • [FOR-9519] Improved the security of the Windows Remote Client.
        • [FOR-10096] The stereo rendering mode has been removed from the Array Viewer visualisation tool.
        • [FOR-10144] Fixed a crash that can occur in the Arm Forge remote client when connecting to remote session.
        • [FOR-9827] Fixed a crash that occurred when viewing Fortran arrays (PGI 19.5, 20.1 and Arm Compiler for Linux 20.2).
        • [FOR-10279] Fixed a bug which caused segfaults while opening and closing some context menus.

      Arm MAP additions and changes:

      • 20.1
        • Performance Reports is now packaged alongside DDT and MAP in the Forge installation. Performance Reports users should now download and install Forge, and can find documentation in the Forge user guide.
        • Added support for Python 3.8 profiling.
        • Added GPU Metrics for ppc64le systems.
        • Removed the following GPU Metrics to increase performance and stability of GPU Profiling: "temperature" and "time spent in global memory accesses".
        • Added support for Arm Compiler for Linux 20.1.
        • Added support for Arm Compiler for Linux 20.2.
        • Added support for GNU 8.3.0 compiler.
        • Added support for CUDA 10.2 Profiling.
        • Added support for MVAPICH 2.3.x.
        • forge-probe now shows a warning if security settings may prevent the collection of Perf-based metrics.
        • Static analysis for C++ updated to use cppcheck 1.89.
        • Added support for Red Hat Enterprise Linux/CentOS 8.
        • Added support for Open MPI 4.0.x.
        • Deprecated ddt-client and allinea-client (used for manual launch), please use forge-client instead.
        • Deprecated ddt-mpirun (used in .qtf scripts), please use forge-mpirun instead.
        • The backend daemon ddt-debugger has been renamed forge-backend and moved to the libexec directory. The frontend process ddt.bin has been renamed forge.bin.
        • A warning displays if GPU profiling is prevented by the nvidia module parameter NVreg_RestrictProfilingToAdminUsers.
        • Allow MPI implementation to be specified via command-line options --mpi and --list-mpis.
        • Added support for PGI 20.1.
        • Forge and Licence Server are now built with Qt5 which comes with better OS X support.
        • Removed support for some obsolete compilers and MPIs. See "Supported Platforms" in the user guide for more information.
        • Added the version number to the default install path on Windows and Linux, to allow multiple versions to be installed alongside each other.

      Arm MAP resolved issues:

      • 20.1.
        • [FOR-10146] Fixed a bug that caused the Windows installer to crash when opening the release notes.
        • [FOR-8547] Fixed an issue with a mismatch between some line breakdowns and adjacent sparkline graphs.
        • [FOR-8854] A document history page is now included in the Arm Forge User Guide listing the version history.
        • [FOR-9236] Fixed an issue which occasionally displays this benign warning message "'libmap-sampler.so' from LD_PRELOAD cannot be preloaded".
        • [FOR-9271] Fixed an issue where OpenMPI occasionally displays a SIGPIPE signal during startup, causing a failure.
        • [FOR-6070] Fixed an issue with example builds to allow spaces, ampersand (&), and hash (#) in the pathname.
        • [FOR-9735] Fixed a bug whereby the C++ standard library used by Forge could interfere with target programs.
        • [FOR-10106] Fix crash when accepting Reverse Connect requests.
        • [FOR-9377] Fixed a bug in physical core count reported when profiling on a machine with disabled cores.

      Arm Performance Reports additions and changes:

      • 20.0.
        • Performance Reports is now packaged alongside DDT and MAP in the Forge installation. Performance Reports users should now download and install Forge, and can find documentation in the Forge user guide.
        • Added support for Python 3.8 profiling.
        • Added GPU Metrics for ppc64le systems.
        • Removed the following GPU Metrics to increase performance and stability of GPU Profiling: "temperature" and "time spent in global memory accesses".
        • Added support for Arm Compiler for Linux 20.1.
        • Added support for Arm Compiler for Linux 20.2.
        • Added support for GNU 8.3.0 compiler.
        • Added support for CUDA 10.2 Profiling.
        • Added support for MVAPICH 2.3.x.
        • Added support for Red Hat Enterprise Linux/CentOS 8.
        • Added support for Open MPI 4.0.x.
        • Allow MPI implementation to be specified via command-line options --mpi and --list-mpis.
        • Added support for PGI 20.1.
        • Forge and Licence Server are now built with Qt5 which comes with better OS X support.
        • Removed support for some obsolete compilers and MPIs. See "Supported Platforms" in the user guide for more information.
        • Added the version number to the default install path on Windows and Linux, to allow multiple versions to be installed alongside each other.

      Arm Performance Reports resolved issues:

      • 20.1.
        • [FOR-8854] A document history page is now included in the Arm Forge User Guide listing the version history.
        • [FOR-9271] Fixed an issue where OpenMPI occasionally displays a SIGPIPE signal during startup, causing a failure.
        • [FOR-6070] Fixed an issue with example builds to allow spaces, ampersand (&), and hash (#) in the pathname.
        • [FOR-9735] Fixed a bug whereby the C++ standard library used by Forge could interfere with target programs.

      Arm Forge deprecated features:

      The following features have been deprecated in the stated release, and might be removed in a future version:

      • 20.1
        • VisIt Visualization.
        • Automatically adding breakpoints and tracepoints based on version control information.
        • Support for CUDA 8.x
        • Support for Python 2.x.x.
        • Support for the following MPIs: SGI MPT (prior to HPE MPI), Open MPI on Cray X-series systems, Open MPI 2.x.x, Parastation MPI.

      Arm Forge known issues

      Please refer to the known issues page



      • Release Note
      • EULA
      • Documentation
    • Arm Allinea Studio: 20.0 - latest update 20.0.3 April 23, 2020

      What's new in 20.0 - latest update 20.0.3

      Arm Allinea Studio 20.0.3 includes:

      • Arm Compiler for Linux 20.1 - Released 23rd April 2020
      • Arm Forge 20.0 (latest 20.0.3) - Released 1st April 2020

      Arm Compiler for Linux 20.1

      Additions and changes:

      • Arm Compiler for Linux suite 20.1:

        • The Fujitsu A64FX processor is added as a new microarchitecture target with specific optimizations. To optimize your code for an A64FX target, compile with the '-mcpu=a64fx' option (or '-mcpu=native' if you are running on an A64FX machine). Optionally, also add the '-armpl' option to use the A64FX-tuned version of Arm Performance Libraries. Note that this is an SVE implementation, meaning that this version will therefore not run natively on other current microarchitectures.

      • Arm C/C++/Fortran Compiler 20.1:

        • The implementation of the -fsimdmath and -armpl options no longer imply -fopenmp, by default.

        • Arm Compiler for Linux 20.1 now warns against using the deprecated SVE/SVE2 ACLE features. Support for these features will be removed in the next major release. For clarity, the SVE/SVE2 ACLE specification has deprecated two features: use of the svcdot function with unsigned arguments, and accessing individual elements of ACLE vector structs using the '.' operator. As an example of the second feature, code such as ((svint8x2_t) foo).v1 is now deprecated, and you should use svget2((svint8x2_t) foo, 0) instead.

        • Arm Fortran Compiler now supports the Fortran 2008 BLOCK construct.

        • Performance improvements for some Fortran workloads.

      • Arm Performance Libraries 20.1.0:

        • FFT implementations are now constructed at planning time, incurring a small additional overhead for each different length of FFT on its first creation. Subsequent plans of the same length are unaffected. Execution performance remains similar to previous releases. This enhancement considerably reduces the size of the library.

        • Implemented improvements to ?GEMM performance, especially for small cases.

        • Added support for LAPACK version 3.9.0. In addition, the CBLAS and LAPACKE interfaces have been updated to use the correct types for complex variables. All documentation has been updated to include these changes.

        • In libamath there are newly-optimized versions of erf and erfc, both for scalar and vector uses, and for single and double precision.

      Resolved issues:

      • Arm C/C++/Fortran Compiler 20.1:

        • Arm Compiler for Linux does not support "omp declare simd" pragmas on C++ class member functions (declarations or definitions), and will ignore them in your code.

        • Fixed an issue with the dynamic scheduling of OpenMP do loops in Arm Fortran Compiler. Incorrect step sizes were being passed to the OpenMP runtime function.

        • Fixed an issue where Fortran VLAs had incomplete debug information. Debug handling code that captures upper and lower bounds of Fortran arrays has been improved.

        • Fixed an issue in Arm Fortran Compiler that prevented the 'Meso-NH' weather code from compiling successfully.

        • Fixed an issue in Arm Fortran Compiler that prevented code containing one or more "arithmetic if" statements compiling with the -i8 compiler option.

        • Fixed an interaction issue between fixed length arrays and SVE ACLE builtins.

        • Fixed an issue where the compiler could fail to obtain a license when multiple possible licenses were available, and where one license had no remaining seats.

        • Fixed an issue in arm-opt-report, where reports would incorrectly describe when loops were inlined and then unrolled.

        • To highlight the vectorization factor when scalable vectors are used, a new annotation option, VS<F,I>, has been added to arm-opt-report: VS<F,I>: A loop has been vectorized using scalable vectors. Each vector iteration performed has the equivalent of N*F*I scalar iterations, where N is the number of vector granules, which can vary according to the machine the program is run on. For example, LLVM assumes a granule size of 128 bits when targeting SVE. F (Vectorization Factor) and I (Interleave count) are as described for V<F,I>.

      • Arm Performance Libraries 20.1.0:

        • Fixed an issue where log10f (single precision, base-10 logarithm function) reported inaccuracies when used in a vectorizable loop.

      Open technical issues:

      • Arm C/C++/Fortran Compiler 20.1:

        • Copying module files to a different directory after installation, or using a symlink to the module files, might result in an error using 'module load'. If this issue affects you, to get the correct advice for your system configuration, please contact Arm support. Module configuration flexibility will be improved in a later release of Arm Compiler for Linux.

      Arm Forge 20.0 (latest 20.0.3)

      Arm DDT new features and enhancements:

      • 20.0
        • CUDA GDB is now shipped on AArch64.
        • Added Assembly debugging mode.
        • Added GDB 8.2 as an optional debugger.
        • Updated the MPICH3 auto-detection heuristics.
        • Fixed an issue where the program was unresponsive when debugging or profiling applications that expect user input from a pseudo terminal.
        • Added support for Ubuntu 18.04 on AArch64.
        • Added support for Ubuntu 18.04 on x86_64.
        • Upgraded CUDA GDB 10.1 to update 2.
        • Added support for Arm Compiler for Linux 19.3.
        • Added support for Arm Compiler for Linux 20.0.
        • Added support for Intel 19.x Compiler and MPI.
        • Updated the Support email address from allinea-support@arm.com to support-hpc-sw@arm.com.
        • Because of issues with the Data Address Watchpoint Register (DAWR) on POWER9, hardware watchpoints are not available from Linux kernel version 4.17. In this case, DDT falls back to software watchpoints which are significantly slower than using hardware watchpoints. More information can be found at https://github.com/torvalds/linux/blob/master/Documentation/powerpc/dawr-power9.rst.

      • 20.0.1
        • Fixed an issue with two dimensional array slicing when using the Intel and GNU Fortran compilers.

      • 20.0.2
        • None in this release.

      • 20.0.3
        • None in this release.

      Arm DDT bug fixes

      • 20.0
        • [FOR-8193] Improved the security of the X11 display server used in Offline Mode.
        • [FOR-8194] Fixed a security issue with Reverse Connect which affected the authentication between Remote Daemons.
        • [FOR-7972] Fixed preloading of the memory debugging library on Cray with Slurm.
        • [FOR-3369] Fixed editor lookups involving user supplied file paths when double-clicking a breakpoint item in the Breakpoint view.
        • [FOR-5400] Fixed a crash in the Attach dialog.
        • [FOR-6148] Fixed an issue with displaying std::map iterators in the variables view.
        • [FOR-8522] Fixed a segmentation fault when printing C++ static constexpr class members.
        • [FOR-8280] Fixed an issue resulting in extra items displaying in the Memory Usage dialog chart legend.

      • 20.0.1
        • [FOR-4263] Addressed an inconsistency between Process selection and Parallel Stack View frame selection.
        • [FOR-8283] Fixed various PSV related bugs which occurred in Assembly Mode.
        • [FOR-8562] Updated the icon for assembly debugging mode.

      • 20.0.2
        • [FOR-8903] Fixed a bug with startup using srun on Arm machines.
        • [FOR-7460] Fixed an issue which caused the DDT offline logs to be incomplete.
        • [FOR-8974] Fixed a bug where selecting memory allocations from the chart in the "Current Memory Usage" window was showing data for the wrong process.
        • [FOR-9006] Fixed an issue where information from some processes was sometimes missing from offline mode log files.
        • [FOR-8769] Now able to set Memory Debugging settings on a subset of processes when launching in Express Launch mode.
        • [FOR-8961] Improved error reporting when there is a failure on startup.
        • [FOR-8858] Fixed an issue with MAC installer window not displaying at the correct size.

      • 20.0.3
        • [FOR-9068] Fixed a race condition from within cuda-gdb which caused DDT to occasionally fail when launching a kernel.
        • [FOR-9374] The value of pointer items in the locals, current line, and evaluate views can now be modified.

      Arm MAP new features and enhancements

      • 20.0
        • Added support for counting hardware events using Linux perf.
        • MAP and Performance Reports custom metric API extended to improve usability and support for passing configuration data to custom metric libraries.
        • Reduced MAP and Performance Reports memory usage when profiling large applications.
        • Updated the MPICH3 auto-detection heuristics.
        • Fixed an issue where the program was unresponsive when debugging or profiling applications that expect user input from a pseudo terminal.
        • Clarified metrics descriptions for branch misses and cache misses on AArch64 and Power architectures.
        • Added support for Ubuntu 18.04 on AArch64.
        • Added support for Ubuntu 18.04 on x86_64.
        • Added support for Arm Compiler for Linux 19.3.
        • Added support for Arm Compiler for Linux 20.0.
        • Added support for Intel 19.x Compiler and MPI.
        • Updated the Support email address from allinea-support@arm.com to support-hpc-sw@arm.com.

      • 20.0.1
        • None in this release.

      • 20.0.2
        • Added Perf Metrics support for A76 and N1 CPUs.
        • Added configurable perf metric support to the static sampler library.

      • 20.0.3
        • None in this release.

      Arm MAP bug fixes

      • 20.0.
        • [FOR-8193] Improved the security of the X11 display server used in Offline Mode.
        • [FOR-8194] Fixed a security issue with Reverse Connect which affected the authentication between Remote Daemons.
        • [FOR-8337] Fixed an issue which caused MAP to fail to create a MAP file at the end of a profiling run.
        • [FOR-7508] Fixed an issue where MAP hangs on start-up when running on an Ubuntu 18.04, AArch64 system without libc debug symbols installed.
        • [FOR-8529] Fixed a bug where a crash during profiling prevented the Run button on MAP from working.
        • [FOR-8343] Fixed an issue where an error message displayed about "unavailable metrics" when metrics were unlicensed.

      • 20.0.1
        • [FOR-8540] Fixed an issue where local MAP files could not be opened on OSX Catalina.

      • 20.0.2
        • [FOR-8903] Fixed a bug with startup using srun on Arm machines.
        • [FOR-8961] Improved error reporting when there is a failure on startup.
        • [FOR-8858] Fixed an issue with MAC installer window not displaying at the correct size.

      • 20.0.3
        • None in this release.

      Arm Performance Reports new features and enhancements

      • 20.0.
        • Added support for counting hardware events using Linux perf.
        • Reduced MAP and Performance Reports memory usage when profiling large applications.
        • Fixed an issue where the program was unresponsive when debugging or profiling applications that expect user input from a pseudo terminal.
        • Added support for Ubuntu 18.04 on AArch64.
        • Added support for Ubuntu 18.04 on x86_64.
        • Added support for Arm Compiler for Linux 19.3.
        • Added support for Arm Compiler for Linux 20.0.
        • Added support for Intel 19.x Compiler and MPI.
        • Updated the Support email address from allinea-support@arm.com to support-hpc-sw@arm.com.

      • 20.0.1
        • None in this release.

      • 20.0.2
        • Added Perf Metrics support for A76 and N1 CPUs.
        • Added configurable perf metric support to the static sampler library.

      • 20.0.3
        • None in this release.

      Arm Performance Reports bug fixes

      • 20.0.
        • [FOR-8337] Fixed an issue which caused MAP to fail to create a MAP file at the end of a profiling run.

      • 20.0.1
        • None in this release.

      • 20.0.2
        • [FOR-8903] Fixed a bug with startup using srun on Arm machines.
        • [FOR-8961] Improved error reporting when there is a failure on startup.
        • [FOR-8858] Fixed an issue with MAC installer window not displaying at the correct size.

      • 20.0.3
        • None in this release.

      Arm Forge known issues

      Please refer to the known issues page

      • Release Note
      • EULA
      • Documentation
    • Arm Allinea Studio: 19.3 August 30, 2019

      What's new in 19.3

      Arm C/C++/Fortran Compiler 19.3.0

      New features and enhancements:

      • D-987 : Product reference guides are now available in HTML format in the <install_location>/<package_name>/share directory.

      • D-983 : Support for the F2008 optional argument "BACK=" was added to the MINLOC and MAXLOC intrinsics.

      • D-865 : When targeting SVE, the option '-z now' is passed implicitly to the compile/link flags, to disable a feature known as lazy binding.  The implementation of lazy binding in the GNU dynamic loader is currently imcompatible with the draft Procedure Call Standard (PCS) for the ARM 64-bit Architecture (AArch64) with SVE support [1].  Once compatible loaders are generally available, this change will be reverted.  If needed, lazy binding can be passing the '-z lazy' option during compilation.  [1] https://developer.arm.com/docs/100986/latest/procedure-call-standard-for-the-arm-64-bit-architecture-aarch64-with-sve-support

      • D-812 : Arm Optimization Report is a new, beta-quality feature of Arm Compiler for Linux. Arm Optimization Report makes it easier to see what optimization decisions the compiler is making, in-line with your source code. For more information, see https://developer.arm.com/tools-and-software/server-and-hpc/arm-architecture-tools/documentation/arm-opt-report.

      • D-804 : The Arm Compiler for Linux package now includes a SLES15 installer.

      • D-803 : The Arm Compiler for Linux package now includes a RHEL 8 installer.

      Bug fixes:

      • H-697 : Fixed a security vulnerability affecting users of the community stack protection feature. For more information, please see https://kb.cert.org/vuls/id/129209/

      • H-692 : Fixed an issue for a crash that could occur when compiling Fortran OpenMP loops with reductions on boolean variables (which can generate atomic instructions).

      • H-689 : Fixed an issue where the LEN Fortran intrinsic could return an argument of INTEGER(8) type in cases where INTEGER was expected.

      • H-681 : Improved vectorization of reductions, for current generation processors.

      • H-667 : Fixed an issue where using a parameter array of type INTEGER*8, defined with implied-do notation in a module, caused an internal compiler error "interf:new_symbol_and_link, symbol not found".

      • H-657 : The compiler now automatically vectorizes sincos and sincosf functions, which are not standard math.h functions, when the -fsimdmath or -armpl option is passed on the command line.

      • H-634 : Improved the vectorization of Fortran loops that have calls to math routines.

      • H-587 : Fixed an issue where the compiler libraries had the psmisc package as a dependency. The psmisc package is no longer required.

      • H-456 : When using the --install-to option, the installer no longer appends /opt/arm to the user-specified path.

      • H-428 : Fixed an issue where Modulefiles did not work correctly when the compiler installation was relocated.

      Arm Performance Libraries 19.3.0

      New features and enhancements:

      • D-846 : A new interface for sparse matrix-matrix multiplication (SpMM) has been added to Arm Performance Libraries.

        - SpMM is an extension of the existing sparse interface supporting all functionality from the usual BLAS GEMM interface.

        - Matrices can be supplied in CSR, CSC, COO, or dense formats.

        - SpMM is available for both C and Fortran, and examples are included for both.

        - A sparse matrix addition function, and functions to generate identity and null matrices, are also provided for convenience. Sparse matrix-vector multiplication (SpMV) functions have been optimized for parallel cases.

      • D-843 : FFT performance improvements.

        - Optimizations for transform lengths involving large prime factors.

        - Parallel performance improvements for multi-dimensional problems.

      • D-675 : Performance enhancements for BLAS level 3 calls. In particular, updated versions of both ?SYMM and ?HEMM are included for all microarchitectures.

      • D-674 : A generic, SVE-enabled version of Arm Performance Libraries is now provided. The SVE-enabled version has not been tuned for any particular microarchitecture, and is available to experiment with SVE in an emulated mode, ahead of silicon deployments. Examples are provided which demonstrate how to run SVE code.

       

      Bug fixes:
      • None in this release.

      Arm Forge 19.1.3
      (includes 19.1, 19.1.1, 19.1.2, and 19.1.3)

      Arm DDT new features and enhancements:

      • Support for Arm C/C++/Fortran Compiler up to version 19.2.
      • Fixed an issue where GDB 8.1 would not start on an Ubuntu 16.04 system without libmpfr installed.
      • Support for debugging of IBM Spectrum MPI jobs launched with Spindle.
      • GDB 8.1 is now the default DDT debugger.
      • Support for the GDB 7.10.1 debugger has been removed.
      • Memory Debugging support for PMDK.
      • Support for debugging CUDA 10.0 and 10.1 binaries.
      • Remote connect network traffic is now compressed by default so some actions will now be faster when using this feature.
      • Support for Cray Shasta detection.

      Arm DDT bug fixes:

      • [FOR-7342] Fixed an issue with memory debugging aligned allocations.
      • [FOR-6659] Clarified information in the user guide about startup issues with OpenMPI 3.0 and 3.1.
      • [FOR-6503] Fixed an issue where variables named "array" in a struct were not evaluated.
      • [FOR-6142] Fixed an issue with memory debug, where the total number of free calls were double counted when using memkind_realloc.
      • [FOR-6049] Fixed an issue with remote client messages when X11 is not available.
      • [FOR-7236] Fixed an issue where MPI auto-detection did not work with HPE MPT 2.18+.
      • [FOR-7195] Fixed an issue that occurs when output brackets are present in the output file argument.
      • [FOR-7417] Track memory allocations for Fortran applications compiled with the  -mkl option to the Intel compiler.
      • [FOR-7140] Fixed an issue where allocated Fortran arrays are reported as not allocated.
      • [FOR-7376] Fixed an issue that caused the offline report memory leak bar charts to flow onto multiple lines.
      • [FOR-6985] Fixed an issue that prevented the expansion of Fortran arrays in the Current Line window. 
      • [FOR-7660] Fixed an issue when launching an application with Slurm 19.0.5.
      • [FOR-7661] Arm Forge now auto-detects HPE HMPT.

      Arm MAP new features and enhancements:

      • Support for Arm C/C++/Fortran Compiler up to version 19.2.
      • Improved GUI performance and reduced memory consumption when viewing large .map files.
      • New CPU metrics for Armv8-A platforms.
      • New CPU metrics for IBM Power9 platforms.
      • MAP now displays stacks from Python code on non-main threads.
      • Architecture information is now stored to the generated .map file.
      • CPU metrics on Power and Armv8-A are now available with a standard Arm Forge license.
      • Support for displaying Caliper instrumented regions (https://github.com/LLNL/Caliper) to Arm MAP. Refer to section 32, 'Performance Analysis with Caliper Instrumentation', in the Arm Forge user guide.
      • Section 24.1 in the Arm Forge user guide has been updated to better describe the CPU instruction metrics available on x86_64, Armv8-A and IBM Power 8 and Power 9 platforms.
      • Remote connect network traffic is now compressed by default so some actions will now be faster when using this feature.
      • Support for Cray Shasta detection.

      Arm MAP bug fixes:

      • [FOR-6642] Improved unwinding for PGI-compiled binaries on IBM Power systems.
      • [FOR-6414] Fixed an issue that occurs when profiling applications that were statically compiled by the PGI compiler.
      • [FOR-6659] Clarified information in the user guide about startup issues with OpenMPI 3.0 and 3.1.
      • [FOR-5518] Fixed an issue that caused a slowdown of the analysis phase when profiling Python scripts.
      • [FOR-7236] Fixed an issue where MPI auto-detection did not work with HPE MPT 2.18+.
      • [FOR-7195] Fixed an issue that occurs when output brackets are present in the output file argument.
      • [FOR-7366] You can now see a breakdown of time spent calling functions and executing instructions per line on AArch74- and ppc64le-based systems.
      • [FOR-7660] Fixed an issue when launching an application with Slurm 19.0.5.
      • [FOR-7661] Arm Forge now auto-detects HPE HMPT.
      • [FOR-7663] MAP GPU profiling now works as expected when GPUs are in "Exclusive Process" mode.
      • [FOR-7879] Autodetection of make-profiler-libraries now works as expected on Cray systems.

      Arm Performance Reports 19.1.3
      (includes 19.1, 19.1.1, 19.1.2, and 19.1.3)

      New features and enhancements:

      • Support for Arm C/C++/Fortran Compiler up to version 19.2.
      • Architecture information is now stored to the generated .map file.
      • New CPU metrics for IBM Power9 and Armv8-A platforms.
      • Support for Cray Shasta detection.

      Bug fixes:

      • [FOR-6642] Improved unwinding for PGI-compiled binaries on IBM Power systems.
      • [FOR-6414] Fixed an issue that occurs when profiling applications that were statically compiled by the PGI compiler.
      • [FOR-6659] Clarified information in the user guide about startup issues with OpenMPI 3.0 and 3.1.
      • [FOR-7236] Fixed an issue where MPI auto-detection did not work with HPE MPT 2.18+.
      • [FOR-7195] Fixed an issue that occurs when output brackets are present in the output file argument.
      • [FOR-7660] Fixed an issue when launching an application with Slurm 19.0.5.
      • [FOR-7661] HPE HMPT is now auto-detected.
      • [FOR-7879] Autodetection of make-profiler-libraries now works as expected on Cray systems.
      • Release Note
      • EULA
      • Documentation
    • Arm Allinea Studio: 19.2 June 07, 2019

      What's new in 19.2

      Arm C/C++/Fortran Compiler 19.2

      New features and enhancements:

      • D-866 : The -insights flag is no longer supported.
      • D-771 : Added experimental scheduler improvements that can give performance benefits on large processors, such as ThunderX2.  By default, the scheduler improvements are disabled. To enable them, include the "-mllvm -misched-favour-latency=true" option at compile time.
      • D-612 : The Fortran 2008 {{ERROR STOP}} statement is now supported.

      Bug fixes:

      • H-629 : Fixes a problem where -armpl did not locate the correct include directory.
      • H-585 : Fixes a problem that caused images that were compiled on ThunderX2 platforms but that targeted other platforms to fail when run on those other platforms.
      • H-538 : Fixes a problem that caused a compiler error when using OpenMP Taskloop.
      • H-536 : The performance of the Fortran BACKSPACE statement has been improved.
      • H-527 : Issues using Fortran ISO C-bindings and Arm Fortran Compiler are now resolved.
      • H-525 : Performance of OpenMP ATOMIC in Fortran has been enhanced by using native atomic load/store instructions where possible.
      • H-508 : armflang now correctly displays the source location for vectorisation reports generated by -Rpass, when compiling without using the -g option.
      • H-401 : The compiler now correctly infers template types for SVE datatypes like 'svfloat32_t'.

      Arm Performance Libraries 19.2.0

      New features and enhancements:

      • D-746 : A new library, libastring, is included by Arm Compiler by default. This library provides optimized versions of a number of common string functions, such as memcpy and memset. libastring is also provided for the GCC compiler, and can be found in $ARMPL_DIR/lib.
      • D-676 : A number of FFT performance improvements have been implemented, especially in single-precision.
      • D-673 : libamath performance improvements including vectorized versions of sin, cos, exp, and log, in both single and double precision.
      • D-671 : Half precision interfaces have been added to libarmpl for matrix-matrix multiplication and FFTs.

        The half precision matrix-matrix multiplication function is called hgemm_. This interface follows the usual *GEMM interface with half precision matrices and floating point scalars.

        The naming scheme for the FFTW interfaces has been extended, such that all functions are prefixed fftwh_. An example of how to use these functions would be based upon:
                  /* Include Arm Performance Libraries FFT interface. Make sure you include the header file provided by Arm PL and not the header provided by FFTW3.*/
                  #include "fftw3.h"
                  /* Declare half-precision arrays to be used */
                  __fp16 *in;             fftwh_complex *out;             fftwh_plan plan;
                  /* Plan, execute and destroy */
                  plan =             fftwh_plan_many_dft_r2c(...);             fftwh_execute(plan);             fftwh_destroy_plan(plan);
      Bug fixes:
      • None in this release.

      Arm Forge 19.1

      Arm DDT new features and enhancements:

      • Support for Arm C/C++/Fortran Compiler up to version 19.2.
      • Fixed an issue where GDB 8.1 would not start on an Ubuntu 16.04 system without libmpfr installed.
      • Support for debugging of IBM Spectrum MPI jobs launched with Spindle.
      • GDB 8.1 is now the default DDT debugger.
      • Support for the GDB 7.10.1 debugger has been removed.
      • Memory Debugging support for PMDK.
      • Support for debugging CUDA 10.0 and 10.1 binaries.
      • Remote connect network traffic is now compressed by default so some actions will now be faster when using this feature.

      Arm DDT bug fixes:

      • [FOR-7342] Fixed an issue with memory debugging aligned allocations.
      • [FOR-6659] Clarified information in the user guide about startup issues with OpenMPI 3.0 and 3.1.
      • [FOR-6503] Fixed an issue where variables named "array" in a struct were not evaluated.
      • [FOR-6142] Fixed an issue with memory debug, where the total number of free calls were double counted when using memkind_realloc.
      • [FOR-6049] Fixed an issue with remote client messages when X11 is not available.
      • [FOR-7236] Fixed an issue where MPI auto-detection did not work with HPE MPT 2.18+.
      • [FOR-7195] Fixed an issue that occurs when output brackets are present in the output file argument.

      Arm MAP new features and enhancements:

      • Support for Arm C/C++/Fortran Compiler up to version 19.2.
      • Improved GUI performance and reduced memory consumption when viewing large .map files.
      • New CPU metrics for Armv8-A platforms.
      • New CPU metrics for IBM Power9 platforms.
      • MAP now displays stacks from Python code on non-main threads.
      • Architecture information is now stored to the generated .map file.
      • CPU metrics on Power and Armv8-A are now available with a standard Arm Forge license.
      • Support for displaying Caliper instrumented regions (https://github.com/LLNL/Caliper) to Arm MAP. Refer to section 32, 'Performance Analysis with Caliper Instrumentation', in the Arm Forge user guide.
      • Section 24.1 in the Arm Forge user guide has been updated to better describe the CPU instruction metrics available on x86_64, Armv8-A and IBM Power 8 and Power 9 platforms.
      • Remote connect network traffic is now compressed by default so some actions will now be faster when using this feature.

      Arm MAP bug fixes:

      • [FOR-6642] Improved unwinding for PGI-compiled binaries on IBM Power systems.
      • [FOR-6414] Fixed an issue that occurs when profiling applications that were statically compiled by the PGI compiler.
      • [FOR-6659] Clarified information in the user guide about startup issues with OpenMPI 3.0 and 3.1.
      • [FOR-5518] Fixed an issue that caused a slowdown of the analysis phase when profiling Python scripts.
      • [FOR-7236] Fixed an issue where MPI auto-detection did not work with HPE MPT 2.18+.
      • [FOR-7195] Fixed an issue that occurs when output brackets are present in the output file argument.

      Arm Performance Reports 19.1

      New features and enhancements:

      • Support for Arm C/C++/Fortran Compiler up to version 19.2.
      • Architecture information is now stored to the generated .map file.
      • New CPU metrics for IBM Power9 and Armv8-A platforms.

      Bug fixes:

      • [FOR-6642] Improved unwinding for PGI-compiled binaries on IBM Power systems.
      • [FOR-6414] Fixed an issue that occurs when profiling applications that were statically compiled by the PGI compiler.
      • [FOR-6659] Clarified information in the user guide about startup issues with OpenMPI 3.0 and 3.1.
      • [FOR-7236] Fixed an issue where MPI auto-detection did not work with HPE MPT 2.18+.
      • [FOR-7195] Fixed an issue that occurs when output brackets are present in the output file argument.
      • Release Note
      • EULA
    • Arm Allinea Studio: 19.1 March 08, 2019

      What's new in 19.1

      New features and enhancements

      Arm C/C++/Fortran Compiler 19.1:
        - D-669 : Arm Compiler for HPC now supports the Fortran 'TRAILZ' intrinsic, which finds the number of trailing zero bits in an integer.  Please refer to the Fortran Reference Guide for more information.

        - D-668 : Arm Compiler for HPC now supports the Fortran 'UNROLL' directive.  This is a hint to the compiler to unroll the preceding loop.  Please refer to the Fortran Reference Guide for more information.

        - D-632 : A new flag -fno-realloc-lhs has been added, for consistency with GNU compilers. Use this flag in place of -Mallocatable=95, which is no longer documented but is still supported. Refer to the Fortran Reference Guide for information about this flag.

        - D-552 : libamath is now the default library used by Arm Compiler for HPC to provide optimized scalar and vector math functions.

                  - The compiler will link to libamath by default before libm in order to provide better performing implementations.

                  - libamath is also provided for GCC. GCC users must link to the library explicitly to make use of the optimized math functions.

                  - Always use the correct build of libamath for the compiler you are using. For example, do not compile and link code with GCC using the version of libamath supplied for Arm Compiler for HPC, use the GCC version.

        - D-513 : Arm Compiler for HPC now supports the Huawei Kunpeng 920 CPU.  Tuning for Kunpeng 920-based platforms is automatically selected with the -mcpu=native option, when the compiler is run on a Kunpeng 920-based platform. To select this explicitly, use the -mcpu=tsv110 option.


      Arm Performance Libraries 19.1.0:
        - D-581 : Improved *GEMV performance.

        - D-552 : libamath is now the default library used by Arm Compiler for HPC to provide optimized scalar and vector math functions.

                  - The compiler will link to libamath by default before libm in order to provide better performing implementations.

                  - libamath is also provided for GCC. GCC users must link to the library explicitly to make use of the optimized math functions.

                  - Always use the correct build of libamath for the compiler you are using. For example, do not compile and link code with GCC using the version of libamath supplied for Arm Compiler for HPC, use the GCC version.

        - D-499 : Performance improvements for [SCZ]GEMM, including stabilized performance for ThunderX2 systems configured in SMT > 1 mode.

        - D-498 : Improved MPI FFT parallel scaling.

        - D-497 : Support for Fortran MPI FFTW interface.

        - D-496 : FFT performance improvements, including for input lengths involving large prime
                  factors.

        - D-495 : Single precision real SpMV performance optimizations.

        - D-494 : SpMV support for Compressed Sparse Column (CSC) and Coordinate (COO) sparse matrix formats with both C and Fortran interfaces.

        - D-493 : Added sparse matrix-vector multiplication (SpMV) interfaces for Fortran, including an example.

      Bug fixes

      Arm C/C++/Fortran Compiler 19.1:
        - H-489 : The armflang runtime library no longer exposes symbols that conflict with libnuma.

        - H-464 : A problem that occurs when a shared variable is accessed in a taskloop has now been fixed.

        - H-400 : Fixed an issue where getting the member of sizeless struct rvalue prevented successful compilation.

        - H-397 : RPMs and debs now correctly report what libraries they provide.

        - H-392 : The runtime performance of the Fortran TRANSFER function has been improved.

        - H-296 : Fixed a runtime segmentation fault in subroutines that contain OMP CRITICAL and have one or more ENTRY statements.

        - H-98 : The install should now be properly relocatable on RPM-based platforms and will register with the system RPM database if the user has appropriate permissions

        - H-59 : Fixed an issue where when the DATA statement was used to assign a value to a Cray pointer, the compiler aborted with the following message "Error: integer constant must have integer type".

      Arm Performance Libraries 19.1.0:
        - No fixed issues

      Known Issues

      Arm C/C++/Fortran Compiler 19.1:
        - H-571 : If you have multiple versions of Arm Compiler for HPC installed that depend on the same GCC version, running the uninstall.sh script will fail. Instead, remove the packages manually, using the Package Manager, or modify the uninstall.sh script to prevent removal of the GCC package.

        - H-421 : When the uninstaller is run, it does not remove all of the files. It is safe to remove the remaining files manually.

        - H-411 : There is a regression in SVE vectorization which may result in miscompiles of loops with loop-carried dependencies.

        - H-310 : -fsimdmath is incompatible with a dynamic linker optimization known as 'lazy binding'.  When using -fsimdmath, Arm recommends that you also add '-z now' to the compile/link flags, in order to disable this optimization during linking. For more information, see Vector math routines.

      Arm Performance Libraries 19.1.0:
        - No known issues

       

      • Release Note
      • EULA
    • Arm Allinea Studio: 19.0 November 02, 2018

      What's new in 19.0

      New features and enhancements

      Arm C/C++/Fortran Compiler 19.0:

      • D-545 : Partial support for the do concurrent Fortran 2008 feature. Partial support because serial code is generated.
      • D-544 : Support for the submodules Fortran 2008 feature.
      • D-394 : Improvements to the performance of Fortran NINT and DNINT intrinsics.
      • D-393 : Improvements to the performance of Fortran math intrinsics, including the ability to auto-vectorize scalar math intrinsics. To benefit from these improvements, add the new compiler option -armpl to your compile and link arguments, and use optimization level -O2 or higher.
      • D-388 : Arm Compiler for HPC is now based on LLVM 7.0.
      • D-374 : Support for the Fortran 'NOVECTOR' directive, which enables users to disable auto-vectorization on individual loops.
      • D-373 : Support for the Fortran 'VECTOR ALWAYS' directive, which enables a user to request that a loop be auto-vectorized, irrespective of the compiler's internal cost-model, if it is safe to do so.
      • D-329 : A new C/C++ Compiler Reference Manual is available in <install_location>/<package_name>/share.

      Arm Performance Libraries 19.0:

      • D-492 : Various changes to C header files:
        • BLAS, CBLAS and LAPACK function prototypes have been modified to use 'const' where appropriate, for example, for input array pointers and char * specifiers.
        • We now use C-style _Complex numbers instead of our own structure for complex numbers in the armpl.h header. If required, you can use #define to override armpl_singlecomplex_t and armpl_doublecomplex_t to something else that is bitwise-compatible (e.g. C++ std::complex type). This change is bitwise-compatible with the structure we have replaced.
        • Complex number manipulation functions have been removed from the header. You are advised to use standard C-style _Complex operations instead (or those appropriate to any redefinition such as C++ std::complex).
        • cdotc_, cdotu_, zdotc_, zdotu_, cladiv_ and zladiv_ prototypes have been modified to reflect the correct C-to-Fortran calling convention for a given compiler toolchain.
      • D-486 : Libraries tuned for Qualcomm Falkor are no longer provided.
      • D-461 : The GCC version of the library is now compatible with GCC 8.2 (previously 7.1).
      • D-430 : Enhancements to existing libamath functions.
      • D-429 : Support for LAPACK version 3.8.0.
      • D-428 : LAPACK parallel scalability tuning has been performed for the following routines on ThunderX2CN99 systems: *POTRF, *GEQRF, *GETRF.
      • D-426 : The FFT interface documented in the Arm Performance Libraries User Manual versions up to v18.4.0 has been deprecated. Users are instead encouraged to use the FFTW interface within Arm Performance Libraries for best performance. This release also includes optimizations to key FFT kernels.
      • D-425 : Added FFTW MPI single and double precision interfaces in C.
      • D-424 : Execution of advanced and guru FFTW plans is now parallelized.
      • D-423 : Added FFTW guru single and double precision interfaces in C and Fortran.
      • D-422 : Added a new suite of sparse matrix routines in C supporting sparse matrix-vector multiplication supplied in Compressed Sparse Row format, including an optimized double-precision real kernel. Added WAXPBY BLAS extension routine (w = a*x + b*y, for vectors w, x and y and scalars a and b).
      • D-421 : Performance enhancements to parallel DGEMM, especially for small to medium-sized problems.

      Bug fixes

      Arm C/C++/Fortran Compiler 19.0:

      • H-423 : Support, by default, for Fortran 2003 semantics for assignments to allocatable variables.
      • H-407 : In some corner cases there has been an increase in memory usage observed due to the switch to memory allocatable semantics of Fortran 2003. This can result in a segfault. In these cases the recommended workaround is to use the armflang option -Mallocatable=95 during compilation.
      • H-361 : Arm Fortran Compiler now handles the -fsave-optimization-record flag correctly.
      • H-333 : Improvements to DWARF source-level debug information for Fortran.
      • H-130 : Added missing man page for armclang++.
      • H-96 : Fixes an issue with armflang's handling of OpenMP 'threadprivate' module variables.

      Arm Performance Libraries 19.0:

      • No fixed issues

      Refer to the Release Notes for further information about this release.

      • Release Note
      • EULA
    • Arm Allinea Studio: 18.4 - latest update 18.4.2 October 10, 2018

      What's new in 18.4 - latest update 18.4.2

      Arm Compiler for HPC 18.4 covers the following releases:

      • Arm C/C++/Fortran Compiler and Arm Performance Libraries version 18.4 - released 26th July 2018.
      • Arm C/C++/Fortran Compiler and Arm Performance Libraries version 18.4.1 - released 7th September 2018.
      • Arm C/C++/Fortran Compiler and Arm Performance Libraries version 18.4.2 - released 10th October 2018.

      New features and enhancements

      Arm C/C++/Fortran Compiler 18.4:

      • The -fstack-arrays option is enabled at the -Ofast optimization level.
      • Arm Fortran Compiler now supports the general-purpose ivdep directive, and partially supports the OpenMP-specific omp simd directive. These directives instruct the compiler to ignore memory dependencies and can enable a loop to be vectorized.
      • The Arm Fortran Compiler Reference guide is now available in /opt/arm/<package_name>/share.
      • The new vector procedure call standard has been implemented and is used by the SLEEF math library.

      Arm C/C++/Fortran Compiler 18.4.1:

      • No new features or enhancements.

      Arm C/C++/Fortran Compiler 18.4.2:

      • No new features or enhancements.

      Arm Performance Libraries 18.4:

      • Performance improvements for batched CGEMM and ZGEMM.
      • Performance improvements for small-to-medium-sized SGEMM problems.
      • Significantly less time spent planning FFTW transforms for levels of rigor greater than FFTW_ESTIMATE.
      • Performance enhancements for complex-to-real FFTW transforms, especially multidimensional problems.
      • Libraries for Cortex-A57 and Cavium ThunderX are no longer provided.
      • New functions in libamath: sinf, cosf, sincosf (single precision).
      • Updated functions in libamath: exp, pow, log (double precision).

      Arm Performance Libraries 18.4.1:

      • No new features or enhancements.

      Arm Performance Libraries 18.4.2:

      • D-490 : The version of libamath built has been updated.

      Bug fixes

      Arm C/C++/Fortran Compiler 18.4:

      • H-52: Segfault on large array allocation.
      • H-87: Arm Fortran Compiler not vectorizing a loop.
      • H-92: Fixes for some debug issues related to subroutine arguments.
      • H-105: Flag (-E) to run only the preprocessor does not work in the Fortran compiler.
      • H-115: Problems with direct access I/O in Fortran programs.

      Arm C/C++/Fortran Compiler 18.4.1:

      • H-317: In the previous release, when armflang was used to link objects without compilation, it generated unnecessary warnings about unused compilation flags. These warnings have been removed.
      • H-149: A problem caused by unaligned offsets in stack layout for SVE replicating loads, which caused the compiler to crash, has been resolved.

      Arm C/C++/Fortran Compiler 18.4.2:

      • H-361 : Arm Fortran Compiler now handles the -fsave-optimization-record flag correctly.

      Arm Performance Libraries 18.4:

      • H-126: Some multidimensional FFTW transforms return incorrect results.

      Arm Performance Libraries 18.4.1:

      • No fixed issues

      Arm Performance Libraries 18.4.2:

      • No fixed issues

      Refer to the Release Note for further details.

      • Release Note
      • EULA
    • Arm Allinea Studio: 18.3 May 29, 2018

      What's new in 18.3

      New features and enhancements

      • Arm C/C++/Fortran Compiler 18.3:
        • Support for Fortran 2008 feature : Pointers to internal procedure and internal procedure passed as argument.
        • Automatic arrays can be allocated on the stack with -fstack-arrays flag.
      • Arm Performance Libraries 18.3:
        • Support for FFTW wisdom included for the first time.
        • Performance enhancements to FFTW functions: complex-to-complex and real-to-complex functions using both basic and advanced interfaces; some complex-to-real performance differences too.
        • Parallel performance improvements for S/D/C/ZTRSV and S/DTRMV.
        • New library, 'libamath', in the 'lib' directory for each microarchitecture for Arm Compiler builds.  This contains optimized versions of exp, pow and log functions in single and double precision.  For more information on libamath, see Getting started with Arm Performance Libraries.

      Bug fixes

      • Arm HPC Compiler 18.3:
        • H-14 : Fixed two preprocessor issues. Transpose intrinsic is now supported during initialization.
        • H-58 : Fixed failure when a module variable was used to set real kind in two different functions.
        • H-61 : __ARM_ARCH macro is now defined in armflang compiler.
        • H-74 : Disabled generation of fmas at O0 in armflang. Matches armclang behaviour. Passes fp accuracy tests at O0.
        • H-86 : Fixed issue with capturing procedure pointers to OpenMP parallel regions, which was preventing the TeaLeaf mini-app from running correctly.
      • Arm Performance Libraries 18.3:
        • H-7: Nested parallelism performance improvements.

      Known Issues

      Arm Compiler 18.3:

      • H-105: The '-E' option to armflang does not work. This will be fixed in the next release.
      • H-114: Debugging arrays with negative lower bounds is not currently supported.

       

      • Release Note
      • EULA
    • Arm Allinea Studio: 18.2 March 22, 2018

      What's new in 18.2

      Arm Compiler for HPC contains the following packages:

      • Arm Compiler v18.2
      • Arm Performance Libraries v18.2
      • GNU GCC 7.1

      New features and enhancements

      Arm C/C++/Fortran Compiler 18.2:

      • License management is now switched on by default. Please refer to Arm Allinea Studio licensing for more information about licensing.

      • SIMD math library 'libsimdmath.so' now provides the same set of functions for targeting Vector Length Agnostic (VLA) SVE instructions as it provides for ARM Advanced SIMD instructions. For example, a loop invoking 'double sin(double)' can be auto-vectorized with calls to a VLA implementation of 'sin', which is provided in 'libsimdmath.so'.
        'libsimdmath.so' has increased coverage of vectorized routines from math.h and GLIBC math.h.
        Please refer to Vector math routines for more information about this feature.

      • Debug information has been added for Fortran adjustable arrays and imported modules.

      Arm Performance Libraries 18.2:

      • FFT performance improvements. Improvements have been made to a selection of FFTW routines in the library. Users should see enhanced performance for a wide range of transform sizes for 1D complex-to-complex transforms in single and double precision via the basic interface. Improvements have also been made to the advanced interfaces for complex-to-complex, real-to-complex and complex-to-real transforms in single and double precision for transforms of any dimensionality. From this release users are advised to target the FFTW interface in Arm Performance Libraries rather than the FFT routines documented in the Arm Performance Libraries Reference Manual.

      • Thread tuning for level 1 BLAS routines *AXPY, *AXPBY, *SCAL, *COPY. Where possible the number of threads used for these routines may be throttled, compared with the number of threads requested, in order to improve performance.

      Refer to the Release Note for details of bug fixes and further information.

      • Release Note
      • EULA
      • Documentation
    • Arm Allinea Studio: 18.1 January 17, 2018

      What's new in 18.1

      Arm Compiler for HPC contains the following packages:

      • Arm Compiler v18.1
      • Arm Performance Libraries v18.1
      • GNU GCC 7.1

      New features and enhancements

      This release contains the following new features and enhancements:

      Arm Compiler 18.1

      Redhat 7 support is now provided as a single package, rather than having individual packages for each point release

      Compiler flag documentation (output with --help, the armflang manpage and the online documentation) have been simplified, by no longer documenting PGI-style Fortran flags when these flags have an exact GCC-style equivalent flag. Although no longer documented, the PGI-style flags are still supported as in previous releases.

      A new flag -fsimdmath enables vectorization of some scalar libm functions, by automatically replacing calls to these functions with a vectorized form inside of vectorized loops.  These vectorized forms are included in a new library (libsimdmath.so), which is included in the release and automatically linked in during compilation.

      License management for Arm Compiler is available as a default-off feature for beta testing.  If you wish to try this feature in your environment, please contact your Arm representative.

      The OpenMP runtime library (libomp.so) has been improved for platforms supporting the ARMv8.1-a architecture. Two versions of this library are included with the release, with the most appropriate library selected automatically.

      Debug information has now been enabled for module variables. With this change, users can now print/access these variables whilst debugging. We also generate debug information for modules even if they contain variables only.

      Arm Performance Libraries 18.1

      Optimizations for very small double precision real matrix-matrix multiplication, improving DGEMM and DGEMM_BATCH performance. Optimizations for complex Hermitian and symmetric matrix-matrix multiplication for Cavium ThunderX2.

      • Release Note
      • EULA
      • Documentation
    • Arm Allinea Studio: 18.0 November 09, 2017

      What's new in 18.0

      Arm Compiler for HPC contains the following packages:

      • Arm Compiler v18.0
      • Arm Performance Libraries v18.0
      • GNU GCC 7.1

      New features and enhancements

      =============================

      Arm Compiler 18.0

      • Increased coverage for Fortran 2003 and Fortran 2008. Please see the following page for more details:
         https://developer.arm.com/products/software-development-tools/hpc/arm-fortran-compiler
      • Runtime performance and stability improvements.
      • Tuning for the host platform is now easily done using '-mcpu=native'.
      • Improved user documentation. The Arm Compiler now includes a man-page and has a more accurate and descriptive '--help' command line option.
      • Added support for vector math routines using -fsimdmath.
      • Implemented more features to improve debugging of Fortran applications.
      •  -ffp-contract=fast is now the default behavior for Fortran workloads.  This allows FP instructions to be fused (eg. into FMA instructions), and makes Arm Compiler consistent with other Fortran compilers (e.g. gfortran). In order to maintain consistency with most C/C++ compilers (e.g. Clang and gcc), C/C++ workloads have a more restrictive default of -ffp-contract=on and only perform this operation in the presence of an FP_CONTRACT pragma.

      Arm Performance Libraries 18.0

      • The Qualcomm Falkor core is added as a new microarchitecture target with specific tunings.
      • New support for the following BLAS extension routines, see the Arm Performance Libraries Reference manual for details:
        • *AXPBY and cblas_*axpby for single and double precision real and complex data.
        • *GEMM_BATCH and cblas_*gemm_batch for single and double precision real and complex data. Examples for SGEMM_BATCH and cblas_zgemm batch are provided.
        • *GEMM3M and cblas_*gemm3m for single and double precision complex data.
          Note that these *GEMM3M and cblas_*gemm3m routines are included in the API, but currently offer no performance advantages over the regular *GEMM and cblas*gemm  routines.
      • Support for LAPACK version 3.7.1.
      • A change has been made to C prototypes for Fortran BLAS routines in armpl.h. Where strings are passed as arguments it is no longer a requirement in the interface to pass string lengths after the standard options to the BLAS routines. Note we recommend that users include these string lengths in their calls from C directly to the Fortran interface.
      • Various performance improvements.
      • Release Note
      • EULA
      • Documentation