Applied Parallel Computing LLC: Coding Blog

OpenACC workshop at TU Dortmund

Applied Parallel Computing LLC has delivered the OpenACC Workshop at the Technical University of Dortmund, Germany. The workshop has been kindly supported by NVID...

Oct 07, 2015 Software Engineering, Trainings, OpenACC

4-day CUDA Course at Airbus Defence and Space

Applied Parallel Computing LLC has delivered the 4-day CUDA Course at Airbus Defence and Space, Ulm, Germany.

Oct 07, 2015 Software Engineering, Trainings, CUDA

Use CUDA 7.0 NVRTC with Thrust

Rintime Compilation (NVRTC) introduced in CUDA 7.0 allows to dynamically compile CUDA kernels during program execution (see example). This functionality allows to...

Apr 29, 2015 Software Engineering, CUDA, Thrust

Get extra 8% perf in bilinear interpolation on GPU using restrict keyword

Starting from GK110 (Tesla Kepler), “const restrict” annotation on kernel argument has an extra GPU-specific meaning: accesses to that argument should go through ...

Mar 26, 2015 Software Engineering, CUDA

Thrust/CUDA tip: reuse temporary buffer across multiple transforms

Thrust is a very handy STL-like template library for rapid data processing on GPUs.

Oct 09, 2014 Software Engineering, CUDA, Thrust

On-the-fly modification of LLVM IR code of CUDA sources

Largely thanks to LLVM, in recent years we’ve seen a significant increase of interest to domain-specific compilation tools research & development. With the re...

Sep 23, 2014 Software Engineering, LLVM

5-day GPU computing workshop at TÜBİTAK UZAY

Applied Parallel Computing LLC has delivered the GPU Computing Workshop at Space Technologies Research Institute (TÜBİTAK UZAY), Ankara, Turkey. We would like to ...

Sep 01, 2014 Software Engineering, Trainings, CUDA, OpenACC

How to find CUDA's version of LLVM backend

It is well-known that CUDA toolkit uses LLVM backend, but the used version number is not shown. We can use gdb and LLVM API function to print the version string:

Jul 14, 2014 Software Engineering, LLVM

NVIDIA Visual Profiler allows to connect 64-bit Linux server from 32-bit Windows

In CUDA 6.0 release an extremely handy feature has been added to Visual Profiler: support for remote profiling. This means that you can run the profiler GUI from ...

Jul 13, 2014 Software Engineering, CUDA

Efficient CPU-GPU data transfers, CUDA 6.0 Unified Virtual Memory

Juraj Kardoš – University of Lugano summer intern and our collaborator – presents a talk on efficient CPU-GPU data transfers and CUDA 6.0 Unified Virtual Memory o...

Jul 11, 2014 Software Engineering, CUDA

OpenACC workshop at TU Dortmund

4-day CUDA Course at Airbus Defence and Space

Use CUDA 7.0 NVRTC with Thrust

Get extra 8% perf in bilinear interpolation on GPU using __restrict__ keyword

Thrust/CUDA tip: reuse temporary buffer across multiple transforms

On-the-fly modification of LLVM IR code of CUDA sources

5-day GPU computing workshop at TÜBİTAK UZAY

How to find CUDA's version of LLVM backend

NVIDIA Visual Profiler allows to connect 64-bit Linux server from 32-bit Windows

Efficient CPU-GPU data transfers, CUDA 6.0 Unified Virtual Memory

Get extra 8% perf in bilinear interpolation on GPU using restrict keyword