Ph.D.-Seminar:
High Performance Computing I
- Contents:
- We will start with an introduction into basic principles
and algorithms
of parallel computing followed by transferring selected algorithms onto
many-core architectures. We will focus on NVIDIA-GPUs from the GTX
280/480
series using the CUDA library, even on multiple GPUs. The students will
compare the performance of the algorithms on the GPU with performance
on multi-core CPUs (Intel/AMD) and OpenCL. Compilers and tools
supporting GPU programming will be used an availablity.
- A tech-report has to be produced until the end of the term.
-
- Lecturer:
- Prof.
Gundolf
Haase, Heinrichstr. 36, Zi 506, Tel. 5178,
- Appointments:
- Monday
13:00 - 14:30 in Heinrichstr. 36, SR 11.34
(starts on Oct 1, 2012)
-
- Presentations:
- Projects: WS12/13, WS11/12, WS10/11
- Download:
- Getting Started
with CUDA (pdf)
- first simple code incl. makefile
- Code in mephisto for CUDA/PGI-OpenACC (add-on
for ~/.bashrc)
- Hardware
(login from outside KFU
only
via VPN):
- gpu4u (inkl. 4x GTX 280): 166
- fermi
(inkl.
2x GTX 480): 245
- gpu11
(sandi brigde + GTX 580) : 68
- mephisto (5 x [4x Tesla 2070]): 128
- Material
for OpenACC (PGI):
Material
for OpenACC (CAPS): Don't use that buggy OpenACC
compiler!
Material for CUDA:
- CUDA Toolkit Documentation: all
- CUDA Toolkit/SDK 5.0 Download,
Getting Started Guide, Programming Guide, Best Practices Guide (more docs),
CUDA
5 (video)
- new: Tutorials
on CUDA, OpenCL, Thrust, Nsight, PGI
- NVIDIA, CUDA,
OpenCL, OpenCL
for
NVIDIA
- Nicely
written, older installation guide
- NVIDIA-Tutoral
- CUDA Programming: Getting Started,
Guide,
Reduction in CUDA
- Slides by M.Liebmann (1,
2, 3,
4, 5,
6)
- AMD, Radeon: Developer
Center
- List
of GPU-acclerated libraries; Thrust
1.5 (C++ STL in CUDA, see modules),
ppt
- cuPrintf
(printf from kernel)
- GPGPU.org
- Material for OpenCL:
- Software/Compiler/Hardware:
- OpenACC
(Cray, NVIDIA, PGI, CAPS), Quick
Ref
- CUBLAS,
CUFFT,
CUSPARSE,
CURAND, Thrust
1.5
- LAPACK on GPU (Info):
cuLA
- PGI-Compiler
with CUDA pragmas [prices]
- CUDA-Programme auch
auf
CPUs lauffähig
- HMPP
workbench with pragmas for CUDA/OpenCL [prices]
- OP2
project by Mike Giles,
great Course
by Mike Giles (see also the guest talks)
- PetSc
on GPU
- LIBJACKET,
C++/C library for GPU computing (Download)
- Kepler-GK110: 1,
2,
3,
4
- Tesla K20: 1,
2,
3,
Top
500
- AMD: Firepro
S1000
- Intel® Xeon®
Phi: 1,
2,
3,
4,
Stampede
- Further
Links
- gpgpu.org, gpucomputing.net
- MultiCoreInfo, GPGPU
- Comparison
GPU/CPU
- New: NVIDIA OpenCL 1.0
(download),
- New: Intel OpenCL SDK
1.1
- New: GTC
[30.09.2009, 01.10.2009,
Fermi
in
c't 22/09], NEXUS
(visual studio based) , Radeon
HD
5800 [23.09.2009], comparison
- Tesla mit Fermi
[16.11.2009], GF100
[18.01.2010], Tesla
C2050, Quadro
6000
- Wiener Supercomputer
[VSC-2] [Nov. 27,
2009], login to Tesla in Wien
- Chinease
GPU-supercomputers [May 31, 2010], nebulae,
auto-tuning,
- Aubrey
Isle / Knights Ferry co-processor card by Intel [June 1, 2010; c't
13/2010, p.20], comparison
with fermi, Compiler,
1
TFLOP DGEMM [Nov. 16, 2011]
- AMD Llano
[Oct. 19, 2010], aktuell
[June 2011]
- NVIDA GPUs in servers
by
Cray [Sept. 22, 2010],
- next generation: Kepler
and
Maxwell [Sept. 22, 2010]
- Chinese Tianhe-1A
on rank 1 in top 500 [Oct 28. 2010]: 14336
Xeon + 7168 Tesla (2.5 PFLOPS, 4.04 MW) located at NSC
in Tianjin (see Spiegel,
Heise,
nvidia).
More
details.
- BlueGene/Q with 17 cores (Heise)
- Mathematica8
supports GPU computing
- low energy supercomouter at Uni Frankfurt (Heise)
- Top 500 [June 2011] (heise)
- Oak
Ridge plans with 18000 Kepler-GPUs [Oct. 11, 2011]; Titan
[Oct. 30, 2012]
- MS-AMP
[Feb 2012]
- Cray
XC30 using Xeon Phi [Nov. 9, 2012]; Titan
- Nvida: Volta
(1 TB bandwidth)
-
20.03.2013