Comments on: Try GPU computing with WebCL

By: Tomi Engdahl

Tomi Engdahl — Thu, 15 Dec 2022 11:16:41 +0000

What’s the Difference Between CUDA and ROCm for GPGPU Apps?
Dec. 2, 2022
NVIDIA’s CUDA and AMD’s ROCm provide frameworks to take advantage of the respective GPU platforms.
https://www.electronicdesign.com/technologies/embedded-revolution/article/21254328/electronic-design-whats-the-difference-between-cuda-and-rocm-for-gpgpu-apps?utm_source=EG+ED+Connected+Solutions&utm_medium=email&utm_campaign=CPS221209019&o_eid=7211D2691390C9R&rdx.identpull=omeda|7211D2691390C9R&oly_enc_id=7211D2691390C9R

By: Tomi Engdahl

Tomi Engdahl — Tue, 24 Oct 2017 14:04:20 +0000

Solving Mazes with Graphics Cards
https://hackaday.com/2017/10/23/solving-mazes-with-graphics-cards/

Software that runs on a GPU is called a shader. In this example a shader is shown that finds the way through a maze. We also get to catch a glimpse at the limitations that make this field of software special: [Viktor]’s solution has to work with only four variables, because all information is stored in the red, green, blue and alpha channels of an image. The alpha channel represents the boundaries of the maze. Red and green channels are used to broadcast waves from the beginning and end points of the maze. Where these two waves meet is the shortest solution, a value which is captured through the blue channel.

By: Tomi Engdahl

Tomi Engdahl — Fri, 11 Sep 2015 08:57:50 +0000

The Most Under-rated FPGA Design Tool Ever
http://www.eetimes.com/author.asp?section_id=36&doc_id=1327664&

There is a design tool that is being quietly adopted by FPGA engineers because, in many cases, it produces results that are better than hand-coded counterparts.

FPGAs keep getting larger, the designs more complex, and the need for high level design (HLD) flows never seems to go away. C-based design for FPGAs has been promoted for over two decades and several such tools are currently on the market. Model-based design has also been around for a long time from multiple vendors. OpenCL for FPGAs has been getting lots of press in the last couple of years. Yet, despite all of this, 90+% of FPGA designs continue to be built using traditional Verilog or VHDL.

No one can deny the need for HLD. New FPGAs contain over 1 million logic elements, with thousands of hardened DSP and memory blocks. Some vendor’s devices can even support floating-point as efficiently as fixed-point arithmetic. Data convertor and interface protocols routinely run at multiple GSPS (giga samples per second), requiring highly parallel or vectorized processing. Timing closure, simulation, and verification become ever-more time-consuming as design sizes grow. But HLD adoption still lags, and FPGAs are primarily programmed by hardware-centric engineers using traditional hardware description languages (HDLs).

The primary reason for this is quality of results (QoR). All high-level design tools have two key challenges to overcome. One is to translate the designer’s intent into implementation when the design is described in a high-level format. This is especially difficult when software programming languages are used (C++, MATLAB, or others), which are inherently serial in nature. It is then up to the compiler to decide by how much and where to parallelize the hardware implementation. This can be aided by adding special intrinsics into the design language, but this defeats the purpose. OpenCL addresses this by having the programmer describe serial dependencies in the datapath, which is why OpenCL is often used for programming GPUs. It is then up to the OpenCL compiler to decide how to balance parallelism against throughput in the implementation. However, OpenCL programming is not exactly a common skillset in the industry.

By: Tomi Engdahl

Tomi Engdahl — Thu, 06 Nov 2014 09:32:44 +0000

GPU Compute and OpenCL: an introduction.
http://www.edn-europe.com/en/gpu-compute-and-opencl-an-introduction..html?cmp_id=7&news_id=10005133&vID=209#.VFs-cclsUik

This article provides, to the reader unfamiliar with the subject, an introduction to the GPU evolution, current architecture, and suitability for compute intensive applications.

By: Tomi Engdahl

Tomi Engdahl — Tue, 28 Oct 2014 09:42:01 +0000

Implementing FPGA Design with the OpenCL Standard
http://www.altera.com/literature/wp/wp-01173-opencl.pdf

Utilizing the Khronos Group’s OpenCL™ standard on an FPGA may offer significantly higher performance and at much lower power than is available today from hardware architectures such as CPUs, graphics processing units (GPUs), and digital signal processing (DSP) units. In addition, an FPGA-based heterogeneous system (CPU + FPGA) using the OpenCL standard has a significant time-to-market advantage compared to traditional FPGA development using lower level hardware description languages (HDLs) such as Verilog or VHDL

By: Monster Warlord

Monster Warlord — Tue, 11 Mar 2014 23:56:04 +0000

I was wondering if you ever considered changing the structure
of your site? Its very well written; I love what youve got to say.
But maybe you could a little more in the way of content so people could connect with it
better. Youve got an awful lot of text for only having one or two pictures.
Maybe you could space it out better?

By: mozila fire

mozila fire — Tue, 10 Dec 2013 09:48:00 +0000

Wow! It’s a real shame more folks don’t know about this site,
it covered everything I needed!!!

By: Tomi Engdahl

Tomi Engdahl — Tue, 06 Aug 2013 13:40:06 +0000

Using OpenCL for Network Acceleration
http://rtcmagazine.com/articles/view/103209

Investigating the practicality of using OpenCL to accelerate AES and DES Encryption and Decryption by leveraging the GPU engines of the APU, reveals a realm of possibilities for exploiting parallelism on hybrid processors.

Microprocessor designs are trending in favor of a higher number of cores per socket versus increased clock speed. Increasingly, more cores are being integrated on the same die to fully take advantage of high-speed interconnects for interprocessor communications. Companies like Advanced Micro Devices (AMD) are innovating high-performance computing by integrating graphics with x86 CPUs to create what AMD refers to as Accelerated Processing Units (APUs).

The advent of the APU creates opportunities for designers to develop solutions not possible a few years ago. These solutions utilize multiple languages and execute across hardware execution domain to enable a wide variety of new applications. One such application is the use of GPU resources as a massively parallel “off-load” engine for computationally intense algorithms in security and networking.

While the results for each of the algorithms differed slightly on both the CPU (blue lines) and GPU (red lines), the overall trend for each was very consistent. It is clear that when the network traffic load was relatively light—meaning that there were not many concurrent threads required to support the algorithm—the CPUs were more than adequate and in fact more efficient than using OpenCL and the GPU cores. However, as the workload and number of concurrent threads increased, the OpenCL and GPU proved to be a significantly better solution.

The beauty of the APU-based architecture is that it allows the designer to decide when and if to use the GPU resources and how.

By: Tomi Engdahl

Tomi Engdahl — Tue, 05 Mar 2013 08:55:32 +0000

OpenCL drivers discovered on Nexus 4 and Nexus 10 devices
http://www.anandtech.com/show/6804/opencl-drivers-discovered-on-nexus-4-and-nexus-10-devices

Companies such as AMD, Intel and Nvidia have been shipping OpenCL drivers on the desktop for some time now. On the mobile side, vendors such as ARM, Imagination, Qualcomm, Samsung and TI have been promising OpenCL on mobile and often show off demos using OpenCL. Drivers from vendors such as ARM, Qualcomm and Imagination have also passed official conformance tests, certifying that they do have working drivers in at least development firmwares.

However, none of the vendors have publically announced whether or not they are already shipping OpenCL in stock firmware on any device.

However, recently we have seen several stories that OpenCL drivers are in fact present on both Nexus 4 and Nexus 10 stock firmware.

By: Tomi

Tomi — Sat, 09 Feb 2013 22:12:28 +0000

Fall Fury: Part 2 – Shaders
http://channel9.msdn.com/coding4fun/articles/Fall-Fury-Part-2-Shaders

In simple terms, shaders are small programs that are executed on the Graphical Processing Unit (GPU) instead of the Central Processing Unit (CPU). In recent years, we’ve seen a major spike in the capabilities of graphic devices, allowing hardware manufacturers to design an execution layer tied to the GPU, therefore being able to target device-specific manipulations to a highly optimized unit. Shaders are not used for simple calculations, but rather for image processing. For example, a shader can be used to adjust image lighting or colors.

Modern GPUs give access to the rendering pipeline that allows developers to execute arbitrary image processing code.

vertex shaders translate the coordinates of a vector in 3D space in relation to the 2D frame. Vertex shaders are executed one time per vector passed to the GPU.

Pixel shaders – these programs are executed on the GPU in relation to every passed pixel
if you want specific pixels adjusted for lighting or 3D bump mapping, a pixel shader can provide the desired effect for a surface.

Geometry shaders – these shaders are the next progression from vertex shaders, introduced with DirectX 10. The developer is able to pass specific primitives as input and either have the output represent the modified version of what was passed to the program or have new primitives, such as triangles, be generated as a result. Geometry shaders are always executed on post-vertex processing in the rendering pipeline.