
Reconfigurable Computing encompasses many different architectures which allow configurability on the hardware level. The main motivation is to bring together the high performance of hardwired Application Specific Integrated Circuits (ASICs) and the programming flexibility of micro-processors. The architectures offer different compromises between these antipodes, but the focus is often on highly parallel processing optimized for high data throughput rather than low latency response. A popular arrangement of parallel processing elements (PEs) is a tile architecture with a configurable interconnect between the tiles. The functionality of the PEs can range from boolean functions on individual bits to entire processors. In this project we have so far worked with a FPGA (Field Programmable Gate Arrays) which has fine-grained PEs (4 bit input look-up tables) and a computing array with coarse-grained PEs (24 bit ALUs)
On the relatively small FPGA (XC4085 XLA from
Xilinx) we have implemented a solver for the level set equation
and use it for segmentation of medical images (Fig. 2). The FPGA was operated on a low-cost PCI card in a standard PC. The computing array (XPP from PACT)
we have used for denoising of images with a non-linear diffusion model
(Fig. 4). Because the actual hardware was not available at first,
the configurations were tested with a clock accurate simulator and the
results generated with a software simulation.
In view of the memory
gap problem, a huge advantage of the free configurability is the
possibility to incorporate all sorts of data-flow optimizations and
parallelism into the implementation (Fig. 1). In particular deep
pipelines can be built which require little bandwidth and execute many
operations in parallel (Fig. 3). So even with the low frequencies of the
devices in tens of MHz, a GHz PC can be easily outperformed, since we do
not suffer a bandwidth problem and hundred or more operations are
performed in each clock cycle in parallel.
The main disadvantage is
the more tedious programming models based on hardware description
languages. For the image processing examples presented below the
complexity is manageable and therefore reconfigurable computing is
gaining popularity in this area. In principal the implemented PDE
solvers could also be applied for scientific simulations, but these
simulations usually comprise many interacting PDE systems which produce
a much higher program complexity. Therefore, we are interested in the
ongoing research on high level languages for reconfigurable computing
which can still efficiently utilize the massive available parallelism.

1. Some optimizations available in reconfigurable computing.

2. Segmentation of a brain tumor computed
with the FPGA.



3. Configuration of a 3x3 filter for a 2D image in the XPP.

4. Non-linear diffusion as implemented on the XPP.




|

dozrtska@mped.gpm.fni-i