Lars Struyf, Stijn De Beugher, Dong Hoon Van Uytsel, Frans Kanters, Toon Goedemé
This paper focuses on a thorough comparison of the two main hardware targets for real-time optimization of a computer vision algorithm: GPU and FPGA. Based on a complex case study algorithm for threaded isle detection, implementation on both hardware targets is compared in terms of resulting time performance, code translation effort, hardware cost, power efficiency and integrateability. A real-life case study as described in this paper is a very useful addition to discussions on a more theoretical level, going beyond artificial experi- ments. In our experiments, we show the speed-up gained by porting our algorithm to FPGA using manually written VHDL and to a heterogeneous GPU/CPU architecture with the OpenCL language. Also, issues and problems occurring during the code porting are detailed.