Some people believe that multi-core processors such as graphics processing units (GPUs) and TIlera processors are gradually replacing field-programmable gate displays (FPGAs) in some applications. The reason is that these multi-core processors have much higher processing performance. For example, because the GPU is primarily responsible for graphics rendering, it is especially good at handling single-precision (SP) and (in some cases) double-precision (DP) floating-point (FP). ) Operation. TIlera's TILE devices currently do not support hardware FP operations, but require software emulation and are costly to perform. In general, the same is true for FPGAs, where devices handle FP operations by leveraging multiple resources. To achieve acceptable performance, IP blocks require multiple gates and require deep pipeline technology. For example, current Tesla-class GPUs can execute up to 1012 floating-point operations or 1TFLOPS per second, compared to 150 GFLOPS for Xilinx Virtex-6 devices. The situation is different when considering fixed-point operations. A new generation of GPUs can perform integer operations with the same floating point rate. For example, when the Virtex-6 device is increased to 500 GOPS, the GPU can perform 1012 operations per second or 1 TOPS. Integer performance is the advantage of the TILE processor: for 8-bit data, TILE-Gx (Figure 1) has a maximum execution capacity of 750 GOPS and 32-bit data with 188 GOPS. FPGAs can take advantage of their parallelism and adaptability to multiple algorithms to achieve performance closer to the theoretical maximum. However, FPGAs require more silicon space and longer development time to approach these theoretical maximums. For algorithms that adapt to the GPU hardware parallel mode, the GPU has been able to reach 20 to 30% of the peak. They also have reasonable silicon densities (40nm process, 32nm R&D) and development time (usually only a few weeks, while FPGAs take several months). The TILEPro64 processor offers FPGA-like adaptability and GPU-like programmability, but its coarse task-level problem decomposition makes it impossible to achieve fine-grained parallelism like FPGAs and GPUs. Memory bandwidth is equally important in evaluating processor performance, and GPUs can deliver three times the power of FPGAs and six times the power of TILEPro64. However, it must be noted that this bandwidth must be based on the following conditions: large delays that occur must be controlled by cross-processing, and joint access should be achieved through integration in the best access mode. With an FPGA, developers need to fully consider the memory location. The new generation of GPUs and TILEPro64 processors have a traditional cache distribution that helps optimize memory locations and reduce development time. Perhaps the most likely factor to rule out using GPGPU is latency. For example, the time required to call the kernel and the longer access time of the main memory can cause long delays. In many cases, this delay may be slightly relieved but not completely avoided. Therefore, it should be preferred to process large data sets because it is a large number of operations, in other words, it has a high computational strength. In environments where strict latency requirements are required (such as closed loop control), FPGAs are preferred. The TILE processor has good latency. Vacuum Cleaner Bldc Dry-Wet Motor Vacuum Cleaner Bldc Dry-Wet Motor,Dry Vacuum Cleaner Bldc Motor,Electric Dc Vacuum Cleaner Motor,Vacuum Cleaner For With Brushless Motor Zhoushan Chenguang Electric Appliance Co., Ltd. , https://www.vacuum-cleaner-motors.com
Will multi-core processors replace FPGAs?
Figure 1: Tilera's TILE-Gx processor has a maximum execution capacity of 750 GOPS for 8-bit data.