GPU accelerated rectilinear steiner tree construction

Download
2015
Özcan, Dinçer
The Rectilinear Steiner Tree (RST) problem is one of the fundamental problems in circuit design automation. The problem is to find the tree structure that connects all points in the input set such that total tree length is minimized. In order to reduce the total length, extra points, called Steiner points, are introduced to the tree. Since the RST problem is NP-complete, developing heuristic algorithms which can produce near optimal solutions is the primary approach to solve the problem. This thesis accelerates the Modified RST algorithm through parallelizing and using a state of art Graphics Processing Unit (GPU) platform. GPUs contain many computational units and they can provide massive computational power. However, it is not trivial to map an RST problem instance to GPU. In order to benefit from the resources of GPU, the problem has to be parallelized and suitably implemented. In this study, we thoroughly investigate two recent rectilinear Steiner tree algorithms, RST and Modified RST, and identify parallel implementation opportunities. Modified RST algorithm is speed oriented and has better performance especially on large problem instances. Moreover, it is suitable for paralel implementation. Thus, we propose a paralel and scalable RST solution based on Modified RST algorithm, which paralellizes the whole algortihm and utilizes GPU resources more efficiently. Computational results for realistic applications and random generated benchmarks are presented to evaluate our implementation. Significant speed up is observed especially for large problem sizes.

Suggestions

Evaluation of compression algorithms for motion command generation
Yaman, Ulaş; Dölen, Melik (2011-04-15)
This paper focuses on a direct command generation technique for Computer Numerical Control (CNC) machine systems. In this paradigm, higher-order differences of a given trajectory (i.e, position) are computed and the resulting data are compacted via data compression techniques. As a part of the command generation scheme, the paper also introduces a new data compression technique titled ΔY10. Apart from this new method, the performances of the proposed generator employing different compression algorithms (suc...
Optimum geometry for torque ripple minimization of switched reluctance motors
Sahin, F; Ertan, HB; Leblebicioğlu, Mehmet Kemal (Emerald, 1995-12-01)
This paper briefly describes an approach to determine the optimum magnetic circuit parameters to minimize low speed torque ripple for switched reluctance (SR) motors. For prediction of the torque ripple, normalized data obtained from field solution and a neural network approach is used. Comparison of experimental results with computations illustrates the accuracy of the method. The optimization method is briefly described and some results are presented.
FPGA IMPLEMENTATION OF TURBO DECODERS USING BCJR ALGORITHM
Atar, Onur; SAZLI, Murat Hüsnü (2011-12-01)
The most difficult design issue for turbo codes, which is the most recent and successful channel coding method to approach the channel capacity limit, is the design of the iterative decoders which perform calculations for all possible states of the encoders. BCJR (MAP) algorithm, which is used for turbo decoders, embodies complex mathematical operations such as division, exponential and logarithm calculations. Therefore, BCJR algorithm was avoided and the sub-optimal derivatives of this algorithm such as Lo...
Parallel-MLFMA Solutions of Large-Scale Problems Involving Composite Objects
Ergül, Özgür Salih (2012-07-14)
We present a parallel implementation of the multilevel fast multipole algorithm (MLFMA) for fast and accurate solutions of large-scale electromagnetics problems involving composite objects with dielectric and metallic parts. Problems are formulated with the electric and magnetic current combined-field integral equation (JMCFIE) and solved iteratively with MLFMA on distributed-memory architectures. Numerical examples involving canonical and complicated objects, such as optical metamaterials, are presented to...
Automated Sizing of Truss Structures Using a Computationally Improved SOPT Algorithm
Hasançebi, Oğuzhan (2013-06-01)
The present study attempts to apply an efficient yet simple optimization (SOPT) algorithm to optimum design of truss structures under stress and displacement constraints. The computational efficiency of the technique is improved through avoiding unnecessary analyses during the course of optimization using the so-called upper bound strategy (UBS). The efficiency of the UBS integrated SOPT algorithm is evaluated through benchmark sizing optimization problems of truss structures and the numerical results are r...
Citation Formats
D. Özcan, “GPU accelerated rectilinear steiner tree construction,” M.S. - Master of Science, Middle East Technical University, 2015.