The Community for Technology Leaders
2013 20th International Conference on High Performance Computing (HiPC) (2013)
Bengaluru (Bangalore), Karnataka, India
Dec. 18, 2013 to Dec. 21, 2013
ISBN: 978-1-4799-0730-4
TABLE OF CONTENTS

Share-o-meter: An empirical analysis of KSM based memory sharing in virtualized systems (Abstract)

Shashank Rachamalla , Department of Computer Science and Engineering, Indian Institute of Technology Bombay, India
Debadatta Mishra , Department of Computer Science and Engineering, Indian Institute of Technology Bombay, India
Purushottam Kulkarni , Department of Computer Science and Engineering, Indian Institute of Technology Bombay, India
pp. 59-68

Revisiting the space-filling curves for storage, reordering and partitioning mesh based data in scientific computing (Abstract)

Pavanakumar M. , Computational and Theoretical Fluid Dynamics Division, CSIR National Aerospace Laboratories, Bangalore, India
Kaushik K. N. , Computational and Theoretical Fluid Dynamics Division, CSIR National Aerospace Laboratories, Bangalore, India
pp. 362-367

MaSiF: Machine learning guided auto-tuning of parallel skeletons (Abstract)

Alexander Collins , School of Informatics, University of Edinburgh, EH8 9AB, UK
Christian Fensch , School of Informatics, University of Edinburgh, EH8 9AB, UK
Hugh Leather , School of Informatics, University of Edinburgh, EH8 9AB, UK
Murray Cole , School of Informatics, University of Edinburgh, EH8 9AB, UK
pp. 186-195

Effects of phase imbalance on data center energy management (Abstract)

Sushil Gupta , HCL Infosystems Ltd, India
Ayan Banerjee , IMPACT Lab at Arizona State University, USA
Zahra Abbasi , IMPACT Lab at Arizona State University, USA
Sandeep K.S. Gupta , IMPACT Lab at Arizona State University, USA
pp. 415-424

HARP: Adaptive abort recurrence prediction for Hardware Transactional Memory (Abstract)

Adria Armejach , Barcelona Supercomputing Center, Spain
Anurag Negi , Chalmers University of Technology, Sweden
Adrian Cristal , Barcelona Supercomputing Center, Spain
Osman Unsal , Barcelona Supercomputing Center, Spain
Per Stenstrom , Chalmers University of Technology, Sweden
Tim Harris , Oracle Labs, Cambridge, Germany
pp. 196-205

Minimization of cloud task execution length with workload prediction errors (Abstract)

Sheng Di , 1INRIA, France
Cho-Li Wang , The University of Hong Kong, Hong Kong
pp. 69-78

Speculative dynamic vectorization to assist static vectorization in a HW/SW co-designed environment (Abstract)

Rakesh Kumar , Department of Computer Architecture, Universitat Politècnica de Catalunya, Barcelona, Spain
Alejandro Martinez , Intel Barcelona Research Center, Intel Labs, Spain
Antonio Gonzalez , Department of Computer Architecture, Universitat Politècnica de Catalunya, Barcelona, Spain
pp. 79-88

LiPS: A cost-efficient data and task co-scheduler for MapReduce (Abstract)

Moussa Ehsan , Computer Science, Stony Brook University, USA
Yao Chen , Computer Science, Stony Brook University, USA
Hui Kang , Computer Science, Stony Brook University, USA
Radu Sion , Computer Science, Stony Brook University, USA
Jennifer Wong , Computer Science, Stony Brook University, USA
pp. 49-58

Multi-tier energy buffering management for IDCs with heterogeneous energy storage devices (Abstract)

Zahra Abbasi , Impact Lab, School of Computing, Informatics and Decision Systems Engineering, ASU, Tempe, Arizona, USA
Madhurima Pore , Impact Lab, School of Computing, Informatics and Decision Systems Engineering, ASU, Tempe, Arizona, USA
Ayan Banerjee , Impact Lab, School of Computing, Informatics and Decision Systems Engineering, ASU, Tempe, Arizona, USA
Sandeep K. S. Gupta , Impact Lab, School of Computing, Informatics and Decision Systems Engineering, ASU, Tempe, Arizona, USA
pp. 368-377

A Branch-and-Bound algorithm using multiple GPU-based LP solvers (Abstract)

Xavier Meyer , Dept. of Computer Science, University of Geneva, Switzerland
Bastien Chopard , Dept. of Computer Science, University of Geneva, Switzerland
Paul Albuquerque , Institute for Informatics & Telecommunications, University of Applied Sciences of Western Switzerland, Geneva, Switzerland
pp. 129-138

MIL: A language to build program analysis tools through static binary instrumentation (Abstract)

Andres S. Charif-Rubial , Exascale Computing Research Laboratory, FR
Denis Barthou , Laboratoire LaBRI, University of Bordeaux, Bordeaux, FR
Cedric Valensi , Exascale Computing Research Laboratory, FR
Sameer Shende , Department of Computer and Information Science, University of Oregon, Eugene, USA
Allen Malony , Department of Computer and Information Science, University of Oregon, Eugene, USA
William Jalby , Exascale Computing Research Laboratory, FR
pp. 206-215

GAGM: Genome assembly on GPU using mate pairs (Abstract)

Ashutosh Jain , Dept. of Computer Science and Engineering, IIT Delhi, New Delhi, India
Anshuj Garg , Dept. of Computer Science and Engineering, IIT Delhi, New Delhi, India
Kolin Paul , Dept. of Computer Science and Engineering, IIT Delhi, New Delhi, India
pp. 176-185

A new parallel algorithm for connected components in dynamic graphs (Abstract)

Robert McColl , College of Computing, Georgia Institute of Technology, Atlanta, USA
Oded Green , College of Computing, Georgia Institute of Technology, Atlanta, USA
David A. Bader , College of Computing, Georgia Institute of Technology, Atlanta, USA
pp. 246-255

Accelerating Strassen-Winograd's matrix multiplication algorithm on GPUs (Abstract)

Pai-Wei Lai , Dept. of Computer Science and Engineering, The Ohio State University, USA
Humayun Arafat , Dept. of Computer Science and Engineering, The Ohio State University, USA
Venmugil Elango , Dept. of Computer Science and Engineering, The Ohio State University, USA
P. Sadayappan , Dept. of Computer Science and Engineering, The Ohio State University, USA
pp. 139-148

Accelerating inclusion-based pointer analysis on heterogeneous CPU-GPU systems (Abstract)

Yu Su , School of Computer Science and Engineering, University of New South Wales, 2052, Australia
Ding Ye , School of Computer Science and Engineering, University of New South Wales, 2052, Australia
Jingling Xue , School of Computer Science and Engineering, University of New South Wales, 2052, Australia
pp. 149-158

Benchmarking MIC architectures with Monte Carlo simulations of spin glass systems (Abstract)

Alessandro Gabbana , University of Ferrara, I-44122 Italy
Marcello Pivanti , University of Ferrara, I-44122 Italy
Sebastiano Fabio Schifano , University of Ferrara, I-44122 Italy
Raffaele Tripiccione , University of Ferrara, I-44122 Italy
pp. 378-385

Conflict-free data access for multi-bank memory architectures using padding (Abstract)

Joar Sohl , Department of Computer Engineering, Linköping University, Sweden
Jian Wang , Department of Computer Engineering, Linköping University, Sweden
Andreas Karlsson , Department of Computer Engineering, Linköping University, Sweden
Dake Liu , Department of Computer Engineering, Linköping University, Sweden
pp. 425-432

Cache-based cross-iteration coherence for speculative parallelization (Abstract)

Andre Baixo , University of Washinghton, Seattle, USA
Joao Paulo Porto , Google Inc., Mountain View, USA
Guido Araujo , IC-UNICAMP, Campinas, Brazil
pp. 216-225

Performance and energy consumption analysis of a seismic application for three different architectures intended for oil and gas industry (Abstract)

Lucas T. Melo , Informatics Center (CIn), Federal University of Pernambuco (UFPE), Recife, Brasil
Gilliano G. S. Menezes , Informatics Center (CIn), Federal University of Pernambuco (UFPE), Recife, Brasil
Abel G. Silva-Filho , Informatics Center (CIn), Federal University of Pernambuco (UFPE), Recife, Brasil
Manoel E. Lima , Informatics Center (CIn), Federal University of Pernambuco (UFPE), Recife, Brasil
pp. 386-395

Analyzing the performance impact of authorization constraints and optimizing the authorization methods for workflows (Abstract)

Nadeem Chaudhary , Department of Computer Science, University of Warwick, Coventry, CV4 7AL, United Kingdom
Ligang He , Department of Computer Science, University of Warwick, Coventry, CV4 7AL, United Kingdom
pp. 1-9

iFlatLFS: Performance optimization for accessing massive small files (Abstract)

Songling Fu , School of Computer Science, National University of Defense Technology, Changsha, China
Chenlin Huang , School of Computer Science, National University of Defense Technology, Changsha, China
Ligang He , Department of Computer Science, University of Warwick, Coventry, UK
Nadeem Chaudhary , Department of Computer Science, University of Warwick, Coventry, UK
Xiangke Liao , School of Computer Science, National University of Defense Technology, Changsha, China
Shazhou Yang , School of Computer Science, National University of Defense Technology, Changsha, China
Xiaochuan Wang , School of Computer Science, National University of Defense Technology, Changsha, China
Bao Li , School of Computer Science, National University of Defense Technology, Changsha, China
pp. 10-19

Solving tridiagonal systems on a GPU (Abstract)

Brian J. Murphy , Department of Mathematics and Computer Science, Lehman College of the City University of New York, Bronx, 10468 USA
pp. 159-168

The super warp architecture with random address shift (Abstract)

Koji Nakano , Department of Information Engineering, Hiroshima University, Kagamiyama 1-4-1, Higashi, 739-8527 Japan
Susumu Matsumae , Department of Information Science, Saga University, Honjo 1, 840-8502 Japan
pp. 256-265

Adding data parallelism to streaming pipelines for throughput optimization (Abstract)

Peng Li , Department of Computer Science and Engineering, Washington University in St. Louis, MO 63130, USA
Kunal Agrawal , Department of Computer Science and Engineering, Washington University in St. Louis, MO 63130, USA
Jeremy Buhler , Department of Computer Science and Engineering, Washington University in St. Louis, MO 63130, USA
Roger D. Chamberlain , Department of Computer Science and Engineering, Washington University in St. Louis, MO 63130, USA
pp. 20-29

A memory efficient algorithm for adaptive multidimensional integration with multiple GPUs (Abstract)

Kamesh Arumugam , Department of Computer Science, Old Dominion University, Norfolk, Virginia 23529, USA
Alexander Godunov , Department of Physics, Old Dominion University, Norfolk, Virginia 23529, USA
Desh Ranjan , Department of Computer Science, Old Dominion University, Norfolk, Virginia 23529, USA
Balsa Terzic , Center for Advanced Studies of Accelerators, Jefferson Lab, Newport News, Virginia 23606, USA
Mohammad Zubair , Department of Computer Science, Old Dominion University, Norfolk, Virginia 23529, USA
pp. 169-175

Performance evaluation of medical imaging algorithms on Intel® MIC platform (Abstract)

Jyotsna Khemka , Siemens Corporate Research and Technology Center, Bangalore, India 560100
Mrugesh Gajjar , Siemens Corporate Research and Technology Center, Bangalore, India 560100
Sharan Vaswani , Siemens Corporate Research and Technology Center, Bangalore, India 560100
Naga Vydyanathan , Siemens Corporate Research and Technology Center, Bangalore, India 560100
Rama Malladi , Intel Technology India Pvt. Ltd., Bangalore, India 560017
Vinutha S V , Intel Technology India Pvt. Ltd., Bangalore, India 560017
pp. 396-404

Exploring energy and performance behaviors of data-intensive scientific workflows on systems with deep memory hierarchies (Abstract)

Marc Gamell , Rutgers Discovery Informatics Institute, NSF Cloud and Autonomic Computing Center, Rutgers University, Piscataway, NJ, USA
Ivan Rodero , Rutgers Discovery Informatics Institute, NSF Cloud and Autonomic Computing Center, Rutgers University, Piscataway, NJ, USA
Manish Parashar , Rutgers Discovery Informatics Institute, NSF Cloud and Autonomic Computing Center, Rutgers University, Piscataway, NJ, USA
Stephen Poole , Computer Science and Mathematics & NCCS Divisions, Oak Ridge National Laboratory, TN, USA
pp. 226-235

Approximation algorithms for energy minimization in Cloud service allocation under reliability constraints (Abstract)

Olivier Beaumont , Inria, Bordeaux, France
Philippe Duchon , University of Bordeaux, France
Paul Renaud-Goud , Inria, Bordeaux, France
pp. 295-304

Work efficient parallel algorithms for large graph exploration (Abstract)

Dip Sankar Banerjee , International Institute of Information Technology, Hyderabad, Gachibowli, India 500 032
Shashank Sharma , International Institute of Information Technology, Hyderabad, Gachibowli, India 500 032
Kishore Kothapalli , International Institute of Information Technology, Hyderabad, Gachibowli, India 500 032
pp. 433-442

Transaction scheduling using conflict avoidance and Contention Intensity (Abstract)

Marcio M. Pereira , Institute of Computing, UNICAMP, Av. Albert Einstein, 1251, Campinas, SP - Brazil
Alexandro Baldassin , UNESP - Univ Estadual Paulista, Rio Claro, SP - Brazil
Guido Araujo , Institute of Computing, UNICAMP, Av. Albert Einstein, 1251, Campinas, SP - Brazil
Luiz Eduardo Buzato , Institute of Computing, UNICAMP, Av. Albert Einstein, 1251, Campinas, SP - Brazil
pp. 236-245

Algorithms for the relaxed Multiple-Organization Multiple-Machine Scheduling Problem (Abstract)

Anirudh Chakravorty , Indraprastha Institute of Information Technology, Delhi, India
Neelima Gupta , Department of Computer Science, University of Delhi, India
Neha Lawaria , Department of Computer Science, University of Delhi, India
Pankaj Kumar , Department of Computer Science, University of Delhi, India
Yogish Sabharwal , IBM Research - India
pp. 30-38

SCORPIO: A scalable two-phase parallel I/O library with application to a large scale subsurface simulator (Abstract)

Sarat Sreepathi , Oak Ridge National Laboratory, TN, USA
Vamsi Sripathiy , Intel Corporation, Hillsboro, OR, USA
Richard Mills , Oak Ridge National Laboratory, TN, USA
Glenn Hammondz , Pacific Northwest National Laboratory, Richland, WA, USA
G. Kumar Mahinthakumar , North Carolina State University, Raleigh, USA
pp. 443-451

Can GPUs sort strings efficiently? (Abstract)

Aditya Deshpande , Center for Visual Information Technology, International Institute of Information Technology, Hyderabad, India
P J Narayanan , Center for Visual Information Technology, International Institute of Information Technology, Hyderabad, India
pp. 305-313

Parallel branch-and-bound for two-stage stochastic integer optimization (Abstract)

Akhil Langer , Department of Computer Science, University of Illinois at Urbana-Champaign, USA
Ramprasad Venkataraman , Department of Computer Science, University of Illinois at Urbana-Champaign, USA
Udatta Palekar , College of Business, University of Illinois at Urbana-Champaign, USA
Laxmikant V. Kale , Department of Computer Science, University of Illinois at Urbana-Champaign, USA
pp. 266-275

Compiler generation and autotuning of communication-avoiding operators for geometric multigrid (Abstract)

Protonu Basu , University of Utah, Salt Lake City, 84112, USA
Anand Venkat , University of Utah, Salt Lake City, 84112, USA
Mary Hall , University of Utah, Salt Lake City, 84112, USA
Samuel Williams , Lawrence Berkeley National Laboratory, CA 94720, USA
Brian Van Straalen , Lawrence Berkeley National Laboratory, CA 94720, USA
Leonid Oliker , Lawrence Berkeley National Laboratory, CA 94720, USA
pp. 452-461

Loop level speculation in a task based programming model (Abstract)

Rahulkumar Gayatri , Barcelona Supercomputing Center, Spain
Rosa. M Badia , Barcelona Supercomputing Center, Spain
Eduard Aygaude , Barcelona Supercomputing Center, Spain
pp. 39-48

A self-tuning system based on application Profiling and Performance Analysis for optimizing Hadoop MapReduce cluster configuration (Abstract)

Dili Wu , ISIS, Dept of EECS, Vanderbilt University, 1025 16th Ave S, Nashville, TN 37212, USA
Aniruddha Gokhale , ISIS, Dept of EECS, Vanderbilt University, 1025 16th Ave S, Nashville, TN 37212, USA
pp. 89-98

A dynamic schema to increase performance in many-core architectures through percolation operations (Abstract)

Elkin Garcia , Computer Architecture and Parallel System Laboratory (CAPSL) - Electrical and Computer Engineering Department, University of Delaware, Newark, 19716, U.S.A.
Daniel Orozco , Computer Architecture and Parallel System Laboratory (CAPSL) - Electrical and Computer Engineering Department, University of Delaware, Newark, 19716, U.S.A.
Rishi Khan , ET International, Newark, DE 19711, U.S.A.
Ioannis E. Venetisz , Department of Computer Engineering and Informatics, University of Patras, Rion 26500, Greece
Kelly Livingston , Computer Architecture and Parallel System Laboratory (CAPSL) - Electrical and Computer Engineering Department, University of Delaware, Newark, 19716, U.S.A.
Guang R. Gao , Computer Architecture and Parallel System Laboratory (CAPSL) - Electrical and Computer Engineering Department, University of Delaware, Newark, 19716, U.S.A.
pp. 276-285

Efficient sparse matrix multiple-vector multiplication using a bitmapped format (Abstract)

Ramaseshan Kannan , School of Mathematics, The University of Manchester, M13 9PL, UK
pp. 286-294

Parallel distributed breadth first search on GPU (Abstract)

Koji Ueno , Tokyo Institute of Technology / JST CREST, Japan
Toyotaro Suzumura , Tokyo Institute of Technology, IBM Research - Tokyo / JST CREST, Japan
pp. 314-323

Web-scale entity annotation using MapReduce (Abstract)

Shashank Gupta , IIT Bombay, India
Varun Chandramouli , NetApp, India
Soumen Chakrabarti , IIT Bombay, India
pp. 99-108

Evaluation and enhancement of weather application performance on Blue Gene/Q (Abstract)

Gurbinder Singh Gill , IBM Research - India
Vaibhav Saxena , IBM Research - India
Rashmi Mittal , IBM Research - India
Thomas George , IBM Research - India
Yogish Sabharwal , IBM Research - India
Lalit Dagar , Universiti Brunei Darussalam, Brunei
pp. 324-332

Efficient homology computations on multicore and manycore systems (Abstract)

N. Anurag Murty , Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore, India
Vijay Natarajan , Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore, India
Sathish Vadhiyar , Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore, India
pp. 333-342

A hybrid shared memory heterogeneous execution platform for PCIe-based GPGPUs (Abstract)

Sambit K. Shukla , Computer Science and Engineering, UC Riverside, CA, USA
Laxmi N. Bhuyan , Computer Science and Engineering, UC Riverside, CA, USA
pp. 343-352

GPU-enabled efficient executions of radiation calculations in climate modeling (Abstract)

Sai Kiran Korwar , Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore, India
Sathish Vadhiyar , Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore, India
Ravi S Nanjundiah , Centre for Atmospheric & Oceanic Sciences, Indian Institute of Science, Bangalore, India
pp. 353-361

A hybrid parallelization approach for high resolution operational flood forecasting (Abstract)

Swati Singhal , IBM Research India, New Delhi, India
Lucas Villa Real , IBM Research Brazil, Sao Paulo, Brazil
Thomas George , IBM Research India, Bangalore, India
Sandhya Aneja , Universiti Brunei Darussalam, Brunei
Yogish Sabharwal , IBM Research India, New Delhi, India
pp. 405-414

X10-based distributed and parallel betweenness centrality and its application to social analytics (Abstract)

Charuwat Houngkaew , Tokyo Institute of Technology, Japan
Toyotaro Suzumura , Tokyo Institute of Technology, Japan
pp. 109-118

SymSig: A low latency interconnection topology for HPC clusters (Abstract)

Dhananjay Brahme , High Performance Computing, Center of Excellence, Tata Consultancy Services, Pune, India 411057
Onkar Bhardwaj , Department of Electrical, Computers and Systems Engg, Rensselaer Polytechnic Institute, Troy, NY 12180, USA
Vipin Chaudhary , Department of Computer Science and Engineering, SUNY Buffalo, New York, USA
pp. 462-471

Author index (PDF)

pp. 1-13

Front cover (PDF)

pp. c1

Program (PDF)

pp. 1-10
81 ms
(Ver 3.3 (11022016))