[ Home
] [ Research
] [ Publications
] [ CV
] [ Links
]
Publications by area
My publications on DBLP.
Programming models
Dynamic load balancing
Fault tolerance
Optimizations for accelerators
Performance tools
Compile-time + runtime optimizations
Out-of-core algorithms
Programming models
-
Global trees: a framework for linked data structures on distributed memory parallel systems
B. Larkins, J. Dinan, S. Krishnamoorthy, S. Parthasarathy, A. Rountev, P. Sadayappan
Supercomputing (SC) 2008, November 2008.
BibTeX
-
Non-collective parallel I/O for global address space programming models
S. Krishnamoorthy, J. P. Canovas, V. Tipparaju, J. Nieplocha, and P. Sadayappan.
Procedings of the International Conference on Cluster Computing (CLUSTER 2007). September 2007
BibTeX
-
Design and implementation of a one-sided communication interface for the IBM eserver blue gene supercomputer
Michael Blocksome, Charles Archer, Todd Inglett, Pat McCarthy, Mike Mundy, Joe Ratterman, Albert Sidelnik, Brian Smith, Gheorghe Almasi, Jose Castanos, Derek Lieber, Jose Moreira, Sriram Krishnamoorthy, and Vinod Tipparaju
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2006). November 2006
BibTeX
-
An extensible global address space framework with decoupled task and data abstractions
S. Krishnamoorthy, U. Catalyurek, J. Nieplocha, A. Rountev, and P. Sadayappan.
IPDPS Workshop on Next Generation Software (NGS 2006).
BibTeX
Dynamic load balancing
-
Load Balancing on Single- and Multi-GPU Systems
L. Chen, O. Villa, S. Krishnamoorthy, and G. Gao
Proceedings of the 24th IEEE International Parallel & Distributed Processing Symposium, April 2010 (To Appear)
-
High Performance Molecular Dynamic Simulation on Single and Multi-GPU Systems
O. Villa, L. Chen, and S. Krishnamoorthy
Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS) 2010 (To Appear)
-
Scalable Work Stealing
J. Dinan, S. Krishnamoorthy, B. Larkins, J. Nieplocha, P. Sadayappan
Supercomputing (SC) 2009, November 2009
-
An Integrated Approach to Locality-Conscious Processor Allocation and Scheduling of Mixed-Parallel Applications
N. Vydyanathan, S. Krishnamoorthy, G.M. Sabin, U.V. Catalyurek, T.M. Kurc, P. Sadayappan, J.H. Saltz
IEEE Transasctions on Parallel Distributed Systems 20(8): 1158-1172 2009
BibTeX
-
Solving large, irregular graph problems using adaptive work-stealing
G. Cong, S. Kodali, S. Krishnamoorthy, D. Lea, V, Saraswat, T. Wen
Proceedings of the International Conference on Parallel Processing (ICPP'08), September 2008.
BibTeX
-
Scioto: a framework for global-view task parallelism
J. Dinan, S. Krishnamoorthy, B. Larkins, J. Nieplocha, and P. Sadayappan
Proceedings of the International Conference on Parallel Processing (ICPP'08), September 2008.
BibTeX
-
Integrated Data and Task Management for Scientific Applications
J. Nieplocha, S. Krishamoorthy, M. Valiev , M. Krishnan , B. Palmer , and P. Sadayappan
Proceedings of the 8th International Conference on Computational Science (ICCS 2008),June 2008, Krakow, Poland.
BibTeX
-
Hypergraph partitioning for automatic memory hierarchy management
S. Krishnamoorthy, U. Catalyurek, J. Nieplocha, and P. Sadayappan.
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2006). November 2006
BibTeX
-
Locality conscious processor allocation and scheduling for mixed-parallel applications
N. Vydyanathan, S. Krishnamoorthy, G. Sabin, U. Catalyurek, T. Kurc, P. Sadayappan, and J. Saltz.
Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER 2006). September 2006
BibTeX
-
An integrated approach for processor allocation and scheduling of mixed-parallel applications
N. Vydyanathan, S. Krishnamoorthy, G. Sabin, U. Catalyurek, T. Kurc, P. Sadayappan, and J. Saltz.
The 35th International Conference on Parallel Processing (ICPP 2006)
BibTeX
-
An approach to locality-conscious load balancing and transparent memory hierarchy management with a global-address-space parallel programming model
S. Krishnamoorthy, U. Catalyurek, J. Nieplocha, and P. Sadayappan.
IPDPS Workshop on Performance Optimization for High-Level Languages and Libraries (POHLL 2006)
BibTeX
-
An extensible global address space framework with decoupled task and data abstractions
S. Krishnamoorthy, U. Catalyurek, J. Nieplocha, A. Rountev, and P. Sadayappan.
IPDPS Workshop on Next Generation Software (NGS 2006).
BibTeX
-
Task scheduling and file replication for data-intensive jobs with batch-shared i/o
G. Khanna, N. Vydyanathan, U. Catalyurek, T. Kurc, S. Krishnamoorthy, P. Sadayappan, J. Saltz
The 15th IEEE International Symposium on High Performance Distributed Computing (HPDC 2006)
BibTeX
-
Data and computation abstractions for dynamic and irregular computations
S. Krishnamoorthy, J. Nieplocha, P. Sadayappan.
The 12th Annual International Conference on High Performance Computing (HiPC 2005)
BibTeX
-
Locality-aware load balancing for dynamic and irregular computations
S. Krishnamoorthy, P. Sadayappan, J. Nieplocha, and M. Krishnan
Workshop on Patterns in High Performance Computing. May 2005
Fault tolerance
-
Data-driven Fault Tolerance for Work Stealing Computations
W. Ma, S. Krishnamoorthy
26th International Conference on Supercomputing (ICS), June 2012
-
Multi-fault Tolerance for Cartesian Data Distributions
N. Ali, S. Krishnamoorthy, M. Halappanavar, J. Daily
International Journal of Parallel Programming, Computing Frontiers special issue (Accepted)
-
Application-Specific Fault Tolerance via Data Access Characterization
N. Ali, S. Krishnamoorthy, N. Govind, K. Kowalski, P. Sadayappan
Euro-Par 2011, August 2011
-
Tolerating Correlated Failures for Generalized Cartesian Distributions via Bipartite Matching
N. Ali, S. Krishnamoorthy, M. Halappanavar, and J. Daily
ACM International Conference on Computing Frontiers (CF'11), May 2011 (Accepted)
-
A Redundant Communication Approach to Scalable Fault Tolerance in PGAS Programming Models
N. Ali, S. Krishnamoorthy, N. Govind, B. Palmer
19th Euromicro International Conference on Parallel, Distributed and Network-Based Computing, February 2011
-
Selective Recovery From Failures In A Task Parallel Programming Model
J. Dinan, A. Singri, P. Sadayappan, and S. Krishnamoorthy
Proceedings of the The 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing -- Resilience Workshop. May 2010 (To Appear)
-
Scalable transparent checkpoint-restart of global address space applications on virtual machines over infiniband
O. Villa, S. Krishnamoorthy, J. Nieplocha, D.M. Brown Jr.
Conference on Computing Frontiers 2009, April 2009.
BibTeX
Optimization for Accelerators
-
GPU-Based Implementations of the Noniterative Regularized-CCSD(T) Corrections:
Applications to Strongly Correlated Systems
W. Ma, S. Krishnamoorthy, O. Villa, K. Kowalski
Journal of Chemical Theory and Computation, (Accepted)
-
Practical Loop Transformations for Tensor Contraction Expressions on Multi-Level Memory Hierarchies
W. Ma, S. Krishnamoorthy, G. Agrawal
International Conference on Compiler Construction (CC'11), April 2011
-
Efficient Sparse Matrix-Matrix Multiplication on Heterogeneous High
Performance Systems
J. Siegel, O. Villa, S. Krishnamoorthy, A. Tumeo, and X. Li
Workshop on Application/Architecture Co-design for Extreme-scale Computing (AACEC). September 2010
-
Acceleration of Streamed Tensor Contraction Expressions on GPGPU-based Clusters
W. Ma, S. Krishnamoorthy, O. Villa, and K. Kowalski
IEEE International Conference on Cluster Computing (CLUSTER). September 2010
-
Load Balancing on Single- and Multi-GPU Systems
L. Chen, O. Villa, S. Krishnamoorthy, and G. Gao
Proceedings of the 24th IEEE International Parallel & Distributed Processing Symposium (IPDPS), April 2010
Performance tools
-
Scalable communication trace compression
S. Krishnamoorthy and K. Agarwal
Proceedings of the 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. May 2010 (To Appear)
Compile-time + runtime optimizations
-
Performance Optimization of Tensor Contraction Expressions for Many Body Methods in Quantum Chemistry
Q. Lu, A. Hartono, T. Henretty, S. Krishnamoorthy, H. Zhang, G. Baumgartner, D.E. Bernholdt, M. Nooijen, R.M. Pitzer, J. Ramanujam, and P. Sadayappan
The Journal of Physical Chemistry (accepted)
-
Data Layout Transformation for Enhancing Locality on NUCA Chip Multiprocessors
Q. Lu, C. Alias, U. Bondhugula, T. Henretty, S. Krishnamoorthy, J. Ramanujam, A. Rountev, P. Sadayappan, Y. Chen, H. Lin, and T. Ngai
18th International Symposium on Parallel Architectures and Compilation Techniques (PACT-18), September 2009
-
Parametric multi-level tiling of imperfectly nested loops
A. Hartono, M. Baskaran, C. Bastoul, A. Cohen, S. Krishnamoorthy, B. Norris, J. Ramanujam, P. Sadayappan
ICS 2009: 147-157 , June 2009. BibTeX
-
A compiler framework for optimization of affine loop nests for GPGPUs
M. Baskaran, U. Bondhugula, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan
Proceedings of the International Conference on Supercomputing (ICS'08), June 2008, Island of Kos, Greece.
BibTeX
-
Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model
Uday Bondhugula, Muthu Manikandan Baskaran, S. Krishnamoorthy, J. Ramanujam, A.Rountev, and P. Sadayappan
Proceedings of the International Conference on Compiler Construction (ETAPS CC'08) April 2008, Budapest, Hungary.
BibTeX
-
Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories
M. Baskaran, Uday Bondhugula, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan.
ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'08) February 2008
BibTeX
-
Efficient search-space pruning for integrated fusion and tiling transformations
X. Gao, S. Krishnamoorthy, S. Sahoo, C. Lam, G. Baumgartner, J. Ramanujam, and P. Sadayappan.
Concurrency and Computation: Practice and Experience, 2007
BibTeX
-
Effective automatic parallelization of stencil computations
S. Krishnamoorthy, M. Baskaran, U. Bondhugula, J. Ramanujam, A. Rountev, and P. Sadayappan.
ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2007). June 2007
BibTeX
-
Identifying cost-effective common subexpressions to reduce operation count in tensor contraction evaluations
A. Hartono, Q. Lu, X. Gao, S. Krishnamoorthy, M. Nooijen, G. Baumgartner, V. Choppella, D. E. Bernholdt, R. M. Pitzer, J. Ramanujam, A. Rountev, and P. Sadayappan.
The 6th International Conference on Computational Science (ICCS 2006)
BibTeX
-
Combining analytical and empirical approaches in tuning matrix transposition
Q. Lu, S. Krishnamoorthy, and P. Sadayappan.
Proceedings of the 15th International Conference on Parallel Architectures and Compiler Techniques. (PACT 2006)
BibTeX
-
Search-based performance-model driven optimization for compilation of tensor contraction expressions
X. Gao, S. Krishnamoorthy, Q. Lu, V. Choppella, G. Baumgartner, J. Ramanujam, and P. Sadayappan.
The 12th Workshop on Compilers for Parallel Computers (CPC 2006). Coruna, Spain.
BibTeX
-
Efficient search-space pruning for integrated fusion and tiling transformations
X. Gao, S. Krishnamoorthy, S. K. Sahoo, C. Lam, G. Baumgartner, J. Ramanujam, P. Sadayappan.
The 18th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2005)
BibTeX
-
Automatic code generation for many-body electronic structure methods: the tensor contraction engine
A. Auer, G. Baumgartner, D. E. Bernholdt, A. Bibireata, V. Choppella, D. Cociorva, X. Gao, R. Harrison, S. Krishnamoorthy, S. Krishnan, C. Lam, M. Nooijen, R. Pitzer, J. Ramanujam, P. Sadayappan and A. Sibiryakov.
Molecular Physics vol:104(2), pp:211-228. January 2006
BibTeX
-
Integrated loop optimizations for data locality enhancement of tensor contraction expressions
S. K. Sahoo, S. Krishnamoorthy, R. Panuganti, P. Sadayappan.
Supercomputing (SC 2005)
BibTeX
-
Cache miss characterization and data locality optimization for imperfectly nested loops on shared memory multiprocessors
S. K. Sahoo, R. Panuganti, S. Krishnamoorthy, P. Sadayappan.
19th IEEE International Parallel & Distributed Processing Symposium. (IPDPS 2005)
BibTeX
-
Synthesis of high-performance parallel programs for a class of ab initio quantum chemistry models
G. Baumgartner, A. Auer, D.E. Bernholdt, A. Bibireata, V. Choppella, D. Cociorva, X. Gao, R.J. Harrison, S. Hirata, S. Krishnamoorthy, S. Krishnan, C. Lam, Q. Lu, M. Nooijen, R.M. Pitzer, J. Ramanujam, P. Sadayappan and A. Sibiryakov.
Proceedings of the IEEE. vol: 93(2) pp:276-292 February 2005.
BibTeX
-
Empirical performance-model driven data layout optimization
Q. Lu, X. Gao, S. Krishnamoorthy, G. Baumgartner, J. Ramanujam and P. Sadayappan.
The 17th International Workshop on Languages and Compilers for Parallel Computing. (LCPC 2004)
BibTeX
Out-of-core algorithms
-
Layout transformation support for the disk resident arrays framework
S. Krishnamoorthy, G. Baumgartner, C. Lam, J. Nieplocha, and P. Sadayappan.
Journal of Supercomputing. vol: 36(2) pp:153-170 May 2006
BibTeX
-
Efficient synthesis of out-of-core algorithms using a nonlinear optimization solver
Sandhya Krishnan, Sriram Krishnamoorthy, Gerald Baumgartner, Chi-Chung Lam, J. Ramanujam, P. Sadayappan, and Venkatesh Chopella
Journal of Parallel and Distributed Computing (IPDPS Special Issue) vol:66(5) pp:659-673. May 2006
BibTeX
-
Efficient synthesis of out-of-core algorithms using a nonlinear optimization solver Best Paper Award
S. Krishnan, S. Krishnamoorthy, G. Baumgartner, C. Lam, J. Ramanujam, P. Sadayappan and V. Choppella.
The 18th International Parallel & Distributed Processing Symposium. (IPDPS 2004).
BibTeX
-
Layout transformation support for the disk resident arrays framework
S. Krishnamoorthy, G. Baumgartner, C. Lam, J. Nieplocha and P. Sadayappan.
The Los Alamos Computer Science Initiative Symposium. (LACSI 2004)
BibTeX
-
Efficient layout transformation support for disk-based multidimensional arrays
S. Krishnamoorthy, G. Baumgartner, C. Lam, J. Nieplocha and P. Sadayappan.
The 11th Annual International Conference on High Performance Computing. (HiPC 2004)
BibTeX
-
Efficient parallel out-of-core matrix transposition
S. Krishnamoorthy, G. Baumgartner, Daniel Cociorva, C. Lam and P. Sadayappan.
International Journal of High Performance Computing and Networking. vol:2(2/3/4) pp:110-119 2004
BibTeX
-
Data locality optimization for synthesis of efficient out-of-core algorithms Best Paper Award
Sandhya Krishnan, Sriram Krishnamoorthy, G. Baumgartner, D. Cociorva, C. Lam,
P. Sadayappan, J. Ramanujam, David E. Bernholdt and V. Choppella.
The 10th Annual International Conference on High Performance Computing. (HiPC 2003). December 2003.
BibTeX
-
Efficient parallel out-of-core matrix transposition
S. Krishnamoorthy, G. Baumgartner, D. Cociorva, C. Lam and P. Sadayappan.
IEEE International Conference on Cluster Computing (CLUSTER 2003). December 2003
BibTeX