Supercomputing for Modeling and Simulation

poster image 1
poster image 2

Training Objectives

This training program aims to develop a comprehensive understanding of high-performance computing (HPC) and its role in modern modeling and simulation. It focuses on highlighting the national importance of building indigenous computational capabilities and reducing reliance on external resources.

The program is designed to equip participants with both theoretical knowledge and practical skills required to effectively use, manage, and optimize workloads on supercomputing systems.

Training Outcomes

By the end of this training, participants will have a clear understanding of the importance of simulation across various domains and the fundamental concepts of high-performance computing. They will be able to work confidently in an HPC-based Linux environment, manage and submit jobs using a workload manager, and execute real-world scientific applications.

Additionally, participants will develop the ability to analyze and improve the performance of their computational tasks through optimization techniques, enabling more efficient use of HPC resources.

Training Modules

Module 1: Importance of Simulation and National Need for Indigenous Capability

1.1 Role of simulation in science, engineering, defense, industry, healthcare, agriculture, and design

1.2 Risks of reliance on external software and computational resources

1.3 Need for local expertise and national supercomputing capability

Module 2: Introduction to Supercomputing for Modeling and Simulation

2.1 Overview of High-Performance Computing (HPC)

2.2 Difference between conventional computing and HPC

2.3 Key components: nodes, CPUs, cores, memory, storage, interconnects

2.4 Importance of HPC in simulation workloads

Module 3: HPC Architectures and Toolkits for Simulation Workloads

3.1 Cluster architecture and system design

3.2 Shared vs distributed memory models

3.3 Parallel programming models: MPI, OpenMP, and hybrid approaches

3.4 Compilers, numerical libraries, and toolchains

Module 4: HPC Software Stack for Scientific Computing

4.1 Linux-based HPC environment

4.2 Module systems and package management

4.3 Compilers and scientific libraries

4.4 Runtime environment and dependency management

Module 5: Hands-on Job Management with Slurm

5.1 Cluster access and user environment setup

5.2 Writing and submitting job scripts

5.3 Monitoring job execution and queue status

5.4 Resource allocation and management

5.5 Debugging and handling job failures

Module 6: Scientific Workflows on HPC (MD & DFT)

6.1 Molecular Dynamics (MD) using GROMACS

6.2 Density Functional Theory (DFT) using Quantum ESPRESSO

6.3 Execution of real-world simulation workflows on HPC systems

Module 7: Performance Optimization Techniques

7.1 Performance profiling and analysis

7.2 Scalability and load balancing

7.3 Memory optimization and efficient resource utilization

7.4 Job tuning and performance improvement strategies