Test Header

Materials Science Workflows

Introduction:

In the rapidly evolving field of materials science (including physics, chemistry, nanotechnology, and data science), computational workflows have become indispensable for automating, managing, and analyzing complex simulations to accelerate materials discovery and design. These workflows, ranging from flexible simulation environments to robust high-throughput automation systems, enable researchers to efficiently handle atomistic calculations, such as density functional theory (DFT), molecular dynamics (MD), and machine learning (ML)/artificial intelligence (AI) driven predictions. Open-source workflow systems provide modular, scalable solutions that integrate with popular computational codes and support diverse applications, from small-scale studies to large-scale materials screening. The following table summarizes the prominent open-source workflows, detailing their field of application, development teams, supported simulations, and key features. This compilation serves as a guide for researchers to select the most suitable workflow for their computational needs, fostering reproducible, data-driven advancements in materials science.


Motivation:

Computational workflows are critical because they streamline repetitive tasks, reduce human error, and enable the processing of vast datasets, which are essential for discovering new materials with tailored properties. By automating simulations and ensuring data provenance, these tools enhance reproducibility, accelerate research cycles, and facilitate collaboration across disciplines. In an era of big data and AI-driven discovery, workflows empower researchers to tackle complex challenges—such as designing sustainable energy materials or advanced semiconductors—efficiently and cost-effectively, making them vital for scientific and technological breakthroughs.


Purpose:

This table compiles prominent open-source and free to use computational workflows to guide researchers in selecting tools that best fit their simulation needs, whether for small-scale prototyping or large-scale high-throughput screening. It provides a concise overview of each workflow's applications, supported simulations, scalability, and unique features, enabling users to compare options and adopt solutions that enhance productivity and reproducibility. Designed for materials scientists, physicists, chemists, and data scientists, this resource aims to foster data-driven advancements by promoting accessible, community-supported tools.


Key Term Definitions:

Database cartoon



Name Field/Area Scale of Workflows Supported Simulations Key Tools/Features Tutorials/Resources Responsible Group Institute/Affiliation Active Description
Atomistic Simulations Small-scale (e.g., single molecules) to large-scale high-throughput computations DFT, MD, NEB, Transport, Structure Optimization, Thermochemistry, Phonon calculations using all the major First Principles codes Python interfaces, ASE Calculators, Structure optimization, Visualization, ASE echosystem, other ASE modules

Tutorials-ASE

Video-Tutorials


ASE Developers including Ask Hjorth Larsen DTU, Denmark and others Yes An open-source Python library for setting up, manipulating, running, visualizing, and analyzing atomistic simulations.
High-Throughput Simulations, Workflow Management Medium-scale to large-scale high-throughput workflows DFT, MD, High-Throughput Materials Screening, Monte Carlo Simulations, Machine-Learning Potential Calculations, Supports all major materials science codes AiiDA plungins, Worksflow, Provenance, AiiDAlab,

Tutorials-AiiDA

Video-Tutorials

Hands-on-video

AiiDA Team, led by Prof. Nicola Marzari and Giovanni Pizzi École Polytechnique Fédérale de Lausanne (EPFL), Switzerland Yes An open-source Python framework for managing, automating, and preserving computational materials science workflows with full data provenance.
Materials science, Computational Chemistry and Physics, and High-throughput Computing Medium-scale to large-scale high-throughput workflows locally or on HPC DFT, MD, Machine learning-based property predictions, High-throughput materials screening, Custom workflows for user-defined computational tasks Job and Flow Objects, Dynamic Workflows, Database Integration, Tutorials and Support

Tutorials-JobFlow


Materials Project team including Alex Ganose, Anubhav Jain, Gian-Marco Rignanese Lawrence Berkeley National Laboratory (LBNL), University of California, Berkeley, UCLouvain, UC San Diego, Yes

An open-source Python library for creating and executing computational workflows. It allows users to define dynamic and complex workflows to accelerate computational materials science.

High-Throughput Materials Discovery, Workflow Automation Single material simulations to large-scale high-throughput campaigns DFT, Elastic Property Calculations, BoltzTraP, Dielectric Properties, NEB, XAS and EELS spectra, Equation of State, Piezoelectric tensor, Ferroelectricity Atomate Workflow

Tutorials-Atomate

Create-Customize-Workflow

Atomate Development Team lead by Anubhav Jain Lawrence Berkeley National Laboratory (LBNL), University of California, Berkeley Yes An open-source Python framework for automating high-throughput materials science simulations, built on Pymatgen, custodian and FireWorks, with a focus on pre-built, reproducible workflows.
Materials Informatics and Materials Analysis Analysis and preprocessing of small to large-scale high-throughput workflows DFT, Phase diagrams, Pourbaix diagrams, Diffusion analyses, Reactions, Structure analysis and supports VASP, ABINIT, QE etc pymatgen-analysis-diffusion, pymatgen-analysis-defects, I/O add-ons

Tutorials-pymatgen

Video-Tutorials

Pymatgen Development Team led by Prof. Shyue Ping Ong Lawrence Berkeley National Laboratory (LBNL), University of California, Berkeley Yes An open-source Python library for materials analysis, simulation setup, and data processing, designed to accelerate computational materials science and informatics.
Job management workflows for Materials Science Small to Large-Scale (High-Throughput) error handling workflows Various DFT codes (VASP, QE, QChem, NWChem etc), MD simulations, high-throughput error handling Error Correction, Job Management, Modular Framework, Validation, custodian package

Tutorials-Custodian

Custodian-documentation


Custodian Development Team led by Prof. Shyue Ping Ong Lawrence Berkeley National Laboratory (LBNL), University of California, Berkeley Yes An open-source Python package that automates error correction and job management for computational materials science simulations
High-Throughput Computing, and Workflow Management. Small-scale (single jobs) to large-scale high-throughput workflows DFT, MD, Monte Carlo Simulations, and Custom Computational Tasks Firetasks, Dynamic Workflows, Duplicate Detection, Multi-Firetask

Tutorials-FireWorks

Video-Tutorial

FireWorks Development Team led by Anubhav Jain Lawrence Berkeley National Laboratory (LBNL), University of California, Berkeley Yes An open-source Python-based workflow management system for defining, executing, and monitoring high-throughput computational workflows
Atomistic Modeling, and Materials Informatics Small-scale (interactive prototyping) to large-scale high-throughput workflows DET, MD, Thermodynamic Integration and supports VASP, LAMMPS etc pyiron package, compatible with ASE, Feedback Loops, Integrated visualization ( NGLview )

Tutorials-pyiron

Video-Tutorial

YouTube-Channel

Pyiron Development Team led by Prof. Jörg Neugebauer Computational Materials Design (CM) department, Max-Planck-Institut für Eisenforschung (MPIE) Yes An open-source Python-based integrated development environment (IDE) for computational materials science, combining simulation, workflow management, and data handling in a unified platform.
ASR Computational Materials Science, High-Throughput Materials Modeling Small-scale (single material simulations) to large-scale high-throughput DFT, Many-Body Perturbation Theory (GW, BSE), Phonon Calculations, and Structural Optimization, B and Topology and supports GPAW Atomic Simulation Recipes, Source code of ASR

Video-Tutorial

Note: ASE and ASR are related

Atomic Simulation Recipes Team led by Prof. Kristian Sommer Thygesen CAMD, DTU, Denmark Yes An open-source Python framework for automating high-throughput atomistic simulations, built on ASE and optimized for GPAW, with a focus on materials discovery

Materials Science, Data Management for Scientific Workflows.

Small-scale to large-scale high-throughput workflows (hundreds of thousands of jobs) DFT, MD, Monte Carlo Simulations, and General Computational Workflows API Reference, FlowProject, other/different packages

Tutorias-signac

Video-Tutorial

signac Development Team led by Prof. Sharon C. Glotzer University of Michigan, USA Yes An open-source Python framework for managing file-based computational workflows, designed to streamline data management

Computational Science, HPC, Task and workflow scheduling system

Small-scale (single jobs) to large-scale workflows (thousands of tasks) General Computational Tasks, including Density Functional Theory (DFT) and Molecular Dynamics (MD) Command-Line Interface ( mq ), Python API, Scheduler Support, Personal Queue

Quick-start

Workflow-Tutorials

MyQueue Development Team led by Prof. Kristian Sommer Thygesen CAMD, DTU, Denmark Yes An open-source Python tool for simplifying task submission and workflow management on HPC clusters, acting as a frontend for SLURM, PBS, and LSF
CWL-Airflow Bioinformatics, Computational biology, and Workflow Management Suitable for smaller or less computationally intensive pipelines ChIP-Seq data analysis, Super-enhancer identification, RNA sequencing (RNA-Seq) Common Workflow Language (CWL), Apache Airflow, Docker Support, REST API and Airflow UI

Quick-start

Tutorials-CWL-Airflow

CWL-Airflow Development Team (Michael Kotliar, Andrey V. Kartashov, and Artem Barski) Cincinnati Children's Hospital Medical Center, University of Cincinnati, Cincinnati, OH, USA Yes A Python package that extends Apache Airflow to execute CWL workflows, providing scalable pipeline management for bioinformatics and data-intensive research.
Computational nanotechnology research, education, and collaboration Small to medium scale Nanoelectronics, Metal-Oxide-Semiconductor, ABACUS for semiconductor, DFT

Rappture Toolkit, Others tools, Educational courses, Machine Learning Lab Module

YouTube-Channel

Network for Computational Nanotechnology (NCN) Team including Prof. Mark S. Lundstrom

Purdue University, West Lafayette, Indiana, USA, Rosen Center for Advanced Computing (RCAC), Network for Computational Nanotechnology (NCN)

Yes An open-source and free to use platform which is a basically hybrid platform (data repository with workflow capabilities), with various tools and educational courses
Computational Materials Science, Atomistic Simulations, Cheminformatics, and MD Small to large-scale Energy and Force Calculations, Geometry optimization (DFT), MD, Gaussian Approximation Potential, Atomic Cluster Expansion (ACE)

ConfigSet/OutputSpec, Autoparallelize, Calculator Integration, Normal Mode Generation, GAP/ACE Fitting

Example-Tutorial

Task-parallelize

Jobs-queue

LibAtoms developer community, with contributors like N. Bernstein, T. K. Stenczel, and E. Gelzinyte

University of Warwick, and others (libAtoms GitHub organization)

Yes An open-source Python package for parallelizing materials science workflows, processing atomic structures for tasks with tools such as CASTEP and VASP, and scaling computations from local to cluster environments.


Created by Manoar Hossain