Introduction:
In the rapidly evolving field of materials science (including physics, chemistry, nanotechnology, and data science), computational workflows have become indispensable for automating, managing, and analyzing complex simulations to accelerate materials discovery and design. These workflows, ranging from flexible simulation environments to robust high-throughput automation systems, enable researchers to efficiently handle atomistic calculations, such as density functional theory (DFT), molecular dynamics (MD), and machine learning (ML)/artificial intelligence (AI) driven predictions. Open-source workflow systems provide modular, scalable solutions that integrate with popular computational codes and support diverse applications, from small-scale studies to large-scale materials screening. The following table summarizes the prominent open-source workflows, detailing their field of application, development teams, supported simulations, and key features. This compilation serves as a guide for researchers to select the most suitable workflow for their computational needs, fostering reproducible, data-driven advancements in materials science.
Motivation:
Computational workflows are critical because they streamline repetitive tasks, reduce human error, and enable the processing of vast datasets, which are essential for discovering new materials with tailored properties. By automating simulations and ensuring data provenance, these tools enhance reproducibility, accelerate research cycles, and facilitate collaboration across disciplines. In an era of big data and AI-driven discovery, workflows empower researchers to tackle complex challenges—such as designing sustainable energy materials or advanced semiconductors—efficiently and cost-effectively, making them vital for scientific and technological breakthroughs.
Purpose:
This table compiles prominent open-source and free to use computational workflows to guide researchers in selecting tools that best fit their simulation needs, whether for small-scale prototyping or large-scale high-throughput screening. It provides a concise overview of each workflow's applications, supported simulations, scalability, and unique features, enabling users to compare options and adopt solutions that enhance productivity and reproducibility. Designed for materials scientists, physicists, chemists, and data scientists, this resource aims to foster data-driven advancements by promoting accessible, community-supported tools.
Key Term Definitions:

Name | Field/Area | Scale of Workflows | Supported Simulations | Key Tools/Features | Tutorials/Resources | Responsible Group | Institute/Affiliation | Active | Description |
---|---|---|---|---|---|---|---|---|---|
Atomistic Simulations | Small-scale (e.g., single molecules) to large-scale high-throughput computations | DFT, MD, NEB, Transport, Structure Optimization, Thermochemistry, Phonon calculations using all the major First Principles codes | Python interfaces, ASE Calculators, Structure optimization, Visualization, ASE echosystem, other ASE modules |
|
ASE Developers including Ask Hjorth Larsen | DTU, Denmark and others | Yes | An open-source Python library for setting up, manipulating, running, visualizing, and analyzing atomistic simulations. | |
High-Throughput Simulations, Workflow Management | Medium-scale to large-scale high-throughput workflows | DFT, MD, High-Throughput Materials Screening, Monte Carlo Simulations, Machine-Learning Potential Calculations, Supports all major materials science codes | AiiDA plungins, Worksflow, Provenance, AiiDAlab, | AiiDA Team, led by Prof. Nicola Marzari and Giovanni Pizzi | École Polytechnique Fédérale de Lausanne (EPFL), Switzerland | Yes | An open-source Python framework for managing, automating, and preserving computational materials science workflows with full data provenance. | ||
Materials science, Computational Chemistry and Physics, and High-throughput Computing | Medium-scale to large-scale high-throughput workflows locally or on HPC | DFT, MD, Machine learning-based property predictions, High-throughput materials screening, Custom workflows for user-defined computational tasks | Job and Flow Objects, Dynamic Workflows, Database Integration, Tutorials and Support |
|
Materials Project team including Alex Ganose, Anubhav Jain, Gian-Marco Rignanese | Lawrence Berkeley National Laboratory (LBNL), University of California, Berkeley, UCLouvain, UC San Diego, | Yes |
An open-source Python library for creating and executing computational workflows. It allows users to define dynamic and complex workflows to accelerate computational materials science. |
|
High-Throughput Materials Discovery, Workflow Automation | Single material simulations to large-scale high-throughput campaigns | DFT, Elastic Property Calculations, BoltzTraP, Dielectric Properties, NEB, XAS and EELS spectra, Equation of State, Piezoelectric tensor, Ferroelectricity | Atomate Workflow | Atomate Development Team lead by Anubhav Jain | Lawrence Berkeley National Laboratory (LBNL), University of California, Berkeley | Yes | An open-source Python framework for automating high-throughput materials science simulations, built on Pymatgen, custodian and FireWorks, with a focus on pre-built, reproducible workflows. | ||
Materials Informatics and Materials Analysis | Analysis and preprocessing of small to large-scale high-throughput workflows | DFT, Phase diagrams, Pourbaix diagrams, Diffusion analyses, Reactions, Structure analysis and supports VASP, ABINIT, QE etc | pymatgen-analysis-diffusion, pymatgen-analysis-defects, I/O add-ons | Pymatgen Development Team led by Prof. Shyue Ping Ong | Lawrence Berkeley National Laboratory (LBNL), University of California, Berkeley | Yes | An open-source Python library for materials analysis, simulation setup, and data processing, designed to accelerate computational materials science and informatics. | ||
Job management workflows for Materials Science | Small to Large-Scale (High-Throughput) error handling workflows | Various DFT codes (VASP, QE, QChem, NWChem etc), MD simulations, high-throughput error handling | Error Correction, Job Management, Modular Framework, Validation, custodian package |
|
Custodian Development Team led by Prof. Shyue Ping Ong | Lawrence Berkeley National Laboratory (LBNL), University of California, Berkeley | Yes | An open-source Python package that automates error correction and job management for computational materials science simulations | |
High-Throughput Computing, and Workflow Management. | Small-scale (single jobs) to large-scale high-throughput workflows | DFT, MD, Monte Carlo Simulations, and Custom Computational Tasks | Firetasks, Dynamic Workflows, Duplicate Detection, Multi-Firetask | FireWorks Development Team led by Anubhav Jain | Lawrence Berkeley National Laboratory (LBNL), University of California, Berkeley | Yes | An open-source Python-based workflow management system for defining, executing, and monitoring high-throughput computational workflows | ||
Atomistic Modeling, and Materials Informatics | Small-scale (interactive prototyping) to large-scale high-throughput workflows | DET, MD, Thermodynamic Integration and supports VASP, LAMMPS etc | pyiron package, compatible with ASE, Feedback Loops, Integrated visualization ( NGLview ) | Pyiron Development Team led by Prof. Jörg Neugebauer | Computational Materials Design (CM) department, Max-Planck-Institut für Eisenforschung (MPIE) | Yes | An open-source Python-based integrated development environment (IDE) for computational materials science, combining simulation, workflow management, and data handling in a unified platform. | ||
ASR | Computational Materials Science, High-Throughput Materials Modeling | Small-scale (single material simulations) to large-scale high-throughput | DFT, Many-Body Perturbation Theory (GW, BSE), Phonon Calculations, and Structural Optimization, B and Topology and supports GPAW | Atomic Simulation Recipes, Source code of ASR |
Note: ASE and ASR are related |
Atomic Simulation Recipes Team led by Prof. Kristian Sommer Thygesen | CAMD, DTU, Denmark | Yes | An open-source Python framework for automating high-throughput atomistic simulations, built on ASE and optimized for GPAW, with a focus on materials discovery |
Materials Science, Data Management for Scientific Workflows. |
Small-scale to large-scale high-throughput workflows (hundreds of thousands of jobs) | DFT, MD, Monte Carlo Simulations, and General Computational Workflows | API Reference, FlowProject, other/different packages | signac Development Team led by Prof. Sharon C. Glotzer | University of Michigan, USA | Yes | An open-source Python framework for managing file-based computational workflows, designed to streamline data management | ||
Computational Science, HPC, Task and workflow scheduling system |
Small-scale (single jobs) to large-scale workflows (thousands of tasks) | General Computational Tasks, including Density Functional Theory (DFT) and Molecular Dynamics (MD) | Command-Line Interface ( mq ), Python API, Scheduler Support, Personal Queue | MyQueue Development Team led by Prof. Kristian Sommer Thygesen | CAMD, DTU, Denmark | Yes | An open-source Python tool for simplifying task submission and workflow management on HPC clusters, acting as a frontend for SLURM, PBS, and LSF | ||
CWL-Airflow | Bioinformatics, Computational biology, and Workflow Management | Suitable for smaller or less computationally intensive pipelines | ChIP-Seq data analysis, Super-enhancer identification, RNA sequencing (RNA-Seq) | Common Workflow Language (CWL), Apache Airflow, Docker Support, REST API and Airflow UI | CWL-Airflow Development Team (Michael Kotliar, Andrey V. Kartashov, and Artem Barski) | Cincinnati Children's Hospital Medical Center, University of Cincinnati, Cincinnati, OH, USA | Yes | A Python package that extends Apache Airflow to execute CWL workflows, providing scalable pipeline management for bioinformatics and data-intensive research. | |
Computational nanotechnology research, education, and collaboration | Small to medium scale | Nanoelectronics, Metal-Oxide-Semiconductor, ABACUS for semiconductor, DFT |
Rappture Toolkit, Others tools, Educational courses, Machine Learning Lab Module |
Network for Computational Nanotechnology (NCN) Team including Prof. Mark S. Lundstrom |
Purdue University, West Lafayette, Indiana, USA, Rosen Center for Advanced Computing (RCAC), Network for Computational Nanotechnology (NCN) |
Yes | An open-source and free to use platform which is a basically hybrid platform (data repository with workflow capabilities), with various tools and educational courses | ||
Computational Materials Science, Atomistic Simulations, Cheminformatics, and MD | Small to large-scale | Energy and Force Calculations, Geometry optimization (DFT), MD, Gaussian Approximation Potential, Atomic Cluster Expansion (ACE) |
ConfigSet/OutputSpec, Autoparallelize, Calculator Integration, Normal Mode Generation, GAP/ACE Fitting |
LibAtoms developer community, with contributors like N. Bernstein, T. K. Stenczel, and E. Gelzinyte |
University of Warwick, and others (libAtoms GitHub organization) |
Yes | An open-source Python package for parallelizing materials science workflows, processing atomic structures for tasks with tools such as CASTEP and VASP, and scaling computations from local to cluster environments. |