ML-Based Models

Most of the following models (with the exception of UNIFAC2 models) are provided by MLThermoProperties.jl and require the package to be installed.

mod. UNIFAC 2.0 and UNIFAC 2.0

UNIFAC 2.0 and mod. UNIFAC 2.0 are enhanced versions of the classical group-contribution methods UNIFAC and mod. UNIFAC (Dortmund), respectively. Missing interaction parameters are predicted using matrix completion, which significantly extends the applicability of the methods and leads to a higher prediction accuracy compared to the original versions.

Clapeyron.UNIFAC2Type
UNIFACModel <: ActivityModel

UNIFAC2(components;
puremodel = PR,
userlocations = String[],
group_userlocations = String[],
pure_userlocations = String[],
verbose = false,
reference_state = nothing)

Input parameters

  • R: Single Parameter (Float64) - Normalized group Van der Waals volume
  • Q: Single Parameter (Float64) - Normalized group Surface Area
  • A: Pair Parameter (Float64, asymetrical, defaults to 0) - Binary group Interaction Energy Parameter
  • B: Pair Parameter (Float64, asymetrical, defaults to 0) - Binary group Interaction Energy Parameter
  • C: Pair Parameter (Float64, asymetrical, defaults to 0) - Binary group Interaction Energy Parameter

Input models

  • puremodel: model to calculate pure pressure-dependent properties

Description

UNIFAC 2.0 activity model. Modified UNIFAC 2.0 (Dortmund) implementation. The method is identical to UNIFAC but with a new parameters fitted by matrix completion methods.

References

  1. Hayer, N., Hasse, H., Jirasek, F.: Modified UNIFAC 2.0-A Group-Contribution Method Completed with Machine Learning, Ind. Eng. Chem. Res. 64 (2025) 10304–10313, DOI: 10.1021/acs.iecr.5c00077.
source
Clapeyron.ogUNIFAC2Type
ogUNIFACModel <: UNIFACModel

ogUNIFAC2(components;
puremodel = PR, 
userlocations = String[],
group_userlocations = String[],
pure_userlocations = String[],
verbose = false,
reference_state = nothing)

Input parameters

  • R: Single Parameter (Float64) - Normalized group Van der Waals volume
  • Q: Single Parameter (Float64) - Normalized group Surface Area
  • A: Pair Parameter (Float64, asymetrical, defaults to 0) - Binary group Interaction Energy Parameter

Input models

  • puremodel: model to calculate pure pressure-dependent properties

UNIFAC 2.0 (UNIQUAC Functional-group Activity Coefficients) activity model. Original formulation. The method is identical to ogUNIFAC but with a new parameters fitted by matrix completion methods.

References

  1. Hayer, N., Wendel, T., Mandt, S., Hasse, H., Jirasek, F., Advancing Thermodynamic Group-Contribution Methods by Machine Learning: UNIFAC 2.0, Chemical Engineering Journal 504 (2025) 158667. 10.1016/j.cej.2024.158667.
source

HANNA activity models

HANNA is a hard-constraint neural network model for the excess Gibbs energy g^E that predicts activity coefficients in a strictly thermodynamically consistent manner. It only requires the SMILES of the components and the temperature as input. The model satisfies thermodynamic boundary conditions by construction, ensuring consistency of the predicted activity coefficients.

Two versions of the model are available:

  • ogHANNA: The original version (HANNA v1.0.0), trained on binary VLE data (up to 10 bar) and limiting activity coefficients from the Dortmund Data Bank. This version is limited to binary mixtures.
  • HANNA (alias multHANNA): The latest version, trained on VLE and LLE data, and applicable to multi-component mixtures.
MLThermoProperties.ogHANNAType
ogHANNA <: ActivityModel

ogHANNA(components;
puremodel = nothing,
userlocations = String[],
pure_userlocations = String[],
verbose = false,
reference_state = nothing)

Input parameters

  • SMILES: canonical SMILES (using RDKit) representation of the components
  • Mw: Single Parameter (Float64) (Optional) - Molecular Weight [g·mol⁻¹]

Input models

  • puremodel: model to calculate pure pressure-dependent properties

Description

Hard-Constraint Neural Network for Consistent Activity Coefficient Prediction (HANNA v1.0.0). The implementation is based on this Github repository. ogHANNA was trained on all available binary VLE data (up to 10 bar) and limiting activity coefficients from the Dortmund Data Bank. ogHANNA was only developed for binary mixtures. Use HANNA for multicomponent mixtures.

Example

using MLThermoProperties, Clapeyron

components = ["water","isobutanol"]
Mw = [18.01528, 74.1216]
smiles = ["O", "CC(C)CO"]

model = ogHANNA(components,userlocations=(;Mw=Mw, SMILWS=smiles))
# model = ogHANNA(components) # also works if components are in the database 

References

  1. Specht, T., Nagda, M., Fellenz, S., Mandt, S., Hasse, H., Jirasek, F., HANNA: Hard-Constraint Neural Network for Consistent Activity Coefficient Prediction. Chemical Science 2024. 10.1039/D4SC05115G.
source
MLThermoProperties.multHANNAType
HANNA <: ActivityModel
multHANNA

HANNA(components;
puremodel = nothing,
userlocations = String[],
pure_userlocations = String[],
verbose = false,
reference_state = nothing)

Input parameters

  • SMILES: canonical SMILES (using RDKit) representation of the components
  • Mw: Single Parameter (Float64) (Optional) - Molecular Weight [g·mol⁻¹]

Input models

  • puremodel: model to calculate pure component properties

Description

Hard-Constraint Neural Network for Consistent Activity Coefficient Prediction (HANNA). HANNA was trained on all available binary VLE data (up to 10 bar) and limiting activity coefficients from the Dortmund Data Bank.

Example

using MLThermoProperties, Clapeyron

components = ["dmso", "ethanol", "aspirin"]
Mw = [78.13, 46.068, 180.158]
smiles = ["CS(=O)C", "CCO", "CC(=O)Oc1ccccc1C(=O)O"]

model = HANNA(components,userlocations=(;Mw=Mw, SMILES=smiles))
# model = HANNA(components) # also works if components are in the database 

References

  1. M. Hoffmann, T. Specht, Q. Göttl, J. Burger, S. Mandt, H. Hasse, and F. Jirasek: A Machine-Learned Expression for the Excess Gibbs Energy, (2025), DOI: https://doi.org/10.48550/arXiv.2509.06484.
source

GRAPPA saturation model

GRAPPA is a graph neural network model for predicting vapor pressures and boiling points of pure components. The model predicts the parameters A, B, and C of the Antoine equation:

\[\ln(p^s / \text{kPa}) = A - \frac{B}{T / \text{K} + C}\]

On model construction, the Antoine parameters are predicted and a SaturationModel is automatically created, which enables the calculation of the vapor pressure via saturation_pressure for a given temperature.

MLThermoProperties.GRAPPAType
GRAPPA{T} <: SaturationModel

GRAPPA(
    components;
    userlocations = String[],
    verbose::Bool=false
)

Description

GRAPPA model for calculating vapor pressure of pure components based on the Antoine equation. On model construction, the Antoine parameters are predicted using a Python implementation the GRAPPA model.

Requires loading the package `PythonCall.jl`

GRAPPA uses a modified Python implementation taken from https://github.com/marco-hoffmann/GRAPPA. Therefore to use the GRAPPA model, you need to install and load the package PythonCall.jl by

using Pkg; Pkg.add("PythonCall")    # Installation
using PythonCall                    # Loading

For predicting the Antoine parameters, only the smiles of the molecule is required. It will automatically be retrieved from the Clapeyron.jl database. The smiles can also be provided by the userlocations keyword (see example below).

Example

using Clapeyron, PythonCall

model = GRAPPA("propanol")
model = GRAPPA("propanol"; userlocations=(; smiles="CCCO"))

ps, _, _ = saturation_pressure(model, 300.)         # Vapor pressure at 300 K

References

  1. M. Hoffmann, H. Hasse, and F. Jirasek: GRAPPA—A Hybrid Graph Neural Network for Predicting Pure Component Vapor Pressures, Chemical Engineering Journal Advances 22 (2025) 100750, DOI: https://doi.org/10.1016/j.ceja.2025.100750.
source

ML utilities

The package ChemBERTa.jl contains encoder language models from the ChemBERTa model family. It is an registered package and can be used independently of MLThermoProperties.jl.

Missing docstring.

Missing docstring for MLThermoProperties.ChemBERTa.ChemBERTaModel. Check Documenter's build log for details.