ML-Based Models

Most of the following models (with the exception of UNIFAC2 models) are provided by MLThermoProperties.jl and require the package to be installed.

mod. UNIFAC 2.0 and UNIFAC 2.0

UNIFAC 2.0 and mod. UNIFAC 2.0 are enhanced versions of the classical group-contribution methods UNIFAC and mod. UNIFAC (Dortmund), respectively. Missing interaction parameters are predicted using matrix completion, which significantly extends the applicability of the methods and leads to a higher prediction accuracy compared to the original versions.

Clapeyron.UNIFAC2 — Type

UNIFACModel <: ActivityModel

UNIFAC2(components;
puremodel = PR,
userlocations = String[],
group_userlocations = String[],
pure_userlocations = String[],
verbose = false,
reference_state = nothing)

Input parameters

R: Single Parameter (Float64) - Normalized group Van der Waals volume
Q: Single Parameter (Float64) - Normalized group Surface Area
A: Pair Parameter (Float64, asymetrical, defaults to 0) - Binary group Interaction Energy Parameter
B: Pair Parameter (Float64, asymetrical, defaults to 0) - Binary group Interaction Energy Parameter
C: Pair Parameter (Float64, asymetrical, defaults to 0) - Binary group Interaction Energy Parameter

Input models

puremodel: model to calculate pure pressure-dependent properties

Description

UNIFAC 2.0 activity model. Modified UNIFAC 2.0 (Dortmund) implementation. The method is identical to UNIFAC but with a new parameters fitted by matrix completion methods.

References

Hayer, N., Hasse, H., Jirasek, F.: Modified UNIFAC 2.0-A Group-Contribution Method Completed with Machine Learning, Ind. Eng. Chem. Res. 64 (2025) 10304–10313, DOI: 10.1021/acs.iecr.5c00077.

source

Clapeyron.ogUNIFAC2 — Type

ogUNIFACModel <: UNIFACModel

ogUNIFAC2(components;
puremodel = PR, 
userlocations = String[],
group_userlocations = String[],
pure_userlocations = String[],
verbose = false,
reference_state = nothing)

Input parameters

R: Single Parameter (Float64) - Normalized group Van der Waals volume
Q: Single Parameter (Float64) - Normalized group Surface Area
A: Pair Parameter (Float64, asymetrical, defaults to 0) - Binary group Interaction Energy Parameter

Input models

puremodel: model to calculate pure pressure-dependent properties

UNIFAC 2.0 (UNIQUAC Functional-group Activity Coefficients) activity model. Original formulation. The method is identical to ogUNIFAC but with a new parameters fitted by matrix completion methods.

References

Hayer, N., Wendel, T., Mandt, S., Hasse, H., Jirasek, F., Advancing Thermodynamic Group-Contribution Methods by Machine Learning: UNIFAC 2.0, Chemical Engineering Journal 504 (2025) 158667. 10.1016/j.cej.2024.158667.

source

HANNA activity models

HANNA is a hard-constraint neural network model for the excess Gibbs energy g^E that predicts activity coefficients in a strictly thermodynamically consistent manner. It only requires the SMILES of the components and the temperature as input. The model satisfies thermodynamic boundary conditions by construction, ensuring consistency of the predicted activity coefficients.

Two versions of the model are available:

ogHANNA: The original version (HANNA v1.0.0), trained on binary VLE data (up to 10 bar) and limiting activity coefficients from the Dortmund Data Bank. This version is limited to binary mixtures.
HANNA (alias multHANNA): The latest version, trained on VLE and LLE data, and applicable to multi-component mixtures.

MLThermoProperties.ogHANNA — Type

ogHANNA <: ActivityModel

ogHANNA(components;
puremodel = nothing,
userlocations = String[],
pure_userlocations = String[],
verbose = false,
reference_state = nothing)

Input parameters

SMILES: canonical SMILES (using RDKit) representation of the components
Mw: Single Parameter (Float64) (Optional) - Molecular Weight [g·mol⁻¹]

Input models

puremodel: model to calculate pure pressure-dependent properties

Description

Hard-Constraint Neural Network for Consistent Activity Coefficient Prediction (HANNA v1.0.0). The implementation is based on this Github repository. ogHANNA was trained on all available binary VLE data (up to 10 bar) and limiting activity coefficients from the Dortmund Data Bank. ogHANNA was only developed for binary mixtures. Use HANNA for multicomponent mixtures.

Example

using MLThermoProperties, Clapeyron

components = ["water","isobutanol"]
Mw = [18.01528, 74.1216]
smiles = ["O", "CC(C)CO"]

model = ogHANNA(components,userlocations=(;Mw=Mw, SMILWS=smiles))
# model = ogHANNA(components) # also works if components are in the database

References

Specht, T., Nagda, M., Fellenz, S., Mandt, S., Hasse, H., Jirasek, F., HANNA: Hard-Constraint Neural Network for Consistent Activity Coefficient Prediction. Chemical Science 2024. 10.1039/D4SC05115G.

source

MLThermoProperties.multHANNA — Type

HANNA <: ActivityModel
multHANNA

HANNA(components;
puremodel = nothing,
userlocations = String[],
pure_userlocations = String[],
verbose = false,
reference_state = nothing)

Input parameters

SMILES: canonical SMILES (using RDKit) representation of the components
Mw: Single Parameter (Float64) (Optional) - Molecular Weight [g·mol⁻¹]

Input models

puremodel: model to calculate pure component properties

Description

Hard-Constraint Neural Network for Consistent Activity Coefficient Prediction (HANNA). HANNA was trained on all available binary VLE data (up to 10 bar) and limiting activity coefficients from the Dortmund Data Bank.

Example

using MLThermoProperties, Clapeyron

components = ["dmso", "ethanol", "aspirin"]
Mw = [78.13, 46.068, 180.158]
smiles = ["CS(=O)C", "CCO", "CC(=O)Oc1ccccc1C(=O)O"]

model = HANNA(components,userlocations=(;Mw=Mw, SMILES=smiles))
# model = HANNA(components) # also works if components are in the database

References

M. Hoffmann, T. Specht, Q. Göttl, J. Burger, S. Mandt, H. Hasse, and F. Jirasek: A Machine-Learned Expression for the Excess Gibbs Energy, (2025), DOI: https://doi.org/10.48550/arXiv.2509.06484.

source

GRAPPA saturation model

GRAPPA is a graph neural network model for predicting vapor pressures and boiling points of pure components. The model predicts the parameters A, B, and C of the Antoine equation:

\[\ln(p^s / \text{kPa}) = A - \frac{B}{T / \text{K} + C}\]

On model construction, the Antoine parameters are predicted and a SaturationModel is automatically created, which enables the calculation of the vapor pressure via saturation_pressure for a given temperature.

MLThermoProperties.GRAPPA — Type

GRAPPA{T} <: SaturationModel

GRAPPA(
    components;
    userlocations = String[],
    verbose::Bool=false
)

Description

GRAPPA model for calculating vapor pressure of pure components based on the Antoine equation. On model construction, the Antoine parameters are predicted using a Python implementation the GRAPPA model.

For predicting the Antoine parameters, only the smiles of the molecule is required. It will automatically be retrieved from the Clapeyron.jl database. The smiles can also be provided by the userlocations keyword (see example below).

Example

using Clapeyron, PythonCall

model = GRAPPA("propanol")
model = GRAPPA("propanol"; userlocations=(; smiles="CCCO"))

ps, _, _ = saturation_pressure(model, 300.)         # Vapor pressure at 300 K

References

M. Hoffmann, H. Hasse, and F. Jirasek: GRAPPA—A Hybrid Graph Neural Network for Predicting Pure Component Vapor Pressures, Chemical Engineering Journal Advances 22 (2025) 100750, DOI: https://doi.org/10.1016/j.ceja.2025.100750.

source

ML utilities

The package ChemBERTa.jl contains encoder language models from the ChemBERTa model family. It is an registered package and can be used independently of MLThermoProperties.jl.

Missing docstring.

Missing docstring for MLThermoProperties.ChemBERTa.ChemBERTaModel. Check Documenter's build log for details.