molprop.utils.analyze_mol_data
Functions
Checks for overlaps between data sets If overlaps are found: returns pd.Series of overlapping entries between the two data sets Otherwise: returns None |
|
Extracts all mol features for training, validation and test datasets. |
|
Creates dictionary containing atom and bond features. |
|
Checks for duplicates If duplicates are found: returns pd.Series of duplicates (input combinations that occur in duplicates) Otherwise: returns None --- check_conflicts checks for conflicting target values for duplicate input combinations (same input with different label) Note: can take some time if df is large |
|
Checks for presence and variation of molecular features. |
|
Performs sanity checks for inputs and compund list. |