molprop.utils.analyze_mol_data.sanity_check_data
- molprop.utils.analyze_mol_data.sanity_check_data(df, input_columns, target_columns, check_conflicts=False)
Checks for duplicates If duplicates are found: returns pd.Series of duplicates (input combinations that occur in duplicates) Otherwise: returns None — check_conflicts checks for conflicting target values for duplicate input combinations (same input with different label) Note: can take some time if df is large