X-ray diffraction-X-ray fluorescence (XRD-XRF) data sets obtained from surface scans of synthetic samples have been analyzed using different data clustering algorithms, to propose a methodology for automatic crystallographic and chemical classification of surfaces.

Three data clustering strategies have been evaluated, namely hierarchical, k-means, and density-based clustering; all of them have been applied to the distance matrix calculated from the single XRD and XRF data sets as well as the combined distance matrix.

Classification performance is reported for each strategy both in numerical form as the corrected Rand index and as a visual reconstruction of the surface maps. Hierarchical and k-means clustering offered comparable results, depending on both sample complexity and data quality.

When applied to XRF data collected on a two-phase test sample, both algorithms allowed to obtain Rand index values above 0.8, whereas XRD data collected on the same sample gave values around 0.5; application to the combined distance matrix improved the correlation to about 0.9. In the case of a more complex multi-phase sample, it has also been found that classification performance strongly depends on both data quality and signal contrast between different regions; again, the adoption of the combined dissimilarity matrix offered improved classification performance.

Published online by Cambridge University Press: 03 May 2019

Authors: M. BortolottiOpen the ORCID record for M. Bortolotti,  L. Lutterotti,  E. Borovin and D. MartorelliOpen the ORCID record for D. Martorelli

Affiliation: Department of Industrial Engineering, University of Trento, IT, Italy