Skip to main content

XRD - X-Ray Diffraction

X-Ray Diffraction (XRD) is used to identify and quantify crystalline mineral phases in solid samples. It provides mineral composition expressed as weight percentages, processed using Profex (Rietveld refinement) software.

Method Overview

PropertyValue
Full NameX-Ray Diffraction
PurposeIdentifies and quantifies crystalline mineral phases in solid samples
Output UnitsWeight percent (%)
Processing SoftwareProfex (Rietveld refinement)
SharePoint LocationAnalytical Data > XRD > XRD Data > z_ProfexReportFiles > Compiled Report Files
File FormatsCSV, XLSX

Minerals Measured

CategoryMinerals
Silicatesquartz, plagioclase, orthopyroxene, clinopyroxene, olivine, serpentine
Carbonatescalcite, dolomite
Oxidesmagnetite, brucite
Phyllosilicatesmica, talc, chlorite
Otherepidote, amphibole, garnet

Data Pipeline

SharePoint Source

  • Site: Analytical Data
  • Folder: XRD/XRD Data/z_ProfexReportFiles/Compiled Report Files/
  • File Types: CSV, XLSX (Profex compiled reports)

Dagster Assets

The XRD data pipeline consists of the following assets in apps/datasmart/src/datasmart/assets/analytical/xrd.py:

xrd_profex_incremental (sharepoint_multi_asset)
├── xrd_profex_incremental (backend.xrd_profex_incremental) - melted mineral data
└── xrd_profex_samples (backend.xrd_profex_samples) - sample metadata

xrd_profex (backend.xrd_profex)

xrd_profex_simplified (analytical.xrd_profex_simplified)
xrd_profex_summed (analytical.xrd_profex_summed)
xrd_crystallinity_adjusted (analytical.xrd_crystallinity_adjusted)

Asset Descriptions

AssetSchemaDescription
xrd_profex_incrementalbackendMelted mineral phase data (long format)
xrd_profex_samplesbackendSample metadata with validity flags
xrd_profexbackendWide format with all mineral columns, sample ID normalization
xrd_profex_simplifiedanalyticalAveraged replicates, invalid rows dropped
xrd_profex_summedanalyticalSimplified mineral names, sub-types merged
xrd_crystallinity_adjustedanalyticalXRD values adjusted by crystallinity measurement

Database Tables

backend.xrd_profex_samples

Sample metadata from raw files.

ColumnTypeDescription
file_idstringSharePoint file identifier
original_sample_idstringRaw sample ID from file
validbooleanWhether measurement is valid
time_createddatetimeFile creation time
file_locstringSharePoint file path

backend.xrd_profex_incremental

Melted mineral data in long format.

ColumnTypeDescription
file_idstringSharePoint file identifier
original_sample_idstringRaw sample ID from file
mineral_phasestringName of mineral phase
valuefloatWeight percentage

analytical.xrd_profex_simplified

Wide format table with one row per sample.

ColumnTypeDescription
sample_idstringNormalized sample ID
quartzfloatQuartz %
calcitefloatCalcite %
dolomitefloatDolomite %
plagioclasefloatPlagioclase %
orthopyroxenefloatOrthopyroxene %
clinopyroxenefloatClinopyroxene %
olivinefloatOlivine %
serpentinefloatSerpentine %
brucitefloatBrucite %
magnetitefloatMagnetite %
...floatAdditional mineral columns as detected

Mineral Name Normalization

The pipeline normalizes mineral names from Profex output:

Leading 'Q' Removal

Profex adds a leading 'Q' to many mineral names (e.g., qcalcite becomes calcite). The pipeline strips this prefix except for actual Q-minerals like quartz, quartzite, and quintinite.

Sub-type Merging

Specific mineral sub-types are merged into parent minerals:

OriginalMapped To
anorthiteplagioclase
clinozoisiteepidote
enstatiteorthopyroxene
ferrosiliteorthopyroxene

Sum Columns Dropped

Columns containing (sum) are removed as they represent calculated totals, not actual mineral phases.

Typo Corrections

Common typos in Profex output are automatically corrected:

  • orthpyroxeneorthopyroxene
  • Malformed parentheses patterns are fixed

Duplicate Handling

Validity Filtering

Rows where valid=False are excluded from simplified outputs. The valid column is set by lab operators during data collection.

Replicate Averaging

When multiple measurements exist for the same sample:

  1. Only valid measurements are considered
  2. Replicates are averaged using average_replicates()
  3. The most recent measurement (by time_created) is prioritized

SCM Sample ID Handling

Special logic handles SCM (Supplementary Cementitious Materials) samples that use a different naming convention:

  • Pattern: PCSC-XRD-...-15x5-...
  • Converted using scm_firstlast_crosswalk

Crystallinity Adjustment

The xrd_crystallinity_adjusted asset provides XRD values adjusted for sample crystallinity:

Adjusted Value = XRD % × Crystallinity % / 100

This gives the true crystalline mineral content accounting for amorphous phases.

Usage Examples

Query Simplified Data

from shared.db.sql import SQL

# Get all simplified XRD data
xrd = SQL.read("SELECT * FROM analytical.xrd_profex_simplified")

# Filter by project team
xrd = SQL.read("""
SELECT * FROM analytical.xrd_profex_simplified
WHERE sample_id LIKE 'P800%'
""")

Join with Other Analytical Data

combined = SQL.read("""
SELECT
xrd.sample_id,
xrd.calcite, xrd.dolomite, xrd.quartz,
xrf.CaO, xrf.MgO, xrf.SiO2
FROM analytical.xrd_profex_simplified xrd
LEFT JOIN analytical.xrf_simplified xrf
ON xrd.sample_id = xrf.sample_id
WHERE xrd.sample_id IS NOT NULL
""")

Common Calculations

import pandas as pd
from shared.db.sql import SQL

xrd = SQL.read("SELECT * FROM analytical.xrd_profex_simplified")

# Total carbonates
xrd['total_carbonates'] = xrd['calcite'] + xrd['dolomite']

# Total silicates
silicate_cols = ['quartz', 'plagioclase', 'orthopyroxene', 'clinopyroxene', 'olivine']
xrd['total_silicates'] = xrd[silicate_cols].sum(axis=1)

# Ca-bearing vs Mg-bearing minerals ratio
xrd['ca_mg_mineral_ratio'] = (
(xrd['calcite'] + xrd['plagioclase']) /
(xrd['dolomite'] + xrd['brucite'] + xrd['serpentine'])
)

Get Crystallinity-Adjusted Values

adjusted = SQL.read("SELECT * FROM analytical.xrd_crystallinity_adjusted")
AssetDescription
crystallinityCrystallinity measurements for adjusting XRD percentages
sample_id_crosswalkMaps messy sample IDs to canonical format
leach_crosswalkSpecial mappings for leaching experiment IDs
scm_firstlast_crosswalkSCM sample ID mappings
project_team_mapMaps old process area codes to new project team codes

Troubleshooting

Missing Samples

If a sample is missing from xrd_profex_simplified:

  1. Check if it exists in backend.xrd_profex (before simplification)
  2. Check if valid=False (would be filtered out)
  3. Verify sample ID normalization worked (check for NULL sample_id)
  4. Confirm the source file exists in SharePoint
# Check validity status
SQL.read("""
SELECT sample_id, original_sample_id, valid, time_created
FROM backend.xrd_profex
WHERE original_sample_id LIKE '%YOUR_SAMPLE%'
""")

Unexpected Mineral Values

If mineral percentages seem wrong:

  1. Check for multiple versions of the same sample
  2. Verify the correct version was selected (most recent by time_created)
  3. Compare raw values in backend.xrd_profex vs simplified
# Compare raw vs simplified for a sample
raw = SQL.read("""
SELECT * FROM backend.xrd_profex
WHERE sample_id = 'YOUR_SAMPLE_ID'
ORDER BY time_created DESC
""")

simplified = SQL.read("""
SELECT * FROM analytical.xrd_profex_simplified
WHERE sample_id = 'YOUR_SAMPLE_ID'
""")

Understanding Crystallinity Adjustment

To understand how crystallinity affects XRD values:

# Compare unadjusted vs adjusted
unadjusted = SQL.read("""
SELECT sample_id, calcite, dolomite, quartz
FROM analytical.xrd_profex_simplified
WHERE sample_id = 'YOUR_SAMPLE_ID'
""")

adjusted = SQL.read("""
SELECT sample_id, calcite, dolomite, quartz
FROM analytical.xrd_crystallinity_adjusted
WHERE sample_id = 'YOUR_SAMPLE_ID'
""")

XRD Spectra

Raw XRD spectra (diffraction patterns with 2-theta angles and intensity values) are handled separately in xrd_spectra.py. These are used for:

  • Advanced analysis
  • Model validation
  • Peak identification