XRD - X-Ray Diffraction
X-Ray Diffraction (XRD) is used to identify and quantify crystalline mineral phases in solid samples. It provides mineral composition expressed as weight percentages, processed using Profex (Rietveld refinement) software.
Method Overview
| Property | Value |
|---|---|
| Full Name | X-Ray Diffraction |
| Purpose | Identifies and quantifies crystalline mineral phases in solid samples |
| Output Units | Weight percent (%) |
| Processing Software | Profex (Rietveld refinement) |
| SharePoint Location | Analytical Data > XRD > XRD Data > z_ProfexReportFiles > Compiled Report Files |
| File Formats | CSV, XLSX |
Minerals Measured
| Category | Minerals |
|---|---|
| Silicates | quartz, plagioclase, orthopyroxene, clinopyroxene, olivine, serpentine |
| Carbonates | calcite, dolomite |
| Oxides | magnetite, brucite |
| Phyllosilicates | mica, talc, chlorite |
| Other | epidote, amphibole, garnet |
Data Pipeline
SharePoint Source
- Site: Analytical Data
- Folder: XRD/XRD Data/z_ProfexReportFiles/Compiled Report Files/
- File Types: CSV, XLSX (Profex compiled reports)
Dagster Assets
The XRD data pipeline consists of the following assets in apps/datasmart/src/datasmart/assets/analytical/xrd.py:
xrd_profex_incremental (sharepoint_multi_asset)
├── xrd_profex_incremental (backend.xrd_profex_incremental) - melted mineral data
└── xrd_profex_samples (backend.xrd_profex_samples) - sample metadata
↓
xrd_profex (backend.xrd_profex)
↓
xrd_profex_simplified (analytical.xrd_profex_simplified)
xrd_profex_summed (analytical.xrd_profex_summed)
xrd_crystallinity_adjusted (analytical.xrd_crystallinity_adjusted)
Asset Descriptions
| Asset | Schema | Description |
|---|---|---|
xrd_profex_incremental | backend | Melted mineral phase data (long format) |
xrd_profex_samples | backend | Sample metadata with validity flags |
xrd_profex | backend | Wide format with all mineral columns, sample ID normalization |
xrd_profex_simplified | analytical | Averaged replicates, invalid rows dropped |
xrd_profex_summed | analytical | Simplified mineral names, sub-types merged |
xrd_crystallinity_adjusted | analytical | XRD values adjusted by crystallinity measurement |
Database Tables
backend.xrd_profex_samples
Sample metadata from raw files.
| Column | Type | Description |
|---|---|---|
file_id | string | SharePoint file identifier |
original_sample_id | string | Raw sample ID from file |
valid | boolean | Whether measurement is valid |
time_created | datetime | File creation time |
file_loc | string | SharePoint file path |
backend.xrd_profex_incremental
Melted mineral data in long format.
| Column | Type | Description |
|---|---|---|
file_id | string | SharePoint file identifier |
original_sample_id | string | Raw sample ID from file |
mineral_phase | string | Name of mineral phase |
value | float | Weight percentage |
analytical.xrd_profex_simplified
Wide format table with one row per sample.
| Column | Type | Description |
|---|---|---|
sample_id | string | Normalized sample ID |
quartz | float | Quartz % |
calcite | float | Calcite % |
dolomite | float | Dolomite % |
plagioclase | float | Plagioclase % |
orthopyroxene | float | Orthopyroxene % |
clinopyroxene | float | Clinopyroxene % |
olivine | float | Olivine % |
serpentine | float | Serpentine % |
brucite | float | Brucite % |
magnetite | float | Magnetite % |
| ... | float | Additional mineral columns as detected |
Mineral Name Normalization
The pipeline normalizes mineral names from Profex output:
Leading 'Q' Removal
Profex adds a leading 'Q' to many mineral names (e.g., qcalcite becomes calcite). The pipeline strips this prefix except for actual Q-minerals like quartz, quartzite, and quintinite.
Sub-type Merging
Specific mineral sub-types are merged into parent minerals:
| Original | Mapped To |
|---|---|
anorthite | plagioclase |
clinozoisite | epidote |
enstatite | orthopyroxene |
ferrosilite | orthopyroxene |
Sum Columns Dropped
Columns containing (sum) are removed as they represent calculated totals, not actual mineral phases.
Typo Corrections
Common typos in Profex output are automatically corrected:
orthpyroxene→orthopyroxene- Malformed parentheses patterns are fixed
Duplicate Handling
Validity Filtering
Rows where valid=False are excluded from simplified outputs. The valid column is set by lab operators during data collection.
Replicate Averaging
When multiple measurements exist for the same sample:
- Only valid measurements are considered
- Replicates are averaged using
average_replicates() - The most recent measurement (by
time_created) is prioritized
SCM Sample ID Handling
Special logic handles SCM (Supplementary Cementitious Materials) samples that use a different naming convention:
- Pattern:
PCSC-XRD-...-15x5-... - Converted using
scm_firstlast_crosswalk
Crystallinity Adjustment
The xrd_crystallinity_adjusted asset provides XRD values adjusted for sample crystallinity:
Adjusted Value = XRD % × Crystallinity % / 100
This gives the true crystalline mineral content accounting for amorphous phases.
Usage Examples
Query Simplified Data
from shared.db.sql import SQL
# Get all simplified XRD data
xrd = SQL.read("SELECT * FROM analytical.xrd_profex_simplified")
# Filter by project team
xrd = SQL.read("""
SELECT * FROM analytical.xrd_profex_simplified
WHERE sample_id LIKE 'P800%'
""")
Join with Other Analytical Data
combined = SQL.read("""
SELECT
xrd.sample_id,
xrd.calcite, xrd.dolomite, xrd.quartz,
xrf.CaO, xrf.MgO, xrf.SiO2
FROM analytical.xrd_profex_simplified xrd
LEFT JOIN analytical.xrf_simplified xrf
ON xrd.sample_id = xrf.sample_id
WHERE xrd.sample_id IS NOT NULL
""")
Common Calculations
import pandas as pd
from shared.db.sql import SQL
xrd = SQL.read("SELECT * FROM analytical.xrd_profex_simplified")
# Total carbonates
xrd['total_carbonates'] = xrd['calcite'] + xrd['dolomite']
# Total silicates
silicate_cols = ['quartz', 'plagioclase', 'orthopyroxene', 'clinopyroxene', 'olivine']
xrd['total_silicates'] = xrd[silicate_cols].sum(axis=1)
# Ca-bearing vs Mg-bearing minerals ratio
xrd['ca_mg_mineral_ratio'] = (
(xrd['calcite'] + xrd['plagioclase']) /
(xrd['dolomite'] + xrd['brucite'] + xrd['serpentine'])
)
Get Crystallinity-Adjusted Values
adjusted = SQL.read("SELECT * FROM analytical.xrd_crystallinity_adjusted")
Related Assets
| Asset | Description |
|---|---|
crystallinity | Crystallinity measurements for adjusting XRD percentages |
sample_id_crosswalk | Maps messy sample IDs to canonical format |
leach_crosswalk | Special mappings for leaching experiment IDs |
scm_firstlast_crosswalk | SCM sample ID mappings |
project_team_map | Maps old process area codes to new project team codes |
Troubleshooting
Missing Samples
If a sample is missing from xrd_profex_simplified:
- Check if it exists in
backend.xrd_profex(before simplification) - Check if
valid=False(would be filtered out) - Verify sample ID normalization worked (check for NULL sample_id)
- Confirm the source file exists in SharePoint
# Check validity status
SQL.read("""
SELECT sample_id, original_sample_id, valid, time_created
FROM backend.xrd_profex
WHERE original_sample_id LIKE '%YOUR_SAMPLE%'
""")
Unexpected Mineral Values
If mineral percentages seem wrong:
- Check for multiple versions of the same sample
- Verify the correct version was selected (most recent by time_created)
- Compare raw values in
backend.xrd_profexvs simplified
# Compare raw vs simplified for a sample
raw = SQL.read("""
SELECT * FROM backend.xrd_profex
WHERE sample_id = 'YOUR_SAMPLE_ID'
ORDER BY time_created DESC
""")
simplified = SQL.read("""
SELECT * FROM analytical.xrd_profex_simplified
WHERE sample_id = 'YOUR_SAMPLE_ID'
""")
Understanding Crystallinity Adjustment
To understand how crystallinity affects XRD values:
# Compare unadjusted vs adjusted
unadjusted = SQL.read("""
SELECT sample_id, calcite, dolomite, quartz
FROM analytical.xrd_profex_simplified
WHERE sample_id = 'YOUR_SAMPLE_ID'
""")
adjusted = SQL.read("""
SELECT sample_id, calcite, dolomite, quartz
FROM analytical.xrd_crystallinity_adjusted
WHERE sample_id = 'YOUR_SAMPLE_ID'
""")
XRD Spectra
Raw XRD spectra (diffraction patterns with 2-theta angles and intensity values) are handled separately in xrd_spectra.py. These are used for:
- Advanced analysis
- Model validation
- Peak identification