Environmental Statistics: Contaminant Source Attribution Using PCA – PFAS, an Example

Environmental pollutants can adversely impact atmospheric, groundwater, surface water, and/or soil conditions. Excess nutrients, pesticides, metals, organic pollutants, and other toxics can be detrimental to aquatic life, wildlife, habitat, and human health. Some contaminants are not only known to be toxic, but also carcinogenic, mutagenic, and/or teratogenic. Due to the potential risks to human health and the environment, and corresponding liabilities associated with environmental contaminants, a clear understanding of their source is key to pollution prevention and management.  

While some contaminants may be ubiquitous in nature, their concentrations can often be small compared to those from naturally-occurring origins. It is possible, however, for natural concentrations to exceed conservative risk-based thresholds, thereby making it important to differentiate between natural and anthropogenic contributions. In addition, contaminants can come from multiple and/or different anthropogenic sources making it difficult to identify a site’s primary contamination source(s). Multiple statistical methods can potentially be applied to assist in determining source attribution.  

Environmental Statistics: How PCA Helps with Source Attribution 

Principal component analysis (PCA) is a multivariate statistical method that can be applied to a set of variables to reduce their dimensionality. PCA is often used to “summarize” complex datasets. For example, PCA can condense analytical results of a large number of constituents in hundreds of samples into a small number (e.g. 2-4) of components without losing important information about variation within the sample population. In turn, each component can help not only delineate likely contaminant source(s), but can also identify and interpret the driving constituents behind each component. PCA can be used to recognize the set of compounds that best identify similarities and differences among environmental samples and can then use those constituents to establish a profile for each source.  

Ideally, if all contaminant sources have been identified, PCA can separate each source and identify the likely origin of the contamination at a site. PCA does not require prior knowledge of the number of sources or their compositions.  It can extract contaminant sources according to the different source markers. 

For instance, PCA has been successfully used to assess polycyclic aromatic hydrocarbons (PAHs) and ambient PM2.5 datasets, and identify the probable contamination source(s). PCA is also a suitable technique that can be applied to potentially identify a holistic per- and polyfluoroalkyl substances (PFAS) signature from source materials. From such PFAS signatures, the probable origin(s) of samples collected from a site could be determined (see case study below).   

Assuming sufficient sample size and good data quality, scientifically defensible decision-making generally requires multiple lines of evidence, which can include robust statistical analyses. Some contaminant sources have similar profiles, so PCA may fail to separate all sources. As such, PCA can be used as an initial step in source apportionment and then supplemented with another type of analysis, such as multiple linear regression, factor analysis or positive matrix factorization (PMF) modeling, molecular diagnostic ratio, or chemical mass balance model. For instance, PMF would take into account the uncertainty associated with errors in sample and laboratory analysis, and add robustness to PCA source apportionment results. 

Case Study: Using PCA to Evaluate PFAS Signatures at Four Airports 

Remind Me – What is PFAS? 

PFAS is an umbrella term for thousands of individual per-fluorinated compounds with varying chemical compositions. PFAS are present in products such as aqueous film-forming foams (AFFFs), which were manufactured across multiple decades (1960s to present) by different manufacturers. Each of these products likely have unique PFAS chemical fingerprints. When PFAS are detected at a site, litigation may ensue as stakeholders work to determine the discrete source(s) of contamination. The widespread use of PFAS-containing materials can make source attribution a difficult task, especially in industrial areas where multi-source contribution may be a factor. Environmental statistical methods such as PCA can be a powerful statistical tool that can provide insights related to PFAS source attribution.  

The Case Study 

Publicly available PFAS groundwater concentration data from four airports in California (Buchanan Airport, Monterey Airport, San Jose [SJ] Airport, and San Luis Obispo [SLO] Airport) were obtained through GeoTracker (downloaded from on April 7, 2021). Only data pertaining to the following PFAS were retained for this example:  

  • Perfluorooctanoic acid
  • Perfluorooctane sulfonic acid
  • Perfluoroheptanoic acid
  • Perfluorobutanesulfonic acid
  • Perfluorohexanesulfonic acid
  • Perfluoropentanoic acid
  • Perfluorohexanoic acid

These constituents were detected in all 52 samples collected between November 2019 and November 2020 (Sample size: Buchanan n1= 10, Monterey n2=6, SJ n3=5, and SLO n4=31). 

The PCA (Figures 1 and 2) illustrates the following: 

  • Using the above seven PFAS, 78.26% of the observed variance appears to be associated with 2 Principal Components (PC1 and PC2), and 91.53% with 3 PCs.  

  • SLO Airport and Buchanan Airport have the same overall PFAS signature as the SJ Airport with different ranges in concentration of perfluorobutanesulfonic acid, perfluorohexanesulfonic acid, perfluoropentanoic acid, and perfluorohexanoic acid that would potentially allow differentiation among the three airports. This could be explained by a fundamental common source and transport process, even if at a different rate. 

  • The Monterey Airport PFAS signature can be isolated from the three other airports using relative concentrations in perfluorooctanoic acid and perfluorooctane sulfonic acid; Monterey Airport exhibits relatively higher concentrations in perfluorooctanoic acid, while the other airports tend to have a higher relative concentration in perfluorooctane sulfonic acid. This difference may be the result of a completely different source or remediation activities. 

Using PCA, it is evident that at least two distinct contamination profiles can be established for these four airports using only the seven PFAS listed above. Using data from additional constituents could help isolate the signature of the SLO, Buchanan, and SJ airports. In the same manner, for contaminated sites, the profile of each potential source may be formulated and used in comparison to the site’s data to determine the probable source(s) of contamination. This may help determine remediation responsibility. 

Figure 1. Results from Principal Component Analysis (PCA) of 7 PFAS found in samples from the Buchanan airport, Monterey airport, San Jose (SJ) airport, and San Luis Obispo (SLO) airport. 

A picture containing graphical user interface Description automatically generated

Figure 2. 3D rendering of the PFAS signature of four airports (Buchanan (red), Monterey (Green), San Jose (Blue), and San Luis Obispo (purple)) in California based on PCA analysis.  

Chart, radar chart Description automatically generated

Contact us 

If you have any questions or want to learn more about PCA (or other forensic methods), source attribution, or PFAS, please contact us. 

Contact Us




Vaughn Barry, Andrea Winquist, and Kyle Steenland. 2013. Perfluorooctanoic Acid (PFOA) Exposures and Incident Cancers among Adults Living Near a Chemical Plant. Environ. Health Perspect. 121(11-12): 1313–1318. 

Joseph M. Braun, Aimin Chen, Megan E. Romano, Antonia M. Calafat, Glenys M. Webster, Kimberly Yolton, and Bruce P. Lanphear. 2016. Prenatal Perfluoroalkyl Substance Exposure and Child Adiposity at 8 Years of Age: The HOME Study. Obesity 24: 231–237. 

Philippe Grandjean, Elisabeth Wreford Andersen, Esben Budtz-Jørgensen, Flemming Nielsen, Kåre Mølbak, Pal Weihe, and Carsten Heilmann. 2012. Serum Vaccine Antibody Concentrations in Children Exposed to Perfluorinated Compounds. JAMA 307(4):391-397. 

Xindi C. Hu, David Q. Andrews, Andrew B. Lindstrom, Thomas A. Bruton, Laurel A. Schaider, Philippe Grandjean, Rainer Lohmann, Courtney C. Carignan, Arlene Blum, Simona A. Balan, Christopher P. Higgins○, and Elsie M. Sunderland. 2016. Detection of Poly- and Perfluoroalkyl Substances (PFASs) in U.S. Drinking Water Linked to Industrial Sites, Military Fire Training Areas, and Wastewater Treatment Plants. Environ. Sci. Technol. Lett. 3(10): 344-350.  

Did you find this information useful? Click the icons below to share on your social channels.

facebook twitter linkedin
Other News


Receive the latest technical and regulatory updates in your inbox.