Clustering (Beta)
OceanGraph provides a beta feature that clusters Argo profiles based on their vertical structure using machine learning. This functionality is experimental and comes with the following limitations and processing steps:
-
Profile Limit
- To reduce server load and memory usage, clustering accepts a maximum of 500 valid profiles per job.
-
Depth Range & Interpolation
- Only the range between 200 and 1000 dbar is used.
- Profiles are linearly interpolated every 100 dbar within this range to align them on a common vertical grid.
- The upper 200 dbar is omitted to suppress the effects of seasonal thermocline and surface forcing.
-
Required Variables
- Only profiles containing valid temperature and salinity data are considered.
- Profiles missing these variables or lacking coverage in the specified depth range are excluded.
-
Clustering Feature Vector
- Clustering is based on a feature vector composed of interpolated temperature and salinity values, combined with location data.
- Temperature and salinity vectors are standardized using z-score normalization at each depth level to ensure that variations at all depths contribute equally to the clustering process.
- Latitude is included as an additional feature, normalized by linear scaling from -90 to 90 degrees into a range of -1 to 1.
- Longitude is transformed into two features using its sine and cosine values (i.e., sin(λ), cos(λ)), allowing for circular continuity around the ±180° meridian without further normalization.
-
Automatic K Determination
- The number of clusters (K) is selected automatically using a simplified elbow method (with a maximum of 8 clusters).
This feature is available to signed-in users only. While we are actively improving this system, unexpected results or limitations may occur. We appreciate your understanding during this beta period.
Note: Gray markers indicate profiles that were excluded from clustering.