wavy.triple_collocation
Functions
|
Filters the datasets according to a maximum collocation |
|
Filters the values for each data serie given as input. |
|
Filter data when the two given model data differ by more than a given |
|
Find indexes of nan values in each of three |
|
Runs the triple collocation given a dictionary |
|
|
|
|
|
|
|
Calibrate A and B relatively to R using triple collocation calibration |
|
Merges the three data series given as input following the least |
|
Divides a given time series into sample of given size, applies a window to |
|
Estimates the representativeness error r2 by integrating the difference |
Module Contents
- wavy.triple_collocation.filter_collocation_distance(data, dist_max, name)
Filters the datasets according to a maximum collocation distance between satellite and in-situ.
- data (dict of wavy objects): wavy objects to filter given
the maximum collocation distance. One of the objects must contain the collocation distance.
dist_max (float): Maximum collocation distance in km. name (string): key from the dictionary that refers to the
wavy object containing the distance
returns: data_filtered (dict of wavy objects): dictionary of the wavy objects
filtered using the maximum collocation distance given.
- wavy.triple_collocation.filter_values(data, ref_data, min=0.0, max=25.0, return_ref_data=False)
Filters the values for each data serie given as input.
data (dict of arrays): data to filter ref_data (string or array): Either a string corresponding
to a key in data or an array. Values for all data are filtered with respect to the ref_data.
min (float): minimum value that ref_data should take. max (float): maximum value that ref_data should take.
- wavy.triple_collocation.filter_dynamic_collocation(data, mod_1, mod_2, max_rel_diff=0.05)
Filter data when the two given model data differ by more than a given percentage. Dynamical collocation filtering method for collocation refers to Dodet et al., 2025.
data (dict of lists): data to filter mod_1 (string or list): Either key from data for the first model data
or the list of values of the model directly
- mod_2 (string or list): Either key from data for the first model data
or the list of values of the model directly
- max_rel_diff (float): Maximum relative difference (abs(mod_1-mod_2)/mod_1)
allowed between values from mod_1 and mod_2.
returns: data_filtered (dict of lists): filtered data
- wavy.triple_collocation.remove_nan(A, B, C)
Find indexes of nan values in each of three lists, and returns the filtered lists
- wavy.triple_collocation.triple_collocation(data, metrics=['var', 'rmse', 'si', 'rho', 'mean', 'std'], r2=0, ref=None, dec=3)
Runs the triple collocation given a dictionary containing three measurements, returns results in a dictionary.
data: {‘name of measurement’:list of values} metrics: Str “all” or List of the metrics to return, among ‘var’, ‘rmse’, ‘si’, ‘rho’, ‘sensitivity’, ‘snr’, ‘snr_db’, ‘fmse’, ‘mean’, ‘std’ r2: representativeness error or cross correlation error between the first two measurements in data. Default 0. ref: Name of one of the measurements, must correspond to one key of data. Default first key from data. dec: Number of decimals to round the results to. Default 3.
returns: dict of dict of the metrics for each measurement {‘name of measurement’: {‘metric name’:metric}}
- wavy.triple_collocation.get_CDF(data, step, llim=None, ulim=None, data_min=None, data_max=None, dec=3, no_empty_bins=True)
- wavy.triple_collocation.CDF_matching_cal(old, CDF_old, CDF_new)
- wavy.triple_collocation.calibration_triplets_cdf_matching(data, ref, step, seed=5)
- wavy.triple_collocation.calibration_triplets_tc(data, ref, r2=0, return_cal_cst=False)
Calibrate A and B relatively to R using triple collocation calibration constant estimates, following Gruber et al., 2016 method.
data (dict of lists of floats): Dictionary of the data to calibrate. ref (string): Name of the reference data to use for calibration. r2 (float): Representativeness error cal_cst (bool): If True, returns a dictionary for the calibration
constantes in addition to the calibrated data.
returns: data_cal (dict of lists of floats): Dictionary of the calibrated
data series
- wavy.triple_collocation.least_squares_merging(data, tc_results=None, return_var=False, **kwargs)
Merges the three data series given as input following the least squares merging method described in Yilmaz et al., 2012.
data (dict of lists of floats): Dictionary of the data to calibrate. tc_results (pandas DataFrame): table of the results of triple collocation
for the given data. Must contain the variance. If None, the triple collocation is performed using the data and kwargs given.
- return_var (bool): If True, returns the variance of the error of the merged
data in addition to the merged data.
returns: least_squares_merge (numpy array): series of merged data least_squares_var (float): variance of error of the merged data
- wavy.triple_collocation.get_mean_spectra(ds, varname, fs, nsample, median_step=None, mode='average', window='hamming')
Divides a given time series into sample of given size, applies a window to each sample and calculates the power spectra for each sample, using a Fast Fourier transform. Returns the frequencies and either the list of the spectra for each sample or the average spectra over all samples.
ds (xarray dataset): xarray dataset with dimension time varname (str): name of the variable for which the spectra is
to be computed, present in ds and indexed by time
fs (float): sampling frequency nsample (int): number of points of each sample median_step (np.timedelta64): time to consider between each point of the
time series
- mode (str): either ‘average’ to return the mean power spectra or
‘list’ to return the list of the power spectra
- window (str): window to apply to the samples before applying the
FFT. See scipy.singal.periodogram for options.
return: df_spectra (pandas DataFrame): dataframe containing the mean power spectra
or the power spectra for each sample and the corresponding frequencies.
- wavy.triple_collocation.integrate_r2(PS_mod, PS_obs, f, threshold=np.inf, threshold_type='inv_freq')
Estimates the representativeness error r2 by integrating the difference between the average power spectra of the model and the observations. Integrates up to a given threshold resolution (inverse of the frequency), or from a given threshold frequency. s PS_mod (numpy array): power spectra for the model PS_obs (numpy array): power spectra for the observations f (numpy array): frequencies of the power spectra threshold (float): either upper resolution to integrate to, or minimum
frequency to integrate from
- threshold_type (str): indicate whether the threshold corresponds to a minimum
frequency (‘freq’) or to a maximum resolution (‘inv_freq’)
return: r2 (float): representativeness error estimate