Quality Control

Tests

We have developed four different kinds of quality control tests. These tests are all done at a minutely level.

  1. Range Test

    Data is flagged when it falls beyond a predefined interval. In other words, the data is abnormally too high or too low.

  2. Step Test

    Data is flagged when the absolute difference between it and its previous value falls beyond a predefined interval. This particular difference is called a “step”.

  3. Persistency Test

    Data is flagged when there is absolutely no difference between it and its consecutively past previous values for at least 3 hours. We can say that there is abnormal persistence since any time series has a reasonable value of fluctuation.

  4. Spatial Regression Test

    Data is flagged when the spatial difference of a particular observed value measured at a station compared to other stations is not consistent with the historical statistics.

It is important to note that we should not assume that a datum flagged by the SRT always means there is a high spatial difference between our target station and other stations. Instead, it indicates a statistically inconsistent spatial difference.

Meaning of QC Results

  • QC flags for minutely data
    • '3': baseline threshold exceeded (e.g. temperature > 45 ℃)
    • '2': threshold based on HKO monthly data exceeded (e.g. temperature > 33 ℃ in December)
    • '1': threshold based on CoWIN (self station) monthly data exceeded
    • '0': good data
    • 'X': test is not carried out
  • QC flags for hourly data
    • 'X': no data at all
    • 'M': missing too many time steps
    • 'F': too many minutely flags failed QC
    • 'G': good data

Reference

To know more about the algorithm for the QC tests, see Chang et al., 2021. https://doi.org/10.1016/j.uclim.2021.100816 .