Quality Control
Tests
We have developed four different kinds of quality control tests. These tests are all done at a minutely level.
-
Range Test
Data is flagged when it falls beyond a predefined interval. In other words, the data is abnormally too high or too low.
-
Step Test
Data is flagged when the absolute difference between it and its previous value falls beyond a predefined interval. This particular difference is called a “step”.
-
Persistency Test
Data is flagged when there is absolutely no difference between it and its consecutively past previous values for at least 3 hours. We can say that there is abnormal persistence since any time series has a reasonable value of fluctuation.
-
Spatial Regression Test
Data is flagged when the spatial difference of a particular observed value measured at a station compared to other stations is not consistent with the historical statistics.
It is important to note that we should not assume that a datum flagged by the SRT always means there is a high spatial difference between our target station and other stations. Instead, it indicates a statistically inconsistent spatial difference.
Meaning of QC Results
-
QC flags for minutely data
- '3': baseline threshold exceeded (e.g. temperature > 45 ℃)
- '2': threshold based on HKO monthly data exceeded (e.g. temperature > 33 ℃ in December)
- '1': threshold based on CoWIN (self station) monthly data exceeded
- '0': good data
- 'X': test is not carried out
-
QC flags for hourly data
- 'X': no data at all
- 'M': missing too many time steps
- 'F': too many minutely flags failed QC
- 'G': good data
Reference
To know more about the algorithm for the QC tests, see Chang et al., 2021. https://doi.org/10.1016/j.uclim.2021.100816 .