Dataset statistics
| Number of variables | 1 |
|---|---|
| Number of observations | 13880 |
| Missing cells | 2191 |
| Missing cells (%) | 15.8% |
| Duplicate rows | 2451 |
| Duplicate rows (%) | 17.7% |
| Total size in memory | 216.9 KiB |
| Average record size in memory | 16.0 B |
Variable types
| TimeSeries | 1 |
|---|
Timeseries statistics
| Number of series | 1 |
|---|---|
| Time series length | 13880 |
| Starting point | 1983-01-01 00:00:00 |
| Ending point | 2020-12-31 00:00:00 |
| Period | 1 day |
| Dataset has 2451 (17.7%) duplicate rows | Duplicates |
Flow has 2191 (15.8%) missing values | Missing |
Flow is non stationary | Non stationary |
Flow is seasonal | Seasonal |
Reproduction
| Analysis started | 2024-05-12 18:18:35.721017 |
|---|---|
| Analysis finished | 2024-05-12 18:18:37.353830 |
| Duration | 1.63 second |
| Missing | Q_Station_NA_25017010_ok_Missing.csv |
| Download configuration | config.json |
Flow
Numeric time series
MISSING  NON STATIONARY  SEASONAL 
| Distinct | 6059 |
|---|---|
| Distinct (%) | 51.8% |
| Missing | 2191 |
| Missing (%) | 15.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 217.19506 |
|---|---|
| Minimum | 11.67 |
| Maximum | 1191 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 216.9 KiB |
Quantile statistics
| Minimum | 11.67 |
|---|---|
| 5-th percentile | 35.744 |
| Q1 | 84.65 |
| median | 175.6 |
| Q3 | 295.7 |
| 95-th percentile | 565.08 |
| Maximum | 1191 |
| Range | 1179.33 |
| Interquartile range (IQR) | 211.05 |
Descriptive statistics
| Standard deviation | 173.39424 |
|---|---|
| Coefficient of variation (CV) | 0.79833419 |
| Kurtosis | 3.2712754 |
| Mean | 217.19506 |
| Median Absolute Deviation (MAD) | 100.29 |
| Skewness | 1.5781312 |
| Sum | 2538793 |
| Variance | 30065.562 |
| Monotonicity | Not monotonic |
| Augmented Dickey-Fuller test p-value | 3.366965046 × 10-16 |
Histogram with fixed size bins (bins=50)
Gap statistics
| number of gaps | 49 |
|---|---|
| min | 3 days |
| max | 2 years and 4 days |
| mean | 5 weeks, 5 days and 17 hours |
| std | 17 weeks, 1 day and 10 hours |
| Value | Count | Frequency (%) |
| 44.66 | 63 | 0.5% |
| 159.5 | 51 | 0.4% |
| 191.8 | 22 | 0.2% |
| 208 | 19 | 0.1% |
| 203.9 | 18 | 0.1% |
| 183.7 | 17 | 0.1% |
| 167.6 | 17 | 0.1% |
| 79.22 | 17 | 0.1% |
| 199.9 | 16 | 0.1% |
| 368.8 | 15 | 0.1% |
| Other values (6049) | 11434 | |
| (Missing) | 2191 | 15.8% |
| Value | Count | Frequency (%) |
| 11.67 | 1 | < 0.1% |
| 11.88 | 3 | < 0.1% |
| 12 | 14 | |
| 12.3 | 1 | < 0.1% |
| 12.6 | 5 | < 0.1% |
| 12.75 | 2 | < 0.1% |
| 12.8 | 1 | < 0.1% |
| 12.9 | 1 | < 0.1% |
| 13.01 | 1 | < 0.1% |
| 13.05 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1191 | 1 | |
| 1187 | 1 | |
| 1163 | 1 | |
| 1161 | 1 | |
| 1159 | 1 | |
| 1155 | 1 | |
| 1145 | 1 | |
| 1109 | 2 | |
| 1104 | 2 | |
| 1087 | 1 |
ACF and PACF
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
| Flow | |
|---|---|
| Date | |
| 1983-01-01 | NaN |
| 1983-01-02 | NaN |
| 1983-01-03 | NaN |
| 1983-01-04 | NaN |
| 1983-01-05 | NaN |
| 1983-01-06 | NaN |
| 1983-01-07 | NaN |
| 1983-01-08 | NaN |
| 1983-01-09 | NaN |
| 1983-01-10 | NaN |
| Flow | |
|---|---|
| Date | |
| 2020-12-22 | 88.050 |
| 2020-12-23 | 81.075 |
| 2020-12-24 | 75.888 |
| 2020-12-25 | 75.263 |
| 2020-12-26 | 74.700 |
| 2020-12-27 | 82.875 |
| 2020-12-28 | 98.075 |
| 2020-12-29 | 88.200 |
| 2020-12-30 | 78.813 |
| 2020-12-31 | 71.325 |
Most frequently occurring
| Flow | # duplicates | |
|---|---|---|
| 2450 | NaN | 2191 |
| 246 | 44.66 | 63 |
| 1017 | 159.50 | 51 |
| 1239 | 191.80 | 22 |
| 1346 | 208.00 | 19 |
| 1317 | 203.90 | 18 |
| 504 | 79.22 | 17 |
| 1075 | 167.60 | 17 |
| 1188 | 183.70 | 17 |
| 1293 | 199.90 | 16 |