Dataset statistics
| Number of variables | 1 |
|---|---|
| Number of observations | 13880 |
| Missing cells | 3254 |
| Missing cells (%) | 23.4% |
| Duplicate rows | 1311 |
| Duplicate rows (%) | 9.4% |
| Total size in memory | 216.9 KiB |
| Average record size in memory | 16.0 B |
Variable types
| TimeSeries | 1 |
|---|
Timeseries statistics
| Number of series | 1 |
|---|---|
| Time series length | 13880 |
| Starting point | 1983-01-01 00:00:00 |
| Ending point | 2020-12-31 00:00:00 |
| Period | 1 day |
| Dataset has 1311 (9.4%) duplicate rows | Duplicates |
Flow has 3254 (23.4%) missing values | Missing |
Reproduction
| Analysis started | 2024-05-12 19:33:17.351815 |
|---|---|
| Analysis finished | 2024-05-12 19:33:19.358309 |
| Duration | 2.01 seconds |
| Missing | Q_Station_NA_24017640_ok_Missing.csv |
| Download configuration | config.json |
Flow
Numeric time series
MISSING 
| Distinct | 8382 |
|---|---|
| Distinct (%) | 78.9% |
| Missing | 3254 |
| Missing (%) | 23.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.13673998 |
|---|---|
| Minimum | -1740.8 |
| Maximum | 1841.4 |
| Zeros | 32 |
| Zeros (%) | 0.2% |
| Memory size | 216.9 KiB |
Quantile statistics
| Minimum | -1740.8 |
|---|---|
| 5-th percentile | -287.595 |
| Q1 | -44.32925 |
| median | 5.95 |
| Q3 | 62 |
| 95-th percentile | 258.675 |
| Maximum | 1841.4 |
| Range | 3582.2 |
| Interquartile range (IQR) | 106.32925 |
Descriptive statistics
| Standard deviation | 193.65542 |
|---|---|
| Coefficient of variation (CV) | 1416.2312 |
| Kurtosis | 13.300936 |
| Mean | 0.13673998 |
| Median Absolute Deviation (MAD) | 53.55 |
| Skewness | -1.1424242 |
| Sum | 1452.999 |
| Variance | 37502.421 |
| Monotonicity | Not monotonic |
| Augmented Dickey-Fuller test p-value | 0 |
Histogram with fixed size bins (bins=50)
Gap statistics
| number of gaps | 191 |
|---|---|
| min | 4 days |
| max | 2 years and 1 week |
| mean | 2 weeks, 4 days and 37 minutes |
| std | 10 weeks, 1 day and 9 hours |
| Value | Count | Frequency (%) |
| 0 | 32 | 0.2% |
| 4 | 18 | 0.1% |
| 14 | 12 | 0.1% |
| 18 | 12 | 0.1% |
| 1 | 12 | 0.1% |
| 16 | 11 | 0.1% |
| 20 | 11 | 0.1% |
| -12 | 11 | 0.1% |
| 23 | 11 | 0.1% |
| -6 | 11 | 0.1% |
| Other values (8372) | 10485 | |
| (Missing) | 3254 | 23.4% |
| Value | Count | Frequency (%) |
| -1740.8 | 1 | |
| -1725.2 | 1 | |
| -1643.7 | 1 | |
| -1589.6 | 1 | |
| -1585.66 | 1 | |
| -1566.6 | 1 | |
| -1517.2 | 1 | |
| -1509 | 1 | |
| -1391.4 | 1 | |
| -1386.1 | 1 |
| Value | Count | Frequency (%) |
| 1841.4 | 1 | |
| 1319.8 | 1 | |
| 1315 | 1 | |
| 1209.68 | 1 | |
| 1169.6 | 1 | |
| 1167 | 1 | |
| 1166.2 | 1 | |
| 1128.6 | 1 | |
| 1126.4 | 1 | |
| 1110.81 | 1 |
ACF and PACF
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
| Flow | |
|---|---|
| Date | |
| 1983-01-01 | NaN |
| 1983-01-02 | NaN |
| 1983-01-03 | 8.000000e-01 |
| 1983-01-04 | -4.700000e+00 |
| 1983-01-05 | 1.421085e-14 |
| 1983-01-06 | 1.460000e+01 |
| 1983-01-07 | -1.650000e+01 |
| 1983-01-08 | -4.000000e-01 |
| 1983-01-09 | 4.800000e+00 |
| 1983-01-10 | -1.200000e+00 |
| Flow | |
|---|---|
| Date | |
| 2020-12-22 | -60.850 |
| 2020-12-23 | 107.940 |
| 2020-12-24 | -95.640 |
| 2020-12-25 | 187.020 |
| 2020-12-26 | -234.030 |
| 2020-12-27 | 33.840 |
| 2020-12-28 | 38.443 |
| 2020-12-29 | 3.841 |
| 2020-12-30 | 29.295 |
| 2020-12-31 | 72.815 |
Most frequently occurring
| Flow | # duplicates | |
|---|---|---|
| 1310 | NaN | 3254 |
| 491 | 0.0 | 32 |
| 586 | 4.0 | 18 |
| 524 | 1.0 | 12 |
| 731 | 14.0 | 12 |
| 778 | 18.0 | 12 |
| 351 | -12.0 | 11 |
| 374 | -9.0 | 11 |
| 398 | -6.0 | 11 |
| 451 | -2.0 | 11 |