Overview

Dataset statistics

Number of variables1
Number of observations13880
Missing cells1623
Missing cells (%)11.7%
Duplicate rows1257
Duplicate rows (%)9.1%
Total size in memory216.9 KiB
Average record size in memory16.0 B

Variable types

TimeSeries1

Timeseries statistics

Number of series1
Time series length13880
Starting point1983-01-01 00:00:00
Ending point2020-12-31 00:00:00
Period1 day
2024-05-12T15:35:22.394542image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-05-12T15:35:22.906751image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Alerts

Dataset has 1257 (9.1%) duplicate rowsDuplicates
Flow has 1623 (11.7%) missing valuesMissing
Flow has 906 (6.5%) zerosZeros

Reproduction

Analysis started2024-05-12 19:35:19.872021
Analysis finished2024-05-12 19:35:22.280527
Duration2.41 seconds
MissingQ_Station_NA_28037030_ok_Missing.csv
Download configurationconfig.json

Variables

Flow
Numeric time series

MISSING  ZEROS 

Distinct6649
Distinct (%)54.2%
Missing1623
Missing (%)11.7%
Infinite0
Infinite (%)0.0%
Mean0.063454344
Minimum-357.32
Maximum196.59
Zeros906
Zeros (%)6.5%
Memory size216.9 KiB
2024-05-12T15:35:23.625671image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum-357.32
5-th percentile-23.42
Q1-0.52
median0.1
Q32.376
95-th percentile21.92
Maximum196.59
Range553.91
Interquartile range (IQR)2.896

Descriptive statistics

Standard deviation19.776896
Coefficient of variation (CV)311.67127
Kurtosis33.074187
Mean0.063454344
Median Absolute Deviation (MAD)1.4
Skewness-1.7325433
Sum777.7599
Variance391.12562
MonotonicityNot monotonic
Augmented Dickey-Fuller test p-value0
2024-05-12T15:35:24.245408image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
2024-05-12T15:35:25.533854image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Gap statistics

number of gaps36
min4 days
max2 years and 6 days
mean6 weeks, 4 days and 40 minutes
std17 weeks, 5 days and 17 hours
2024-05-12T15:35:25.988295image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0 906
 
6.5%
0.141 124
 
0.9%
-0.141 112
 
0.8%
0.1 96
 
0.7%
-0.1 82
 
0.6%
0.2 60
 
0.4%
0.1 59
 
0.4%
-0.1 54
 
0.4%
-0.2 48
 
0.3%
-8.881784197 × 10-1648
 
0.3%
Other values (6639) 10668
76.9%
(Missing) 1623
 
11.7%
ValueCountFrequency (%)
-357.32 1
< 0.1%
-229.9 1
< 0.1%
-208.36 1
< 0.1%
-204.6 1
< 0.1%
-188.7 1
< 0.1%
-185.4 1
< 0.1%
-184.3 1
< 0.1%
-180.2 1
< 0.1%
-179.7 1
< 0.1%
-171.68 1
< 0.1%
ValueCountFrequency (%)
196.59 1
< 0.1%
180.76 1
< 0.1%
173.8 1
< 0.1%
161.82 1
< 0.1%
158.46 1
< 0.1%
154.8 1
< 0.1%
154.1 1
< 0.1%
150.1 1
< 0.1%
147.57 1
< 0.1%
145.2 1
< 0.1%
2024-05-12T15:35:24.791794image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ACF and PACF

Interactions

2024-05-12T15:35:21.651876image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Missing values

2024-05-12T15:35:22.004314image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T15:35:22.200578image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Flow
Date
1983-01-01NaN
1983-01-02NaN
1983-01-030.000000e+00
1983-01-04-3.000000e-01
1983-01-054.000000e-01
1983-01-06-8.881784e-16
1983-01-07-1.000000e-01
1983-01-083.000000e-01
1983-01-09-4.000000e-01
1983-01-10-1.600000e+00
Flow
Date
2020-12-221.5380
2020-12-23-3.3780
2020-12-24-2.3070
2020-12-250.4420
2020-12-26-2.1110
2020-12-271.4480
2020-12-28-1.4330
2020-12-291.2120
2020-12-301.5483
2020-12-31-0.7971

Duplicate rows

Most frequently occurring

Flow# duplicates
1256NaN1623
4840.000906
5480.141124
419-0.141112
5220.10096
449-0.10082
5630.20060
5240.10059
447-0.10054
406-0.20048