Overview

Dataset statistics

Number of variables1
Number of observations13880
Missing cells2191
Missing cells (%)15.8%
Duplicate rows2451
Duplicate rows (%)17.7%
Total size in memory216.9 KiB
Average record size in memory16.0 B

Variable types

TimeSeries1

Timeseries statistics

Number of series1
Time series length13880
Starting point1983-01-01 00:00:00
Ending point2020-12-31 00:00:00
Period1 day
2024-05-12T14:18:37.462798image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-05-12T14:18:37.872128image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Alerts

Dataset has 2451 (17.7%) duplicate rowsDuplicates
Flow has 2191 (15.8%) missing valuesMissing
Flow is non stationaryNon stationary
Flow is seasonalSeasonal

Reproduction

Analysis started2024-05-12 18:18:35.721017
Analysis finished2024-05-12 18:18:37.353830
Duration1.63 second
MissingQ_Station_NA_25017010_ok_Missing.csv
Download configurationconfig.json

Variables

Flow
Numeric time series

MISSING  NON STATIONARY  SEASONAL 

Distinct6059
Distinct (%)51.8%
Missing2191
Missing (%)15.8%
Infinite0
Infinite (%)0.0%
Mean217.19506
Minimum11.67
Maximum1191
Zeros0
Zeros (%)0.0%
Memory size216.9 KiB
2024-05-12T14:18:38.607644image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum11.67
5-th percentile35.744
Q184.65
median175.6
Q3295.7
95-th percentile565.08
Maximum1191
Range1179.33
Interquartile range (IQR)211.05

Descriptive statistics

Standard deviation173.39424
Coefficient of variation (CV)0.79833419
Kurtosis3.2712754
Mean217.19506
Median Absolute Deviation (MAD)100.29
Skewness1.5781312
Sum2538793
Variance30065.562
MonotonicityNot monotonic
Augmented Dickey-Fuller test p-value3.366965046 × 10-16
2024-05-12T14:18:39.360476image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
2024-05-12T14:18:42.467566image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Gap statistics

number of gaps49
min3 days
max2 years and 4 days
mean5 weeks, 5 days and 17 hours
std17 weeks, 1 day and 10 hours
2024-05-12T14:18:42.801402image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
44.66 63
 
0.5%
159.5 51
 
0.4%
191.8 22
 
0.2%
208 19
 
0.1%
203.9 18
 
0.1%
183.7 17
 
0.1%
167.6 17
 
0.1%
79.22 17
 
0.1%
199.9 16
 
0.1%
368.8 15
 
0.1%
Other values (6049) 11434
82.4%
(Missing) 2191
 
15.8%
ValueCountFrequency (%)
11.67 1
 
< 0.1%
11.88 3
 
< 0.1%
12 14
0.1%
12.3 1
 
< 0.1%
12.6 5
 
< 0.1%
12.75 2
 
< 0.1%
12.8 1
 
< 0.1%
12.9 1
 
< 0.1%
13.01 1
 
< 0.1%
13.05 1
 
< 0.1%
ValueCountFrequency (%)
1191 1
< 0.1%
1187 1
< 0.1%
1163 1
< 0.1%
1161 1
< 0.1%
1159 1
< 0.1%
1155 1
< 0.1%
1145 1
< 0.1%
1109 2
< 0.1%
1104 2
< 0.1%
1087 1
< 0.1%
2024-05-12T14:18:41.826122image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ACF and PACF

Interactions

2024-05-12T14:18:36.875664image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Missing values

2024-05-12T14:18:37.131233image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T14:18:37.276116image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Flow
Date
1983-01-01NaN
1983-01-02NaN
1983-01-03NaN
1983-01-04NaN
1983-01-05NaN
1983-01-06NaN
1983-01-07NaN
1983-01-08NaN
1983-01-09NaN
1983-01-10NaN
Flow
Date
2020-12-2288.050
2020-12-2381.075
2020-12-2475.888
2020-12-2575.263
2020-12-2674.700
2020-12-2782.875
2020-12-2898.075
2020-12-2988.200
2020-12-3078.813
2020-12-3171.325

Duplicate rows

Most frequently occurring

Flow# duplicates
2450NaN2191
24644.6663
1017159.5051
1239191.8022
1346208.0019
1317203.9018
50479.2217
1075167.6017
1188183.7017
1293199.9016