Overview

Dataset statistics

Number of variables1
Number of observations13880
Missing cells2769
Missing cells (%)19.9%
Duplicate rows2579
Duplicate rows (%)18.6%
Total size in memory216.9 KiB
Average record size in memory16.0 B

Variable types

TimeSeries1

Timeseries statistics

Number of series1
Time series length13880
Starting point1983-01-01 00:00:00
Ending point2020-12-31 00:00:00
Period1 day
2024-05-12T14:17:17.826294image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-05-12T14:17:18.240908image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Alerts

Dataset has 2579 (18.6%) duplicate rowsDuplicates
Flow has 2769 (19.9%) missing valuesMissing
Flow is non stationaryNon stationary
Flow is seasonalSeasonal

Reproduction

Analysis started2024-05-12 18:17:16.018604
Analysis finished2024-05-12 18:17:17.723837
Duration1.71 second
MissingQ_Station_NA_25027410_ok_Missing.csv
Download configurationconfig.json

Variables

Flow
Numeric time series

MISSING  NON STATIONARY  SEASONAL 

Distinct5291
Distinct (%)47.6%
Missing2769
Missing (%)19.9%
Infinite0
Infinite (%)0.0%
Mean4332.5033
Minimum1235
Maximum9092
Zeros0
Zeros (%)0.0%
Memory size216.9 KiB
2024-05-12T14:17:18.991492image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum1235
5-th percentile1986.1
Q13026
median4193
Q35483
95-th percentile7280.5
Maximum9092
Range7857
Interquartile range (IQR)2457

Descriptive statistics

Standard deviation1616.5721
Coefficient of variation (CV)0.37312658
Kurtosis-0.65577078
Mean4332.5033
Median Absolute Deviation (MAD)1227
Skewness0.36493801
Sum48138444
Variance2613305.5
MonotonicityNot monotonic
Augmented Dickey-Fuller test p-value7.709848253 × 10-19
2024-05-12T14:17:19.654061image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
2024-05-12T14:17:22.475593image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Gap statistics

number of gaps75
min3 days
max1 year, 4 weeks and 5 days
mean5 weeks, 2 days and 7 hours
std8 weeks, 1 day and 19 hours
2024-05-12T14:17:22.982600image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
3518 13
 
0.1%
5215 12
 
0.1%
3985 11
 
0.1%
3080 11
 
0.1%
3643 10
 
0.1%
4311 10
 
0.1%
2632 10
 
0.1%
4584 10
 
0.1%
5305 10
 
0.1%
5892 10
 
0.1%
Other values (5281) 11004
79.3%
(Missing) 2769
 
19.9%
ValueCountFrequency (%)
1235 2
< 0.1%
1253 1
< 0.1%
1270 1
< 0.1%
1283 1
< 0.1%
1287 1
< 0.1%
1290 1
< 0.1%
1292 1
< 0.1%
1293 1
< 0.1%
1298 1
< 0.1%
1301 1
< 0.1%
ValueCountFrequency (%)
9092 1
< 0.1%
8994 1
< 0.1%
8978 1
< 0.1%
8970 1
< 0.1%
8913 1
< 0.1%
8873 1
< 0.1%
8799 1
< 0.1%
8791 1
< 0.1%
8736 1
< 0.1%
8728 1
< 0.1%
2024-05-12T14:17:21.617620image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ACF and PACF

Interactions

2024-05-12T14:17:17.023998image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Missing values

2024-05-12T14:17:17.425500image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T14:17:17.632873image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Flow
Date
1983-01-013250.0
1983-01-023002.0
1983-01-032830.0
1983-01-042770.0
1983-01-052922.0
1983-01-062830.0
1983-01-072530.0
1983-01-082623.0
1983-01-093041.0
1983-01-103098.0
Flow
Date
2020-12-22NaN
2020-12-23NaN
2020-12-24NaN
2020-12-25NaN
2020-12-26NaN
2020-12-27NaN
2020-12-28NaN
2020-12-29NaN
2020-12-30NaN
2020-12-31NaN

Duplicate rows

Most frequently occurring

Flow# duplicates
2578NaN2769
9363518.013
18045215.012
6833080.011
11853985.011
4412632.010
10053643.010
13614311.010
15044584.010
17565105.010