Overview

Dataset statistics

Number of variables1
Number of observations13880
Missing cells3542
Missing cells (%)25.5%
Duplicate rows1370
Duplicate rows (%)9.9%
Total size in memory216.9 KiB
Average record size in memory16.0 B

Variable types

TimeSeries1

Timeseries statistics

Number of series1
Time series length13880
Starting point1983-01-01 00:00:00
Ending point2020-12-31 00:00:00
Period1 day
2024-05-12T15:35:33.198151image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-05-12T15:35:33.570373image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Alerts

Dataset has 1370 (9.9%) duplicate rowsDuplicates
Flow has 3542 (25.5%) missing valuesMissing
Flow has 228 (1.6%) zerosZeros

Reproduction

Analysis started2024-05-12 19:35:31.079475
Analysis finished2024-05-12 19:35:33.100998
Duration2.02 seconds
MissingQ_Station_NA_28037090_ok_Missing.csv
Download configurationconfig.json

Variables

Flow
Numeric time series

MISSING  ZEROS 

Distinct5565
Distinct (%)53.8%
Missing3542
Missing (%)25.5%
Infinite0
Infinite (%)0.0%
Mean0.0057741826
Minimum-190.9
Maximum196.8
Zeros228
Zeros (%)1.6%
Memory size216.9 KiB
2024-05-12T15:35:34.240310image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum-190.9
5-th percentile-7.349
Q1-1.24475
median0
Q31.3
95-th percentile7.2
Maximum196.8
Range387.7
Interquartile range (IQR)2.54475

Descriptive statistics

Standard deviation7.3116554
Coefficient of variation (CV)1266.2667
Kurtosis130.95208
Mean0.0057741826
Median Absolute Deviation (MAD)1.3
Skewness-0.21414691
Sum59.6935
Variance53.460304
MonotonicityNot monotonic
Augmented Dickey-Fuller test p-value0
2024-05-12T15:35:34.860928image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
2024-05-12T15:35:36.080548image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Gap statistics

number of gaps34
min5 days
max4 years, 1 week and 2 days
mean15 weeks, 42 minutes and 21.35 seconds
std39 weeks, 4 days and 17 hours
2024-05-12T15:35:36.531860image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0 228
 
1.6%
0.5 53
 
0.4%
-0.5 42
 
0.3%
-0.1 37
 
0.3%
-0.1 36
 
0.3%
-1 33
 
0.2%
0.2 31
 
0.2%
0.1 29
 
0.2%
0.2 27
 
0.2%
0.1 25
 
0.2%
Other values (5555) 9797
70.6%
(Missing) 3542
 
25.5%
ValueCountFrequency (%)
-190.9 1
< 0.1%
-86.4 1
< 0.1%
-85.6 1
< 0.1%
-85.1 1
< 0.1%
-84.1 1
< 0.1%
-74.1 1
< 0.1%
-73.9 1
< 0.1%
-73.4 1
< 0.1%
-73 1
< 0.1%
-67.4 1
< 0.1%
ValueCountFrequency (%)
196.8 1
< 0.1%
93.9 1
< 0.1%
90.6 1
< 0.1%
85.5 1
< 0.1%
83.6 1
< 0.1%
79.7 1
< 0.1%
76 1
< 0.1%
70.2 1
< 0.1%
63.4 1
< 0.1%
58.9 1
< 0.1%
2024-05-12T15:35:35.382224image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ACF and PACF

Interactions

2024-05-12T15:35:32.434574image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Missing values

2024-05-12T15:35:32.823655image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T15:35:33.016303image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Flow
Date
1983-01-01NaN
1983-01-02NaN
1983-01-03NaN
1983-01-040.0
1983-01-050.1
1983-01-06-0.1
1983-01-07-0.2
1983-01-080.4
1983-01-09-0.3
1983-01-100.1
Flow
Date
2020-12-220.620
2020-12-23-1.437
2020-12-240.701
2020-12-25-0.477
2020-12-260.791
2020-12-270.107
2020-12-28-0.713
2020-12-290.272
2020-12-30NaN
2020-12-31NaN

Duplicate rows

Most frequently occurring

Flow# duplicates
1369NaN3542
6890.0228
8470.553
516-0.542
642-0.137
640-0.136
413-1.033
7720.231
7430.129
7690.227