Overview

Dataset statistics

Number of variables1
Number of observations13880
Missing cells159
Missing cells (%)1.1%
Duplicate rows1605
Duplicate rows (%)11.6%
Total size in memory216.9 KiB
Average record size in memory16.0 B

Variable types

TimeSeries1

Timeseries statistics

Number of series1
Time series length13880
Starting point1983-01-01 00:00:00
Ending point2020-12-31 00:00:00
Period1 day
2024-05-12T15:33:49.686943image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-05-12T15:33:50.117493image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Alerts

Dataset has 1605 (11.6%) duplicate rowsDuplicates
Flow has 159 (1.1%) missing valuesMissing

Reproduction

Analysis started2024-05-12 19:33:46.971140
Analysis finished2024-05-12 19:33:49.581610
Duration2.61 seconds
MissingQ_Station_NA_23097030_ok_Missing.csv
Download configurationconfig.json

Variables

Flow
Numeric time series

MISSING 

Distinct3688
Distinct (%)26.9%
Missing159
Missing (%)1.1%
Infinite0
Infinite (%)0.0%
Mean0.41384739
Minimum-2959
Maximum2342
Zeros21
Zeros (%)0.2%
Memory size216.9 KiB
2024-05-12T15:33:50.726291image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum-2959
5-th percentile-706
Q1-201
median18.6
Q3224
95-th percentile631
Maximum2342
Range5301
Interquartile range (IQR)425

Descriptive statistics

Standard deviation408.49185
Coefficient of variation (CV)987.05914
Kurtosis2.5364123
Mean0.41384739
Median Absolute Deviation (MAD)211.4
Skewness-0.38061353
Sum5678.4
Variance166865.59
MonotonicityNot monotonic
Augmented Dickey-Fuller test p-value0
2024-05-12T15:33:51.093465image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
2024-05-12T15:33:52.241518image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Gap statistics

number of gaps15
min4 days
max11 weeks and 1 day
mean1 week, 4 days and 11 hours
std2 weeks, 4 days and 18 hours
2024-05-12T15:33:52.558103image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
-6 30
 
0.2%
44 28
 
0.2%
42 27
 
0.2%
195 26
 
0.2%
-8 24
 
0.2%
66 24
 
0.2%
75 23
 
0.2%
107 23
 
0.2%
-33 23
 
0.2%
81 23
 
0.2%
Other values (3678) 13470
97.0%
(Missing) 159
 
1.1%
ValueCountFrequency (%)
-2959 1
< 0.1%
-2914 1
< 0.1%
-2338 1
< 0.1%
-2316 1
< 0.1%
-2273 1
< 0.1%
-2251 1
< 0.1%
-2205 1
< 0.1%
-2190 1
< 0.1%
-2103 1
< 0.1%
-2099 1
< 0.1%
ValueCountFrequency (%)
2342 1
< 0.1%
2137 1
< 0.1%
1931 1
< 0.1%
1889 1
< 0.1%
1803 1
< 0.1%
1730 1
< 0.1%
1697 1
< 0.1%
1696 1
< 0.1%
1671 1
< 0.1%
1573 1
< 0.1%
2024-05-12T15:33:51.546513image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ACF and PACF

Interactions

2024-05-12T15:33:48.970996image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Missing values

2024-05-12T15:33:49.325819image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T15:33:49.506082image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Flow
Date
1983-01-01NaN
1983-01-02NaN
1983-01-03-186.0
1983-01-04-720.0
1983-01-05569.0
1983-01-06872.0
1983-01-07-754.0
1983-01-08-541.0
1983-01-09195.0
1983-01-10264.0
Flow
Date
2020-12-22-140.9
2020-12-23-136.6
2020-12-24108.3
2020-12-25371.2
2020-12-26-555.0
2020-12-27320.0
2020-12-2892.1
2020-12-29-372.1
2020-12-30189.4
2020-12-31113.0

Duplicate rows

Most frequently occurring

Flow# duplicates
1604NaN159
812-6.030
86644.028
86442.027
1028195.026
809-8.024
88966.024
777-33.023
89975.023
90581.023