-
Notifications
You must be signed in to change notification settings - Fork 20
/
monotonic_plotting_pdf.md1
166 lines (125 loc) · 5.35 KB
/
monotonic_plotting_pdf.md1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
```@meta
Author = "Paulito P. Palmes"
```
# Monotonic Detection and Plotting
One important preprocessing step for time series data processing is the detection
of monotonic data and transform it to non-monotonic type by using the finite difference
operator.
## Artificial Data Example
Let's create an artificial monotonic data and apply our monotonic transformer to normalize it.
We can use the `Plotter` filter to visualize the generated data.
```@example mono
using TSML
Random.seed!(123)
pltr = Plotter(Dict(:pdfoutput => true))
mdates = DateTime(2017,12,1,1):Dates.Hour(1):DateTime(2017,12,31,10) |> collect
mvals = rand(length(mdates)) |> cumsum
df = DataFrame(Date=mdates ,Value = mvals)
fit_transform!(pltr,df);
nothing #hide
```
Now that we have a monotonic data, let's use the `Monotonicer` to normalize and plot the result:
```@example mono
using TSML
mono = Monotonicer(Dict())
pipeline = mono |> pltr
res=fit_transform!(pipeline,df)
nothing #hide
```
## Real Data Example
We will now apply the entire pipeline
starting from reading csv data, aggregate, impute, normalize
if it's monotonic, and plot. We will consider three
different data types: a regular time series data, a
monotonic data, and a daily monotonic data. The difference between
monotonic and daily monotonic is that the values in daily monotonic resets to
zero or some baseline and cumulatively increases in a day until the
next day where it resets to zero or some baseline value. `Monotonicer`
automatically detects these three different types and apply the corresponding
normalization accordingly.
```@example mono
using TSML
regularfile = joinpath(dirname(pathof(TSML)),"../data/typedetection/regular.csv")
monofile = joinpath(dirname(pathof(TSML)),"../data/typedetection/monotonic.csv")
dailymonofile = joinpath(dirname(pathof(TSML)),"../data/typedetection/dailymonotonic.csv")
regularfilecsv = CSVDateValReader(Dict(:filename=>regularfile,:dateformat=>"dd/mm/yyyy HH:MM"))
monofilecsv = CSVDateValReader(Dict(:filename=>monofile,:dateformat=>"dd/mm/yyyy HH:MM"))
dailymonofilecsv = CSVDateValReader(Dict(:filename=>dailymonofile,:dateformat=>"dd/mm/yyyy HH:MM"))
valgator = DateValgator(Dict(:dateinterval=>Dates.Hour(1)))
valnner = DateValLinearImputer(Dict(:dateinterval=>Dates.Hour(1)))
stfier = Statifier(Dict(:processmissing=>true))
mono = Monotonicer(Dict())
pltr = Plotter(Dict(:pdfoutput => true))
nothing #hide
```
## Regular TS Processing
Let's test by feeding the regular time series type to the pipeline. We expect that for this type,
`Monotonicer` will not perform further processing:
- Pipeline with `Monotonicer`: regular time series
```@example mono
pipeline = regularfilecsv |> valgator |> valnner |> mono |> pltr
fit_transform!(pipeline);
nothing #hide
```
- Pipeline without `Monotonicer`: regular time series
```@example mono
pipeline = regularfilecsv |> valgator |> valnner |> pltr
fit_transform!(pipeline);
nothing #hide
```
Notice that the plots are the same with or without the `Monotonicer` instance.
## Monotonic TS Processing
Let's now feed the same pipeline with a monotonic csv data.
- Pipeline without `Monotonicer`: monotonic time series
```@example mono
pipeline = monofilecsv |> valgator |> valnner |> pltr
fit_transform!(pipeline);
nothing #hide
```
- Pipeline with `Monotonicer`: monotonic time series
```@example mono
pipeline = monofilecsv |> valgator |> valnner |> mono |> pltr
fit_transform!(pipeline);
nothing #hide
```
Notice that without the `Monotonicer` instance, the data is monotonic. Applying
the `Monotonicer` instance in the pipeline converts the data into
a regular time series but with outliers.
We can use the `Outliernicer` filter to remove outliers. Let's apply this filter after the
`Monotonicer` and plot the result.
- Pipeline with `Monotonicer` and `Outliernicer`: monotonic time series
```@example mono
using TSML: Outliernicer
outliernicer = Outliernicer(Dict(:dateinterval=>Dates.Hour(1)));
pipeline = monofilecsv |> valgator |> valnner |> mono |> outliernicer |> pltr
fit_transform!(pipeline);
nothing #hide
```
## Daily Monotonic TS Processing
Lastly, let's feed the daily monotonic data using similar pipeline and examine its plot.
- Pipeline without `Monotonicer`: daily monotonic time series
```@example mono
pipeline = dailymonofilecsv |> valgator |> valnner |> pltr
fit_transform!(pipeline);
nothing #hide
```
This plot is characterized by monotonically increasing trend but resets to certain baseline value
at the end of the day and repeat similar trend daily. The challenge for the monotonic normalizer
is to differentiate between daily monotonic from the typical monotonic function to apply
the correct normalization.
- Pipeline with `Monotonicer`: daily monotonic time series
```@example mono
pipeline = dailymonofilecsv |> valgator |> valnner |> mono |> pltr
fit_transform!(pipeline);
nothing #hide
```
While the `Monotonicer` filter is able to transform the data into a regular time series,
there are significant outliers due to noise and the nature of this kind of data or sensor.
Let's remove the outliers by applying the `Outliernicer` filter and examine the result.
- Pipeline with `Monotonicer` and `Outliernicer`: daily monotonic time series
```@example mono
pipeline = dailymonofilecsv |> valgator |> valnner |> mono |> outliernicer |> pltr
fit_transform!(pipeline);
nothing #hide
```
The `Outliernicer` filter effectively removed the outliers as shown in the plot.