Data Analysis by Driven IFS

Background - Converting time series to symbol strings

Though there are others, here we describe two methods for converting a time series {x1, x2, ... , xn} into a symbol string {i1, i2, ... , in}.
One method is called equal size bins,
the other equal weight bins.
More methods will be presented in the sample.

For equal size bins, begin by finding
M = max{xi: i = 1, ... , n} and
m = min{xi: i = 1, ... , n}.
Then the range of the xi is R = M - m and
the bin boundaries are
B1 = m + (1/4)R,
B2 = m + (1/2)R, and
B3 = m + (3/4)R.
Then for each xk of the time series, the corresponding symbol ik is given by

ik = 4 if B3 <= xk <= M
ik = 3 if B2 <= xk < B3
ik = 2 if B1 <= xk < B2
ik = 1 if m <= xk < B1

We call the intervals [m, B1), [B1, B2), [B2, B3), and [B3, M] bin 1, bin 2, bin 3, and bin 4, respectively.

For equal weight bins, select the bin boundaries B1, B2, and B3 so that each bin contains one-quarter of the xk.

Return to Background.