However we could also incorporate temporal information by using $P_Y$ to smooth the output:
$$y_t = f(G_Y(x_t),P_Y(G_Y(x_{1:t-1})))$$
Here $f$ could be simple averaging:
$$y_t = \frac{G_Y(x_t) + P_Y(G_Y(x_{1:t-1}))}{2}$$
It could also be a non-linear function and possibly one that is learned.