Why was this so hard? It gets at the binding problem and more scientifically Treisman and Gelade's feature integration theory. In cognition, we conflate object and form. We perceived something shaped like a dog, and we assign it as a dog-concept. When we perceive an object, we see it as a whole object, composed of concept and form, not a mere representation or projection of that object. When programming the game in Object-Oriented JavaScript that updated an HTML5 canvas, I needed to shift my cognitive framework to think in the latter way. The shapes on the screen are mere projections of the behind the scenes logical operations of the game. The objects themselves might change their position logically, but unless the forms are updated, you don't know. And that's really weird to think about, that you might change something about the location of objects but that isn't necessarily reflected in your environment. When a player plays the game, they see the moving ball as a complete ball, not a projection of a JavaScript ball object that is merely a bunch of numbers in a program. (And this doesn't even get to the lower level representation of the ball-object and ball-picture by the computer...).
In the beginning stages of my game, there was an even more off-putting bug. By the nature of the HTML5 canvas functionality, any drawings you do there are permanent unless you "paint over" them. So, as the ball moved, it left a trail behind it, a ghost in full color, a map of time passed as an object moved through the environment.
Challenges like these are why I love programming.