Great that you are trying to do manual processing also. It is a good way to learn the possibilities and limitations of using the gaza data as an input modality. I am sure there will always be situations where the high-level interactions included in the EyeX Engine are not enough, so it is good to be able to have a backup plan and know how to handle the raw gaze data.
However, for most cases when you want to know “which object is the user looking at”, it is very helpful to make use of the features that the EyeX Engine provides. The object tracking/snapping, also discussed in this thread, is a way to make it easier for the user to hit small objects even though the gaze data is noisy. It is a similar technique that is used on touch interfaces. I cannot give you any more implementation details right now, but maybe we can share some more info later on if there is any demand.