2

Similar questions have been asked in the past, but I haven't been satisfied with any of the answers and none matches what I require. I believe the problem has been that the questions have been ambiguously posed. I'll try to do better.

Imagine that you have data that describes a process that consists of two components; a signal component and a background component. The data is x, y, and a label that indicates if the (x,y) data point is signal or background. You notice that a 2D histogram (perhaps plotted with ggplot2) of the data seems to look like a flat inclined plane or something similar (this is the background component), with a bump or peak (the signal component) that may be modelled as a 2D Gaussian or something like that. So here's what I want to do: I want to subtract the background component from the data so I just see the signal component. But maybe it's not obvious that there's a bump in the data. Maybe the bump is very small compared to the background component. So what I want to do is subtract the background so that I can search for bumps. And I'd rather not manufacture a bump by some unfortunate choice of parameters for a difference of two KDEs. I just want to bin the data, stratify it by label (signal/background) and subtract one 2D histogram from the other 2D histogram. And then I want to plot that using ggplot2, because I want a nice publication-ready plot. Hexbin plot, tile plot, 2D contour plot, or poly plot with ggplot2 will do nicely. Any ideas on how to do this?

Other similar questions have answers that suggest creating two KDEs and subtracting those, but I don't want this extra layer of complexity. I just want the difference of the raw 2D bin counts. Surely there is a simple way to do this?

0 Answers0