Sunday, October 14, 2012

Linear Filter

I've been reading about linear filters that separate complex signals into contributions from individual components.  They work in a very simple manner and it seems a bit odd that they are effective at all, but they do work.

Here is an example.  Suppose you observe meals being sold at a cafeteria.  At the end of each day you have how many of each food item were sold (pizza slices, bottled water, salads, and fortune cookies) and the total amount in cash collected at the cash register.  However, when you return from your data collection you realize you forgot to write down how much each item costs.  So here is a plot of the data.

So, for example, on day 25 a total of $85.76 was made and 9 slices of pizza, 5 bottles of water, 5 salads and 6 cookies were sold.  So how much did each cost?  (The individual signals between 1 and 10 in the plot are contributing in different proportions to the combined signal.)  By using a simple procedure of first guessing prices of 100 cents for each item, then updating that guess by raising or lowering the price if the total is too high or too low, for each day, we can work out the individual prices.  The key is to change the price of items that are common on a particular day by (relative to its current price estimate) a bit more  than items that are rarer--because they contributed more  to the price.  Doing this over a 200 day period gives the following plot. 


The y-axis here is in pennies.  You can see that the price estimates approach and stabilize around certain values over time.  In fact, the prices I used in this test example are $5.74 for pizza, $4.63 for salad, $1.89 for water and $0.25 for cookies, which are almost exactly the prices arrived at (within a few pennies) by the linear filtering procedure. 

In other words the relative contributions of the individual components (individual prices), to the overall signal (total price) have been estimated with reasonable accuracy--pizza has a larger effect on the total signal than cookies do, etc.

 In a simple case like this we could work out the individual prices by hand and sifting through the data (e.g. by finding examples where only one component changed).  However, this can be computer automated and works for more complex datasets and even in cases where items are correlated (if you tended to sell more botteled water with salads or cookies with pizza). 

No comments: