How do I create a function in Excel to fit a trendline for this data? The answer will power a cool analysis that I've been putting together.
Tuesday, April 28, 2009
Calling amateur mathematicians
How do I create a function in Excel to fit a trendline for this data? The answer will power a cool analysis that I've been putting together.
Posted by Consultant Ninja at 8:41 PM Labels: Analysis, Consulting, Excel
8 comments:
 Consultant Insider said...

Try y = 7/(x1)
Seems to fit pretty well. Don't know helpful it will be though.  April 28, 2009 at 11:28 PM


Try these:
1) y = 0,3115exp(0,5227x)
(R2 = 0,9595)
2) y = 0,0398x6  1,6883x5 + 27,6x4  217,62x3 + 847,63x2  1480,7x + 851,89
(R2 = 0,9901)
Fits good but for the practical use... it depends
Regards  April 29, 2009 at 3:51 AM
 Steve Shu said...

akznetsov looks like he is on the right track. #1 looks better to me (presuming the coefficients are right  seems so since R squared value was given). #2 seems like it could be an overfit regression. Maybe try the regression with fewer independent variables as opposed to 6.
 April 29, 2009 at 12:53 PM
 Steve Shu said...

Also, I don't know if you need to get a plugin for Excel to do the regression analysis. See this link as one possible source.
http://www.jeremymiles.co.uk/regressionbook/extras/appendix2/excel
I am on the new Excel. I can barely figure out how to format cells anymore, the interface is so different.  April 29, 2009 at 12:58 PM
 Mason said...
 This comment has been removed by the author.
 April 29, 2009 at 4:13 PM
 Mason said...

From the looks of the data I can see (assumptions may be violated/need to be checked based on the data sources), I have a few suggestions.
The most immediate models that come to mind are:
ln(y)=B0+B1ln(x1)
OR
ln(y)=B0+B1x1
I would also try a piecewise regressionperhaps try a cubic until a point k, and then try either of the log models for >k.
As far as the Lorenz curve...I don't think I can give an intelligent answer without knowing the nature of the data, but I would only try it if the other models fail.
Hope that helps! I enjoy reading your blog :)  April 29, 2009 at 4:20 PM
 Consultant Ninja said...

The
 April 29, 2009 at 4:22 PM
 Consultant Ninja said...

The data is IRS tax data.
http://www.irs.gov/taxstats/indtaxstats/article/0,,id=134951,00.html
Xvalue: Percentile of the US population below a certain income.
Yvalue: That income value.
Interesting stuff here. I'll have to chew through it. Thanks.  April 29, 2009 at 4:25 PM