Given a set of x-y values like the above, where X=0%, Y=0%*, and at X=100%, Y=infinity.
How do I create a function in Excel to fit a trendline for this data? The answer will power a cool analysis that I've been putting together.
Management Consultant | Excel Jockey | Slide Monkey | Corporate Insurgent | One-Eyed Man in the Valley of the Blind
Mckinsey | Bain | BCG | Booz | Oliver Wyman
Given a set of x-y values like the above, where X=0%, Y=0%*, and at X=100%, Y=infinity.
Posted by
Consultant Ninja
at
8:41 PM
Labels:
Analysis,
Consulting,
Excel
Try y = -7/(x-1)
Seems to fit pretty well. Don't know helpful it will be though.
Try these:
1) y = 0,3115exp(0,5227x)
(R2 = 0,9595)
2) y = 0,0398x6 - 1,6883x5 + 27,6x4 - 217,62x3 + 847,63x2 - 1480,7x + 851,89
(R2 = 0,9901)
Fits good but for the practical use... it depends
Regards
akznetsov looks like he is on the right track. #1 looks better to me (presuming the coefficients are right - seems so since R squared value was given). #2 seems like it could be an overfit regression. Maybe try the regression with fewer independent variables as opposed to 6.
Also, I don't know if you need to get a plug-in for Excel to do the regression analysis. See this link as one possible source.
http://www.jeremymiles.co.uk/regressionbook/extras/appendix2/excel
I am on the new Excel. I can barely figure out how to format cells anymore, the interface is so different.
From the looks of the data I can see (assumptions may be violated/need to be checked based on the data sources), I have a few suggestions.
The most immediate models that come to mind are:
ln(y)=B0+B1ln(x1)
OR
ln(y)=B0+B1x1
I would also try a piecewise regression--perhaps try a cubic until a point k, and then try either of the log models for >k.
As far as the Lorenz curve...I don't think I can give an intelligent answer without knowing the nature of the data, but I would only try it if the other models fail.
Hope that helps! I enjoy reading your blog :)
The
The data is IRS tax data.
http://www.irs.gov/taxstats/indtaxstats/article/0,,id=134951,00.html
X-value: Percentile of the US population below a certain income.
Y-value: That income value.
Interesting stuff here. I'll have to chew through it. Thanks.