Wednesday, April 4, 2018

lendingclub strategy backtesting and comparison

About three years ago, I've developed and deployed automatic lendingclub note picker models.  Those models were trained and deployed with some limited testing.  I've accumulated 2500+ notes at $25 a piece.  With enough time and data, it is time to review the results and get better understanding about my model performance.
One obvious measure is IRR, which many other lendingclub investors used.  I decided not to use it because of some reasons:
1. I've added and withdrawn cash from my account multiple times and the timings & amount are hard to track.
2. I'd like to compare different models performance and different note vintages.
3. I am only interested in relative performance measure.
So, I choose simpler ROI measure.  For the loans that are alive, I apply a 2% liquidity penalty to remaining principal of current loans (delinquent loans are excluded)to be conservative.  To liquidate loans on Foliofn, you have to pay 1% fee; and it is likely that you have to discount the note in order to find a buyer; hence the 2% cost assumption.




roi
portfolioNameissue_year
filter20159.6%
20162.2%
20172.8%
2018-2.3%
model120158.1%
20165.1%
20173.2%
2018-2.4%
model22018-2.0%
Again, the absolute numbers doesn't matter.  The machine learning model outperforms simple filer in 2016 and 2017.  As noted from previous analysis, the lendingclub platform's borrowers quality deteriorated during that time period.  It is likely that my simple filter were not able to filter out bad borrowers when there are so many of them.  But my ml model is able to discriminate  and select relatively better borrower during these periods.  A grade breakdown shows that my ml model (upper figure) preferred B graded notes while my filter (lower) represents the filtered population.  According to lendingclub's statistics, B grade returns highest among all grades in period 2015-2016; which explains the difference between filter and ml model.


These existing models and a newly developed model (model2) went through a new backtesting using same ROI estimation.  All loans after 2015 went through selection in monthly basis and better loans were chosen according to models.  The results is model2>model1>filter>random select.  Many research papers have suggested inefficiency of lendingclub's grading and interests assignment model and my back testing and actual return support their findings.

No comments:

Post a Comment