Since lendingclub announced its intention to discontinue its loan offering by the end of this year, I starting to look for alternatives. I used to have a small prosper balance but stopped adding new cash long time ago due to its awkward website design and low return using its automatic investment tool. Now that lendingclub will be unavailable, I looked at prosper a second time. It is fairly easy to setup its API and adapt my existing automatic investment framework. It is time to download historical data and build models. At first glance, prosper's data are good and clean; however, I found out that prosper separate listing data from loan data and there is no key to link them. As a result, I cannot link borrower and loan features with its outcome. Prosper intentionally does that to "protect" its internal model.
After looking at the data for a while, I realized I can still match these two data set without a key. About half of the data points can be uniquely matched; another half have multiple possible matches (usually 2-3) and I can simply do an average to create reasonably good synthetic data for modelling.
No comments:
Post a Comment