Written: July 2021 (Github)

** Update **

The dataset shared on Kaggle is a subset of the available data to the public. There is more data on their website

I'll release a second analysis with the expanded dataset soon. Apologies for the intermediate analysis in the meantime.


In 2018, Kiva released some of it's data to the website Kaggle. If you haven't heard about Kaggle, don't worry about it. Kiva is a non-profit organization that gives out micro-loans to entrepreneurs in developing economies. These crowd funded loans help grow the economy in these regions and have a very high repayment rate (~96%).

According to the dataset on Kaggle, Kiva has executed 671205 loans totaling to $527.56 million There are around 1.35 million borrowers; 79.58% of borrowers are women

Plotting's always super helpful because in this case, it's obvious to me that there are some outliers here! One project has over 800 lenders for $10,000. This project was for "creat[ing] more than 300 jobs for women and farmers in rural Haiti." It seems to be the work of the Yunus Social Business (YSB), however I can't find this partner in the "loan_themes_by_region.csv". Their Wikipedia page does list them as serving Haiti

This may just be a quirk of the dataset. Also noticed that "loan_themes_by_region.csv" only accounts for $315.34 million of the total $506 million in the dataset with "partner_id"s. [Hopefully this resolves in the updated data]

Globe of Kiva's influence

The current populations of regions are given too. If this is too much information, you may remove them from the figure by clicking the relevant dots in the legend

Frequency of tags used

It may be surprising to see that the most used tag is "#Parent". Although in hindsight it makes sense, . Having responsiblity outside of yourself, especially for your child can fire your motivation. Moreover, it's a shared experience between lenders and borrowers that can build sympathy. Probably a combination of these two reasons are why the "#Parent" tag correlates so strongly with the "funded_amount". To philosophize some more, all the other tags are indirectly related to this one so it is a commonality shared with all. All themes are directed towards making the future a better place for our children. The "#Parent" tag is just the literal manifestation of this desire

Correlations of Tags

It's interesting that "#Health and Sanitation", "#Technology", and "#Eco-friendly" correlate negatively with the funded amounts. Meaning projects without these tags tend to be funded more. I'm currently working in a Neuroscience lab so the existence of p values < 0.01 is pretty shocking.

Again notice how well the "#Parent" tag correlates with the "funded_amount". It's hard to tell whether this is due to lenders being more sypathetic to parents or whether parents are more motivated entrepreneurs than non-parents