The other day I was curious: does every programming language have one of these? I decided to find out. I pointed my crawler and trained a model to check for a package repository for every one of the 3,006 languages I am tracking. The results surprised me.
★ Only 1% have them
My model found only 39 languages with central package repositories. (For comparison, Wikipedia lists ~20). That’s just ~1% of languages. I thought it would be higher.
★ ~30% of the Top 100 have them
Given that a programming language is very popular and appears in my top 100 list, it is about 15 - 30x more likely to have a CR. Given a language is not in the top 100, <1% will have a CR.
★ 2 Million+ Packages Total
There are over 2,000,000 packages (aka modules or libraries) across these CRs. That means there are about 1,000x more packages than there are programming languages.
With that many packages, name collision is certainly a problem (maybe a subject for another post), though not as much of a problem as in the domain system where 130,000,000+ “.coms” alone are registered.
★ The top 5 account for ~80% of all packages
★ GitHub has over 100,000,000 repositories
Given the size of GitHub, and it’s growth as somewhat of a universal central package repository (though totally unmoderated), and given that many (if not the majority) of the packages in these CRs are also listed on GitHub, it’s conceivable that GitHub is the largest CR and that the number of packages out there is easily 10x bigger than 2M.
★ Newer languages are not more likely to have CRs
This surprised me. The median age of a language with a CR is 24 (1995). Of the top 5 languages I mentioned above, all were created by then. Almost always the creation of the CR follows the launch of the language, sometimes by months or sometimes by years. I expected most CRs to be from newer languages, but that wasn’t the case. While some new languages like Rust and Julia have CRs, others like Go and Kotlin do not.
★ A Visual
★ The List
Here’s my list of the main central package repositories for languages that have them. I cut the list a bit to only include CRs with more than 100 packages available. As always, let me know if you spot any omissions or mistakes!
Notes: as Jay18001 pointed out, a few of these repositories serve packages for more than one language. Cocoapods => Objective-C, and Nuget => F# and other .Net langs. In this post I collapsed things so each repo only has 1 language.
Update: 8/26/2019. Multiple readers pointed out that my stat for Ruby was off by 10x. I used the # I found on this page (https://rubygems.org/gems), which turned out to be just the gems beginning with the letter A. I apologize for the mistake and am very grateful for the corrections.