jq ftw

Today I came across the need to clone a bunch of git repo’s from an organization’s page on GitHub. A perfect chance for my all time favorite fast challenge: a down and dirty bash script!

Here’s how to use the GitHub public API to get access to all public repos for an org:

curl https://api.github.com/orgs/<ORG_NAME>/repos?per_page=100&page=1

The max pagination size allowed by the GitHub API is 100. To check and see if you can grab all the results in one shot, we can use the length tool from jq

I’m using Confluent Inc, the keepers of Apache Kafka. (warning…they have a few large repos)

curl -s "https://api.github.com/orgs/confluentinc/repos?per_page=100&page=1" | jq 'length'

At publishing time they have 38 repos so we can grab them all in one shot. If not just page thru them by altering the URL.

A quick scan of the original response body shows that there is an array of repo JSON objects, and inside each object a clone_url field at the top level. That’s what we need to grab. We can traverse the JSON easily with jq (-r to remove the double-quotes)

curl -s "https://api.github.com/orgs/confluentinc/repos?per_page=100&page=1" | jq -r '.[].clone_url'

Now we just need to loop over each url and clone it. We can do that with a quick read/while loop

more on read/while in this post

while read repo; do
  git clone $repo
done < <(curl -s "https://api.github.com/orgs/confluentinc/repos?per_page=100&page=1" | jq -r '.[].clone_url')

Final note…if you wanted to clone some private repos, you can pass curl a -u flag with your GitHub username and you will be prompted for your password before the API responds.

curl -u <YOUR_USER_NAME> -s "https://api.github.com.....