Ever wondered who is coding what in your city and how to keep track of it, maybe grep
projects by keyword?
This is your tool! (babashka >= 1.0.171 mandatory)
EDN is a data format, like JSON, but more like Clojure's native data format.
A result can look like this:
{:name "Simon Neutert"
:hireable true
:languages ["HTML"]
:bio "I'm an HTML hacker."
:location "Area 50++"
:public-repos 123
:repos-url "https://api.github.com/users/simonneutert/repos"
:type "User"}
I can highly suggest jet for that.
https://knowyourmeme.com/memes/this-is-fine
- up to 1000 users per city + language combination (sorted by "users' public repositories count")
- if less than 1000 users in a city total, you can download by location only
- concurrency built-in 🚀
- get all users (not just 1000)
- implement automatic bucketing, sliding through the limits
- PROBLEM: GitHub sets the limit here 🥴
- tests?! 🧌
- sort by active last week? OR created in year?
- speed isn't crucial, but utilizing some of
clojure.core.async
magic could speed things up 10x maybe 🤔pmap
ftw 🎉
- babashka latest supported version for this code is currently 1.0.171
- GitHub API Token (Personal Access Tokens)
- Java doesn't hurt, too
make sure your ENV has the GITHUB_HIRE_TOKEN
at hand.
I do it like this:
in a terminal enter $ export GITHUB_HIRE_TOKEN="<my-token-here>"
then, from that terminal open your IDE of choice, like
$ code .
or have it in your .zshrc
🤗 or whatever your shell loads at start
🥳 happy times in the REPL
All of the following should work in Docker, too.
The simplest way for you is to use the given Dockerfile.
$ docker build --build-arg github_hire_token=${GITHUB_HIRE_TOKEN} -t git-hire .
$ docker run -it --rm git-hire
If you need to store the profiles, you can mount a docker volume, but this goes beyond the scope of this README.
$ bb scrape <location-like-city-or-country>
Will save the github profiles as .edn
into the profiles
directory,
but as GitHub support let me know:
When using the language qualifier when searching for users, it will only return users where the majority of their repositories use the specified language. (please, see documentation)
Specify further adding a language:
$ bb scrape <location-like-city-or-country> <language>
Be warned! This might not find a PHP dev who switched to Rust recently, as described by GitHub's Support.
Or if the city is too crowded, try loading mainstream languages for a given city.
Watch your rate limits
After having built a pool of profiles, use
$ bb search-keyword "rust"
and/or see examples given below.
$ bb scrape mainz
$ bb scrape "Bad Schwalbach"
$ bb scrape wiesbaden java
$ bb scrape wiesbaden php
$ bb scrape mainz javascript
$ bb search-keyword <search term skill framework else>
$ bb search-keyword android
$ bb search-keyword "ruby on rails"
$ bb search-keyword nuxt
you might go further, by piping to bb again, unimaginable possibilities...
$ mkdir rails; cp $(grep -Zril rails profiles) rails
$ bb search-keyword "ios" | bb -e '(map #(str/upper-case %) *input*)'
$ bb read-profile.clj simonneutert
go further, by piping
$ bb read-profile.clj simonneutert | bb -e '(:languages *input*)'
read many profiles
$ bb search-keyword ruby | bb -e '(mapv #(edn/read-string (slurp %)) *input*)'
map out name
and bio
, where bio
is provided
$ bb search-keyword ruby |\
bb -e '(mapv #(edn/read-string (slurp %)) *input*)' |\
bb -e '(mapv #(select-keys % [:name :bio]) *input*)' |\
bb -e '(remove #(nil? (:bio %)) *input*)'
map out name
and bio
, where bio
is provided, filter by bio containing "apple"
$ bb search-keyword ruby |\
bb -e '(mapv #(edn/read-string (slurp %)) *input*)' |\
bb -e '(mapv #(select-keys % [:name :bio]) *input*)' |\
bb -e '(remove #(nil? (:bio %)) *input*)' |\
bb -e '(filter #(clojure.string/includes? (clojure.string/lower-case (:bio %)) "apple") *input*)' |\
bb -e '(clojure.pprint/pprint *input*)'
what you came here for 🔥 find all hireable
search-keyword git is sort of a hack returning all profiles you downloaded at this point
$ bb search-keyword git |\
bb -e '(mapv #(edn/read-string (slurp %)) *input*)' |\
bb -e '(remove #(nil? (:hireable %)) *input*)'
# using httpie
GITHUB_HIRE_SINCE_YEAR=2019;
GITHUB_HIRE_LOCATION=wiesbaden;
https -A bearer -a ${GITHUB_HIRE_TOKEN} \
"https://api.github.com/search/users?q=created%3A%3E${GITHUB_HIRE_SINCE_YEAR}-01-01+location%3A${GITHUB_HIRE_LOCATION}+repos%3A%3E1&type=Users" \
"Accept":"application/vnd.github.v3+json"
# using httpie and jq
GITHUB_HIRE_SINCE_YEAR=2019;
GITHUB_HIRE_LOCATION=wiesbaden;
https -A bearer -a ${GITHUB_HIRE_TOKEN} \
"https://api.github.com/search/users?q=created%3A%3E${GITHUB_HIRE_SINCE_YEAR}-01-01+location%3A${GITHUB_HIRE_LOCATION}+repos%3A%3E1&type=Users" \
"Accept":"application/vnd.github.v3+json" |\
jq '.items | map(select(.type == "User")) | .[] |.repos_url'
Some stuff you would want to know/read as a beginner.
- REPL fails and outputs
; : Can't set!: *current-length* from non-binding thread user
pmap
and curl
don't play well with each other in the shell (I guess).
Don't worry, run the tool from the shell:
bb scrape berlin ruby
it will fire up some threads 🔥
https://clojure.org/guides/editors#_vs_code_rapidly_evolving_beginner_friendly
CLI to transform between JSON, EDN and Transit, powered with a minimal query language.
https://github.com/borkdude/jet
$ bb search-keyword ruby |\
bb -e '(mapv #(edn/read-string (slurp %)) *input*)' |\
jet --to json