Use Babashka to work with data in Excel Spreadsheets!
bb-excel is a simple Clojure library to work with data in from Excel files without relying on the Apache POI OOXML library. See below for rationale.
Add the following dependency to your project.clj
, deps.edn
, or bb.edn
You can also install it as a bbin utility to convert Excel Sheets to EDN
bbin install com.github.kbosompem/bb-excel --latest-sha
Please note that being a beta version indicates the provisional status of this library, and that features are subject to change.
This library is meant for the simplest of spreadsheets.
The primary function is get-sheet
In the first example below we pull data from a specific sheet within the workbook. It returns a list of hash-maps with :_r key indicating the row within the sheet.
#!/usr/bin/env bb
(require '[babashka.deps :as deps])
(deps/add-deps
'{:deps {com.github.kbosompem/bb-excel {:mvn/version "0.1.1"}}})
(ns demo
(:require [clojure.java.io :as io]
[clojure.pprint :refer [print-table pprint]]
[bb-excel.core :refer [get-sheet get-sheets]]))
;; To specify file can use either filename or file object
;; To specify sheet can use either sheetname or sheet index(started form 1)
(get-sheet "test/data/simple.xlsx" "Shows" )
(get-sheet (io/file "test/data/simple.xlsx") 1)
;=>
({:_r 2, :A 1.0, :B "Sesame Street"}
{:_r 3, :A 2.0, :B "La Femme Nikita"}
{:_r 4, :A 3.0, :B "House M.D"}
{:_r 5, :A 4.0, :B "Breaking Bad"})
Let's print out the results for better readability
(print-table
(get-sheet "test/data/simple.xlsx" "Shows" ))
| :_r | :A | :B |
|-----+------+-----------------|
| 1 | Rank | TV Show |
| 2 | 1.0 | Sesame Street |
| 3 | 2.0 | La Femme Nikita |
| 4 | 3.0 | House M.D |
| 5 | 4.0 | Breaking Bad |
If there is a header row we can use the values from that row as keys.Set :hdr flag in opts to true and specify the row number.
(print-table
(get-sheet "test/data/simple.xlsx" "Shows" {:hdr true :row 1}))
| :_r | Rank | TV Show |
|-----+------+-----------------|
| 2 | 1.0 | Sesame Street |
| 3 | 2.0 | La Femme Nikita |
| 4 | 3.0 | House M.D |
| 5 | 4.0 | Breaking Bad |
You can also format the keys by providing a function. In the example below we keywordize each heading
(print-table
(get-sheet "test/data/simple.xlsx" "Shows" {:hdr true :fxn keyword :row 1}))
| :_r | :Rank | :TV Show |
|-----+-------+-----------------|
| 2 | 1.0 | Sesame Street |
| 3 | 2.0 | La Femme Nikita |
| 4 | 3.0 | House M.D |
| 5 | 4.0 | Breaking Bad |
Another example. skeyword Lowercases, replaces spaces and slashes, excises non-ascii characters and limits to 50 characters every row heading
(print-table
(get-sheet "test/data/simple.xlsx" "Shows"
{:hdr true
:fxn bb-excel.cli/skeyword}))
| :_r | :rank | :tv_show |
|-----+-------+-----------------|
| 2 | 1.0 | Sesame Street |
| 3 | 2.0 | La Femme Nikita |
| 4 | 3.0 | House M.D |
| 5 | 4.0 | Breaking Bad |
That's all well and good but how do we use the same cell addressing format used in excel? For instance if we want B2:C4 then we can leverage the get-range function instead. Now that expects a sheet so
(print-table
(get-range
(get-sheet "test/data/simple.xlsx" "Shows")
"A3:B4"))
| :_r | :A | :B |
|-----+-----+-----------------|
| 3 | 2.0 | La Femme Nikita |
| 4 | 3.0 | House M.D |
To create an Excel Spreadsheet call create-xlsx with a vector of maps. Each map represents a tab/sheet in Excel It requires a :name (String) and a :sheet (Vector of maps) There is an optional :cmap for renaming columns
(create-xlsx "sample.xlsx" [{:name "TestSheet"
:sheet [{:A "1" :B "One" :C "Baako"}
{:A "2" :B "Two" :C "Mienu"}
{:A "3" :B "Three" :C "Miensa"}]}])
To validate the data was saved accurately use the get-sheet and print-table to extract and print.
(print-table
(get-sheet "sample.xlsx" "TestSheet" {:hdr true})
| :A | :B | :C | :_r |
|----+-------+--------+-----|
| 1 | One | Baako | 2 |
| 2 | Two | Mienu | 3 |
| 3 | Three | Miensa | 4 |
Please note that Excel is an intricate and complex file format and this library performs the most basic writes possible. Only Strings, Numbers and Dates.
No styles, formulas, lambdas,images, embeddings, tables etc can be supported.
- Does not support older xls format
- Write is super basic and does not include styles, formulas, lambdas,images, embeddings, tables etc
Why create another excel library in clojure when you can use docjure or wrap the venerable Apache POI library. Answer is simple. Babashka, my scripting language of choice does not support Apache POI. This library currently reads xlsx files and returns a vector of hashmaps. Each hashmap representing a row in a selected sheet. This is also experimental support for write.
- If you are using clojure and need to read from and write to Excel files I highly recommend you take a look at Martin Jul's DOCJURE
- If you need to generate excel files take a look at Mathew Downey's EXCEL-CLJ
- If you need to generate multi-format reports try JasperReport