Skip to content

Commit

Permalink
Update to DimensionalData.jl (#84)
Browse files Browse the repository at this point in the history
* just port dd to 0.20

* add `gnv` function

* LookUp -> LookupArrays

* climarray can be initialized and printed

* fix missing format function

* fix all statistical tests

* maybe fix all tests...?

* correct rebuild with format

* even more fixes hopefully

* replace boardcast typo

* change all between usages to ..

* correct lookup array specification for intervalset

* use `gnv` in `Coord` dimensions

* split tests into files

* add some basic tests for Coordinates

* better naming for space types

Closes #79

* fix remaining small tests
  • Loading branch information
Datseris committed Jan 5, 2022
1 parent d7ce9dc commit 1d88bfb
Show file tree
Hide file tree
Showing 21 changed files with 558 additions and 465 deletions.
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"

[compat]
DimensionalData = "0.18"
DimensionalData = "0.20"
Interpolations = "0.13.2"
NCDatasets = "0.10, 0.11"
Requires = "1"
Expand Down
2 changes: 1 addition & 1 deletion docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Make sure your installed version coincides with the one in this docs (see bottom

## `ClimArray`: the core data structure
This project treats "climate data" as an instance of [`ClimArray`](@ref).
At the moment `ClimArray` is a subtype of `DimensionalArray` from DimensionalData.jl.
At the moment `ClimArray` is a subtype of `DimArray` from DimensionalData.jl.
A brief introduction to DimensionalData.jl is copied here from its docs, because basic handling of a `ClimArray` comes from DimensionalData.jl.
DimensionalData.jl allows to dimensionally-index data by their values.

Expand Down
2 changes: 1 addition & 1 deletion docs/src/plotting.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@

## Python based plotting with `cartopy`
For now, you can use PyCall.jl, matplotlib, and the Python library cartopy.
In the file [`ClimateBase/plotting/python.jl`](https://github.com/JuliaClimate/ClimateBase.jl/tree/master/plotting/python.jl) we provide two functions that plot maps of `ClimArray` in arbitrary projections: `earthsurface` for `LonLatGrid` and `earthscatter` for `UnstructuredGrid`. You can incorporate these in your source code as a temporary solution.
In the file [`ClimateBase/plotting/python.jl`](https://github.com/JuliaClimate/ClimateBase.jl/tree/master/plotting/python.jl) we provide two functions that plot maps of `ClimArray` in arbitrary projections: `earthsurface` for `OrthogonalSpace` and `earthscatter` for `CoordinateSpace`. You can incorporate these in your source code as a temporary solution.
10 changes: 5 additions & 5 deletions docs/src/statistics.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ sametimespan

## Spatial

All functions in this section work for both types of space, see [Types of spatial coordinates](@ref).
All functions in this section work for both types of space, see [Types of spatial information](@ref).
```@docs
zonalmean
latmean
Expand All @@ -32,11 +32,11 @@ lonlatfirst
longitude_circshift
```

### Types of spatial coordinates
At the moment the following type of spatial coordinates are supported:
### Types of spatial information
Spatial information (excluding height/pressure dimensions) in ClimateBase.jl exists in one of two forms:
```@docs
LonLatGrid
UnstructuredGrid
OrthogonalSpace
CoordinateSpace
```

ClimateBase.jl works with either type of spatial coordinate system.
Expand Down
16 changes: 8 additions & 8 deletions src/core/aggregation.jl
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@ end
#########################################################################
#=
"""
dimwise(f, A, B)
broadcast_dims(f, A, B)
Apply bivariate function `f` (e.g. multiplication `*`) across the (first) matching dimension
of `A` and `B`, where `B` has only one dimension.
For normal Julia arrays match is done on the first matching dimension size.
Expand All @@ -168,18 +168,18 @@ for every `i` across the matching dimension.
Useful when wanting to weight a field `A` based on its latitude or time.
"""
function dimwise(f, A::AbDimArray, B::AbDimArray)
@assert length(size(B)) == 1 "3rd argument of dimwise must be a vector"
function broadcast_dims(f, A::AbDimArray, B::AbDimArray)
@assert length(size(B)) == 1 "3rd argument of broadcast_dims must be a vector"
m = find_matching_dim(A, B)
# Array(data) necessary because reshapes somewhow fail to play well with the code...
C = dimwise(f, Array(data(A)), Array(data(B)), m)
C = broadcast_dims(f, Array(data(A)), Array(data(B)), m)
return DimensionalData.basetypeof(A)(C, dims(A))
end
dimwise(f, A::AbstractArray, B, m::Int = find_matching_dim(A, B)) =
dimwise!(copy(A), f, A, B, m)
broadcast_dims(f, A::AbstractArray, B, m::Int = find_matching_dim(A, B)) =
broadcast_dims!(copy(A), f, A, B, m)
function dimwise!(C, f, A::AbstractArray{T, N}, B::AbstractVector, m::Int) where {T, N}
function broadcast_dims!(C, f, A::AbstractArray{T, N}, B::AbstractVector, m::Int) where {T, N}
shape = ntuple(i -> (i==m) ? length(B) : 1, N)
b = reshape(B, shape)
C .= f.(A, b)
Expand All @@ -200,4 +200,4 @@ function find_matching_dim(A::AbstractArray, B::AbstractVector)::Int
end
=#
export dimwise
export broadcast_dims
86 changes: 43 additions & 43 deletions src/core/coredefs.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@
# Basic imports and dimension definitions
##########################################################################################
using DimensionalData
using DimensionalData: @dim, hasdim, Dimension, IndependentDim
using DimensionalData: basetypeof
using DimensionalData.Dimensions, DimensionalData.LookupArrays
using DimensionalData: basetypeof, broadcast_dims
using Dates

Time = DimensionalData.Ti
Expand All @@ -19,18 +19,18 @@ dimindex(A, dim) = DimensionalData.dimnum(A, dim)

"""
gnv(object) → x
Short for "get numeric value", this function will return the pure numeric value
of the given object. Convenience function for quickly gettin the numeric data of
Short for "get numeric value", this function will return the pure numeric value(s)
of the given object. Convenience function for quickly getting the numeric data of
dimensional arrays or dimensions.
"""
gnv(x) = x
gnv(x::AbDimArray) = DimensionalData.parent(x)
gnv(x::Dimension) = x.val
gnv(x::Union{AbDimArray, LookupArray}) = parent(x)
gnv(x::Dimension) = parent(parent(x))

export At, Between, Near # Selectors from DimensionalArrays.jl
export At, (..), Between, Near # Selectors from DimensionalArrays.jl
export hasdim, dims, dimindex
export Time, Lon, Lat, dims, Coord, Hei, Pre, Ti
export UnstructuredGrid, LonLatGrid, spacestructure
export CoordinateSpace, OrthogonalSpace, spacestructure
export DimensionalData # for accessing its functions
export gnv

Expand Down Expand Up @@ -69,55 +69,53 @@ const COMMONNAMES = Dict(
##########################################################################################

# the following traits for the the way space is configured. currently the options are
# UnstructuredGrid, which is for equal area "grids" (or better points/coordinates)
# while the LonLatGrid is for standard Longitude x Latitude dimensions.
# Dispatch on `UnstructuredGrid` or other types is type-stable
# CoordinateSpace, which is for points/coordinates
# while the OrthogonalSpace is for standard Longitude x Latitude dimensions.
# Dispatch on `CoordinateSpace` or other types is type-stable
# because it comes from the underlying dimension

abstract type SpaceType end

"""
Space coordinates are represented by two orthogonal dimensions `Lon, Lat`,
Space information is represented by two orthogonal dimensions `Lon, Lat`,
one being longitude and the other being latitude.
"""
struct LonLatGrid <: SpaceType end
struct OrthogonalSpace <: SpaceType end

"""
Space coordinates are represented by a single dimension `Coord`, whose
elements are coordinate locations, i.e. 2-element `SVector(longitude, latitude)`.
Each coordinate represents an **equal area polygon** corresponding to the point in space.
Space information is represented by a single dimension `Coord`, whose
elements are coordinates, i.e. 2-element `SVector(longitude, latitude)`.
Each coordinate represents the center of an arbitrary polygon in space.
The actual limits of each polygon are not included in the dimension for performance reasons.
This dimension allows indexing according to the underlying `Lon, Lat` representation,
e.g. you can do
In statistical functions such as [`spaceagg`](@ref), it is assumed that entry of the
coordinates covers **an equal amount of area**. If this is not the case, you can simply
provide an additional weights vector which would correspond to the area covered.
This dimension also allows indexing by latitude ranges, e.g. you can do
```julia
A # some `ClimArray` with unstructured grid type.
A[Coord(Lon(Between(0, 30)), Lat(Between(-30, 30)))]
A # some `ClimArray` with a `Coord` dimension
A[Coord(Lat(-30..30)))]
```
To use functions such as [`zonalmean`](@ref) or [`hemispheric_means`](@ref) with this grid,
you must first sort the `ClimArray` so that the latitudes
of its coordinates are sorted in ascending order. I.e.
Most functions of ClimateBase.jl implicitly assume that the coordinates are
sorted by latidude. You can achieve this with the following code:
```julia
A # some `ClimArray` with unstructured grid type.
coords = dims(A, Coord).val
si = sortperm(coords, by = reverse)
A # some `ClimArray` with a `Coord` dimension
coords = gnv(dims(A, Coord))
si = sortperm(coords; by = reverse)
A = A[Coord(si)]
```
**This is done automatically by [`ncread`](@ref).**
!!! warn
`UnstructuredGrid` functionality is currently in an **experimental phase**!
Notice that non-equal area unstructured grids are not supported yet.
"""
struct UnstructuredGrid <: SpaceType end
struct CoordinateSpace <: SpaceType end

spacestructure(a::AbDimArray) = spacestructure(dims(a))
function spacestructure(dims)
if hasdim(dims, Coord)
UnstructuredGrid()
CoordinateSpace()
elseif hasdim(dims, Lon) || hasdim(dims, Lat)
LonLatGrid()
OrthogonalSpace()
else
error("Array does not have any spatial dimensions: `Lon`, `Lat`, or `Coord`.")
end
Expand All @@ -126,17 +124,18 @@ end
##########################################################################################
# ClimArray definition and DimensionalData.jl extensions
##########################################################################################
# TODO: remove `refdims` field
export ClimArray
struct ClimArray{T,N,D<:Tuple,R<:Tuple,A<:AbstractArray{T,N},Me} <: AbstractDimensionalArray{T,N,D,A}
struct ClimArray{T,N,D<:Tuple,R<:Tuple,A<:AbstractArray{T,N},Me} <: AbDimArray{T,N,D,A}
data::A
dims::D
refdims::R
name::Symbol
attrib::Me
end
ClimArray(A::DimensionalArray) = ClimArray(A.data, A.dims, A.refdims, A.name, A.metadata)
ClimArray(A::AbDimArray) = ClimArray(A.data, A.dims, A.refdims, A.name, A.metadata)
ClimArray(A::ClimArray, dims::Tuple = A.dims; name = A.name, attrib = A.attrib) =
ClimArray(A.data, DimensionalData.formatdims(A.data, dims), A.refdims, Symbol(name), attrib)
ClimArray(A.data, format(dims, A.data), A.refdims, Symbol(name), attrib)

"""
ClimArray(A::Array, dims::Tuple; name = "", attrib = nothing)
Expand All @@ -146,7 +145,7 @@ information, a name and an `attrib` field (typically a dictionary) that holds ge
attributes.
You can think of `ClimArray` as a in-memory representation of a CFVariable.
At the moment, a `ClimArray` is using `DimensionalArray` from DimensionalData.jl, and
At the moment, a `ClimArray` is using `DimArray` from DimensionalData.jl, and
all basic handling of `ClimArray` is offered by `DimensionalData` (see below).
`ClimArray` is created by passing in standard array data `A` and a
Expand All @@ -167,23 +166,24 @@ A = ClimArray(data, dimensions)
```
"""
ClimArray(A::AbstractArray, dims::Tuple; refdims=(), name="", attrib=nothing) =
ClimArray(A, DimensionalData.formatdims(A, dims), refdims, Symbol(name), attrib)
ClimArray(A, format(dims, A), refdims, Symbol(name), attrib)
ClimArray(A::AbstractArray, dims::Tuple, name; refdims=(), attrib=nothing) =
ClimArray(A, DimensionalData.formatdims(A, dims), refdims, Symbol(name), attrib)
ClimArray(A, format(dims, A), refdims, Symbol(name), attrib)

Base.parent(A::ClimArray) = A.data
Base.@propagate_inbounds Base.setindex!(A::ClimArray, x, I::Vararg{DimensionalData.StandardIndices}) =
setindex!(A.data, x, I...)

DimensionalData.metadata(A::ClimArray) = A.attrib
DimensionalData.rebuild(A::ClimArray, data::Any, dims::Tuple=dims(A), refdims=DimensionalData.refdims(A),
name="", attrib=nothing) = ClimArray(data, dims, refdims, Symbol(name), attrib)
DimensionalData.basetypeof(::ClimArray) = ClimArray

DimensionalData.rebuild(
::ClimArray, data, dims, refdims, name, metadata
) = ClimArray(data, format(dims, data), refdims, name, metadata)
DimensionalData.rebuild(
A::ClimArray;
data=data(A), dims=dims(A), refdims=refdims(A), name=name(A), metadata=metadata(A)
) = ClimArray(data, dims, refdims, name, metadata)
data=gnv(A), dims=dims(A), refdims=refdims(A), name=name(A), metadata=metadata(A)
) = ClimArray(data, format(dims, data), refdims, name, metadata)


# The following basic methods allow indexing with tuples, (Time(5), Lon(3))
Expand Down
9 changes: 3 additions & 6 deletions src/core/interpolation.jl
Original file line number Diff line number Diff line change
Expand Up @@ -86,24 +86,21 @@ function interpolate_height2pressure(A::ClimArray,pressure_levels::Vector; extra
error("Height is not an ascending coordinate.")
end

pressure = height2pressure.(dims(A,Hei()).val)
pressure = height2pressure.(gnv(dims(A,Hei())))

# construct output Array:
pre = Pre(pressure_levels; metadata = Dict())
dims_no_height = otherdims(A, Hei())
out_dims = (dims_no_height...,pre)
out_dims = (dims_no_height..., pre)
dimension_lengths = length.(out_dims)

int_array = ClimArray(zeros(eltype(A), Tuple(dimension_lengths)), out_dims ; name = A.name, attrib = A.attrib)
int_array = ClimArray(zeros(eltype(A), Tuple(dimension_lengths)), out_dims; name = A.name, attrib = A.attrib)

for i in otheridxs(A, Hei())

itp = LinearInterpolation(pressure,A[i],extrapolation_bc=extrapolation_bc)
int_array[i] = itp(pressure_levels)

end
return int_array

end


Expand Down
26 changes: 13 additions & 13 deletions src/core/nc_io.jl
Original file line number Diff line number Diff line change
Expand Up @@ -113,8 +113,8 @@ NCDatasets.jl and then convert to some kind of dimensional container.
## Keywords
* `name` optionally rename loaded array.
* `grid = nothing` optionally specify whether the underlying grid is `grid = LonLatGrid()`
or `grid = UnstructuredGrid()`, see [Types of spatial coordinates](@ref).
* `grid = nothing` optionally specify whether the underlying grid is `grid = OrthogonalSpace()`
or `grid = CoordinateSpace()`, see [Types of spatial information](@ref).
If `nothing`, we try to deduce automatically based on
the names of dimensions and other keys of the `NCDataset`.
* `lon, lat`. These two keywords are useful in unstructured grid data where the grid
Expand All @@ -126,7 +126,7 @@ NCDatasets.jl and then convert to some kind of dimensional container.
lon = Array(ds["clon"]);
lat = Array(ds["clat"]);
```
If `lon, lat` are given, `grid` is automatically assumed `UnstructuredGrid()`.
If `lon, lat` are given, `grid` is automatically assumed `CoordinateSpace()`.
"""
function ncread(path::Union{String, Vector{String}}, args...; kwargs...)
NCDataset(path) do ds
Expand All @@ -147,14 +147,14 @@ function ncread(ds::NCDatasets.AbstractDataset, var, selection = nothing;
name = var2name(var), grid = nothing, lon = nothing, lat = nothing,
)
if lon isa Vector && lat isa Vector
gridtype = UnstructuredGrid()
gridtype = CoordinateSpace()
elseif isnothing(grid)
gridtype = autodetect_grid(ds)
else
gridtype = grid
end

if gridtype == UnstructuredGrid()
if gridtype == CoordinateSpace()
return ncread_unstructured(ds, var, name, lon, lat, selection)
else
return ncread_lonlat(ds, var, name, selection)
Expand All @@ -163,19 +163,19 @@ end

function autodetect_grid(ds)
if haskey(ds, "reduced_points") || haskey(ds.dim, "ncells") || haskey(ds, "clon")
return UnstructuredGrid()
return CoordinateSpace()
elseif haskey(ds, "lat") && length(size(ds["lat"])) > 1
return UnstructuredGrid()
return CoordinateSpace()
else
return LonLatGrid()
return OrthogonalSpace()
end
end

var2name(var) = string(var)
var2name(var::Tuple) = join(var, "_")

#########################################################################
# Reading: LonLatGrid and main reading functionality
# Reading: OrthogonalSpace and main reading functionality
#########################################################################
# Notice that this function properly loads even without any spatial coordinate
function ncread_lonlat(ds::NCDatasets.AbstractDataset, var, name, selection)
Expand Down Expand Up @@ -291,7 +291,7 @@ vector2range(r::AbstractRange) = r


#########################################################################
# Reading: UnstructuredGrid
# Reading: CoordinateSpace
#########################################################################
export SVector

Expand Down Expand Up @@ -342,7 +342,7 @@ function ncread_unstructured(

# Make coordinate dimension
lonlat = lonlat[sel[i]]
si = sortperm(lonlat, by = reverse)
si = sortperm(lonlat; by = reverse)
coords = Coord(lonlat, (Lon, Lat))
insert!(actualdims, i, coords)

Expand Down Expand Up @@ -474,7 +474,7 @@ function ncwrite(file::String, Xs; globalattr = Dict())
# into longitude and latitude.
if any(X -> hasdim(X, Coord), Xs)
error("""
Outputing `UnstructuredGrid` coordinates to .nc files is not yet supported,
Outputing `CoordinateSpace` coordinates to .nc files is not yet supported,
but it is an easy fix, see source of `ncwrite`.
""")
end
Expand Down Expand Up @@ -511,7 +511,7 @@ function add_dims_to_ncfile!(ds::NCDatasets.AbstractDataset, dimensions::Tuple)
eltype(v) == Date && (v = DateTime.(v))
l = length(v)
NCDatasets.defDim(ds, d, l) # add dimension entry
attrib = dimensions[i].metadata
attrib = DimensionalData.metadata(dimensions[i])
if (isnothing(attrib) || attrib == DimensionalData.NoMetadata()) && haskey(DEFAULT_ATTRIBS, d)
@warn "Dimension $d has no attributes, adding default attributes (mandatory)."
attrib = DEFAULT_ATTRIBS[d]
Expand Down
Loading

0 comments on commit 1d88bfb

Please sign in to comment.