Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated OperatorEnum to use any data type (not just Numbers) #85

Merged
merged 52 commits into from
Jul 7, 2024

Conversation

gca30
Copy link
Collaborator

@gca30 gca30 commented Jun 25, 2024

Changes:

  • Adds a ValueInterface module. If you want to use an OperatorEnum on a specific data type, you have to implement all the functions from that module.
  • Adds an on_type argument to @extend_operators, which restricts the type parameter of AbstractNodes that have the operators defined on them.
  • Adds an example with the Scalar-Vector-Matrix type as a test

Copy link
Contributor

github-actions bot commented Jun 25, 2024

Benchmark Results

master 1b576fe... master/1b576fe42364bb...
eval/ComplexF32/evaluation 7.48 ± 0.44 ms 7.45 ± 0.43 ms 1
eval/ComplexF64/evaluation 9.7 ± 0.65 ms 9.7 ± 0.75 ms 0.999
eval/Float32/derivative 11.2 ± 2.3 ms 10.9 ± 2 ms 1.03
eval/Float32/derivative_turbo 11.2 ± 2.3 ms 10.9 ± 2 ms 1.02
eval/Float32/evaluation 2.76 ± 0.23 ms 2.77 ± 0.22 ms 0.997
eval/Float32/evaluation_bumper 0.541 ± 0.013 ms 0.531 ± 0.013 ms 1.02
eval/Float32/evaluation_turbo 0.719 ± 0.026 ms 0.717 ± 0.025 ms 1
eval/Float32/evaluation_turbo_bumper 0.543 ± 0.013 ms 0.53 ± 0.012 ms 1.02
eval/Float64/derivative 15.1 ± 0.74 ms 14.6 ± 0.73 ms 1.03
eval/Float64/derivative_turbo 14.9 ± 0.7 ms 14.6 ± 0.6 ms 1.02
eval/Float64/evaluation 2.92 ± 0.23 ms 2.9 ± 0.23 ms 1.01
eval/Float64/evaluation_bumper 1.22 ± 0.044 ms 1.21 ± 0.041 ms 1
eval/Float64/evaluation_turbo 1.22 ± 0.061 ms 1.19 ± 0.057 ms 1.02
eval/Float64/evaluation_turbo_bumper 1.22 ± 0.045 ms 1.21 ± 0.042 ms 1.01
utils/combine_operators/break_sharing 0.0394 ± 0.0015 ms 0.0392 ± 0.00069 ms 1.01
utils/convert/break_sharing 23.9 ± 1.2 μs 23.2 ± 1.1 μs 1.03
utils/convert/preserve_sharing 0.125 ± 0.0035 ms 0.126 ± 0.003 ms 0.992
utils/copy/break_sharing 24.3 ± 1.3 μs 23.4 ± 1 μs 1.04
utils/copy/preserve_sharing 0.125 ± 0.0035 ms 0.125 ± 0.0032 ms 1
utils/count_constant_nodes/break_sharing 9.36 ± 0.14 μs 8.65 ± 0.14 μs 1.08
utils/count_constant_nodes/preserve_sharing 0.111 ± 0.0029 ms 0.111 ± 0.0027 ms 0.994
utils/count_depth/break_sharing 10.5 ± 0.15 μs 12.6 ± 0.26 μs 0.828
utils/count_nodes/break_sharing 8.81 ± 0.12 μs 8.56 ± 0.11 μs 1.03
utils/count_nodes/preserve_sharing 0.112 ± 0.003 ms 0.113 ± 0.0026 ms 0.995
utils/get_set_constants!/break_sharing 0.0341 ± 0.0011 ms 0.0339 ± 0.0013 ms 1.01
utils/get_set_constants!/preserve_sharing 0.228 ± 0.0055 ms 0.224 ± 0.0052 ms 1.02
utils/get_set_constants_parametric 0.0481 ± 0.0021 ms 0.0481 ± 0.0016 ms 1
utils/has_constants/break_sharing 4.28 ± 0.068 μs 4.33 ± 0.054 μs 0.989
utils/has_operators/break_sharing 1.96 ± 0.022 μs 2.11 ± 0.032 μs 0.93
utils/hash/break_sharing 25.3 ± 0.53 μs 25.8 ± 0.48 μs 0.982
utils/hash/preserve_sharing 0.133 ± 0.003 ms 0.132 ± 0.0025 ms 1.01
utils/index_constant_nodes/break_sharing 23 ± 0.97 μs 22.5 ± 1.1 μs 1.02
utils/index_constant_nodes/preserve_sharing 0.126 ± 0.0034 ms 0.125 ± 0.0029 ms 1
utils/is_constant/break_sharing 4.2 ± 0.056 μs 3.65 ± 0.055 μs 1.15
utils/simplify_tree/break_sharing 0.168 ± 0.0016 ms 0.177 ± 0.0013 ms 0.95
utils/simplify_tree/preserve_sharing 0.287 ± 0.005 ms 0.285 ± 0.0038 ms 1
utils/string_tree/break_sharing 0.395 ± 0.013 ms 0.393 ± 0.012 ms 1
utils/string_tree/preserve_sharing 0.532 ± 0.018 ms 0.535 ± 0.015 ms 0.995
time_to_load 0.232 ± 0.0041 s 0.228 ± 0.0057 s 1.01

@MilesCranmer MilesCranmer self-requested a review June 25, 2024 23:31
Copy link
Member

@MilesCranmer MilesCranmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good, nice work! I left some comments below.

Also, to make the formatter happy you can install JuliaFormatter and then run this in the root of the repository:

julia --startup-file=no --threads=auto -O0 --compile=min -e 'using JuliaFormatter; format("."; verbose=true)'

and it should format all of the code so the formatting test passes. (The extra flags just make it startup faster; though aren't necessary).

src/Evaluate.jl Outdated Show resolved Hide resolved
src/Evaluate.jl Outdated Show resolved Hide resolved
src/Evaluate.jl Outdated Show resolved Hide resolved
test/test_non_number_eval_tree_array.jl Outdated Show resolved Hide resolved
src/Evaluate.jl Outdated Show resolved Hide resolved
test/test_non_number_eval_tree_array.jl Outdated Show resolved Hide resolved
src/Utils.jl Outdated Show resolved Hide resolved
test/test_non_number_eval_tree_array.jl Outdated Show resolved Hide resolved
src/DynamicExpressions.jl Outdated Show resolved Hide resolved
@MilesCranmer
Copy link
Member

MilesCranmer commented Jun 26, 2024

By the way, I wonder if using a 0-dimensional array instead of a scalar could simplify some code? I'm not sure if it would or not (see bottom example for a caveat), but you could try it if interested.

You can make 0-dimensional arrays with:

julia> zeros(Float64)
0-dimensional Array{Float64, 0}:
0.0

julia> fill(0.0)
0-dimensional Array{Float64, 0}:
0.0

julia> x = Array{Float64}(undef); x .= 0.0; x
0-dimensional Array{Float64, 0}:
0.0

To index a zero-dimensional array, you use []:

julia> x = zeros(Float64)
0-dimensional Array{Float64, 0}:
0.0

julia> x[] = 1.0
1.0

julia> x
0-dimensional Array{Float64, 0}:
1.0

Perhaps this simplifies some things like splatting because size(x) is an empty tuple in the case of 0-dimensional arrays.

However there are some issues with zero-dimensional arrays... For example, op.(x) does not return a 0-dimensional array, but a scalar:

julia> x = zeros(Float64)
0-dimensional Array{Float64, 0}:
0.0

julia> x .+ x
0.0

julia> x + x
0-dimensional Array{Float64, 0}:
0.0

which I find a bit weird, and might make other parts of the code more complex and unintuitive.

So, completely up to you! I'm not sure it's worth it or not.

Also, I think storing data in a zero-dimensional array may incur extra garbage collection cycles whereas a scalar might just get inlined.

@MilesCranmer
Copy link
Member

It seems like 0-dimensional arrays aren't used much so it might run into some weird behavior... Maybe just better to stick with your existing approach of scalars.

@MilesCranmer
Copy link
Member

I think the line causing an error:

        @test_throws ErrorException ex(randn(1, 1))

can just be removed. This line just asserts that (up to this point) you couldn't use this type of behavior. But now you can! So that check can be turned off.

@MilesCranmer
Copy link
Member

Also I wonder if SVM is the right name for it because of the overlap with https://en.wikipedia.org/wiki/Support_vector_machine. Maybe just "Tensor" is a better name for that struct. Or DynamicTensor maybe (emphasising that the dimensionality is "dynamic" and is stored as a runtime value rather than fixed to the type)

src/TypeInterface.jl Outdated Show resolved Hide resolved
src/DynamicExpressions.jl Outdated Show resolved Hide resolved
Comment on lines 13 to 29
struct Max2Tensor{T}
dims::UInt8 # number of dimmentions
scalar::T
vector::Vector{T}
matrix::Matrix{T}
Max2Tensor{T}() where {T} = new(Int8(0), zero(T), T[], Array{T,2}(undef, 0, 0))
function Max2Tensor{T}(scalar::W) where {T,W<:Number}
return new(Int8(0), Base.convert(T, scalar), T[], Array{T,2}(undef, 0, 0))
end
function Max2Tensor{T}(vector::Vector{W}) where {T,W<:Number}
return new(Int8(1), zero(T), Vector{T}(vector), Array{T,2}(undef, 0, 0))
end
function Max2Tensor{T}(matrix::Matrix{W}) where {T,W<:Number}
return new(Int8(2), zero(T), T[], Matrix{T}(matrix))
end
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if you are still planning to go in this direction (from the other struct you showed me) but just an idea: you could write this as:

julia> struct MultiTensor{T,N,A<:Tuple{Vararg{<:Any,N}}}
           dims::UInt8
           data::A
       
           function MultiTensor{T,N}() where {T,N}
               @assert N >= 1
               data = ntuple(i -> i == 1 ? zero(T) : zeros(T, ntuple(_ -> 0, Val(i-1))), Val(N))
               return new{T,N,typeof(data)}(zero(UInt8), data)
           end
       end

For example, for N=3, this would expand to:

function MultiTensor{T,N}() where {T,N}
    data = (zero(T), zeros(T, 0), zeros(T, 0, 0))
    return new{T,N,typeof(data)}(zero(UInt8), data)
end

and (I think) be type stable too, because ntuple(..., Val(n)) will expand to exactly n-tuple if n is known to the compiler.

Then you can access the data with val.data[1] for scalars, val.data[2] for vectors, and so on.

(But I'm assuming you are going the other way with the fixed array?)

src/ParametricExpression.jl Outdated Show resolved Hide resolved
src/ParametricExpression.jl Outdated Show resolved Hide resolved
src/NodeUtils.jl Outdated Show resolved Hide resolved
@coveralls
Copy link

coveralls commented Jul 5, 2024

Pull Request Test Coverage Report for Build 9806989095

Details

  • 145 of 166 (87.35%) changed or added relevant lines in 7 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-0.6%) to 94.904%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/NodeUtils.jl 25 26 96.15%
src/Strings.jl 4 6 66.67%
src/Evaluate.jl 26 29 89.66%
src/OperatorEnumConstruction.jl 60 66 90.91%
src/TypeInterface.jl 10 19 52.63%
Files with Coverage Reduction New Missed Lines %
src/OperatorEnumConstruction.jl 1 90.74%
Totals Coverage Status
Change from base Build 9793838744: -0.6%
Covered Lines: 2272
Relevant Lines: 2394

💛 - Coveralls

@coveralls
Copy link

coveralls commented Jul 6, 2024

Pull Request Test Coverage Report for Build 9821555509

Details

  • 193 of 210 (91.9%) changed or added relevant lines in 12 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-0.5%) to 95.065%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/NodeUtils.jl 28 29 96.55%
src/Expression.jl 10 12 83.33%
src/Strings.jl 4 6 66.67%
src/Evaluate.jl 26 29 89.66%
src/TypeInterface.jl 10 13 76.92%
src/OperatorEnumConstruction.jl 60 66 90.91%
Files with Coverage Reduction New Missed Lines %
src/OperatorEnumConstruction.jl 1 90.74%
Totals Coverage Status
Change from base Build 9793838744: -0.5%
Covered Lines: 2273
Relevant Lines: 2391

💛 - Coveralls

@coveralls
Copy link

coveralls commented Jul 6, 2024

Pull Request Test Coverage Report for Build 9822135489

Details

  • 211 of 229 (92.14%) changed or added relevant lines in 12 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-0.5%) to 95.062%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/NodeUtils.jl 28 29 96.55%
src/Expression.jl 10 12 83.33%
src/Strings.jl 4 6 66.67%
src/Evaluate.jl 26 29 89.66%
src/ValueInterface.jl 28 32 87.5%
src/OperatorEnumConstruction.jl 60 66 90.91%
Files with Coverage Reduction New Missed Lines %
src/OperatorEnumConstruction.jl 1 90.74%
Totals Coverage Status
Change from base Build 9793838744: -0.5%
Covered Lines: 2291
Relevant Lines: 2410

💛 - Coveralls

@coveralls
Copy link

coveralls commented Jul 6, 2024

Pull Request Test Coverage Report for Build 9822262239

Details

  • 226 of 243 (93.0%) changed or added relevant lines in 15 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-0.4%) to 95.137%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/NodeUtils.jl 28 29 96.55%
src/Expression.jl 10 12 83.33%
src/Strings.jl 4 6 66.67%
src/Evaluate.jl 26 29 89.66%
src/ValueInterface.jl 29 32 90.63%
src/OperatorEnumConstruction.jl 60 66 90.91%
Files with Coverage Reduction New Missed Lines %
src/OperatorEnumConstruction.jl 1 90.74%
Totals Coverage Status
Change from base Build 9793838744: -0.4%
Covered Lines: 2289
Relevant Lines: 2406

💛 - Coveralls

@coveralls
Copy link

coveralls commented Jul 6, 2024

Pull Request Test Coverage Report for Build 9822341017

Details

  • 226 of 243 (93.0%) changed or added relevant lines in 15 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-0.4%) to 95.137%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/NodeUtils.jl 28 29 96.55%
src/Expression.jl 10 12 83.33%
src/Strings.jl 4 6 66.67%
src/Evaluate.jl 26 29 89.66%
src/ValueInterface.jl 29 32 90.63%
src/OperatorEnumConstruction.jl 60 66 90.91%
Files with Coverage Reduction New Missed Lines %
src/OperatorEnumConstruction.jl 1 90.74%
Totals Coverage Status
Change from base Build 9793838744: -0.4%
Covered Lines: 2289
Relevant Lines: 2406

💛 - Coveralls

@coveralls
Copy link

coveralls commented Jul 7, 2024

Pull Request Test Coverage Report for Build 9828697768

Details

  • 224 of 239 (93.72%) changed or added relevant lines in 15 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.1%) to 95.371%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/NodeUtils.jl 28 29 96.55%
src/Expression.jl 10 12 83.33%
src/Strings.jl 4 6 66.67%
src/Evaluate.jl 26 29 89.66%
src/ValueInterface.jl 29 32 90.63%
src/OperatorEnumConstruction.jl 58 62 93.55%
Totals Coverage Status
Change from base Build 9793838744: -0.1%
Covered Lines: 2287
Relevant Lines: 2398

💛 - Coveralls

@MilesCranmer
Copy link
Member

There are some issues with defining OperatorEnum on BroadcastedFunction – seems you can't re-define an OperatorEnum on the non-broadcasted version. But we can fix that later and I'll just print a warning for now that this feature is unstable

@MilesCranmer MilesCranmer self-requested a review July 7, 2024 17:11
@MilesCranmer MilesCranmer merged commit db556dc into SymbolicML:master Jul 7, 2024
10 checks passed
@coveralls
Copy link

coveralls commented Jul 7, 2024

Pull Request Test Coverage Report for Build 9828915053

Details

  • 229 of 243 (94.24%) changed or added relevant lines in 15 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.1%) to 95.417%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/NodeUtils.jl 28 29 96.55%
src/Expression.jl 10 12 83.33%
src/Strings.jl 4 6 66.67%
src/Evaluate.jl 26 29 89.66%
src/OperatorEnumConstruction.jl 63 66 95.45%
src/ValueInterface.jl 29 32 90.63%
Totals Coverage Status
Change from base Build 9793838744: -0.1%
Covered Lines: 2290
Relevant Lines: 2400

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants