I spent the week at the SAMSI workshop for Uncertainty Quantification. Random variables were active everywhere, assimilating data, representing uncertainty in climate models, being represented by samples, polynomials, gradients, etc…. For me there has been a lot of thinking but no new code.
I would like to be able to do two things
- Compute distributions of expressions like
f(X), given that some other condition, such as
g(X+Y)>z, is true.
- Do the above (and all other functionality), on multivariate random variables
Issue (1) is done in the discrete case but fails on any complex continuous expression. It’s challenging to represent events like (X+Y==z). The PDF includes delta functions and isn’t easily representable using the current SymPy-Set backend.
Issue (2) is challenging because the symbols in the equations will have multi-dimensional domains. This isn’t so much complex as it is annoying to build onto the current system.
In general the code isn’t set up to handle joint distributions over several variables elegantly. Both of these issues are things I’d like in my research, and neither of them are simple to implement on top of the given system. I’m considering rewriting the backend to define events using conditionals rather than sets. This is daunting (lots of work) but exciting (potentially lots of new results).