I’ve been neglecting my GSoC project this week. This is what’s on the burner though:

- Write up a blogpost on my implementation of Matrix Expressions. What they can and can’t do. I’d like to generate discussion on this topic.
- Test my code against Tom’s integration code. This has been happening over the last 24 hours actually. It’s cool to see lots of new things work and work well – I feel like I’m driving a sports car. I think that this cross-branch testing has been helpful to locate bugs in both of our codebases.
- After I check what will and won’t work with Tom’s code I need to fill out tests and polish documentation for my main Discrete and Continuous RV branch. It’d be nice to have it presentable to the community for review.

### Like this:

Like Loading...

*Related*

## About mrocklin

PhD student studying Computational Mathematics at the University of Chicago.

I think it’s great that you and Tom are helping each other. What Tom really needs for his project are some integrals that people actually use (as opposed to some random integrals from some huge book of integration tables, or one that he just made up), and it’s good that you can provide those for him.

Indeed, the integrals you provided were great and pointed out some important bugs.

I have been going through wikipedia’s list of continuous probability distributions and added code that tests computing the first few (usually two) non-central moments for a lot of distributions (I even found one containing a bessel function ^^):

https://github.com/ness01/sympy/blob/gsoc-3/sympy/integrals/tests/test_meijerint.py#L283

You will see that there is a TODO note about some distributions I did not add yet … I just got bored at some stage. It is also pleasing to see that of the many distributions I looked at, about 90% of those that have analytically expressible PDF are actually doable using the g-function method.

Of course things get more interesting when computing with multivariate distributions and characteristics that are not polynomial functions of the variable, but I suppose you will add relevant tests to your branch.

You bring up a point about testing. Integral tests should be in test_integrals.py (or somewhere similar), and statistics tests should be in test_statistics.py. The statistics tests should only be providing coverage for the statistics code. This means, that you don’t need to have a test for every kind of function for each probability distribution if all it does is call a different integral—these tests should be in the integration code.

Similarly, in your own code, you should have a combination of so called “black box” and “white box” tests. A black box test knows nothing of the implementation. It just calls the user-interface (or the interface of whatever level it is testing), and tests the result. A “white box” test does know about the implementation. This means having tests specifically testing the internal implementation specific functions. If your implementation changes, the white box tests will have to be updated, but the black box tests should not (other than if an answer comes out a little different as a result). Personally, I think the vast majority of your tests should be white box. You should only test the interface itself for the high level functions. You’ll see that this is what Mateusz does in the polys, and it’s what I have done in my Risch code.

I’m not saying either of you are doing or not doing this already; I just want you to be aware of the concept.