Introduction

1 Introduction

This report gives the result of running the computer algebra independent integration problems (Lite version) obtained from from https://rulebasedintegration.org.

The following versions of Mathematica were tested.

Version 14.1 (August 1, 2024) (on windows 10, 64 bit)
Version 14 (January 9, 2024) (on windows 10, 64 bit)
Version 13.3 (July 2023) (on windows 10, 64 bit)
Version 12.3.1 (on windows 10, 64 bit)
Version 12.1 (on windows 10, 64 bit)
Version 12 (on windows 10, 64 bit)
Version 11.3 (on windows 7, 64 bit)
Version 11.2 (on windows 7, 64 bit)
Version 10.3 (on windows 7, 64 bit)
Version 9 (on windows 7, 64 bit)
Version 8 (on windows 7, 64 bit)
Version 7 (on windows 7, 64 bit)
Version 6.0.1 (on windows 7, 64 bit)
Version 5.2 (on windows 7, 64 bit)

The command AboluteTiming[] was used in Mathematica to obtain the CPU time.

A time limit of 3 minutes is used for all integrals in each CAS. If the integration does not complete within this time limit then the integral is considered to have failed.

The table below gives additional break down of the grading of quality of the antiderivatives generated by each CAS. The grading is given using the letters A,B,C and F with A being the best quality. The grading is accomplished by comparing the antiderivative generated with the optimal antiderivatives included in the test suite. The following table describes the meaning of these grades.


grade	description


A	Integral was solved and antiderivative is optimal in quality and leaf size.

B	Integral was solved and antiderivative is optimal in quality but leaf size is larger than twice the optimal antiderivatives leaf size.

C	Integral was solved and antiderivative is non-optimal in quality. This can be due to one or more of the following reasons antiderivative contains a hypergeometric function and the optimal antiderivative does not. antiderivative contains a special function and the optimal antiderivative does not. antiderivative contains the imaginary unit and the optimal antiderivative does not.

F	Integral was not solved. Either the integral was returned unevaluated within the time limit, or it timed out, or CAS hanged or crashed or an exception was raised.

Based on the above, the following tables summarizes the grading for each test suite for each version

This table shows the percentage and count of solved and non solved integrals for each version. There are a total of [ 14944 ] integrals in the test suite.


Version	percentage solved	number solved	number failed


14.1	98.715	14752	192

14	98.675	14746	198

13.3	98.675	14746	198

12.3.1	98.588	14733	211

12.1	98.548	14727	217

12	98.555	14728	216

11.3	97.223	14529	415

11.2	97.511	14572	372

10.3	93.395	13957	987

9	92.867	13878	1066

8	91.783	13716	1228

7	91.823	13722	1222

6.0.1	90.645	13546	1398

5.2	88.477	13222	1722

Table 1: Solved percentage over versions

This ﬁgure shows the percentage of passed integrals in each version.

This Plot shows the number of A graded result for each version.

This table shows the grading performance for each version.


Version	%A	%B	%C	%F


14.1	79.611 (11897)	5.567 (832)	13.537 (2023)	1.285 (192)

14	79.979 (11952)	5.179 (774)	13.517 (2020)	1.325 (198)

13.3	79.865 (11935)	5.333 (797)	13.477 (2014)	1.325 (198)

12.3.1	77.048 (11514)	5.822 (870)	15.719 (2349)	1.412 (211)

12.1	77.001 (11507)	5.795 (866)	15.752 (2354)	1.445 (216)

12	76.807 (11478)	5.989 (895)	15.759 (2355)	1.445 (216)

11.3	75.174 (11234)	7.889 (1179)	14.16 (2116)	2.777 (415)

11.2	75.147 (11230)	7.314 (1093)	15.05 (2249)	2.489 (372)

10.3	71.079 (10622)	7.347 (1098)	14.969 (2237)	6.605 (987)

9	72.022 (10763)	7.02 (1049)	13.825 (2066)	7.133 (1066)

8	70.865 (10590)	6.939 (1037)	13.979 (2089)	8.217 (1228)

7	71.112 (10627)	7.515 (1123)	13.196 (1972)	8.177 (1222)

6.0.1	70.323 (10509)	7.194 (1075)	13.129 (1962)	9.355 (1398)

5.2	68.389 (10220)	7.254 (1084)	12.835 (1918)	11.523 (1722)

Table 2: Performance grading summary table over versions

This ﬁgure show the normalized mean leaf size for each version. This was normalized to the size of the optimal result.

This ﬁgure show the mean leaf size for each version.

This ﬁgure show the median leaf size for each version.

This ﬁgure show the mean CPU time (sec) for each version.

[next] [front] [up]