GSoC 2017 Application Shikhar Jaiswal: Improving SymEngine's Python Wrappers and SymPy SymEngine Integration
- Personal Background
- Project Overview
- Project Details
- Additional Goals (Time Permitting)
- Timeline
- References
Name : Shikhar Jaiswal
University : Indian Institute of Technology, Patna
Email : jaiswalshikhar87@gmail.com
GitHub : ShikharJ
Blog : shikharj.github.io
Time-zone : IST (UTC+5:30)
Age : 18
I am a first year undergraduate student pursuing a Bachelors of Technology in Computer Science and Engineering at Indian Institute of Technology, Patna. I was introduced to programming about three years ago. I have previously programmed a file-based organic and inorganic chemical analysis module in C++, which provides the user with an initial set of characteristics to select from, and returns the name of the test to be carried out to exactly determine the functional group of that compound.
I also had the opportunity to implement a steganography tool (in C++), using the algorithm employed in the original Enigma machine used by the Axis powers during World War II.
I am comfortable with STL and algorithms. I am also currently improving my skills in competitive programming, apart from software development. I was introduced to Python programming through the book “The Python Crash Course” by Eric Matthes, and have developed a 2-D space shooting game, using the PyGame library, and also worked on data visualisation using Pygal. I am familiar with git for version control, and currently working on a project involving image processing and gesture recognition, requiring the use of MATLAB/GNU Octave and OpenCV respectively. I am also currently trying my hands at Cython through the book “Cython: A Guide for Python Programmers” by Kurt W. Smith.
OS : Ubuntu 16.10
Hardware Configuration : i7 7700HQ/ 16GB
IDE : C/C++ - CLion
Python - PyCharm
Editor : SublimeText 3
-
Removed unimplemented constructor declaration in
SymEngine::Min
(Merged) -
Added more functions to
LLVMDoubleVisitor
(Pending) -
Worked on changing the test clause to
CHECK()
(Pending) - Worked on increasing code coverage (Merged)
-
Implemented the derivative of
Dirichlet_eta
function and added tests (Merged) -
Improved test cases in
test_infinity.cpp
(Merged) -
Updated
functions.cpp
and added tests (Merged) -
Implemented
NaN
class and made subsequent changes in the code-base (Merged) - Implemented automatic evaluation of powers to and of constants (with @isuruf) (Merged)
-
Implemented class
ComplexBase
andis_a_complex()
virtual function in classNumber
(Merged) -
Improved upon
Zeta
function derivative and added tests (Merged) -
Implemented
Infty::pow
andInfty::rpow
functions (Merged) -
Worked on replacing calls to
rcp_static_cast()
with calls todown_cast()
(Merged) -
Implemented the derivative of
LeviCivita
function (Merged) -
Incorporated the derivative of
Zeta
,UpperGamma
andLowerGamma
functions and added tests (with @isuruf) (Merged) -
Restructured the code-base to convert
public
andprotected
data members of select classes toprivate
(Merged) -
Applied
-Wconversion
flag and reported the errors (with @isuruf) (Pending) -
Restructured the
mul_dense_dense()
function (Merged) -
Added
-ftrapv
flag toclang
builds for checking integer overflows (Merged) -
Wrapped
as_numer_denom()
function inC
(Merged) -
Added a
GCC 6
build toTravisCI
(with @certik) (Merged) - Worked on changing the print format of exponential expressions and added tests (Merged)
-
Added support for
down_cast()
and made changes in the code-base (with @isuruf) (Merged) -
Implemented
complex_double_get()
function (Merged) -
Added
clang sanitizer
checks and worked on test consolidation (Pending) -
Implemented
symengine_have_component()
function (Merged) -
Implemented
rational_get_mpq()
function (Merged) -
Minor change in
UndefinedError
class (Merged)
Issues
-
Added
CodeCov
check (Pending) -
Wrapped
sech
,csch
,acsch
andasech
functions (Merged) -
Wrapped
DenseMatrix::reshape()
function (Merged) -
Formatted
.py
files according toPEP8
(Merged) -
Refactored
DenseMatrix
class and introducedMutableDenseMatrix
andImmutableDenseMatrix
classes (Pending) -
Refactored
Subs
andDerivative
classes (Pending) -
Wrapped
SymEngine::Min
andSymEngine::Max
functions and added tests (Pending) -
Minor improvement in
atoms()
function and added tests (Merged)
-
Added a bunch of constants to
sympy/physics/units
(Merged) -
Refactored
printing/pretty/pretty.py
to usepretty_symbology.py
(Pending) -
Worked on porting
SymEngine
tophysics/optics
(Pending)
Speed is of the utmost importance for any Computer Algebra System.
SymEngine
, was initially developed with the aim of serving as an optional core for the SymPy
CAS in the future. Over the years, it has matured enough to be used as a symbolic backend.
Using SymEngine
can significantly increase speeds of various symbolic operations, and hence make SymPy
an ideal choice for projects requiring fast manipulations, by giving them the option to switch over to SymEngine
’s routines.
On the other hand, this will also lead to the development of a number of features currently lacking in SymEngine
and its Python wrapper, which would be ported over from SymPy
in order to provide smooth wrappers for optional use.
An added advantage is that SymEngine
can be used in SymPy
with minimal programming effort (as clearly demonstrated here), requiring less time in porting and hence more time can be dedicated to expanding and implementing additional functionality in SymEngine
and SymEngine.py
, that can again be integrated between SymEngine
and SymPy
.
I initially wanted to implement my proposal on a module-by-module basis (i.e. working on improving the backend of a single module at a time). However, after talks with Isuru, it soon turned out that this was a longer approach. As such, this proposal currently takes on a routine-by-routine approach, for which a number of routines are shortlisted through our discussions. The main idea of this approach is that the majority of the implementation and wrapping related work should occur first, succeeded by introducing changes and tackling conflicts in the SymPy
repository. Thanks to Isuru, this proposal now has a much improved layout and timeline breakup.
Currently, there are roughly 16 modules (or specialised directories), out of a total 37, in SymPy
that are under the present scope of improvement. Since the period of GSoC may lead to the further development of modules and sub-modules through other contributions, the exact figure would be a variant. I plan on executing this proposal in three inter-mixing phases (list of specific functions is given later):
In this phase, the idea, basically, is to refurbish (completely or partially) all the modules that currently import routines that are already implemented in SymEngine
, and available in the SymEngine.py
wrapper. As such, no new functionality is expected to be implemented in either SymEngine
, or its Python wrapper, though minor changes may be required. Only those modules are worked upon in which all of the imported routines are either available in the SymEngine.py
wrapper, or are beyond the scope of the development of this project (for example, integration heuristics and assumption routines). This work should require making trivial changes such as changing:
from sympy.core import ...
to
from sympy.core.backend import ...
Testing (for compatibility issues) and benchmarking, if required, for these modules will also occur during this period. It will also serve as a warm-up for the next two phases, which would be more coding intensive, and can be initiated before and during the Community Bonding period.
This phase will primarily focus on implementing specific functionalities that aren’t currently available in SymEngine
or in SymEngine.py
or both, but can be implemented within a stipulated amount of time. This includes, implementing routines in SymEngine
in a manner similar to SymPy
, updating the python wrapper with the latest development, and testing all the implementations. Having worked extensively with SymEngine
, this should be comparatively an extensive, yet intermediately challenging task. Since most of the work will be centered around implementation in SymEngine
and SymEngine.py
, no major change is expected in the SymPy
repository. USE_SYMENGINE
clause requires the latest version of SymEngine
’s python wrapper, and as such, SymEngine.py
release will also have to be necessarily updated.
This phase would largely be a follow-up of the first two phases, especially the second phase. All the new functionality implemented in SymEngine
and SymEngine.py
, will be ported over to SymPy
. All of the modules, left uncovered in the first phase, will also be updated here, along with final testing and benchmarking of the changes made up till then. Remaining compatibility issues, if they arise, will also be dealt with during this phase. The proposal would be finished off with a final update to the documentation and instructions wiki.
PHASE I: MIGHT THROW COMPATIBILITY ISSUES WITH SYMPY
The following functions are to be inspected once for conflicts with their SymPy
counterparts and minor changes. Also some of these are yet to be made available through sympy_compat.py
file:
Symbol
Integer
sympify
S
SympifyError
exp
log
gamma
sqrt
I
E
pi
Matrix
lambdify
symarray
diff
zeros
eye
symbols
diag
ones
expand
AppliedUndef
Function
symbols
var
Add
Mul
Derivative
Basic
Pow
Rational
Abs
Number
Float
Dict
factorial
sieve
gcd
lcm
factor
nextprime
mod_inverse
totient
primitive_root
atan2
MatrixBase
DenseMatrix
- Trigonometric Functions (
sin
,cos
,tan
,cot
,csc
,sec
,asin
,acos
,atan
,acot
,acsc
,asec
,sinh
,cosh
,tanh
,coth
,asinh
,acosh
,atanh
,acoth
)
PHASE II: TO BE IMPLEMENTED IN A MANNER SIMILAR TO SYMPY
These functions, after being implemented, will have to be selectively checked for compatibility (initially between SymEngine
and SymEngine.py
and later between SymEngine.py
and SymPy
). Some of these are pre-implemented, but are needed to be refurbished:
- Relational Operators (
Rel
,Eq
,Ne
,Lt
,Le
,Gt
,Ge
) Nor
sign
NumberSymbol
isprime
Range
Intersection
Complement
Mod
_symbol
floor
igcdex
_symbol
ceiling
igcd
ilcm
isqrt
Tuple
integer_nthroot
perfect_power
sqrt_mod
gcdex
divisors
ProductSet
conjugate
_sympify
And
Not
Or
Expr
Interval
FiniteSet
Union
EmptySet
Set
as_int
KroneckerDelta
Zeta
MutableDenseMatrix
MatrixSymbol
- Error Functions (
erf
,erfc
,erfi
) NaN
Infinity
NegativeInfinity
LambertW
Piecewise
expand_mul
BooleanAtom
nan
oo
zoo
Lambda
Min
Max
Contains
Xor
Nand
Nor
col
rowadd
rowmul
SparseMatrix
MutableMatrix
mgamma
diophantine
Eulergamma
lowergamma
uppergamma
ImmutableMatrix
ImmutableSparseMatrix
The modules list has been added only to portray the potential benefit of implementing this proposal. What I intend on doing is to port over and wrap up the mentioned functions and classes only (which are universal to use). I do not plan on implementing any other stuff that is unique to these modules (that are also not implemented in SymEngine
).
- Parsing
- Physics
- Categories
- CodeGen
- Combinatronics
- Crypto
- DiffGeom
- Geometry
- LieAlgebras
- Ntheory
- Polys
- Sets
- Simplify
- Strategies
- Tensor
- Utilities
Code coverage is one of the most fundamental methods of software testing. A program with high code coverage, has a lower chance of containing undetected bugs. Currently, the SymEngine
master branch stands at ~82.75% coverage as reported by Codecov
. Raising code coverage, though not explicitly challenging, is a very time consuming task, requiring the implementation of proper test cases and conditions, and subsequently, debugging the errors obtained, if any. Though SymEngine
currently deploys the Codecov
check as pre-condition for the incoming pull requests, some of the already existing files in the code-base have a coverage as low as 20-25%. Hence I would like to devote some time in implementing tests to increase SymEngine
’s coverage.
From my experiences of contributing to SymEngine
over the course of the past 5 months, I have felt that the SymEngine
library currently lacks a proper documentation. As a newbie to SymEngine
, it took me a considerable amount of time to completely understand some of the very core functionality of the library, along with frequent clarifications from Ondřej and Isuru. Also, this would be helpful to a lot of people who would like to integrate SymEngine
into their projects, but are finding the lack of documentation to be a hindrance. Though I won’t be able to document the entire library, as it would still require a lot of time, I would certainly like to work on documenting the functionality on which I have worked upon, both pre-GSoC and while implementing my proposal, as a side task. Also, as suggested by Isuru, I would like to write a tutorial (possibly a Jupyter
notebook) for using SymEngine
in Python through SymEngine.py
.
Thread safety is a current issue with SymEngine.py
that needs to be worked upon. One probable way of implementing thread safety is through setting up a GIL acquisition routine in pywrapper.cpp
.
Though I don’t know how this is likely to be implemented, I plan on helping in its development, and work alongside my mentors to get it finished.
I have no major commitments for the coming summer, except for maybe a couple of days of family vacation during the first week of May. As such, I will be able to contribute a total of 50 hours per week, or more if required. My summer break starts from the 29th of April, and regular classes would commence from the 31st of July. I also do not have any examinations before mid-September. Hence, the following timeline is planned to finish up on major areas of work before my college semester kicks off. I will also maintain my Github
blog to show my progress and get feedback from the SymPy
community.
Specific Goals
- Make all the pre-implemented functions available through
sympy_compat.py
. - Introduce
SymEngine
as a backend inParsing
andPhysics
modules (Phase I Modules).
Side Goals
- Talk to Isuru Fernando, Ondřej Certik, Aaron Meurer and others regarding the feasibility (with respect to time) of porting functionalities (in addition to the ones mentioned above) from
SymPy
toSymEngine
and back. - Finalise the complete set of classes and routines to be ported, so as to save time later on.
Specific Goals
- Implement Relational Operations in
SymEngine
as an initiation to Phase 2. - Import changes for pre-implemented functions/classes in the remaining modules (Phase I and II Modules).
- Benchmark the results obtained, and make changes to the Phase II and Phase III work approaches if required.
Side Goal
- Examine the
SymPy
source code for the implementation of the proposed functions inSymEngine
andSymEngine.py
wrapper, in preparation of later phases.
Specific Goals
- Implement the rest of the mentioned Phase II functionalities to be ported over to
SymEngine
, along with tests (classesNumberSymbol
,Complement
,Mod
and others). - Finish up on all the conflicted (with respect to naming, internal representation or output types/formats) Phase II functionalities pre-implemented in
SymEngine
(such asoo
,zoo
,rowadd
,rowmul
and others).
My goal up till Phase 1 evaluations would be to finish off with Phase I and SymEngine’s side of implementation work in Phase II.
Specific Goals
- Work on the existing issues in the
SymEngine.py
repository related to the proposal such as #17, #76 and #91. - Wrap up functionalities (first 20 under Phase II) in
SymEngine.py
along with extensive testing.
Specific Goals
- Finish up on the (28 remaining) shortlisted functionalities and classes, effectively finishing up on Phase II.
- Exhaustively clear out compatibility issues between
SymEngine
repository and the wrapper, if they arise.
Side Goal
- Setup
CodeCov
check forSymEngine.py
to maintain a healthy coverage.
By the time for Phase 2 evaluations, I plan to be finished off with SymEngine.py’s side of implementation work and Phase II.
Specific Goals
- Update the first 9 mentioned Phase I and II modules in
SymPy
after finishing up onSymEngine
andSymEngine.py
work, as a part of Phase 3. - Fix the compatibility issues thrown up.
Side Goal
- Finish off miscellaneous implementations that might be required along the path of a future release.
Specific Goals
- Update the remaining 5 mentioned Phase I and II modules in
SymPy
, bringing an end to Phase III. - Investigate
SparseMatrix
algorithms inSymPy
, for a possible update on their usage withSymEngine.py
objects. - Final check for any issues or conflicts between the wrapper and
SymPy
repository. - Final benchmarking and update to the
SymPy
andSymEngine
wikis.
- Buffer time for finishing up on documentation or any other piece of implementation that needs refactoring, or wrapping up any functionality left untouched due to delays.
- Work on the additional goals planned (subject to the availability of time).
SymEngine
is the first open-source project that I contributed to, and the journey has been simply amazing. Over time, I have realised that collaborating with the sharpest minds of the world is a pleasure beyond words. The experience I have gained so far is enriching in itself. I have the following plans post-GSoC:
- I realise that the entire
SymPy
library cannot be covered presently due to various constraints. As such, I hope to continue upon my work on the modules and functionalities left untouched by the changes proposed above. - While thinking of a project idea, I had the opportunity to go through some amount of work done/required in
SymPy
andSymEngine
(related toassumptions
). Hence I wish to be a part of the team that develops theassumptions
module inSymEngine
. - Lastly, I also hope to represent team
SymEngine
at the upcoming conventions (PyCon India
andSciPy India
2017), and talks organised at my college.
-
I would also like to mention that the structure and format of this proposal is inspired from a number of outstanding proposals from previous year GSoCers, available at
SymPy
's wiki.