Google Summer of Code 2018 : Machine Learning features in Scilab
The project aims to enhance machine learning features in SCILAB, for SCILAB and by SCILAB. The main concentration would be to integrate deep learning functionality with a possibility of working on the data science usability as well (R/SAS integration).There are tits-bits here and there that would (well at least Soumitra thinks so) drive the developer pool towards SCILAB in the future. The project in itself draws inspiration from the GSoC project last year with the same header but is not necessarily an extension to it.
Community bonding period
The community bonding period was focused on integrating and understanding the previous implementation (done during GSoC 2016). Another major accomplishment during the period was to going through the Jupyter documentation, since this was to be the bread and butter in the proposed approach. Thus an instance of Jupyter Hub was set up on a local machine.
The coding effort was divided into two streams namely
- Development : Initiative to create a standalone machine learning toolbox written completely in Scilab
- Experimentation : Initiative to run machine learning scripts already written in python using a feeder subscriber mechanism which can be called by a user with scripts residing on a server. It also includes any other effort made ideate machine learning easier to do in Scilab other than the development part.
The standalone machine learning toolbox presently contains the following parts
The following algorithms have been implemented in form of macros on the github repository
- Decision Tree classification (CART)
- Linear regression
- Logistic regression
- Naive Bayes (Gaussian)
- Polynomial regression
- K-Means clustering
- Scaling (Zero mean, unit variance)
- Train Test split
The work under the experimentation domain began with the setup of a GCP server with ipython (Jupyter) server with only a set of specific keys able to log into the machine. This machine would act as our server and would do the computation for the python scripts that have the machine learning algorithms pre-written. Our client tries to log into the machine, start up a kernel and copy the kernel configuration file to its local machine. The scripts for this can be found on the github sub-repository. This then can be integrated with the approached used in the project last year to run the script as an interim to a larger Scilab code.
The next step was to ensure an authentication mechanism so that a user doesn't have the permission to do anything other than just run a kernel and copy its script. How to analyse which kernel a user has started still eludes us, but using the command option with the authorized_keys parameter in the OpenSSH mechanism we were able to lock a users ability to execute commands on a server.
Efforts by Soumitra Agarwal under mentors -
- Aashay Singhal
- Mandar Deshpande