logo.jpg

GSoC 2017 - Machine Learning Toolbox in Scilab

Student and Mentors

Student Name -

Mentors -

Introduction

This project aims to develop machine learning features in SCILAB, which will be available to the end-user as a toolbox or direct function calls. The project has been divided into two sections

ml libraries to be adapted from python to Scilab-

Community Bonding Period

(5th to 30th May, 2017)

Tasks

Description

Status

Getting to know my mentors and finalizing a rough starting point for the project

I was asked to revise all major machine learning models/ algorithms and get clarity about their mathematical modeling

Done

Getting in-depth knowledge of scikit-learn library

I went through the documentation and tutorials on the scikit-learn portal. Got hands-on experience working with the various modules in sklearn and trying them on Kaggle datasets

Done

Get acquainted with deep learning libraries like tensorflow and keras

Since neural network module already exists in scilab, it is current need to have a deep learning module/library implementation through Scilab. I understood the working and modeling of neural networks using tensorflow and keras library for python

Done

Scilab syntaxes and toolboxes

Here I was required to make myself comfortable in using Scilab for the entire development period. Getting to know how and where Scilab is different from MATLAB(which I am used to)

Done

Coding Period Begins

(30th May 2017)

The first task assigned to me was to study already existing toolboxes for machine learning in Scilab,so as to get a clear understanding of what exists and what needs to be developed.

Here are some of the modules which are present in atoms right now :

Week 1-2 Report

(31st May 2017 - 14th June 2017)

Week 3-4 Report

(15th June 2017 - 30th June 2017)

Following are the important points of this approach:

  1. Its essential to know that, we are not planning to have an interactive python environment within Scilab, as it would be unfeasible to manage so many libraries and versions in an efficient manner
  2. Python scripts will be written outside Scilab interface, and would be called to work only when their outputs or trained machine learning models need to be used in Scilab context.
  3. This would involve the following steps
    1. Writing the required machine learning script and saving it as a '.py' file.
    2. Sending the ml script created to the python kernel running on jupyter server.
    3. Once execution completes, passing back the python objects like regression model to Scilab and converting it to Scilab context.
    4. This converted object can then be used for solving any required operation through Scilab like a differential equation.
  4. Two major parts of this approach are :
    • [1] Passing the python script file to the jupyter server, and/or letting the python kernel to know where this file exists [2] Conversion of python objects to Scilab compatible form

Part [1] involves, passing the path of the script file to the jupyter server so that the python kernel can execute it. This can be achieved through python code for transferring/copying the script file to python kernel path. Even if we decide to follow the PIMS approach or continue working on this jupyter server method, we would be required to handle part [2].

jupyter_ml.png

An illustration of Jupyter-Client Approach for machine learning in Scilab

Comparison of prediction results on Python and Scilab side

Week 5-6 Report

(1st July 2017 - 15th July 2017)

Daily Reports can be found here

jupyter_approach.jpg

16th July - 21st August 2017

Daily Reports can be found here

Source Code

Link

The scilab forge is currently down, so I have created a copy of my repo on Github : https://github.com/mandroid6/machine-learning-Toolbox-SCILAB/blob/master/Final%20Submission/

Ideas/Direction To Work in Future

In the present stage, machine learning can be easily implemented through SCILAB using python libraries as supported by this toolbox. But still the steps involved may prove a bit elaborate for any non-python programmer. The current version of this machine learning toolbox is an early stage, and would require several iterations before it can be finalized for the end user. Right now, the user can run python scripts on a remote server as had been planned earlier, but this approach doesn't support multiple users working on the same server at the same time. This is due to a common workspace offered by the IPython kernel on the remote machine.

These are few ideas which if implemented later would significantly improve ML in Scilab:

  1. Making a JupyterHub implementation within SCILAB, but offering a non-interactive execution.

  2. Automating transfer of connection_file corresponding to the kernel running on the remote server.
  3. Inclusion of support for Python3, which would open doors for many other ML libraries unavailable through Python 2.7
  4. Making Scilab open to inclusion of famous ML libs like Tensorflow( currently it supports executoion on server side, but learned models cannot be used on local Scilab Machine)
  5. Trying out Jupyter kernels to run "R programming language" scripts, since it is highly used in the field of Data Science.
  6. Support for inline visualizations within Scilab using libs such as matplotlib, seaborn(python) and ggplot2(R)

public: Machine Learning Toolbox in Scilab (last edited 2017-08-28 17:51:45 by mandar061095@gmail.com)