Scilab function variables: representation, manipulation.

Functions in Scilab are stored as variables in the stack. Apparently, functions are generated from parsing scilab textual code by a process called "compilation", which seems rather to be a translation in a condensed, tokenized internal representation (called improperly pseudocode).

There is no published description, neither of the parsing of the function text which produces the function pseudocode nor of the storage conventions and implications. It is not clear to me whether this reflects an unstated intention of keeping the innermost proprietary details of scilab deliberately cryptic, or is just a result of the development history of Scilab.

Nevertheless, a better understanding of the works of the parser, and its way of storing and perusing code data, would be beneficial for any attempt of designing or improving modern Scilab code tools - like a lexer, a profiler, a debugger, a cross-compiler, a code differentiator, and so on.

What follows are a few personal deductions, for reference.

There are three sorts of functions: uncompiled (type 11), compiled (type 13) and compiled with provisions for profiling (type 13 as well).

The basis for the storage seems to be a header, detailing function type, input and output arguments and size, and a function body, stored either as program text (type 11) or pseudocode (type 13). Some of this information can be conjectured from help save_format, as it is likely that, for economy, the binary scilab files are essentially a dump of the stack structures. Historically, once upon a time function variables were stripped of the comment text in the function definition; now comments are preserved and stored along.

In all cases, the function body seems to be organized in elementary chunks corresponding to individual code lines. Both breakpointing and profiling operate with such a granularity.

Functions compiled for profiling apparently differ from those "just compiled", in that two extra words are added per function line (this is roughly deducible from the function size reported by who). I figure out that one of such words is for storing a cumulative call count, while the second for storing the cumulative time spent. Besides that, profilable functions are compatible with breakpointing (the time spent waiting for user input at the breakpoint is even *not* cumulated in the timing, correctly); the impact of making a function profilable on the performance seems at all negligible.

There are few Scilab functions providing some degree of access to the function types, and the possibility to manipulate them. These are:

Interfacing with files:

Operating on functions, or producing functions:

For the m2sci suite:

The format of a tlist of type "program" is:

The couple tree2code(macr2tree) used to be at odd with particular syntax constructs, which are gradually sorted out. Does it support profilable functions (bug 1619)?


Format of lst=macr2lst(foo)

All members of lst are either string vectors or lists (nested, of the same sort)

lst(1)= function name

lst(2)= output arguments names

lst(3)= input argument names

lst(4)= "15" (end of line 1, corresponding to the function header)

lst(5)= ["25" "x" "y”] for profilable functions, beginning of the pseudocode for plain compiled functions

A partial opcode list

op(1)

meaning

0

deleted operation

1

stackp (i.e. stack put, retained for compatibility with 2.7 and earlier version)

2

stackg (i.e. stack get)

3

string

4

empty matrix

5

allops (i.e. operations)

6

number

7

"for-end" control instruction

8

"if-then-else" control instruction

9

"while-end" control instruction

10

"select-case-end" control instruction

11

"try-catch-end" control instruction

12

pause

13

break

14

abort

15

EOL

16

set line number

17

quit

18

named variable

19

mkindx (make recursive index list: start of a new opcode index structure)

20

functions

21

beginning of rhs

22

set print mode

23

create variable from name

24

create object with type 0

25

profiling information

26

vector of strings

27

funptr variable

28

continue

29

affectation (assignment)

30

expression evaluation short circuiting

31

comment "in multiline matrix definition a=[..." (?)

99

return


A fast Scilab function for listing all the function variables in a namespace, together with their kind:

function [flist,compiled,profilable,called]=listfunctions()
  nam=who("get")'
  called=uint32(zeros(nam)); afun=(called==1); pfun=afun; cfun=pfun;
  for i=1:size(nam,2)
    clear rvar lst;
    // rvar is cleared to avoid function redefinition warning
    // lst (topmost, variable size) is cleared to speed up garbage collection
    execstr("rvar="+nam(i));
    if type(rvar)==11 then afun(i)=%t; end
    if type(rvar)==13 then
      afun(i)=%t; cfun(i)=%t;
      lst=macr2lst(rvar)
      pfun(i)=and(lst(5)(1)=="25")
      if pfun(i) then execstr("called(i)="+lst(5)(2)); end
    end
  end
  flist=nam(afun)
  compiled=cfun(afun)
  profilable=pfun(afun)
  called=called(afun)
endfunction


Tricks for converting functions of one kind into another:

funtext=fun2string(foo,"foo")
deff(strsubst(funtext(1),"function ",""),funtext(2:\$-1),"n")

comp(foo) 

comp(foo, 2) 

All together in nice function form:

function recompilefunction(funname,kind,force)
  if ~exists("force","local") then force=%f; end
  if ~exists("kind","local") then kind="c"; end
  if ~exists(funname)
     error("No variable named: "+funname)
  end
  clear fvar funtext tempfun
  execstr("fvar="+funname)
  if ~or(type(fvar)==[11 13]) then
     error(funname+" must be the name of a scilab function variable")
  end
  if type(fvar)==11 & ~force then
    oldkind="n"
    if kind=="n" then
      warning(funname+" is already noncompiled, nothing to do!")
      return
    end
//can't avoid "Warning: redefining function: fvar", sorry
//    if kind=="c" then comp(fvar); end
//    if kind=="p" then comp(fvar,2); end
//    execstr(funname+"=resume(fvar)")
//or:
    [out,in,funtext]=string(fvar);
    deff("["+strcat(out,",")+"]=tempfun("+strcat(in,",")+")",..
          funtext,kind)
    execstr(funname+"=resume(tempfun)")
  elseif type(fvar)==13 then
    lst=macr2lst(fvar)
    if lst(5)(1)=="25" then oldkind="p"; else oldkind="c"; end
    if kind=="c" & oldkind=="c" & ~force then
      warning(funname+" is already compiled, nothing to do!")
      return
    end
    if kind=="p" & oldkind=="p" & ~force then
      warning(funname+" is already compiled for profiling, nothing to do!")
      return
    end
    funtext=fun2string(lst,"tempfun")
    deff(strsubst(funtext(1),"function ",""),funtext(2:\$-1),kind)
    execstr(funname+"=resume(tempfun)")
  end
endfunction

public: Scilab function variables: representation, manipulation (last edited 2011-03-30 16:17:55 by localhost)