SEP 2017: upgrading vectorfind()

Abstract

As other finding functions, vectorfind() is a key one. However, it proves to be stiff against manageable errors, and most of all to be quite limited: It does support neither hypermatrix haystack, nor needles shorter than the haystack size (see this discussion on users@), nor wildcards in the needle. This SEP aims to fix useless stiffness and extend vectorfind() with new important features.

Current status: 6.0 help page

Proposed improvements

Managing useless errors

--> vectorfind(rand(3,3), [])
at line    21 of function vectorfind ( SCI\modules\elementary_functions\macros\vectorfind.sci line 34 )

vectorfind: Wrong size for input argument #2: Vector expected.

Instead, we propose to return []: When nothing is searched for, nothing is found and returned.

--> vectorfind(rand(3,3), [1 2 3 4])
at line    26 of function vectorfind ( SCI\modules\elementary_functions\macros\vectorfind.sci line 39 )

vectorfind: Wrong size for input arguments: Incompatible sizes.

Instead, we propose to return [].

Supporting partial vector as needle

Presently, vectorfind() expects a needle as long as the haystack dimension along which it must be searched. A partial vector shorter than this size is not accepted. This limitation has been reported on the mailing list. It should be removed, whatever is the haystack: matrix or hypermatrix. For instance, we will have the following:

--> vectorfind(H, [0 0], "r")
 ans  =
   3.   8.   13.   28.   34.

--> vectorfind(H, [0 0], 3)
 ans =
   5.  10.   28.   30.

The linearized indices of the haystack components starting to match the given needle are returned.

Supporting wildcards in the needle

Option indType tuning the type of indices returned as results

Extending the haystack to hypermatrix

Presently, the usage of hypermatrix haystack is not documented. However, vectorfind() silently accepts any hypermatrix as haystack. Nowadays, N-dimensionnal arrays are a basic form of data. Scilab 6 now encodes them natively. vectorfind() should actually become able to work with them.

Searching along rows or columns of an hypermatrix

Unfortunately, presently, the search is not properly performed: Only the first page of the hypermatrix is considered:

--> H = [1 0 0 1 2 0 0 0 1 0 2 1 2 2 1 0 1 2   
  >      2 2 1 0 0 2 1 0 2 0 1 2 1 0 0 1 0 0];
--> H = matrix(H, [2 3 3 2])
 H  = 
(:,:,1,1)
   1.   0.   0.
   2.   2.   1.
(:,:,2,1)
   1.   2.   0.
   0.   0.   2.
(:,:,3,1)
   0.   0.   1.
   1.   0.   2.

(:,:,1,2)
   0.   2.   1.
   0.   1.   2.
(:,:,2,2)
   2.   2.   1.
   1.   0.   0.
(:,:,3,2)
   0.   1.   2.
   1.   0.   0.

--> vectorfind(H, [0 1], "c")
 ans  =
   3.

Here, the occurrence in columns #7 (:,1,3,1) and #16 (:,1,3,2) are not detected. vectorfind() will now return [3  7  16] by default (unless the indType option is used. See below.)

Searching along any dimension > 2

Then, vectorfind() should become able to search occurrences along any of the haystack dimensions. Presently, only a search along rows with the "r" option or along the columns with the "c" one are possible. The direction should be generalized with numbers: 1, 2, ..., ndims(haystack). Using the above haystack, we shall write and get for instance:

--> vectorfind(H, [0 2 0], 3)  // in (1,2,:,1) and (1,1,:,2)
 ans = 
  3.   16.

--> vectorfind(H, [0 0], 4)    // in (2,2,2,:), (1,1,3,:) and (2,2,3,:)
 ans = 
  10.   13.   16.

--> ind2sub(size(H), [3 10 13 16]')
 ans  =
   1.   2.   1.   1.
   2.   2.   2.   1.
   1.   1.   3.   1.
   2.   2.   3.   1.

Authors

2017 - Samuel GOUGEON


CategorySep

public: SEP/2017 - vectorfind() upgrade (last edited 2017-08-13 01:03:27 by sgougeon@free.fr)