01.07 report

In these past few weeks I've made more tests between various patchers, however, after discussing with my mentor, I've decided to first get a version working with bsdiff and then add more algorithms.

Since .jar patches were the biggest, I've asked the guys at Eclipse ([1]) on how they do their automatic updates. They said they just resend the whole .jar file, which wasn't quite what I had in mind.

Therefore, I've decided to improve the patch creating algorithm a bit and found out that .jar files are just .zip compressed files, so instead of applying patchapi/courgette/bsdiff directly on a jar file, I'm first uncompressing it, then applying the patch creating algorithm recursively on the extracted folder. Afterwards, I .zip the generated patches and turn it into a "jar patch". If it turns out that this "jar patch" is smaller than the patch created by applying patchapi/courgette/bsdiff directly on the jar file, I pick this one instead. Using this algorithm has made the patches somewhat smaller, but I've used 5.2 vs. 5.3 for comparison, in which the jar files are pretty different, so the improvement wasn't as good as I've expected.

I've started doing more work on the updater toolbox ([2]), where I've ran into a few problems. I'm developing on Windows (msvc compiler), which caused some problems with linking, such as:

SCI/modules/dynamic_link/src/scripts/Makefile.incl.mak links DLL-s with -MT instead of -MD. -MD means dynamic linking with the c(pp) runtime, whereas -MT links statically. Linking statically is problematic, because this makes each DLL have its own copy of the data that the C runtime uses. This includes allocator data, which means that when you're freeing memory allocated by another DLL (which means by another instance of the C runtime), you'll get heap corruption problems. Fixed with s/-MT/-MD/
core is also built as a static lib, which causes similar problems: getSCIpath() returns its value from a global variable, which means that each instance of core (i.e. each dll that links with core) stores a different copy of that global value. Since that value is never initialized in my instance of core, getSCIpath() returns NULL from me. Functions such as getmodules() rely on getSCIpath() so I can't use those either yet.
Some other problems with core not being linked to everything it should be and giving linking errors whenever I was using getmodules() (more info on my builder_cpp.sce)

Besides the problems I've encountered, I've also made some changes to my initial design. My initial design choice was not to use version numbers for deciding whether a client is up-to-date, but md5/sha1 sums. This offered some advantages (such as making sure that the client has the official binaries, otherwise sending binary diffs is useless). However, I've decided to switch back to version numbers (as taken from version.xml). The md5 sums felt pretty unnatural and different to what everyone else is using.

Although version.xml will be removed soon from Scilab, my understanding is that getmoduleversion uses it if its available (otherwise it just returns the version of scilab, which is good). The version is major.minor.maintenance.revision, so whenever one dev wants to release a quick fix to one of the modules, that dev will increase the revision number to the said module (so the version of scilab will be e.g. 5.4.0.0 and the version of that module will be 5.4.0.1). Assuring that the client has the official binaries shouldn't be difficult, we could just set a special #define whenever we're compiling the official ones.

Currently, my idea is to spawn the update checker on a separate process and use Boost.Interprocess[3], which is a neat cross-platform library for IPC and the like. The process will just periodically check for updates (if some updater_auto_check is set), optionally download them (if some updater_auto_download is set) and then ask the client whether he wants to apply the update.

My current sci gateway includes the functions check_for_updates, download_updates and apply_downloaded_updates, but a few more will be added, as mentioned in the SEP. The binary patching algorithms are released as separate DLLs which attach to an updater manager, guaranteeing the possibility of easily adding/removing patchers afterwards.

Before midterm, I'm hoping to be able to finish the download updates and the apply updates functionality, so we can actually see scilab update itself from my localhost http server. Hopefully I won't run into problems

Links

1. Eclipse thread

2. Updater toolbox

3. http://www.boost.org/doc/libs/1_36_0/doc/html/interprocess.html