17/06/2011 - Second report

Although it's been a busy period (finals and moving back to hometown), these two weeks I've managed to do a few different things. First, I've added more information to the SEP about the server part of the updater. Afterwards, I have written the Python script which generates the DLL/EXE dependency tree, which can be found at [1].

After having e-mailed the developers of Courgette regarding some questions I had and not receiving any response, I've decided to forward my e-mail to the Courgette-dev mailing list. Some of my questions have been answered there, and you can find the discussion at [2]. I have improved the patch generator script[3], so it can now use various binary patching apps rather than just xdelta.

Using the improved patch generator script, I have written a script for generating a Spreadsheet (currently a .csv file) that can be used to compare various algorithms[4]. I have not yet added Courgette to the list, but I should be able to do this tomorrow. I have compared xdelta, bsdiff and Windows' PatchAPI.

The results were strongly in favor of PatchAPI. The patches were created from Scilab5.2.2 to Scilab5.3.1, Windows x86. Patchapi perfomed best for 97.53% of the binary files, bsdiff for 2.3% and xdelta for 0.17%. Patchapi created a 62.78MB patch, bsdiff in a 69.04MB patch and xdelta in a 71.06MB patch. Clearly, only the binary files were patched (text files and newly created files were ignored). There was a total of 163.98MB of binary files in Scilab5.3.1, meaning Patchapi got a compression ratio of around 2.61:1 and bsdiff got around 2.38:1. You can find more info on the results at [5].

A courgette dev claims that Courgette did a better job than Patchapi for Chromium, so that's what I'm trying to test tomorrow. I'll add Courgette to the script that generates the Spreadsheet and I'll see how small the patches are compared to the rest of the algorithms. Over the next two weeks, I hope to be able to finally decide exactly what binary patching algorithms I'll use for each platform and whether I'll start on Courgette and improve from there or not. Afterwards, I will look to better understand the way Scilab works with modules and external modules and I will transform my current proof-of-concept module into an external module that will work with the provided patching algorithms. I'm using C++ and the "Abstract Object Factory" idiom, so the code that downloads and applies patches is abstractized from the algorithm that is actually used, allowing to easily change and add new algorithms.

  1. Dependency tree generators

  2. Chromium-dev thread

  3. Patch creator

  4. Statistical analysis

  5. Results

  6. Draft SEP

public: Contributor - Binary Patch/17.06 (last edited 2011-06-17 22:31:24 by Stefan Mihaila)