Archive for the 'Random Randomness' Category

Google Me!

What does a google search for my name tell people about me?

  • Obviously, I own this website.
  • I have LinkedIn and Facebook profiles.
  • I don’t really participate on Stack Overflow or Wikipedia a lot.
  • I wrote some articles on GameDev.net, which were repeated around on the web, translated to other languages, helped people, and were published in books.
  • Also, I have been participating on the GameDev.net forums for quite a while.
  • I wrote a Master’s Thesis [pdf] in finance in 2007 (namely, market microstructure and game theory).
  • I wrote another Master’s Thesis [pdf, french] in 2007, this time in computer science (GPGPU using CUDA, to be precise). I ended up presenting that work at Calyon (an investment bank) in early 2008.
  • Back in 2006, I wrote a course evaluation system for one of my schools (the Paris School of Economics). It worked, but the code is certainly not something I’m proud of.
  • Back in 2005, I worked on a grid computing framework in C#, NGrid.
  • Also in 2005, I was an intern [pdf, french] at Exalead on bayesian classification of web sites.
  • I play role-playing games and was the main contact for the RPG society at the Ecole Normale Supérieure. I also ran for treasurer [pdf, french] of the entire student association in 2006 (I was not elected).
  • I dabble in Objective Caml randomness and have some thoughts about programming language design.
  • I use Magento.
  • I work for Tangane (and wrote some stuff on their blog).
  • People find it funny when I mention condoms.
  • I used to be a TA in computer science, some of my old papers are still floating around and being linked to.
  • I once played Diplomacy. I’m not a very good player. :(
  • I regularly play Magic : the Gathering. I’m not a very good player either. :(
  • I was present at the Mobility Party 2004, back when I was a developer with int13 (working on Darklaga). I remember playing Half Life 2 for the first time at that event.
  • I signed a petition against Verisign. You sign one petition, and it remains online forever—I have been much more careful about what I do online since then.
  • I wrote small novellas in an even smaller literary society. Actually, I had started writing in college in a (parody) student newspaper [pdf, french].
  • When in school, I worked at the student help desk for windows users on my spare time.

All these are actual links you can find in a google search for “Victor Nicollet”.

If you look deeper (that is, using your brain instead of google), you could also find:

  • Extatica (page is down, though it’s still present on this list). A C++ 3D game engine I wrote back in 2003. Took the website down after I received a Cease & Desist from Extatica (my google ranking exceeded theirs). Ironically, on the latter page, there’s a link to a website where the DLL for my old engine can be downloaded.
  • N’improtequoi, an improvisational theater group. I am their webmaster, as well as a team member.

I don’t really know which is scarier—that so much information about me is available online, or that nobody seems to care about it? :)

Happy Birthday, Pizza!

A year ago, I ordered a €9 pizza from Speed Rabbit Pizza, a Paris-based pizza delivery chain. A meal was had by me. Verily.

pizza

The next evening, I received a text message on my cell phone about a promotional 3-for-2 offer from that very Speed Rabbit Pizza outlet. I had made the mistake of giving them my cell phone number, because they wanted to call me if the pizza delivery man got stuck in a door. I have been receiving such weekly reminder messages every week for an entire year . And since they also had my home address, they could even send me promotional envelopes by mail.

I am quite protective of my personal contact information, and actively avoid doing business with any company that contacts me without my permission, so I have taken my business to Pizza Hut instead.

Total money made by Speed Rabbit pizza from me: €8.53 (without tax)

Total money spent by Speed Rabbit pizza on sending me promotional messages: 52 × €0.07 (text messages) + 5 × €0.7 (postal mail) = €7.14 (without tax)

Total money earned by Pizza Hut because Speed Rabbit sent me promotional messages: around €105 (without tax)

Making your blog post look like a Mastercard ad: priceless.

Smart Spamming

I found an interesting comment on my website today, for the article on last-minute-skinning of a page in HTML from some Javascript. It looks pretty sane:

CT — October 5, 2009 at 22:15

Interesting stuff. I don’t relish the idea of taking the vile HTML our designers produce and creating the skin files. Nice proof of concept though – I’ll have to keep an eye out for an excuse to use it ; )

This comment, while completely adequate and relevant to the article, is spam. How do I know? First, the provided website is a classic credit-rating-improvement web portal. But should I prevent people who work in the credit spam industry from posting relevant comments on my articles? Well, there are other comments on that article, too, such as:

Tom Milsom — September 8, 2009 at 11:41

Interesting stuff. I don’t relish the idea of taking the vile HTML our designers produce and creating the skin files. Nice proof of concept though – I’ll have to keep an eye out for an excuse to use it ; )

So, it looks like the spam-bot found an earlier comment on the article, copied it verbatim, and posted it with a different link. This would ensure that, if the spam domain is fresh enough not to register as such, the Akismet spam detector would let the comment go through unscathed based on its content alone. And as a human, if I did not pay attention to the author’s website while reviewing comments, I would let it go through as well because the comment would look sane. I don’t remember comments from one month ago, and I guess many people don’t.

Everyone enjoys advertising if they are looking for, or otherwise interested in, the product being advertised. I discovered Cushy CMS because it ran an ad on The Daily WTF, and I am quite happy with the discovery because I was looking for such a product. And nobody enjoys advertising for products they don’t need—I don’t give a cheese about US credit ratings. I have limited space on my screen that I’d rather not fill up with advertising about things I do not need, and my time is even more precious than that.

This spam comment blurs the line between spam comments that are irrelevant to the discussion and point to websites irrelevant to the readers, and ham comments that are relevant to the discussion and point to websites that are relevant to the readers (by virtue of usually being run by the author of the comment and thus sharing at least some elements).

Suppose that tommorrow, someone posts an original and interesting comment on one of my articles, yet links it to a credit rating website. Should I accept the comment as such, block it, or publish it without the link?

One of the main reasons why people comment on the blogs of other people is to improve their visibility on the internet. If I post a comment on a well-known blog, hundreds and thousands of people will browse over that comment, a small percentage of these will find my writing worthy enough to follow the link and end up on my blog, and an even smaller percentage will become regulars, posting comments and subscribing to my feeds. Which is good, of course, because the more comments I get on my blog, the more interesting it becomes.

This means that commenting is often quite similar to advertising one’s own blog or website. People allow commercial advertising on their blogs (ad banners and such) to get money in return, and they allow personal blog/website advertising on their blogs to get comments in return. So, I guess if an irrelevant website was linked to by a genuinely interesting comment, I would publish that comment (of course, restrictions do apply: I would not allow all websites, just like I would not allow all ad banners).

I like the blogs with good comment advertising—where I can browse the comments and find links to interesting websites.

Now we know why…

…the stock market crashed.

failcac

It’s The Fear

Today, I ran rm -rf *. As root. On a production server.

The problem is not that I lost any important data—my surgical strike removed precisely the outdated files that had to be erased, and nothing else.

The problem is that I did it unconsciously. I did not stop to check that I was doing the right thing, which means that had my pwd been off by an inode or two I would have blasted important files away instead of junk. And would have spent the rest of the evening and night restoring them. And when I noticed this, moments after doing it, I went into a short-lived panic as I checked everything after the deed. Those were my two seconds of fear for the week, I guess.

Being a system administrator is all about being permanently afraid of the next thing that will happen. When you don’t mistype rm -rf *~ as rm -rf * ~ or overwrite original data with an incorrect tab-completed pipe-to-file, you end up with security holes that can and will be exploited at one point.

Did you know that changing an user password in the mysql.user table only takes effect on the next MySQL reboot? And that if the password is not a valid hash, MySQL assumes no password is required? Not all security holes appear right away.

Nice One

void User::DeleteAccount()
{
  // You cannot delete accounts. It's illegal, we must keep a trace of every
  // account on the system. Use User::DisableAccount() instead. 

  // Why is there a DeleteAccount() function then? Because once upon a
  // time, when there was no DeleteAccount() function, a smartass though
  // "hey, they forgot to write a DeleteAccount()" and promptly wrote it
  // himself. 

  // So, this function remained here as a warning for you: you obviously
  // didn't get the "you cannot delete accounts" memo, because you came
  // looking for a function to do just that. Do not try to delete
  // accounts, my friend. Do not stray from the righeous path. Disable
  // the accounts instead.

  assert (false);
}

Scanners, Again

I’ve already ranted about my document scanner suite. I have recently updated it to add new features.

The basic workflow goes like this:

  • You run the “scan” command. This usually happens by clicking the desktop icon for the launcher, but you can also run it on a command line.
  • The program prompts you for a document name. Aside from being different from any existing document name (to avoid accidental overwriting) you are free to choose any valid file name.
  • The program starts scanning pages. Every time a page is scanned, a  preview is shown and the user can accept or try again. Every time a page is accepted, the user is allowed to scan another page or stop scanning.
  • Every scanned page is saved to TIFF on the fly. Once all pages have been retrieved, they are converted to PNM, then to DJVU. This conversion step takes around two minutes per page on my computer. Then, all DJVU files are bundled together as a single file.
  • The bundled DJVU is stored both locally and on a backup server through FTP.

Once the manual scan-preview-confirm process has ended, the lengthy compression and upload stage starts, but is completely non-interactive. It is therefore possible to start scanning another document (or do something else) while it finishes.

I have also reduced the resolution from 300 dpi to 150 dpi, as it remains quite readable. This has resulted in a reduction in file size from around 8MiB PNG files to 2MiB TIFF files, which are in turn compressed to 1MiB DJVU files. My current library of scanned pages (mostly administrative documents, reports and contracts) weighs in at around 150MiB instead of the previous 1.1GiB.

Below is a scan of Papier d’Arménie made by my delightful assistant:

The Objective Caml source code for running this little baby follows below:

exception CommandFailed of int

let run command =
  print_endline command ;
  let result = Sys.command command in
    if result <> 0 then raise (CommandFailed result)

let ask request =
  print_endline ( "# " ^ request ) ;
  read_line ()

let tmp ext =
  Filename.temp_file "" ext

let say format =
  Printf.printf ("# " ^^ format)

(* Scan a page, display the result, ask if the user wants to keep it
   (tries again until it gets the scan right) and returns the filename
   where the successful scan was saved. *)
let rec scan_to_tiff () =
  let file = tmp ".tiff" in
    run ("scanimage -l 0 -t 0 -x 215 -y 297 --brightness -22 "
         ^ "--contrast 22 --resolution 150 --progress --mode Gray "
         ^ "--format=tiff > " ^ file) ;
    run ("display " ^ file) ;
    if ask "keep this page? [Yn]" <> "n" then
      file
    else
      scan_to_tiff ()

(* Scan individual pages (using scan_to_tiff) until the user decides to
   stop. If an individual scan fails due to system errors, allows retrying.
   Returns the list of all filenames the user agreed with. *)
let rec scan_list_to_tiff () =
  try
    let file = scan_to_tiff () in
      if ask "scan another page? [Yn]" <> "n" then
        file :: scan_list_to_tiff ()
      else
        [file]
  with CommandFailed i ->
    say "command failed with exit code %d\n" i ;
    if ask "try again? [Yn]" <> "n" then
      scan_list_to_tiff ()
    else
      []

(* Turn individual image into djvu image. Returns djvu filename
   if successful. *)
let rec tiff_to_djvu file =
  let pnm = tmp ".ppm" in
  let djvu = tmp ".djvu" in
    run ( "convert " ^ file ^ " " ^ pnm ) ;
    run ( "cpaldjvu " ^ pnm ^ " " ^ djvu ) ;
    djvu

(* Turn a set of images into individual djvu pages. Allow skipping
   or retrying on error during the conversion process. *)
let rec tiff_list_to_djvu_list = function
  | [] -> []
  | file :: list ->
    try
      tiff_to_djvu file :: tiff_list_to_djvu_list list
    with CommandFailed i ->
      say "command failed with exit code %d\n" i ;
      if ask "try again? [Yn]" <> "n" then
        tiff_list_to_djvu_list (file :: list)
      else
        tiff_list_to_djvu_list list

(* Turn a list of individual djvu files into a bundled djvu file. *)
let rec make_djvu_bundle file list =
  try
    if  list = [] then
      false
    else if List.tl list = [] then
      ( run ( "cp " ^ List.hd list ^ " " ^ file ) ; true )
    else
      ( run ( "djvm " ^ file ^ " " ^ String.concat " " list) ; true )
  with CommandFailed i ->
    say "command failed with exit code %d\n" i ;
    if ask "try again? [Yn]" <> "n" then
      make_djvu_bundle file list
    else
     ( say "scan aborted" ; false )

(* Choose a name for the output djvu file *)
let rec choose_djvu_filename () =
  let path = "/home/arkadir/docs/" in
  let name = ask "document name (extension will be added automatically) ?" in
    if name <> "" && name <> Filename.basename name then
      ( say "incorrect filename" ; choose_djvu_filename () )
    else if Sys.file_exists (Filename.concat path (name ^ ".djvu")) then
      ( say "file already exists" ; choose_djvu_filename () )
    else
      Filename.concat path (name ^ ".djvu")

(* Upload a file to an ftp server. *)
let rec upload_file file =
  try
    run ( "ncftpput -f /home/arkadir/docs/ftp.cfg /home/www/blog/docs " ^ file )
  with CommandFailed i ->
    say "command failed with exit code %d\n" i  ;
    if ask "try again? [Yn]" <> "n" then
      upload_file file
    else
      say "upload aborted"

(* Complete process *)
let _ =
  let name = choose_djvu_filename () in
  let files = tiff_list_to_djvu_list (scan_list_to_tiff ()) in
    if make_djvu_bundle name files then
      upload_file name

This requires the classic djvuLibre utils to be installed (cpaldjvu and djvm), as well as imagemagick (convert) and ncftp (ncftpput). Scanning happens with sane (scanimage). Some files are also uploaded to my web server, where I use “convert -thumbnail” to create thumbnails from DJVU files.

Scanners!

Every piece of (useful) snail mail I receive is scanned and stored both on my computer and on a remote backup server. The scanner itself cost me around 50€ (it’s a Canon LIDE 50 of which I am quite happy, especially since it is perfectly compatible with the latest SANE libraries on my Ubuntu). In order to improve the efficiency of the process in terms of time I have to waste doing things, I have written a short script that interacts with sane (for scanning) and ncftp (for uploading to the backup server) and lets me enter elementary information on the command line.

Here is the code:

let prompt s =
  print_string s ;
  flush stdout

let make_directory dirname =
  let command = "mkdir " ^ dirname in
    0 = Sys.command command

let base_path = "/home/arkadir/docs/"

let scan_command =
  "scanimage -l 0 -t 0 -x 215 -y 297 --brightness -22 --contrast 22 --resolutio\
n 300 --progress --mode Gray --format=tiff 2> /dev/null"

let scan_to_file filename =
  let command = scan_command ^ " | convert tiff:- " ^ filename in
    0 = Sys.command command

let scan_files base =
  let rec aux i =
    let filename = base ^ "/page" ^ string_of_int i ^ ".png" in
      if scan_to_file filename then
        begin
          ignore (Sys.command ("display " ^ filename));
          prompt "Scan successful. Enter any string to continue, nothing to stop. " ;
          let line = read_line () in
            if line <> "" then aux (i+1)
        end
      else
        begin
          prompt "Scan FAILED. Enter any string to retry. " ;
          let line = read_line () in
            if line <> "" then aux i
        end
  in aux 1

let rec upload_files dirname =
  print_endline "Uploading files..." ;
  let command =
    "ncftpput -R -f "^base_path^"ftp.cfg scans "^dirname
  in
    if 0 <> Sys.command command then begin
      prompt "Upload FAILED! Enter any string to retry. " ;
      let line = read_line () in
        if line <> "" then upload_files dirname
    end

let process =
  prompt "Document name: " ;
  let line = read_line () in
    if line = "" then print_endline "No filename entered, aborting."
    else
      let dirname = base_path ^ line in
        if not (make_directory dirname) then
          print_endline ("Could not create directory " ^ dirname ^ ", aborting.")
        else begin
          scan_files dirname;
          upload_files dirname
        end

So far, I’m keeping the data as high-resolution PNG files, which means about 8MiB for every file. I will be moving to the DjVu compression format as soon as possible, and update my script accordingly.

Jamin-Puech

Jamin-Puech is a French company that designs and sells handbags and jewelry worldwide. Still, people couldn’t buy these handbags online, because the brand had no e-commerce website.

This is where I come in: I was the technical lead of the development team brought together by my employer, Tangane, to build a new e-commerce website for Jamin-Puech. It consists mostly of a custom-skinned Magento website with some additional development in various functionality areas.

The result has been online for a short while now. If you’re interested in handbags:

http://www.jamin-puech.com/eboutique/

Development…

I tend to write code mostly for my own projects (at work, my job consists mostly of understanding third party code, suggesting implementation tactics and gathering requirements), so I get a reasonably free choice of operating systems and development environments.

My basic work environment looks like this (click to enlarge):

editor

This is an xterm running emacs through an SSH connection, showing JavaScript and PHP code for JITBrain in two buffers, with js2-mode, php-mode and global-font-lock-mode enabled. What is shown here is what my laptop screen is able to display—my actual workstation fills two 22″ screens with four buffers of code and a smaller font, and the transparent dark terminal background lets me look at a browser window behind the editor for quick reference.

Why?

The main reason is that I’m used to it. I’m so used to the emacs way of working with code that I actually do counter-productive things when I use other editors: I expect the tab key to move the code to its natural indentation, but most editors just insert a tab, and I routinely save my work with Ctrl-X Ctrl-S, which usually just cuts the text. I can of course get by these limitations (I used to work with Visual C++ a lot when I was younger) but I still don’t have the same training as other developers with other IDEs—except, for some reason, the Visual Studio debugger.

The second reason is that xterm-friendly editors are the only editors I can use on both windows and linux with the exact same configuration. When I’m on my laptop, I can just SSH to the appropriate server and start hacking away code. I guess I could use a Windows-based VPN and use a graphical IDE remotely, but the performance of that has been quite low.

The third reason is that emacs is designed to be used with only a keyboard. By contrast, it’s certainly possible to use Visual Studio or another graphical IDE with keyboard shortcuts alone, but doing so is an order of magnitude more annoying than the emacs equivalent. The most fundamental things I tend to do is open source files: this is fairly optimized for mouse users in graphical IDEs (open solution panel, double-click file) but navigating a file tree with only the keyboard is quite harder. The basic issue with the mouse is that I don’t have one—my laptop only has a touchpad and I don’t have the room to carry or use an additional mouse with me.

And before you point out that this is ultimately a known troll and religious war topic, this is more about my own habits than about whether these habits are better than others. Feel free to discuss your own inferior development environments below ;)