UTF-8 Round Trip for Perl, MySQL and node.js

Doing things the un*x way means keeping a supply of “go-to” tools for the various tasks that spring up during development. For me, that’s a lot of bash, python and Perl on the dev machine and recently node.js on the server. While scraping acquiring multi-lingual (Unicode) data for a project, I had to make sure I kept […]

Playing Perl: Counting Occurrences

Ever have a list of phrases and wonder which individual words appear the most? Me too! Here’s a handy Perl command that will get the job done: perl -F\t -lane “map{$w{$_}++} split (/ /,$F[0]); END { print qq|$_\t$w{$_}| foreach sort{$w{$b}<=>$w{$a}} keys(%w) } ” < INPUT_FILE > OUTPUT_FILE This assumes: the input contains a TAB separated […]