word frequency count
cat {file} |tr [:upper:] [:lower:]\
| sed ’s/[^[:alpha:]][^[:alpha:]]*/\
/g’ | sort | uniq -c| sort -r |more
note that a new line is explicitly included in the sed via the line break. . .
thanks to Daniel Fackrell for the inspiration
11 February 2012 at 12:42 pm
cat {file} | tr [:upper:] [:lower:]\ | sed s/[^[:alpha:]][^[:alpha:]]*/\ /g | sort | uniq -c | sort -r | moreGotten rid of the typographic apostrophes. Use typewriter apostrophes instead. The space before /g is important to separate words.
But still it doesn't work correctly since it only counts unique lines not words ...
11 February 2012 at 12:46 pm
This seems to work:
awk '{gsub(/[^[:alnum:]_[:blank:]]/, "", $0);for (i = 1; i11 February 2012 at 12:47 pm
Here from pravin27:
http://www.unix.com/shell-programming-scripting/156334-word-frequency-sort-printfriendly.html