word frequency count

cat {file} |tr [:upper:] [:lower:]\
| sed ’s/[^[:alpha:]][^[:alpha:]]*/\
/g’ | sort | uniq -c| sort -r |more

note that a new line is explicitly included in the sed via the line break. . .

thanks to Daniel Fackrell for the inspiration

3 Responses to “word frequency count”

  1. äxl Says:

    cat {file} | tr [:upper:] [:lower:]\ | sed s/[^[:alpha:]][^[:alpha:]]*/\ /g | sort | uniq -c | sort -r | more

    Gotten rid of the typographic apostrophes. Use typewriter apostrophes instead. The space before /g is important to separate words.
    But still it doesn't work correctly since it only counts unique lines not words ...

  2. äxl Says:

    This seems to work:

    awk '{gsub(/[^[:alnum:]_[:blank:]]/, "", $0);for (i = 1; i

  3. äxl Says:

    Here from pravin27:
    http://www.unix.com/shell-programming-scripting/156334-word-frequency-sort-printfriendly.html

Leave a Reply