Bash script have very useful commands for Text File analysis. For this example we will use small part from Leo Tolstoy: War and Peace, available on Project Gutenberg:
This part is placed in file onepart.txt. To count letters all letters must have same case. Command tr will translate all letters to uppercase letters:
$ cat onepart.txt|tr a-z A-Z
Next, output is filtered. Command tr is useful here with first argument are switches -cd and seccond is letter to be filtered. Switch -c means complent and -d to delete, in short those two arguments means erase everything what is not letter in argument.
Pipelining we get:
$ cat onepart.txt|tr a-z A-Z|tr -cd A
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Or for B, C etc...
$ cat 3.txt|tr a-z A-Z|tr -cd B
BBBB
$ cat 3.txt|tr a-z A-Z|tr -cd C
CCCCCCCCCCCCCCCCCCCCCCC
$ cat 3.txt|tr a-z A-Z|tr -cd D
DDDDDDDDDDDDDDDDDDDDDDDDDD
And of course we need to count those letters. Command wc with switch -m do exactly that:
$ cat 3.txt|tr a-z A-Z|tr -cd A|wc -m
40
$ cat 3.txt|tr a-z A-Z|tr -cd B|wc -m
4
$ cat 3.txt|tr a-z A-Z|tr -cd C|wc -m
23
$ cat 3.txt|tr a-z A-Z|tr -cd D|wc -m
26
and, of course, script to count all letters occurances:
#!/bin/bash
text=$(cat $1|tr a-z A-Z)
echo "Letter occurances:"
for l in A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
do
let=$(echo $text|tr -cd $l|wc -m)
echo "$l $let"
done
Script is a little bit slower because text is analyzed for every letter.