linux - Probability Distribution of each unique numbers in an array (length unknown) after excluding zeros -
part of datafile looks as
ifile.txt 1 1 3 0 6 3 0 3 3 5
i find probability of each numbers excluding zeros. e.g. p(1)=2/8; p(3)=4/8 , on
desire output
ofile.txt 1 0.250 3 0.500 5 0.125 6 0.125
where 1st column shows unique numbers except 0 , 2nd column shows probability. trying following, looks lengthy idea. facing problem in loop, there many unique numbers
n=$(awk '$1 > 0 {print $0}' ifile.txt | wc -l) in 1 3 5 6 ..... n1=$(awk '$1 == $i {print $0}' ifile.txt | wc -l) p=$(echo $n1/$n | bc -l) printf "%d %.3f\n" "$i $p" >> ofile.txt done
use associative array in awk
count of each unique number in 1 pass.
awk '$0 != "0" { count[$0]++; total++ } end { for(i in count) printf("%d %.3f\n", i, count[i]/total) }' ifile.txt | sort -n > ofile.txt
Comments
Post a Comment