linux - Probability Distribution of each unique numbers in an array (length unknown) after excluding zeros -


part of datafile looks as

ifile.txt 1 1 3 0 6 3 0 3 3 5 

i find probability of each numbers excluding zeros. e.g. p(1)=2/8; p(3)=4/8 , on

desire output

ofile.txt 1  0.250 3  0.500 5  0.125 6  0.125 

where 1st column shows unique numbers except 0 , 2nd column shows probability. trying following, looks lengthy idea. facing problem in loop, there many unique numbers

n=$(awk '$1 > 0 {print $0}' ifile.txt | wc -l) in 1 3 5 6 ..... n1=$(awk '$1 == $i {print $0}' ifile.txt | wc -l) p=$(echo $n1/$n | bc -l) printf "%d %.3f\n" "$i $p" >> ofile.txt done 

use associative array in awk count of each unique number in 1 pass.

awk '$0 != "0" { count[$0]++; total++ }       end { for(i in count) printf("%d %.3f\n", i, count[i]/total) }' ifile.txt | sort -n > ofile.txt 

Comments

Popular posts from this blog

c# - Better 64-bit byte array hash -

webrtc - Which ICE candidate am I using and why? -

php - Zend Framework / Skeleton-Application / Composer install issue -