shell - To Split into fixed sequences and leave extra out -
i limit files of same fixed length last item can variable size not more 557. means file amount can more determined flag -n
of command split
.
code 1 (ok)
$ seq -w 1 1671 > /tmp/k && gsplit -n15 /tmp/k && wc -c xaa && wc -c xao 557 xaa 557 xao
where xaa first file of sequence, while xao last one. increase sequence 1 unit causes 5 unit increase (557->562) in last file xao not understand:
$ seq -w 1 1672 > /tmp/k && gsplit -n15 /tmp/k && wc -c xaa && wc -c xao 557 xaa 562 xao
why increase of one-unit in sequence increase last item (xao) 5 units?
code 2
$ seq -w 1 1671 | gsed ':a;n;$!ba;s/\n//g' > /tmp/k && gsplit -n15 /tmp/k&& wc -c xaa && wc -c xao 445 xaa 455 xao $ seq -w 1 1672 | gsed ':a;n;$!ba;s/\n//g' > /tmp/k && gsplit -n15 /tmp/k&& wc -c xaa && wc -c xao 445 xaa 459 xao
so increasing whole length 1 sequence (4 characters) leads 4 character increase (455 -> 459), in contrast first code increase 5 characters.
code 3
let's keep each unit of sequence fixed 4 characters seq -w 0 0.0001 1 | gsed 's/\.//g'
:
$ seq -w 0 0.0001 1 | gsed 's/\.//g' | gsed ':a;n;$!ba;s/\n//g' > /tmp/k && gsplit -n15 /tmp/k&& wc -c xaa && wc -c xao 3333 xaa 3344 xao $ seq -w 0 0.0001 1.0001 | gsed 's/\.//g' | gsed ':a;n;$!ba;s/\n//g' > /tmp/k && gsplit -n15 /tmp/k&& wc -c xaa && wc -c xao 3334 xaa 3335 xao
so increasing sequence 1 characters increases xaa unit decreases xao 9 units. behavior not keep logical.
how can limit sequence length first, instance fixed @ 557 , later determine amount of files of successful files?
original answer — code 1
because seq -w 1 1671
generates 5 characters per number — 4 digits , 1 newline. adding 1 number output adds 5 bytes output.
extra answer — code 2
you've asked gnu split
(aka gsplit
) split file input 15 chunks. best values out. there's limit can when total number of bytes not multiple of 15. there options control happens.
however, in basic form, -n 15
option means first 14 output files each 445 characters, , last gets 455 because there 6685 = 445 * 15 + 10 characters in output file. when add 4 characters file (because delete newlines), last file gets additional 4 characters (because 6689 = 445 * 15 + 14).
extra answer — code 3
first of all, output seq -w 0 0.0001 1
looks like:
0.0000 0.0001 0.0002 … 0.9998 0.9999 1.0000
so after output edited first sed
, numbers 00000 10000 present, 1 per line, 6 characters per line (including newline). second sed
eliminates newlines, again.
there 50006 bytes in /tmp/k
on 1 line. that's equal 15 * 3333 + 11, hence first output. second variant has 50011 bytes in /tmp/k
, 15 * 3334 + 1. hence difference of one.
Comments
Post a Comment