The Fast Meme Transform: Convert Audio Into Linux Commands
2018-02-10 - By Robert Elder
Abstract
In recent years the meme economy has grown to have a significant influence on our everyday lives and culture. This influence has non-negligible effects on the measured GDP of the 'real' non-meme economy. Recently, the influence of memes has extended to even affect our political system. Although some might consider memes to be merely a joke not worthy of serious consideration, the significance of memes cannot be under-stated and it is therefore meaningful to explore new ways in which memes can be proliferated.
This article will present a method of applying the Cooley-Tukey Fast Fourier Transform algorithm to convert audio samples from YouTube videos that contain dank memes into small self-contained Linux commands. These self-contained Linux commands describe the coefficients of the discrete Fourier series and are packaged along with a small audio decoder (written in awk) that performs the inverse Fourier transform (along with a few small adjustments) to decode the audio. Expressing several seconds of audio data in terms of standard printable ASCII characters that are simultaneously easily copied, but are also easy for a listener to understand, presents various challenges. Several techniques to improve sound quality and compress the Fourier co-efficients are also discussed to meet these challenges, with an overall satisfactory final outcome. This article is also part of a collection:
- Overlap Add, Overlap Save Visual Explanation
- Endpoint Discontinuities and Spectral Leakage
- Fourier Transform Coefficients Of Real Valued Audio Signals
Triple Collateral Signaling
A meme codec audio too called the Fast Meme Transform has been created which can be used to easily transform audio samples from YouTube videos into Linux commands. The Fast Meme Transform tool itself is a small ISO C99 program with no external dependencies other than the C standard library, so it should be versatile in situations that require rapid meme deployment. For example, you can make use of the 'youtube-dl' tool to quickly download the YouTube video and extract the audio into a standard PCM wav format just like in the example below:
youtube-dl M6PbdJiAK84 -x --audio-format wav -o "tripple.%(ext)s"
Then, pass the wav data through the Fast Meme Transform encoder tool:
./fmt -f tripple.wav -s 4.9 -e 7.9
Here is the output you'll get after running the above commands:
echo -n 'uuuuJC8/CvOoO2/OgO5EGvHLHU:i:a:]<V<3/<4/Dz?3F?_19W19w19U19Z20P20p20[21VOVOdOWnCZ:V:2F:W:6>:_:8I:c<k<1><3/<y20Q20[20v21|21o2G/2H*2D(2?(21]OkOWOmO7E<5X<t<i<b<]D0>D1>D2(DqD4/D5YD6FD7>DaDc?l?{?W?5E9z9fDiDa?k?1=r?2=r?4Y?h?m?[n9z9Q?|?3I?4Br?5I?6>?7}190E9f?P?f?p?4/?5Br?6B5#?7/n8h8a8Z<m<8/D0>D2/DWDgD5>DtD8/Dc?|?o?2/?Ln7h7L7_7a8l8P8WCZ:o:d:p:g:s:b:]<0}<1><f<Q<5/<s<Un6i6v7V71>72(7Q75/7s7UCkCRCwCtC7E5_6f6p6R6L6Un51F5p54E4h4L4b49EJ77/78I8V8z83X8R39z39W39y397Fn80F8f<q40P4G%4C*4<%482F5N*56P56d56Wn8_DfSySmK|KoKp5HF5<*5?(TlTQN4^L87/DhKtKiKcTqTgTLTU6G(6H(78U78[n8b8Z4S%4K(jRjhjL532/533/62l62P62q799^b8Z4S(4K(jQj5/53P532/61v6O%798^a8]4S%4K(jgjsj7M02(80pn7Z80^P82M3M4M6>8i8cKQKmK[j0Xn8h86I8U8b8ZKgKyT{5O%NUN]61V869^y8t8_8[8cKQK_TzT3/T4/N[Nc61l6G^6M7M]9VKmTPTpTgNc61|6G(6H^w8tSRSsSaS]5:*5<*59bN|NoNqNRNyN6^y8m<]DV33vSP4OYS3YS4/S9^R85EuuJ7R7w7s77Y79>8k8W8Qn71X74>75=r76>77/78MP:k:{:3/:5E7U79}80=r8{82MW8R8L:Q:5/:L:Z<k<d<4^d83M4M5M6I87Y88MZ90>9{D3XDRDL26_26v27l272(27QnDiDa?l35R35w35t35[3N%j6Fn?z?f35[35c3N/36P363Xnu8V82Mg8s<f<Q<h<m<7^0M1Md:5X:m:U:8/:]<0=5#<{<d<W<Q<5/<sn7_7bCq:k:d:p:4=r:5>:6I:7Y:[<LnC2FCRChCsC_CaC9I:0Br:o:2>:3>:g:w:tn7fH7/ClCPC4B5#C5BrCLCiCaC]:knGWGwG7IG9>HkHoH2IHgH5}H6>Hin9L9b10q10R11V11{1G(1H%1D*1?*11cG|GPGLGUHlHUH[CoCqC5/CLC_C9F:l:q:h:L:Z<V<d<3/n1:*1?%GdGmGiGbGZHVH1E@n' | hexdump -v -e '/1 "%u\n"' | awk 'function d(f,k,a,b,o,w,i,q,h,j){for(k=32;k<127;k++)o[sprintf("%c", k)]=k;split("e-0,,, 2x,,=1#,=-1,,=2#,=-2,&#,+,,-,.,)#,0,1,2,3,4,5,6,7,8,9,15,;,16,=,=3#,18,0=0,9 3,=-,14,17,%n,BAx,12,13,B3#,@n,43,6*,/8,60,22,1*,4*,4%,42,52,7*,0%,3*,=Ax,B4#,9%,8*,,9(,E8,7%,,8(,8%,9/,2*,e,2%,4(,5*,7(,44,0(,0*,6%,,1/,3(,3%,6#,6/,6(,JJ,9*,5(,x,5%,1%,1(,0/,=4#,,",a,",");split("0,1,2,3,4,5,6,7,8,9,=,e,x,+,-,.,n,;",w,",");q=0;for(g in w)if(o[w[g]]==f)q++;i=f-32+1;if(q>0)printf(f==110?"\n":a[i]);else{h=split(a[i],b,"");for(j=0;j<h;j++)d(o[b[j+1]]);}}d($1)' | awk 'BEGIN{L=2400;N=8192;for(n=0;n<L;n+=1)o[n+1]=0;}{l=split($1,t,"x");for(j=0;j<l;j+=1){split(t[j+1],p,"=");x[j+1]=p[2];f[j+1]=p[1];}M=(NR==60?2400:2*L);for(n=0;n<M;n+=1){R=0;for(j=0;j<l;j+=1){a=cos(2*3.141592*f[j+1]*(n/N)*1.000000);R+=(x[j+1]*a);}if(n<L){R=(R*(n/L))+(o[n+1]*((L-n)/L));R*=2.000000;R=int(R*32767);printf("%04X\n",R==0?0:(R>0?(R>32767?32767:R):(R<-32768?32768:65536+R)));}else if(n<2*L){o[n-L+1]=R;}}}' | awk '{a[i++]=$0}END{while(j<i)print a[j++]}' | xxd -r -p | aplay -q -c 1 -f S16_BE -r 48000
Listen to the audio output of the above command: tripple.wav
The text you see above is a self-contained Linux command that includes all of the audio data from the meme as well as the necessary software to decode it. The final output of the command is raw PCM that needs only to be directed to the system audio driver. As you can see, the above Linux command is rather large, but it is still small enough to easily share through copy and pasting.
Illuminati Confirmation
In situations where the existence of the Illuminati has been confirmed, it is often appropriate to make others aware of this fact by means of brief musical tone:
youtube-dl GRWbIoIR04c -x --audio-format wav -o "confirmed.%(ext)s"
./fmt -f confirmed.wav -e 10 -m 5
you'll get the following output:
echo -n 'UUUUF15M7Y?SrN3a4K5]6ka8K9%|@5V%bM5*16t_*SWZ23aiaR|>31*43C3(6?7Z8T9=7#B=4#13H6Y*GT27]9zC4%5[Z<Z12(E_rZ23a6K8]9m@7?9*<M3H6t*GraiV6}1oyf?9ZBd5YTG]8K9?|Cy[Id3>{Y*GN1qZ23K7]8*7sy*7[?<?E_*SmC6OT8TBZ<d6YN1qa8K9I4Q5@BIJE*Gl*BIM3H8Y*SWa3J29>39Is8*B?<d4V6I2vMS?n8?9ZB?XY*47O4Q2Q^8uI*12v@7Z8?9[T<*G%hoyH2(hlmo8uIf9CwC<2cbI2Q5CEJ<2Q3HSC7(B_kQ9vgYMGHSc8}QMS@<(ECG@UFBf4><2CF1GC9CUBCFB@h@F8uM7M8(Gg[*17_K8@2^bKiC8IN9C9[kN8l><K^bKi(48c28KR46c9k@BICbCBQg*Bl@bCFB@FBChCF^UUUF151M53@{8H4^{8d49?n150H51@151C{7H49M50d52@{7M48d50@151H52@Uj?S>28KR76W7C17H8t*SJ27%|J38V4pV7W5@j?SN8]9O4m*77W1PsXt*SJ38V5pfR90q4P5*njzO7f9P7P8*nXt*74V5u0}1P5Cj(46V5zW4@XYmV6J89>w*<4P5CEI4gY>47(55I3C36>37>76?wMG?G3>A6obV4O7MS(G0?G1C<>75pQ9>E6(E7(G2_3g[O7fR91}2v>G1_2lQV5u0u1JE^bV4(89_1lI_0_1g[Q2I3I5Y9_2C8Q3IsbI3J<s<1Q2vMS(G3W7g[O4mQ9HSW2P4q5?n<9vP1W3P4*n91vP5W6q7?nA6r7@BP1JA2C<2Q3>A3WsBI3P7P8*nFB@270@1GK65K67N68]6^263K65N66]67@267N68k0a71C268@264@265C269J270K72cU17?XZE?S]8a9M70W6P8CBQH4>17H8TETSK7NR|W2W3q5r6cXZEZSN8]9P2P4q5K70CXtT2i(|P2JA5JA7r8*njZGKRA4>A5>A7CXYTG]7]8%39JA1P2@E?Ga8N9P4>A5r6@E*GN8(47%48P8@E(GCE_kN8%47W6CEK9%36H6RA5P7W8CEkW6c28J29oBkgYJ28P7CBC9CB@^BC9CB@^UB@U9oUUU57cUU7}d4O4V5C3(4?5(8?B*<*{H5tmV6C8}*BH2M3V4pV7C5f[?<d3M4?75p>8R90?n5u?Bd2H4M5(h%57pf^7(8(BQ*12H6Jh(57V6g*B*13H4%h>wl[I(hI2Q3*<4g?BM4H5%55c4(8IH3M5Q2?wC9*BpI3MGC5f[IM4zV7J<9>1SCbJ75lImpf9?91c9[H4N3K4o<(8^Rh@75I2I4><5@wC23Q3Is<2>1GMSC<1Q^EK3I9>1G@1G@XCXYN3@EoE@bM4(EJw@X>n' | hexdump -v -e '/1 "%u\n"' | awk 'function d(f,k,a,b,o,w,i,q,h,j){for(k=32;k<127;k++)o[sprintf("%c", k)]=k;split("e-0,,, 2x,,=1#,=-1,,&#,=-2,=2#,+,,-,.,9 3,0,1,2,3,4,5,6,7,8,9,0=0,;,11,=,=/x,)#,%n,22,10,(n,=-/,19,:n,20,%1,(<,Dx,(2,=-3,(1,%2,%7,%A,%<,9(,21,=3#,FF,(7,(A,18,%E,L#,%B,,*2,9@,%G,,?2,RB,>n,?1,e,%8,@9,56,7K8,X?E,N7,@8,O5,,Jn,O6,?A,*A,4C,*E,(9,HG,<3,x,5%6,?76,14,30,%9,,",a,",");split("0,1,2,3,4,5,6,7,8,9,=,e,x,+,-,.,n,;",w,",");q=0;for(g in w)if(o[w[g]]==f)q++;i=f-32+1;if(q>0)printf(f==110?"\n":a[i]);else{h=split(a[i],b,"");for(j=0;j<h;j++)d(o[b[j+1]]);}}d($1)' | awk 'BEGIN{L=2400;N=8192;for(n=0;n<L;n+=1)o[n+1]=0;}{l=split($1,t,"x");for(j=0;j<l;j+=1){split(t[j+1],p,"=");x[j+1]=p[2];f[j+1]=p[1];}M=(NR==200?2400:2*L);for(n=0;n<M;n+=1){R=0;for(j=0;j<l;j+=1){a=cos(2*3.141592*f[j+1]*(n/N)*1.000000);R+=(x[j+1]*a);}if(n<L){R=(R*(n/L))+(o[n+1]*((L-n)/L));R*=5.000000;R=int(R*32767);printf("%04X\n",R==0?0:(R>0?(R>32767?32767:R):(R<-32768?32768:65536+R)));}else if(n<2*L){o[n-L+1]=R;}}}' | awk '{a[i++]=$0}END{while(j<i)print a[j++]}' | xxd -r -p | aplay -q -c 1 -f S16_BE -r 48000
Listen to the audio output of the above command: confirmed.wav
The above command will play the first 10 seconds of the X-files theme music. The -m parameter was added in this case to increase the volume by 5 since it's a bit quiet otherwise.
A Linux Command For Curbing Enthusiasm
For situations that require a Linux command that can highlight moments of unexpected disappointment:
youtube-dl Ag1o3koTLWM -x --audio-format wav -o "curb.%(ext)s"
./fmt -f curb.wav -e 10
you'll get the following output:
echo -n 'AccccGDFUa5k7@<6}N18k5@30F3r(ip2F67@<6}y3H4]P|l5O7*<5^M8@49O6O7zgB8I7[3J6jzbI7>8F<7>8H7V8:40p6:57uP75(77J6M8K9RCa6a9H8[9u6*57*75(76:77T3T4I7B8M9=4#Cy8[9u5u6:57(94I9RC[8IPC*nC:DI8:<8F1PnDV7F4W59O1l4z^k5a6V1f3]4l0O2N63N64%8W83F<7oPD:2L41fw43O0O1:6L63l4*<PCkwUV1H2F41*42*6L63:64Qs*UV0OW63J1%q2*<1F19@CaWU{C(D:nA<4X}(C(D@26*27H3]3*44]6@65:66R8PnUfW44:45:64l7:Sb8b9j1gWnU]3f4O6*67N68g0g1{24O7QSZ3Z4M67b8j0N171Nnd(Sb8^PndZ4o69j0:n88(d(Y%92g0}1*h8F3PYQ34V7p7(d*S*92T2T3b8^Ph3V5H6p5b9j0QS_0g1%htS%Y^8g1I69j0QYo58b0^LnS%YZ4B59^0*nA<5K6R28]8]P5L67:68vtdZ3J5_>8l7F6tSZ1o12T4M18J7%dT2T3:<7Z3T4BiF<6Z2X0BiBmF<5^Z3X0X3zF17B8(Yo|BmJ6_(SMmB54Q35F36]1(d:S(Y%9WnY:92K12K13K15%hrT3K14K34M35J34I36@GS%Y>42M45@nd(S:nqWUa4[5@47F64F65l7v8g0_1:nDf3f4R45O4O6*67M13b9_0I1F43f4O6QA<1b9J0J1>WYF92J0B45%h6[7:54v6*87%d:9L93T2T6Q55u6ut88FS:YK11T2K14Z5X1J12T3R114T5Z6]56I11Z2Z3T4K15I12Z3Z5oiX2JiB5LndFY:9L456{S*YX1f54I4XbR18>PD%d(SB|X3zB8H3V5]8f9O5O7lt84%85Z3K14KmJ7H3H4*68T2Z4T5(cGdRSR143>44{8td*Y(9wh5y6]0v8:d(S*YM|Bm{22*4L45l3l6(ST3T4QDaw44]5l5l6%Y(9Ln6tcnS*YM59^0M61F3Cf81@nS(YI4g(Ck5@26a8F32^9o70_1g2}4I2B3a2H1O1O3*64^8K69j1gWnC(UHrO1Ow63I1%C@D[1]2F169g1_LnDaL41_0_1jLnC]3o69QD%85:86B08J1v4v5K08Q35V7F5w54p5p7v4j0_1jLi0Q36Fm*54*56@84B07>08>26b9_0Qmu4R55:<68F169J69_0:G85*86X9K60:48Wn84%85J5_H0Hr[5F47fP84J6}k7k8H1H2*47f8@84J4^j[0V1:47O1@6L83F<4K5j>8Hr@85v6zM6y1J4^j[2H3]7J4_H1v4X1zM6X0F1iX3zbVW5Lm:55X0I6H2H3H4H5F37V8:4Lmp4*55*h6@cG87%D4{83o61^2J9@C:2WUHL38H9F41O0B05_0g1*nC:DR2L36H7H8^9}0J9*sa2RU[0O0F6w63I0Z%syrQq2y7{CQC4:C5=-4#C6:C7NC8ND6@nCWC4%C5*C6%C8(D0(D1:DLD3(D4:D6*D8RD9:248*24P2ia5L2ma54QD5(D6:2C:2Ua25N226N228V48[49V|Hi%h48R349R3iFh46H48[49=-4#3|Rch47H48H|Q347H48y4Ph47Fh3QC:D:Uk9V1y9f1]2*43pP75JP2L25@28H0[1R33H4HP40]L64I0X_>tC%q4@32[3I1XkWUJ6(CV3J5^%DI0I1B4%DJ0T:cccc<4M5^_(CRD=-4#26*27y8@39u4J9:CRDaW60*61%7t7P81Q9o5B8B9RC:Dk2*<9(q2J8(C%qW78(79B5W455{s*<PsF<0>PC>00Fh4V5[6VP40*41F44p8X3Qi:5w58O0*61M81Fh9O0:61:ccnA<4bN17*C%D%UR24:44(izb(C(D(Uk4ktS(Y@<4N15*C%U:24V0I0o3K4R16}FD(UH1J3M5K6:UI4M5N17@UJ3B4K6g%CJ3XK6:C@52I4XH7[8[9f4@5wmu9Q4Pi:5L5P60{|Q37@5WG80J27B29>30J27>29Qs*40*41p7p8*59*80@n' | hexdump -v -e '/1 "%u\n"' | awk 'function d(f,k,a,b,o,w,i,q,h,j){for(k=32;k<127;k++)o[sprintf("%c", k)]=k;split("e-0,,, 2x,,=1#,=-1,,&#,=-2,=2#,+,,-,.,9 3,0,1,2,3,4,5,6,7,8,9,)#,;,n1,=,%1,/x,=-?,0=0,(1,20,21,=-3,=?,nAn,%3,(<,%<,:1,2(,*1,=3#,(6,9%,(n,E#,90,B1,23,(3,2%,>5,91,>1,:3,,%4,>6,B7,,(2,B6,GA,89,e,(4,>7,n3,51,K7,%2,%6,53,,@1,%5,D*2,L33,C*D,8%,(5,(8,2:,x,*3,I5,%n,50,M7,,",a,",");split("0,1,2,3,4,5,6,7,8,9,=,e,x,+,-,.,n,;",w,",");q=0;for(g in w)if(o[w[g]]==f)q++;i=f-32+1;if(q>0)printf(f==110?"\n":a[i]);else{h=split(a[i],b,"");for(j=0;j<h;j++)d(o[b[j+1]]);}}d($1)' | awk 'BEGIN{L=2400;N=8192;for(n=0;n<L;n+=1)o[n+1]=0;}{l=split($1,t,"x");for(j=0;j<l;j+=1){split(t[j+1],p,"=");x[j+1]=p[2];f[j+1]=p[1];}M=(NR==200?2400:2*L);for(n=0;n<M;n+=1){R=0;for(j=0;j<l;j+=1){a=cos(2*3.141592*f[j+1]*(n/N)*1.000000);R+=(x[j+1]*a);}if(n<L){R=(R*(n/L))+(o[n+1]*((L-n)/L));R*=2.000000;R=int(R*32767);printf("%04X\n",R==0?0:(R>0?(R>32767?32767:R):(R<-32768?32768:65536+R)));}else if(n<2*L){o[n-L+1]=R;}}}' | awk '{a[i++]=$0}END{while(j<i)print a[j++]}' | xxd -r -p | aplay -q -c 1 -f S16_BE -r 48000
Listen to the audio output of the above command: curb.wav
Precision Meme Control Options
Depending on your needs, you may need to choose between high audio quality and large command size versus low audio quality, but small command size. This can be controlled by the '-t' or '--threshold' option. With this parameter, any frequency with a power level contribution lower than the threshold option will be dropped from the output. For example, let's create a command that only keeps frequencies with a power level over 9000:
youtube-dl SiMHTK15Pik -x --audio-format wav -o "9000.%(ext)s"
./fmt -f 9000.wav -s 9.3 -e 12.5 -t 9000
in this case the output is:
echo -n '(((& n' | hexdump -v -e '/1 "%u\n"' | awk 'function d(f,k,a,b,o,w,i,q,h,j){for(k=32;k<127;k++)o[sprintf("%c", k)]=k;split("0=0,,, n,,###,%%%,,&&,,,+,,-,.,,0,1,2,3,4,5,6,7,8,9,,;,,=,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,e,,,,,,,,,,,,,,,,,,,x,,,,,,,",a,",");split("0,1,2,3,4,5,6,7,8,9,=,e,x,+,-,.,n,;",w,",");q=0;for(g in w)if(o[w[g]]==f)q++;i=f-32+1;if(q>0)printf(f==110?"\n":a[i]);else{h=split(a[i],b,"");for(j=0;j<h;j++)d(o[b[j+1]]);}}d($1)' | awk 'BEGIN{L=2400;N=8192;for(n=0;n<L;n+=1)o[n+1]=0;}{l=split($1,t,"x");for(j=0;j<l;j+=1){split(t[j+1],p,"=");x[j+1]=p[2];f[j+1]=p[1];}M=(NR==64?2399:2*L);for(n=0;n<M;n+=1){R=0;for(j=0;j<l;j+=1){a=cos(2*3.141592*f[j+1]*(n/N)*1.000000);R+=(x[j+1]*a);}if(n<L){R=(R*(n/L))+(o[n+1]*((L-n)/L));R*=2.000000;R=int(R*32767);printf("%04X\n",R==0?0:(R>0?(R>32767?32767:R):(R<-32768?32768:65536+R)));}else if(n<2*L){o[n-L+1]=R;}}}' | awk '{a[i++]=$0}END{while(j<i)print a[j++]}' | xxd -r -p | aplay -q -c 1 -f S16_BE -r 48000
Listen to the audio output of the above command: 9000.wav
What 9000? There's no way that can be right, because the command above doesn't actually contain much audio data. The frequency power level threshold of 9000 is simply too high. Let's try again with a power level threshold of 70:
./fmt -f 9000.wav -s 9.3 -e 12.5 -t 70
and the result looks something like this:
echo -n '_____N8*Q1)Q3*Q9*M1)J0*4<)J3&JrJ7)J8&K0(K3&KsK8)4E*5G)5[({5)t3*5c)5j)u0ZvI|506Pi)vI5I6I7&S*500P_O*A)j/1d)V9>O?28ZADE:57&j@g@61Dh(90(1WP29DA(j:g:n26B9&E:i/a(wP29&A)i@vIrmDw(Q*S(69/70)102:103(10|173B05?06U08}09>17?VPaI6@N*G1?17Bl>V>20T21T24FKBoBg>66?89>9rE3Pm/NDQ:nAI4)m)wRNI9=7#M=9#nQ/M@J&G8*G9*XBiFmTN?63>65?66F68U69:O1^O2RO3/O5YO6Dno(<9>03B<F17BVU29FAUaBm>]Tc>t>53B72F78>79Po(d(S&])[5/[6/1N?<>[>14BVT23B26FO?28?EFb?wBNBM?{BgF60?69Z22*S:]D[7:1N*1Q&1M^1J&l3)l5:l6U28?A>E>b?iY2vUWBh>73B75?77/O8fS)])[6&[7/1N>A?E?bF66F74?75:O6?78^O9R280T96)b1I22I23I24I26*b8/3A(3N(3MIJ)3SIc*4A&]7D]8:na3*apasN4)N5I91*Q4I9|M2(M3DM5)J0)K0k08k<&6[&617)6VkwPS)6V(h0Dh2/hph5:h8/h9^6A&6v:6aY6ND6Q=9#6MD6Kko(6d(64r6Sk53Zc(WY{fJ&d*S&]@c@W/50f28f{)t&54D5su/57/g(G0*1<)G4?69>71:O3^O5DO7@O8>79?80?82)3E)4M(c7/c8&W0*W2)W8*501P[2BOBABi>vBaFm?NBo?d?W?{>tB61?80>83>84&t6Z<8(1A>21Fb}i}aY2m?M>K?dBS=7#2]=7#2g(nG6&G8YG9:y*1A/1E>22F25BiFvFm>wTN>M>J>d>45>S>]I85)N6I96fq(X:<3/z/l1&l3:lpl8*V0&V1&V3*V4:VsQ3)Q4I95/Q6/Q9:M0*MpK6Zu(j&g&M7*J0&J3&J5:J7@J9(K2*K3DK5^K7(4A&4Q(4Kf4/|8(9(53:54*u&57/j&6iP6)7/8f57&g(61&63F80B82F84>86Zu:57:109)G0/G1&GrG7&qFhF68B70?71F75B78RO9R280>82B83T84>87?88U89?98B99@A0&A2*E5:E6:b6*3MIu)a7I8pQsK3P5/8*G9(X@y@<6)<9)1A>45BW>50T92T93T94>98>99&A9&E1(b2:b3ZX)z^<5/<6(<7Y2WD2{}tRA3I<&E4*Ernq(X=7#yYz/<6(1E&[4FJ?KUo?c?53T54}uUj>64F91*A2(A3*E3)E4PG8*q&X/y*<3Yz:<s<6?J?K?d>cFWUtF53?55>h>66)A2(A|EpE6fG9*qYXD<3Rz>o}d}SF54?55UuFj>92)A3&AsA6(E4*E8Py/<3%9#z^<6?SU]Y2g?60U61:A9/E0/b3PX:<3^zR[4>45TSTj?gU69?71>72&A8:ErE9:n21)G8(G9YX@y^2J>89B99)A0/A2*E4f15)q=8#X=8#<3>J?K?98)A1&A2/A4:E6Z20(G7&q/X%8#y/<p<6?9rApAsE0)E|E6PG3*G6(G7=8#G8Dq/X&<9BwB91>96>97)A6fV>0&57(G3&G4>71F73B87?88(nW({/tP20)N&M:K/o^d/]:c*50Z679k83k8|687k89)691*n' | hexdump -v -e '/1 "%u\n"' | awk 'function d(f,k,a,b,o,w,i,q,h,j){for(k=32;k<127;k++)o[sprintf("%c", k)]=k;split("03x,,,e- ,,=-,%3#,,=3#,%2#,=2#,+,,-,.,%4#,0,1,2,3,4,5,6,7,8,9,=4#,;,12,=,&2,(2,%5#,30,)2,0=0,=5#,31,*2,11,6#,&3,41,42,Cn,40,38,27,)n,39,%7#,46,/2,:2,19,49,<1,=H,*n,13,,47,%H,LLL,,35,32,48,44,e,&n,59,62,33,58,*6,18,36,,43,4(,<0,5(,5*,52,56,34,37,x,<2,<4,51,5&,@2,,",a,",");split("0,1,2,3,4,5,6,7,8,9,=,e,x,+,-,.,n,;",w,",");q=0;for(g in w)if(o[w[g]]==f)q++;i=f-32+1;if(q>0)printf(f==110?"\n":a[i]);else{h=split(a[i],b,"");for(j=0;j<h;j++)d(o[b[j+1]]);}}d($1)' | awk 'BEGIN{L=2400;N=8192;for(n=0;n<L;n+=1)o[n+1]=0;}{l=split($1,t,"x");for(j=0;j<l;j+=1){split(t[j+1],p,"=");x[j+1]=p[2];f[j+1]=p[1];}M=(NR==64?2399:2*L);for(n=0;n<M;n+=1){R=0;for(j=0;j<l;j+=1){a=cos(2*3.141592*f[j+1]*(n/N)*1.000000);R+=(x[j+1]*a);}if(n<L){R=(R*(n/L))+(o[n+1]*((L-n)/L));R*=2.000000;R=int(R*32767);printf("%04X\n",R==0?0:(R>0?(R>32767?32767:R):(R<-32768?32768:65536+R)));}else if(n<2*L){o[n-L+1]=R;}}}' | awk '{a[i++]=$0}END{while(j<i)print a[j++]}' | xxd -r -p | aplay -q -c 1 -f S16_BE -r 48000
Listen to the audio output of the above command: 9000-70.wav
Let's do some more experimenting and shift the pitch up. We'll also pipe it directly into shell so we don't need to copy paste:
./fmt -f 9000.wav -s 9.3 -e 12.5 -t 70 -c 1.3 | sh
Listen to the audio output of the above command: 9000-high-pitch.wav
Another option is to change the amount of time in the sampling window. If you change the sample interval with the '-p' option you will probably need to adjust the threshold value to get the same size output because the sample window size will affect the relative power levels:
./fmt -f 9000.wav -s 9.3 -e 12.5 -t 200 -p 0.02 | sh
Listen to the audio output of the above command: 9000-short-interval.wav
If you make the window small enough, you'll get to experience all kinds of artifacts and distortion!
HIGH VOLUME WARNING!!! For memes that require volume increases that are high enough to cause distortion, you can use the -m option to increase the volume by a factor of 1000:
./fmt -f 9000.wav -s 9.3 -e 12.5 -t 100 -p 0.02 -m 1000 | sh
Listen to the audio output of the above command: 9000-high-volume.wav
HIGH VOLUME WARNING!!! In order to achieve even higher levels of distortion, you can use the high or low pass filters (high pass in this case):
./fmt -f 9000.wav -s 9.3 -e 12.5 -t 100 -p 0.02 -m 1000 -l h 10 500 | sh
Listen to the audio output of the above command: 9000-high-distortion.wav
Audio Encoding And Decoding
If the thought of running a huge untrusted shell command full of obviously compressed text scares you (like it should), you may rejoice in knowing that the fast meme transform provides you with an option to leave the data uncompressed. To see what the Fourier coefficients look like, you can disable compression with the '-u' flag:
./fmt -f 9000.wav -s 9.3 -e 12.5 -t 70 -u
here is the output of the above command:
echo -en "0=0\n0=0\n0=0\n0=0\n0=0\n0=0\n0=0\n0=0\n0=0\n0=0\n0=0\n0=0\n0=0\n0=0\n0=0\n388=2e-03x391=-2e-03x393=2e-03x399=2e-03x401=-2e-03x410=2e-03x412=-2e-03x413=-3e-03x415=3e-03x417=-2e-03x418=-3e-03x420=3e-03x423=-3e-03x425=2e-03x428=-2e-03x431=2e-03x511=-2e-03x513=3e-03x515=-2e-03x523=2e-03x548=-2e-03x558=-2e-03x560=2e-03x\n34=-3e-03x35=-3e-03x506=-2e-03x\n33=-2e-03x34=-3e-03x35=-3e-03x36=-3e-03x37=-3e-03x46=2e-03x500=-2e-03x\n0=0\n0=0\n0=0\n27=2e-03x30=-2e-03x58=-4e-03x144=-2e-03x199=-3e-03x227=3e-03x228=2e-03x\n30=5e-03x31=4e-03x57=-3e-03x58=-5e-03x59=-5e-03x61=5e-03x62=3e-03x90=3e-03x149=-2e-03x\n29=5e-03x30=3e-03x58=4e-03x59=4e-03x\n26=-2e-03x29=-3e-03x31=4e-03x33=-4e-03x35=3e-03x37=-2e-03x\n29=-3e-03x30=-2e-03x33=-5e-03x34=-3e-03x35=3e-03x36=5e-03x37=3e-03x39=2e-03x46=3e-03x69=-4e-03x70=-2e-03x102=4e-03x103=3e-03x105=-3e-03x173=-2e-03x205=3e-03x206=4e-03x208=-5e-03x209=-3e-03x217=3e-03x219=-2e-03x\n35=-3e-03x36=-5e-03x38=2e-03x111=3e-03x217=-2e-03x218=-3e-03x219=-3e-03x220=-4e-03x221=-4e-03x224=2e-03x242=-2e-03x243=-2e-03x259=-3e-03x266=3e-03x289=-3e-03x295=3e-03x313=-2e-03x\n36=-4e-03x38=5e-03x39=4e-03x\n30=-3e-03x34=-2e-03x36=-2e-03x37=-7e-03x38=-3e-03x39=7e-03x40=9e-03x\n39=-4e-03x40=-5e-03x41=-3e-03x118=2e-03x119=2e-03x121=-2e-03x233=2e-03x236=-4e-03x238=3e-03x263=-3e-03x265=3e-03x266=2e-03x268=4e-03x269=4e-03x271=-6e-03x272=-7e-03x273=-4e-03x275=6e-03x276=5e-03x\n43=3e-03x129=-3e-03x203=-2e-03x212=2e-03x217=-2e-03x219=4e-03x229=2e-03x230=4e-03x235=-2e-03x236=-3e-03x247=-4e-03x248=-3e-03x252=-3e-03x253=-2e-03x272=2e-03x278=-3e-03x279=-2e-03x\n43=3e-03x44=3e-03x46=-3e-03x47=-2e-03x135=-4e-03x136=-4e-03x138=3e-03x212=-3e-03x213=-3e-03x214=-2e-03x219=-4e-03x223=-2e-03x226=2e-03x227=3e-03x228=3e-03x231=2e-03x232=3e-03x237=-2e-03x238=-2e-03x240=3e-03x251=-2e-03x259=2e-03x260=3e-03x269=2e-03x\n22=2e-03x46=4e-03x47=5e-03x137=4e-03x138=2e-03x139=-3e-03x140=-6e-03x141=-3e-03x183=-2e-03x185=4e-03x186=4e-03x228=3e-03x230=-3e-03x231=-3e-03x232=3e-03x233=6e-03x234=4e-03x249=-2e-03x262=-3e-03x273=-2e-03x275=3e-03x277=-4e-03x278=-3e-03x\n46=-2e-03x47=-2e-03x136=-3e-03x137=-4e-03x138=-3e-03x230=3e-03x231=3e-03x232=2e-03x266=2e-03x274=3e-03x275=4e-03x276=3e-03x278=-6e-03x279=-7e-03x280=-4e-03x296=-2e-03x321=-3e-03x322=-3e-03x323=-3e-03x324=-3e-03x326=2e-03x328=-4e-03x330=3e-03x338=3e-03x340=-3e-03x341=-2e-03x346=-3e-03x348=2e-03x430=-3e-03x477=5e-03x478=4e-03x\n353=2e-03x354=3e-03x355=2e-03x384=-2e-03x385=-3e-03x391=2e-03x394=-3e-03x395=-3e-03x402=3e-03x403=5e-03x405=-2e-03x410=-2e-03x420=2e-03x608=2e-03x612=-3e-03x613=-3e-03x617=-2e-03x619=2e-03x637=-2e-03x\n46=-2e-03x619=3e-03x620=5e-03x622=-4e-03x624=3e-03x625=4e-03x628=-4e-03x629=-6e-03x630=-3e-03x634=4e-03x635=6e-03x638=5e-03x639=9e-03x640=5e-03x642=2e-03x643=3e-03x644=3e-03x645=3e-03x646=2e-03x653=2e-03x\n48=3e-03x49=6e-03x51=-3e-03x\n41=-3e-03x44=2e-03x46=-3e-03x47=-5e-03x48=-5e-03x49=-4e-03x50=-3e-03x\n28=-3e-03x\n51=-2e-03x52=-3e-03x54=5e-03x55=2e-03x56=-4e-03x57=-4e-03x59=3e-03x110=2e-03x112=-2e-03x114=3e-03x269=-3e-03x271=4e-03x273=-6e-03x275=5e-03x277=-5e-03x278=-3e-03x279=3e-03x280=3e-03x282=-2e-03x331=-2e-03x440=3e-03x487=-4e-03x488=-3e-03x490=2e-03x492=-2e-03x498=2e-03x501=-2e-03x\n132=-2e-03x227=-2e-03x230=-2e-03x233=-3e-03x234=-2e-03x235=2e-03x236=3e-03x238=-2e-03x243=3e-03x244=3e-03x249=3e-03x251=-3e-03x252=-2e-03x261=3e-03x280=-3e-03x283=-3e-03x284=-3e-03x526=2e-03x\n128=3e-03x130=-3e-03x221=2e-03x232=-5e-03x233=-5e-03x235=6e-03x236=3e-03x240=-3e-03x242=3e-03x244=-2e-03x246=7e-03x247=7e-03x259=3e-03x\n116=-3e-03x118=6e-03x119=4e-03x122=2e-03x130=-4e-03x131=-3e-03x222=2e-03x225=-2e-03x233=2e-03x234=2e-03x236=-3e-03x237=-4e-03x238=-3e-03x240=-3e-03x241=-3e-03x244=-3e-03x245=-3e-03x246=-3e-03x247=-3e-03x385=-2e-03x386=-3e-03x396=-3e-03x\n120=3e-03x121=4e-03x123=-4e-03x124=-4e-03x181=-3e-03x183=4e-03x184=3e-03x188=2e-03x190=-3e-03x191=-3e-03x193=2e-03x194=4e-03x195=2e-03x393=-2e-03x394=-3e-03x395=-4e-03x396=-4e-03x399=4e-03x400=2e-03x404=3e-03x426=2e-03x\n56=3e-03x58=-3e-03x59=-3e-03x407=2e-03x410=-3e-03x413=-3e-03x415=4e-03x417=-5e-03x419=3e-03x422=2e-03x423=5e-03x425=-6e-03x427=3e-03x430=-3e-03x439=3e-03x442=-3e-03x\n4=-4e-03x5=-3e-03x8=3e-03x9=3e-03x53=4e-03x54=2e-03x56=-3e-03x57=-4e-03x58=-3e-03x633=-2e-03x\n6=-2e-03x7=-4e-03x8=-3e-03x\n57=-3e-03x59=3e-03x61=-3e-03x63=2e-03x280=-2e-03x282=2e-03x284=-3e-03x286=2e-03x\n56=4e-03x57=4e-03x109=-2e-03x110=-4e-03x111=-3e-03x115=3e-03x117=-3e-03x120=2e-03x262=2e-03x268=-2e-03x270=3e-03x271=2e-03x275=-2e-03x278=-7e-03x279=-7e-03x280=-3e-03x282=-2e-03x283=-4e-03x284=-3e-03x287=3e-03x288=4e-03x289=3e-03x298=-2e-03x299=-5e-03x300=-3e-03x302=2e-03x315=4e-03x316=4e-03x326=2e-03x340=-3e-03x356=-2e-03x357=-3e-03x384=3e-03x395=2e-03x423=-2e-03x\n5=-4e-03x8=2e-03x119=3e-03x121=-5e-03x122=-5e-03x126=-2e-03x129=-2e-03x130=-3e-03x245=-2e-03x249=-3e-03x250=-4e-03x292=-4e-03x293=-4e-03x294=-3e-03x298=-3e-03x299=-3e-03x309=-3e-03x311=3e-03x322=4e-03x323=2e-03x\n121=-2e-03x124=-6e-03x125=-4e-03x126=3e-03x127=6e-03x249=5e-03x251=-5e-03x252=-7e-03x303=-3e-03x312=-3e-03x314=2e-03x315=3e-03x\n120=3e-03x121=7e-03x122=6e-03x124=-4e-03x126=3e-03x131=-3e-03x134=2e-03x241=3e-03x242=4e-03x243=3e-03x248=3e-03x253=-4e-03x254=-5e-03x256=4e-03x258=-3e-03x264=2e-03x291=2e-03x302=3e-03x303=2e-03x313=-2e-03x314=-2e-03x\n118=2e-03x120=-3e-03x121=-4e-03x122=2e-03x123=6e-03x124=4e-03x125=2e-03x126=3e-03x241=3e-03x242=3e-03x244=-3e-03x248=2e-03x249=4e-03x252=2e-03x253=3e-03x255=-3e-03x262=-3e-03x266=-2e-03x302=3e-03x305=-3e-03x314=3e-03x316=-3e-03x\n119=2e-03x120=6e-03x121=5e-03x123=-7e-03x124=-3e-03x243=-5e-03x244=-5e-03x246=2e-03x254=3e-03x255=4e-03x256=2e-03x258=-3e-03x292=-2e-03x303=-3e-03x305=2e-03x306=3e-03x314=2e-03x318=-2e-03x\n122=-4e-03x123=-9e-03x124=-6e-03x126=3e-03x246=4e-03x247=6e-03x259=3e-03x260=4e-03x261=4e-03x309=-4e-03x310=-4e-03x323=-2e-03x\n121=4e-03x123=-6e-03x124=-7e-03x134=-3e-03x245=-4e-03x246=-4e-03x258=3e-03x259=4e-03x269=3e-03x271=-3e-03x272=-3e-03x308=4e-03x315=3e-03x319=4e-03x\n21=-2e-03x118=3e-03x119=6e-03x121=-5e-03x122=-6e-03x241=-3e-03x289=-2e-03x299=-2e-03x300=-4e-03x302=2e-03x314=-3e-03x\n15=-2e-03x120=8e-03x121=8e-03x123=-3e-03x241=3e-03x242=3e-03x298=-2e-03x301=-3e-03x302=-4e-03x304=4e-03x316=2e-03x\n20=3e-03x117=-3e-03x120=-4e-03x121=-8e-03x122=-4e-03x124=3e-03x126=3e-03x295=3e-03x304=3e-03x305=2e-03x310=-2e-03x315=-3e-03x316=-2e-03x\n113=2e-03x116=3e-03x117=8e-03x118=5e-03x120=-4e-03x121=-3e-03x129=-2e-03x237=-2e-03x291=-3e-03x296=-3e-03x297=-2e-03x306=-3e-03x\n19=-3e-03x20=-3e-03x57=3e-03x113=-3e-03x114=-3e-03x271=2e-03x273=-2e-03x287=3e-03x288=3e-03x\n49=3e-03x51=-4e-03x52=-2e-03x\n20=-2e-03x38=-3e-03x40=4e-03x42=-4e-03x43=-6e-03x44=-4e-03x47=4e-03x48=2e-03x50=2e-03x\n679=2e-03x683=2e-03x685=-3e-03x687=2e-03x689=-2e-03x691=2e-03x\n" | awk 'BEGIN{L=2400;N=8192;for(n=0;n<L;n+=1)o[n+1]=0;}{l=split($1,t,"x");for(j=0;j<l;j+=1){split(t[j+1],p,"=");x[j+1]=p[2];f[j+1]=p[1];}M=(NR==64?2399:2*L);for(n=0;n<M;n+=1){R=0;for(j=0;j<l;j+=1){a=cos(2*3.141592*f[j+1]*(n/N)*1.000000);R+=(x[j+1]*a);}if(n<L){R=(R*(n/L))+(o[n+1]*((L-n)/L));R*=2.000000;R=int(R*32767);printf("%04X\n",R==0?0:(R>0?(R>32767?32767:R):(R<-32768?32768:65536+R)));}else if(n<2*L){o[n-L+1]=R;}}}' | awk '{a[i++]=$0}END{while(j<i)print a[j++]}' | xxd -r -p | aplay -q -c 1 -f S16_BE -r 48000
Listen to the audio output of the above command: 9000-uncompressed.wav
The audio encoding format is a plain text format that uses various delimiters to separate important information. For example, here is a sample of encoded audio data that plays one interval with a 220 hz tone, followed by a 440hz tone, followed by a 110hz tone played at the same time as a 220hz tone:
220=1
440=1
220=1x110=1
In this case, the intervals are separated by a newline, and the individual tones in an interval are separated by an 'x' character. Here is an example of how this information can be fed into the audio decoder to produce actual sounds:
echo -en "220=1\n440=1\n220=1x110=1" | awk '{l=split($1,t,"x");for(j=0;j<l;j+=1){split(t[j+1],p,"=");x[j+1]=p[2];f[j+1]=p[1];}N=48000;r=N/48000;M=N;for(n=0;n<M;n+=r){R=0;for(j=0;j<l;j+=1){a=cos(2*3.141592*f[j+1]*(n/N)*(1/r));R+=(x[j+1]*a);}R=(R>1?1:(R<-1?-1:R));printf("%04X\n", and(0xFFFF, R*0x7FFF));}}' | tac | tac | xxd -r -p | aplay -c 1 -f S16_BE -r 48000
Listen to the audio output of the above command: sample-tone.wav
In the above example, a few of the numerical constants in the audio decoder were set manually to specific values. In practice, they will be different for each audio sample that is produced with the audio codec tool provided here. These constants are things like the sample rate of the audio, and the number of samples produced per interval.
In order to repeat this process for a more complex waveform such as human speech, a Fourier transform is performed to determine what frequency components are most important for representing the signal in any given interval. Once these important frequency components are determined, they are included along with a numerical weighting that describes exactly how important they are. In the previous trivial example, all the weightings were set to 1.
Here is a tiny sample of the encoded audio data for a more complex waveform produced with the audio encoding tool. Note that the decimal weightings are represented in E notation to save space:
0=0
0=0
0=0
147=-2e-03x148=-2e-02x149=-1e-02x150=7e-04
128=-3e-03x129=-1e-02x136=-1e-02x137=-1e-02x138=2e-04
145=-6e-03x146=3e-03x147=9e-03x149=1e-02x150=1e-02x151=1e-03
E notation is used by default to describe the weightings of frequency components in the audio codec tool. A detailed analysis of the audio decoder can be found in the following awk script. This awk script expects E notation frequency weights as shown in the previous example:
awk '
# NOTE: Indices in awk are 1 based! That's why you'll see
# +1 in array accesses everywhere.
# Everything inside the 'BEGIN' section runs once when awk
# starts processing.
BEGIN{
# L represents the number base number of samples
# that occur before any zero padding is performed.
L=2205;
# N is the power of two rounded size of the interval
# that we perform the Fourier transform on to analyse
# our signal. Power of two size is a pre-requisite of
# the fast Fourier transform. Note that, in this case,
# we didn't go for the closest power of two which would
# have been 4096, but instead we chose 8196. This is
# because ensuring that we have at least as many zero
# pads as the base number of samples in our signal
# helps to prevent aliasing due to circular convolution
# of the Fourier transform.
N=8192;
# This for loop is initializing the array that will be
# used to implement the modified version of overlap add.
# We only initialize up to L in this version because it
# this variant of overlap doesn't actually add all of
# the overlap.
for(n=0;n<L;n+=1)
o[n+1]=0;
}
{
# In the next line, 't' becomes an array of frequencies.
# A string like '253=3e-04;-1e-03x254=-1e-03;-2e-04'
# would be split into two strings: '253=3e-04;-1e-03'
# and '254=-1e-03;-2e-04'.
# The variable 'l' is assigned the length of the array
# t, which for the example string above is 2.
l=split($1,t,"x");
# For each of the frequencies (in this case 2 of them)
for(j=0;j<l;j+=1){
# NOTE: This section will sometimes be different
# depending on what options you run the fast
# meme transform tool with.
# Take the frequency and split it up into parts,
# for example with '253=3e-04;-1e-03' you'd have
# parts '253' and '3e-04;-1e-03'. These parts
# go into the 'p' array. In this example, 253
# is the frequency bin number, and '3e-04;-1e-03'
# is the real and imaginary parts for that
# frequency.
split(t[j+1],p,"=");
# The next line is splitting up the '3e-04;-1e-03'
# into the separate real and imaginary parts which
# go into the array 'z'.
split(p[2],z,";");
# Save the real part.
x[j+1]=z[1];
# Save the imaginary part.
y[j+1]=z[2];
# Save the frequency bin number.
f[j+1]=p[1];
}
# This line is assigning to M the upper bound that decides
# how many samples we want to look at for this interval.
# NR == 64, means 'when the current line number is 64', then
# assign 349. Otherwise, review 2 * L samples.
# The case when NR == 64 is special because that is the last
# interval in the audio signal. In order to produce the
# same number of samples as the original signal, we need to
# use this method to remember how many samples to output in
# the very last interval. In all other intervals we output
# L samples. The reason we go up to 2 * L samples is
# only the first L samples we calculate get output for this
# interval. The rest of them are saved for the 'overlap add'
# part of the next interval.
M=(NR==64?349:2*L);
for(n=0;n<M;n+=1){
# We are now about to start performing the inverse
# discrete Fourier transform. Initialize the real
# and imaginary parts:
R=0;
I=0;
# For every relevant frequency
for(j=0;j<l;j+=1){
# Calculate Re(e^(i*2*pi*k*n/N))
a=cos(2*3.141592*f[j+1]*(n/N)*1.000000);
# Calculate Im(e^(i*2*pi*k*n/N))
b=sin(2*3.141592*f[j+1]*(n/N)*1.000000);
# Perform the complex multiply with the
# Fourier coefficient we stored earlier.
R+=(x[j+1]*a)-(y[j+1]*b);
I+=(x[j+1]*b)+(y[j+1]*a);
}
# If we're in the interval where we want output.
if(n<L){
# The is a variant of the 'overlap add'
# that uses a bit of smoothing to
# linearly transition from the signal in
# the last interval to the signal in the
# current interval. Also, this variant
# ignores multi-interval overlap, which
# was found to produce better results
# with large endpoint discontinuities
# that result from very ad-hoc frequency
# domain changes.
R=(R*(n/L))+(o[n+1]*((L-n)/L));
# Volume control is achieved by
# multiplying R here.
R*=2.000000;
# Scale R properly to approximately fit
# the range of a 16 bit signed integer.
R=int(R*32767);
# The ternary expression in the print
# statement does two things: First it
# Prevents 16 bit signed/unsigned overflow
# by capping the integer at either the
# max or min 16 bit integer. Second,
# it converts a negative signed integer
# to a positive unsigned integer with
# the same value to print it with %04X
printf("%04X\n",R==0?0:(R>0?(R>32767?32767:R):(R<-32768?32768:65536+R)));
}else if(n<2*L){
# Save the calculated value for the
# overlap add algorithm to use in the
# next interval.
o[n-L+1]=R;
}
}
}
'
Compression
The terminal environment presents unique challenges when it comes to compressing data: The resulting compressed data must be easily typed (or copied) into the terminal, and one must also be considerate of various control characters that have special meaning to the shell environment itself. Certain characters like '*' can interfere with the expression of the command itself when the shell treats them as control information, but the user's intention is to use them as data. Similar considerations exist for non-printable characters, backticks, single quotes etc. Binary data such as nulls or extended ASCII characters which may not be portable are also off limits.
It can be observed that the information we need to compress consists only of the following characters in all cases: {'0','1','2','3','4','5','6','7','8','9','e','-','=', '.', '+','n','x'}. This means that other printable characters are freely available to express compressed versions of these fundamental characters that represent our data. Furthermore, it was necessary to consider only a method of decompression that could be done quickly and with very little code. The decompressor code must be very small, otherwise any benefit that compression would gain would be defeated by the increased code required to represent the decompressor in a single command. Including the decompressor along with the command is essential since the meme would have little value if it required installing external tools.
The solution that was applied resembles the use of grammar rules in constructing a programming language. For example, the character 'A', which does not appear in the output, can be used to represent the character sequence 'e-0' for a savings of 2 characters for every occurrence (not considering the text added to the decoder). The use of tries was employed to find the most common substrings in the input. Furthermore, this calculation was repeated for a small number of substring lengths in order to greedily calculate the highest benefit string substitution. This process was repeated until all available printable characters were associated with a maximum benefit replacement rule. Once no further substitutions were possible, the compression ends. This algorithm is by no means the most space efficient, but it was well suited to the task since it reduced the total command size by about 50% in most cases without being too complex or slow in decompressing.
Here is a detailed explanation of an example of the de-compression algorithm written in awk:
awk 'function d(f,k,a,b,o,w,i,q,h,j){
# f (integer) - Represents the decimal value of a single ASCII character
# we want to de-compress
# k (integer) - A local variable used for iteration.
# a (array) - A local array used to hold the decompression mappings for
# each ASCII character.
# b (array) - A local array used to the individual characters that our
# decompressed character gets mapped to.
# o (array) - A local array used to hold mappings from standard ASCII
# characters to their equivalent decimal numbers. This is needed since
# awk does not implement an 'ord()' function.
# w (array) - A local array used to hold a list of terminal characters
# that don't expand into other characters, but instead only represent
# themselves.
# i (integer) - A local variable that is used to indicate the index into
# the 'a' array mentioned earlier.
# q (integer) - A local variable used to keep track of terminal characters.
# h (integer) - A local variable used to keep track of the length of the
# list of characters that are currently being decompressed.
# j (integer) - A local variable used in the iteration over the list of
# characters currently being decompressed.
#
# Since awk does not natively have an 'ord()' function that maps characters
# to their decimal equivalents, we create a dictionary to look up these
# values ourselves.
for(k=32;k<127;k++)
o[sprintf("%c", k)]=k;
# This invocation of the split function creates our decompression mapping
# where all possible input characters map to some substituted sequence of
# characters. This decompression of individual characters can happen
# recursively. The specific mapping you see below is just an example, and
# the actual de-compression mapping will be re-calculated for every each
# meme by the audio codec tool.
split("e-0,,, 2x,,#40, 3x,,=-,=2#,(2,+,,-,.,&40,0,1,2,3,4,5,6,7,8,9, 2n,;,=2%,=,*#,0=0,38,=3,*%,(3, 4x,=4,39,=5,?n,21,=2,=1,=7&, 3n,(1,/5,(4,79,%1,A#,(5,(7&,C#,%5,@7,=9&,B5,22,,J:,*:,<5,,%0,@8,26,(6,e,HH,13,42,30,K#,%2,D40,44,,43,@3,)F,3Q,>7,P#,/0,=8&,80,x,41,3>,E#,E&,J&,,",a,",");
# In order to end the recursion, we also maintain an array of terminal
# characters that do not need further decompression.
split("0,1,2,3,4,5,6,7,8,9,=,e,x,+,-,.,n,;",w,",");
# Used to keep track of whether the current character the current input
# character is part of a terminal substitution rule.
q=0;
# For every item in the terminal character array,
for(g in w)
# If the character we were passed is terminal, then increment q.
if(o[w[g]]==f)
q++;
# we subtract 32 from i, since ASCII values 0-31 all decode to control
# characters which we don't care about, so they aren't included in the
# decompression mapping array. The +1 comes from the fact that array
# indices are one based in awk.
i=f-32+1;
# If f represents a terminal character, then print it.
if(q>0)
# If the character is a newline, print a newline, otherwise
# consult the decompression mapping.
printf(f==110?"\n":a[i]);
else{
# Otherwise, was it was non-terminal, recursively de-compress
# everything it maps to:
h=split(a[i],b,"");
for(j=0;j<h;j++)
d(o[b[j]]);
}
}
d($1) # Invoke the decompression on this character
'
Operating System Portability
Different Unix systems have different default methods of interacting with audio drivers. This is why the -k option has been included in the Fast Meme Transform tool to make it easy to create memes that can work of a number of different systems. By default, the meme transform tool assumes that you have aplay installed, and this is usually is by default on Ubuntu. For other platforms, you may be able to use the 'sox' program if you specify an audio driver. If your meme doesn't work at first due to a 'command not found', try again with a different -k option. As of this writing, valid values are currently 0, 1, 2, 3:
./fmt -f 9000.wav -t 99999999 -k 0
./fmt -f 9000.wav -t 99999999 -k 1
./fmt -f 9000.wav -t 99999999 -k 2
./fmt -f 9000.wav -t 99999999 -k 3
The output from the above commands will look something like this:
... | xxd -r -p | sox -q -r 48000 -b 16 -B -e signed -c 1 -t raw - -t `sox -h|grep DRIVERS|sed 's/^.*: \([^ ]\+\).*$/\1/g'`
... | xxd -r -p | aplay -q -c 1 -f S16_BE -r 48000
... | xxd -r -p | sox -q -r 48000 -b 16 -B -e signed -c 1 -t raw - -t coreaudio
... | xxd -r -p | sox -q -r 48000 -b 16 -B -e signed -c 1 -t raw - -t alsa
Mathematical Tricks For Even More Compression
There are two mathematical observations that can give us a bit of compression for free and help to reduce the output size. The first reduction reduces the size by half losslessly, and the second reduction cuts the size in half again, but this time the reduction is lossy. The first observation is related to the fact that our input signal is purely real and this means that the output has symmetries that aren't needed to re-construct the original signal. More precisely, in the frequency domain X(-w) = conjugate(X(w)) when x(n) is purely real. The trick is related to the fact that we can simply drop the imaginary part of the Fourier coefficients only mild distortion of the signal.
These two topics are explored in much greater detail in Fourier Transform Coefficients Of Real Valued Audio Signals.
Endpoint Discontinuities and Spectral Leakage
The effects of endpoint discontinuities and spectral leakage were also explored. Mitigation techniques such as use of the Hann Window were explored. The act of zero padding as is done in the fast Fourier transform has an effect on the spectral leakage which was also explored.
This analysis ended up being quite long, so it was split out into another article specifically on Endpoint Discontinuities and Spectral Leakage.
High Frequency Endpoint Artifacts
One problem that was encountered initially with re-constructing the audio was the presence of high frequency popping sounds from artifacts that occurred around the endpoints of each sampling window in the re-constructed signal. It is quite unpleasant to listen to, and you can enable this as a feature with the fast meme transform tool:
./fmt -f my_file.wav -b
Listen to the audio output of the above command: 9000-popping.wav
It sounds awful!
This problem occurs when filtering is applied in the frequency domain on independent sample intervals without any communication between the intervals once they are reconstructed in the time domain. Since a multiplication in the frequency domain corresponds to convolution in the time domain, it is necessary to communicate some information from the previous sample into the next sample whenever a filter has been applied. The amount of information that needs to be communicated depends on the length of the filter function that is convolved with the input signal. In the trivial case, the filter function could be a simple impulse with value 1 and length 1. In this case, no information would need to be communicated between intervals. This should be intuitive, since multiplying against an impulse filter function in the frequency domain wouldn't change the frequencies at all, and your resulting filtered time domain signal would simply be the same as taking the Fourier transform, them immediately taking the inverse transform. In this case, no extra information from other intervals is necessary.
There are two popular methods to achieve proper filtering in the frequency domain: Overlap Add and Overlap Save. The fast meme transform tool uses a variant of the overlap add algorithm.
The variant of overlap add that was employed in the fast meme transform tool is different from the classical overlap add algorithm in that it only adds in contributions from the previous interval instead of all previous overlapping intervals. In addition, the classical overlap add algorithm will save (len(h) -1) items to be added into values in the next interval, but the version in the meme transform tool only saves up to L values. Finally, since the meme transform tool employs lossy compression and non-multiplicative ad-hoc changes to the frequency domain, the use of pure overlap add or overlap save can still produce endpoint discontinuities. Therefore, a smoothing function is employed: For each interval of length L in the re-constructed signal, the weighting of the output is linearly shifted from considering the overlap audio from the previous interval, to the audio calculated in the current interval. In practice, this completely solved the issue with endpoint discontinuities even in the presence of high amounts of distortion due to dropping frequencies with low power contributions.
Fast Meme Transform Options
Here are some of the command-line options that come with the Fast Meme Transform Tool. Note that if you run the tool without any -f parameter, it will wait for input to come from standard in.
-k, --command-type
A single number that will determine which type of command will be generated to play the final audio. You may need to use this option to generate a command that will play in your operating system without installing additional tools. 0 = use sox to automatically detect audio driver (Tested to work on Ubuntu 16), 1 = use aplay (tested on Ubuntu 16) 2 = sox with coreaudio (Probably works on a Mac, but I don't own one so I can't test this), 3 = sox with alsa.
-h, --help
Display this help menu.
-s, --start-time
The start time in seconds where the output audio clip will begin.
-e, --end-time
The end time in seconds where the output audio clip will end. An end time of zero indicates end of the clip.
-v, --verbose
Be verbose.
-f, --file
The file name of the input PCM .wav file to be processed.
-u, --disable-compression
Disable compression of the output Fourier coefficients.
-w, --enable-hann-window
Enable the application of a Hann window to each audio sample period.
-i, --include-imaginary
Include the imaginary parts of the coefficients of the Fourier transform output.
-a, --full-power-spectrum
Include the entire range of frequencies from the Fourier transform output instead of just including up to N/2.
-m, --volume
A floating point number describing the output volume.
-b, --enable-endpoint-discontinuities
Enable endpoint discontinuities that occur at the ends of sampling intervals (see -p flag). Enable this option if you want the presence of annoying high-frequency popping noises in your audio.
-t, --threshold
A floating point number that describes the cutoff power level, below which any frequency with less than this quantity of power will not be included in the output. If you make this number high, the output will be small, but the audio will sound terrible. If you make this value low, the output will be huge and slow to decode, but the result will sound better. If you make this value large, the output will be small but the sound quality will decrease.
-p, --sample-time-period
A floating point number that describes how much long each sub-sample will be when the audio is broken up into small sections to have a Fourier transform applied to it.
-n, --float-precision
An integer describing how many digits should be printed after the decimal after for each Fourier coefficient that appears in the output.
-l, --filters
A list of length 3*n that one or more sequence of a single character followed by two numbers that describe the parameters of either a low pass or high pass biquad filter. Each biquad filter will be applied one after another on the waveform before the Fourier transform is calculated. For example, using '-l l 5 500' will apply a lowpass filter with a Q value of 5, and a cutoff frequency of 500. The extra character 'l' or 'h' is for low pass and high pass respectively.
-c, --pitch
A positive number that can be used to either shift the frequency up or down. Set this value greater than one to increase the pitch, and set it between 0 and 1 to decrease the pitch.
Conclusion
In this article, it has been shown that you can make use of the Fourier transform to build self-contained Linux commands that play arbitrary audio including human speech. A tool called the Fast Meme Transform that includes many options for meme configuration was presented. A number of different techniques were employed to make the listening and meme distribution experience as good as possible, such as the use of the overlap add algorithm, and ASCII text based compression. Many of the details are not presented in this article, but have instead been moved into articles of their own:
- Overlap Add, Overlap Save Visual Explanation
- Endpoint Discontinuities and Spectral Leakage
- Fourier Transform Coefficients Of Real Valued Audio Signals
A Surprisingly Common Mistake Involving Wildcards & The Find Command
Published 2020-01-21 |
$1.00 CAD |
A Guide to Recording 660FPS Video On A $6 Raspberry Pi Camera
Published 2019-08-01 |
The Most Confusing Grep Mistakes I've Ever Made
Published 2020-11-02 |
Use The 'tail' Command To Monitor Everything
Published 2021-04-08 |
An Overview of How to Do Everything with Raspberry Pi Cameras
Published 2019-05-28 |
An Introduction To Data Science On The Linux Command Line
Published 2019-10-16 |
Using A Piece Of Paper As A Display Terminal - ed Vs. vim
Published 2020-10-05 |
Join My Mailing List Privacy Policy |
Why Bother Subscribing?
|