CINXE.COM
Perl/Archive-Zip : GyparkWiki
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> <HTML><HEAD><TITLE>Perl/Archive-Zip : GyparkWiki</TITLE> <LINK REL="stylesheet" HREF="https://gypark.pe.kr/cgi-bin/wiki/wiki.css"> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8"> <META HTTP-EQUIV="Content-Script-Type" CONTENT="text/javascript"> <link rel="alternate" type="application/rss+xml" title="GyparkWiki" href="http://gypark.pe.kr/cgi-bin/wiki/wiki.pl/action=rss"> <script src="https://gypark.pe.kr/cgi-bin/wiki/wikiscript.js" language="javascript" type="text/javascript" charset="UTF-8"></script> <META NAME='robots' CONTENT='index,follow'/> <link rel="canonical" href="https://gypark.pe.kr/wiki/Perl/Archive-Zip" /> <link rel="shortcut icon" href="https://gypark.pe.kr/favicon.ico" type="image/vnd.microsoft.icon" /> <link rel="pgpkey" type="application/pgp-keys" href="https://gypark.pe.kr/gypark_0xA96AE92C_pub.asc" /> <link rel="stylesheet" media="print" href="https://gypark.pe.kr/cgi-bin/wiki/wiki_print.css" /> <script> <!-- var key = new Array(); key['f'] = "https://gypark.pe.kr/wiki/Diary"; key['i'] = "https://gypark.pe.kr/wiki/action=index"; key['r'] = "https://gypark.pe.kr/wiki/action=rc"; key['l'] = "https://gypark.pe.kr/wiki/action=login&pageid=Perl/Archive-Zip"; key['t'] = "#PAGE_TOP"; key['b'] = "#PAGE_BOTTOM"; key['e'] = "https://gypark.pe.kr/wiki/action=edit&id=Perl/Archive-Zip"; key['v'] = "https://gypark.pe.kr/wiki/action=edit&id=Perl/Archive-Zip&revision="; key['h'] = "https://gypark.pe.kr/wiki/action=history&id=Perl/Archive-Zip"; key['d'] = "https://gypark.pe.kr/wiki/action=browse&diff=5&id=Perl/Archive-Zip"; document.onkeypress = GetKeyStroke; --> </script> </HEAD><BODY BGCOLOR="white" ondblclick="location.href='https://gypark.pe.kr/wiki/action=edit&id=Perl/Archive-Zip'" > <h1 class="pagename"><a accesskey="w" href="https://gypark.pe.kr/wiki"><IMG class='logoimage' src="/raymundo_logo.jpg" alt="[첫화면으로]" border=0 align="right"></a><a rel="nofollow" href="https://gypark.pe.kr/wiki/action=reverse&id=Perl/Archive-Zip">Perl/Archive-Zip</a></h1> <div class="gobottom" align="right"><a accesskey="z" name="PAGE_TOP" href="#PAGE_BOTTOM">마지막으로 [b]</a></div> <DIV class='gotobar'><DIV class='gotobar_search'> <UL> <LI><form method="post" action="https://gypark.pe.kr/wiki" enctype="application/x-www-form-urlencoded" accept-charset="UTF-8" name="goto_form" onsubmit="document.location.href = 'https://gypark.pe.kr/wiki/'+document.getElementById('goto_1').value.replace(/\s*$/,'').replace(/ /g,'_');return false;"> <input type="hidden" name="action" value="browse" /> <input type="text" name="goto_text" tabindex="1000" size="30" accesskey="g" class="goto" id="goto_1" onkeydown="goto_text_keydown(this,event); getTitleIndex('https://gypark.pe.kr/wiki')" onkeyup="getTitleIndex('https://gypark.pe.kr/wiki')" title="바로 가기(Alt + g)" /> <input type="submit" tabindex="1002" name="Submit" value="바로 가기" class="goto" /><BR> <DIV id="goto_list" style="display:none;"> <select name="goto_select" tabindex="1001" onblur="goto_list_blur(this,true,true);_goto_field.select()" onchange="goto_list_blur(this,true,false);" onkeydown="return goto_list_keydown(this,event);" size="15"> <option value="-- Loading page list... --">-- Loading page list... --</option> </select></DIV> </form></LI> <LI><form method="post" action="https://gypark.pe.kr/wiki" enctype="application/x-www-form-urlencoded" name="search_form"><input type="hidden" name="dosearch" value="1" /><input type="text" name="search" size="30" accesskey="s" class="search" title="검색(Alt + s)" /><label><input type="checkbox" name="context" value="on" />내용출력</label> <input type="submit" name="Submit" value="검색" class="search" /><div><input type="hidden" name=".cgifields" value="context" /></div> </form></LI> </UL></DIV> <DIV class='gotobar_user'> <UL> <LI><a rel="nofollow" href="https://gypark.pe.kr/wiki/action=login&pageid=Perl/Archive-Zip">로그인[l]</a></LI> </UL></DIV> <DIV class='gotobar_menu'> <UL> <LI><a href="https://gypark.pe.kr/wiki/Diary" class="wikipagelink" data-editlink="https://gypark.pe.kr/wiki/action=edit&id=Diary">Diary</a><a href="https://gypark.pe.kr/wiki/Diary" class="wikipagelink" data-editlink="https://gypark.pe.kr/wiki/action=edit&id=Diary">[f]</a></LI> <LI><a href="https://gypark.pe.kr/wiki/최근변경내역" class="wikipagelink" data-editlink="https://gypark.pe.kr/wiki/action=edit&id=최근변경내역">최근변경내역</a><a href="https://gypark.pe.kr/wiki/action=rc">[r]</a></LI> <LI><a href="https://gypark.pe.kr/wiki/action=index">페이지목록[i]</a></LI> <LI><img src="https://gypark.pe.kr/cgi-bin/wiki/icons//parentpage.gif" border="0" alt="상위페이지: Perl" align="absmiddle"><a href="https://gypark.pe.kr/wiki/Perl" class="wikipagelink" data-editlink="https://gypark.pe.kr/wiki/action=edit&id=Perl">Perl</a></LI> <LI><a href='/wiki/횡설수설'><b>횡설수설[2]</b></a><script> <!-- if (!key) { var key = Array(); } key['2'] = '/wiki/횡설수설'; --> </script></LI> <LI><a href='/wiki/게시판'><b>게시판[3]</b></a><script> <!-- key['3'] = '/wiki/게시판'; --> </script></LI> <LI><a rel="nofollow" href="https://gypark.pe.kr/wiki/action=links">링크</a></LI> </UL></DIV> </DIV><script> <!-- gotobar_init(); --> </script> <HR class='gotobar'> <A class='inter' href='http://metacpan.org/module/'><IMG class='inter' src='https://gypark.pe.kr/cgi-bin/wiki/icons-inter//metacpan.ico' alt='Cpan:' title='Cpan:'></A><A class='inter' href='http://metacpan.org/module/Archive::Zip' title='Cpan:Archive::Zip'>Archive::Zip</A> <p></p> Zip 압축파일을 만들거나 풀 때 사용하는 모듈. <p></p> <a name="toc"></a><dl><dt> </dt><dd>1. <a href="#H_1">반복적인 read() 호출과 extract() 관련 문제점</a></dd> <dt> </dt><dd><dl><dt> </dt><dd>1.1. <a href="#H_1_1">실험1</a></dd> <dt> </dt><dd>1.2. <a href="#H_1_2">실험2</a></dd> <dt> </dt><dd>1.3. <a href="#H_1_3">실험3</a></dd> <dt> </dt><dd>1.4. <a href="#H_1_4">실험4 - 매번 객체 만들기</a></dd> <dt> </dt><dd>1.5. <a href="#H_1_5">중간 결론</a></dd> <dt> </dt><dd></dd></dl> 2. <a href="#H_2">기타 & Comments</a></dd> <dt> </dt><dd></dd></dl> <p></p> <a name="S_1"></a><H2><a name='H_1' href='#toc'>1. </a>반복적인 read() 호출과 extract() 관련 문제점</H2> <p></p> <a class="outer" href="http://advent.perl.kr/2013/2013-12-20.html">[스무번째 날: Gearman 사용 사례 -- 펄 크리스마스 달력 #2013]</A> 기사를 작성하는 도중에 발견. <p></p> <UL> <li> 하나의 Archive::Zip 오브젝트에 대하여 <code>read</code> 메쏘드를 호출하여 압축 파일을 읽을 때마다 점점 더 메모리를 차지하게 되고 <li> 이것이 영향을 미치는 걸로 보이는데, <code>extractTree</code> 또는 <code>extractMemberWithoutMethods</code> 메쏘드를 호출하여 내용을 추출할 때 점점 더 많은 시간을 소모하게 된다. <li> 따라서 수천 회 이상 압축 파일을 다루게 될 경우 속도가 눈에 띄게 저하될 수 있으므로 Archive::Zip 오브젝트를 재사용하지 말고 새로 생성하여 사용할 것을 권장함 </UL> <p></p> <p></p> <a name="S_2"></a><H3><a name='H_1_1' href='#toc'>1.1. </a>실험1</H3> <p></p> 일단 한 번만 read()하고, 루프를 돌며 2만번 extract()하는 테스트. <p></p> <style type="text/css"> <!-- .Identifier { color: #00ffff; } .Constant { color: #ff6060; } .Comment { color: #8080ff; } .Statement { color: #ffff00; } --> </style> <pre class="vim"> <span class="Comment"># Archive::Zip 오브젝트 생성</span> <span class="Statement">my</span> <span class="Identifier">$zip</span> = Archive::Zip-><span class="Statement">new</span>(); <span class="Comment"># 압축 파일을 한번만 읽고</span> <span class="Identifier">$zip</span>-><span class="Statement">read</span>(<span class="Identifier">$file</span>) == AZ_OK; <span class="Statement">for</span> <span class="Statement">my</span> <span class="Identifier">$num</span> ( <span class="Constant">1</span> .. <span class="Constant">20000</span> ) { <span class="Comment"># 특정 파일을 추출을 2만번 반복</span> <span class="Identifier">$zip</span>->extractMemberWithoutPaths(<span class="Identifier">$member</span>); } </pre> <p></p> <button class="memo-toggle">전체 코드는 길어서 접음</button><div class="memo-area memo" style="display:none;"> <style type="text/css"> <!-- .Comment { color: #8080ff; } .Identifier { color: #00ffff; } .Special { color: #ff40ff; } .Statement { color: #ffff00; } .Constant { color: #ff6060; } --> </style> <pre class="vim"> <span class="Statement">!/</span><span class="Constant">usr</span><span class="Statement">/</span>bin/env perl <span class="Statement">use strict</span>; <span class="Statement">use warnings</span>; <span class="Comment"># local $| = 1;</span> <span class="Statement">use </span>Archive::Zip <span class="Constant">qw/</span><span class="Constant">:ERROR_CODES :CONSTANTS</span><span class="Constant">/</span>; <span class="Statement">use </span>Time::HiRes <span class="Constant">qw/</span><span class="Constant">gettimeofday tv_interval</span><span class="Constant">/</span>; <span class="Statement">use </span>Memory::Usage; <span class="Statement">my</span> <span class="Identifier">$member</span> = <span class="Constant">'</span><span class="Constant">LIST.txt</span><span class="Constant">'</span>; <span class="Comment"># ZIP 파일 내에서 꺼낼 파일</span> <span class="Statement">my</span> <span class="Identifier">$file</span> = <span class="Constant">'</span><span class="Constant">/home/gypark/temp/gearman/data/00001.zip</span><span class="Constant">'</span>; <span class="Statement">my</span> <span class="Identifier">$mu</span> = Memory::Usage-><span class="Statement">new</span>; <span class="Comment"># Archive::Zip 오브젝트 생성</span> <span class="Statement">my</span> <span class="Identifier">$zip</span> = Archive::Zip-><span class="Statement">new</span>(); <span class="Statement">my</span> <span class="Identifier">$time_read</span> = <span class="Constant">0</span>; <span class="Statement">my</span> <span class="Identifier">$time_extract</span> = <span class="Constant">0</span>; <span class="Statement">my</span> <span class="Identifier">$t0</span>; <span class="Statement">my</span> <span class="Identifier">$elapsed</span>; <span class="Comment"># 압축 파일을 한번만 읽고</span> <span class="Statement">die</span> <span class="Constant">'</span><span class="Constant">read erorr</span><span class="Constant">'</span> <span class="Statement">unless</span> ( <span class="Identifier">$zip</span>-><span class="Statement">read</span>(<span class="Identifier">$file</span>) == AZ_OK ); <span class="Identifier">$mu</span>->record(<span class="Constant">'</span><span class="Constant">before loop</span><span class="Constant">'</span>); <span class="Statement">for</span> <span class="Statement">my</span> <span class="Identifier">$num</span> ( <span class="Constant">1</span> .. <span class="Constant">20000</span> ) { <span class="Comment"># 특정 파일을 추출을 2만번 반복</span> <span class="Identifier">$t0</span> = [ gettimeofday ]; <span class="Identifier">$zip</span>->extractMemberWithoutPaths(<span class="Identifier">$member</span>); <span class="Identifier">$elapsed</span> = tv_interval(<span class="Identifier">$t0</span>); <span class="Identifier">$time_extract</span> += <span class="Identifier">$elapsed</span>; <span class="Statement">if</span> ( <span class="Identifier">$num</span> % <span class="Constant">1000</span> == <span class="Constant">0</span> ) { <span class="Statement">printf</span>(<span class="Constant">"</span><span class="Identifier">%d</span><span class="Constant">, %.6f, %.6f</span><span class="Special">\n</span><span class="Constant">"</span>, <span class="Identifier">$num</span>, <span class="Identifier">$time_read</span>/<span class="Constant">1000</span>, <span class="Identifier">$time_extract</span>/<span class="Constant">1000</span>); <span class="Identifier">$time_read</span> = <span class="Constant">0</span>; <span class="Identifier">$time_extract</span> = <span class="Constant">0</span>; } <span class="Comment"># 추출했던 파일은 삭제</span> <span class="Statement">unlink</span> <span class="Identifier">$member</span> <span class="Statement">or</span> <span class="Statement">die</span> <span class="Constant">"</span><span class="Constant">unlink:</span><span class="Identifier">$!</span><span class="Constant">"</span>; } <span class="Identifier">$mu</span>->record(<span class="Constant">'</span><span class="Constant">after loop</span><span class="Constant">'</span>); <span class="Identifier">$mu</span>-><span class="Statement">dump</span>(); </pre> </div> <p></p> 결과: <pre class="code"> 1000, 0.000000, 0.000742 <-- 매 천 번 반복하는 동안 read()와 extract..()의 평균 실행 시간 2000, 0.000000, 0.000750 3000, 0.000000, 0.000746 4000, 0.000000, 0.000768 5000, 0.000000, 0.000772 6000, 0.000000, 0.000771 7000, 0.000000, 0.000756 8000, 0.000000, 0.000649 9000, 0.000000, 0.000626 10000, 0.000000, 0.000754 11000, 0.000000, 0.000764 12000, 0.000000, 0.000764 13000, 0.000000, 0.000583 14000, 0.000000, 0.000658 15000, 0.000000, 0.000495 16000, 0.000000, 0.000514 17000, 0.000000, 0.000608 18000, 0.000000, 0.000589 19000, 0.000000, 0.000708 20000, 0.000000, 0.000760 time vsz ( diff) rss ( diff) shared ( diff) code ( diff) data ( diff) 0 94868 ( 94868) 8920 ( 8920) 2068 ( 2068) 1412 ( 1412) 7136 ( 7136) before loop 14 94868 ( 0) 8968 ( 48) 2096 ( 28) 1412 ( 0) 7136 ( 0) after loop </pre> <p></p> <UL> <li> read()는 처음 한 번만 하고 루프 내에서는 실행하지 않았으니 시간이 0.0이고 <li> extract<a rel="nofollow" href="https://gypark.pe.kr/wiki/action=edit&id=MemberWithoutPaths" class="wikipageedit" >M</a>emberWithoutPaths()의 평균 수행 시간은 루프를 천 번 반복하는 동안 합산한 후 1000으로 나눈 값이다. 즉 처음 천 번 루프를 도는 동안 이 함수를 한번 호출할 때 수행 시간은 평균 0.742밀리초였고, 그 다음 천 번 루프를 도는 동안은 평균 0.750밀리초...이런 식. 위 결과를 보면 <strong>처음 천 번 돌 때나 마지막 스무 번째 천 번 돌 때나 평균시간은 항상 0.8밀리초 이내를 유지</strong>한다. <li> 마지막 두 줄은 루프를 돌기 전과 후의 사용 메모리 변화인데, <strong>data size 에 전혀 변화가 없다</strong>. </UL> <p></p> <p></p> <a name="S_3"></a><H3><a name='H_1_2' href='#toc'>1.2. </a>실험2</H3> <p></p> 이번에는 2만번 루프를 돌면서 매번 read()하고 extract()하는 테스트. <p></p> <style type="text/css"> <!-- .Identifier { color: #00ffff; } .Constant { color: #ff6060; } .Comment { color: #8080ff; } .Statement { color: #ffff00; } --> </style> <pre class="vim"> <span class="Comment"># Archive::Zip 오브젝트 생성</span> <span class="Statement">my</span> <span class="Identifier">$zip</span> = Archive::Zip-><span class="Statement">new</span>(); <span class="Statement">for</span> <span class="Statement">my</span> <span class="Identifier">$num</span> ( <span class="Constant">1</span> .. <span class="Constant">20000</span> ) { <span class="Comment"># 압축 파일을 매번 읽고</span> <span class="Identifier">$zip</span>-><span class="Statement">read</span>(<span class="Identifier">$file</span>); <span class="Comment"># 특정 파일을 추출</span> <span class="Identifier">$zip</span>->extractMemberWithoutPaths(<span class="Identifier">$member</span>); } </pre> <p></p> <button class="memo-toggle">전체 코드 - 길어서 접음</button><div class="memo-area memo" style="display:none;"> <style type="text/css"> <!-- .Comment { color: #8080ff; } .Constant { color: #ff6060; } .Identifier { color: #00ffff; } .Special { color: #ff40ff; } .PreProc { color: #ff40ff; } .Statement { color: #ffff00; } --> </style> <pre class="vim"> <span class="PreProc">#!/usr/bin/env perl</span> <span class="Statement">use strict</span>; <span class="Statement">use warnings</span>; <span class="Comment"># local $| = 1;</span> <span class="Statement">use </span>Archive::Zip <span class="Constant">qw/</span><span class="Constant">:ERROR_CODES :CONSTANTS</span><span class="Constant">/</span>; <span class="Statement">use </span>Time::HiRes <span class="Constant">qw/</span><span class="Constant">gettimeofday tv_interval</span><span class="Constant">/</span>; <span class="Statement">use </span>Memory::Usage; <span class="Statement">my</span> <span class="Identifier">$member</span> = <span class="Constant">'</span><span class="Constant">LIST.txt</span><span class="Constant">'</span>; <span class="Comment"># ZIP 파일 내에서 꺼낼 파일</span> <span class="Statement">my</span> <span class="Identifier">$file</span> = <span class="Constant">'</span><span class="Constant">/home/gypark/temp/gearman/data/00001.zip</span><span class="Constant">'</span>; <span class="Statement">my</span> <span class="Identifier">$mu</span> = Memory::Usage-><span class="Statement">new</span>; <span class="Comment"># Archive::Zip 오브젝트 생성</span> <span class="Statement">my</span> <span class="Identifier">$zip</span> = Archive::Zip-><span class="Statement">new</span>(); <span class="Statement">my</span> <span class="Identifier">$time_read</span> = <span class="Constant">0</span>; <span class="Statement">my</span> <span class="Identifier">$time_extract</span> = <span class="Constant">0</span>; <span class="Statement">my</span> <span class="Identifier">$t0</span>; <span class="Statement">my</span> <span class="Identifier">$elapsed</span>; <span class="Identifier">$mu</span>->record(<span class="Constant">'</span><span class="Constant">before loop</span><span class="Constant">'</span>); <span class="Statement">for</span> <span class="Statement">my</span> <span class="Identifier">$num</span> ( <span class="Constant">1</span> .. <span class="Constant">20000</span> ) { <span class="Comment"># 압축 파일을 매번 읽고</span> <span class="Identifier">$t0</span> = [ gettimeofday ]; <span class="Identifier">$zip</span>-><span class="Statement">read</span>(<span class="Identifier">$file</span>); <span class="Identifier">$elapsed</span> = tv_interval(<span class="Identifier">$t0</span>); <span class="Identifier">$time_read</span> += <span class="Identifier">$elapsed</span>; <span class="Comment"># 특정 파일을 추출</span> <span class="Identifier">$t0</span> = [ gettimeofday ]; <span class="Identifier">$zip</span>->extractMemberWithoutPaths(<span class="Identifier">$member</span>); <span class="Identifier">$elapsed</span> = tv_interval(<span class="Identifier">$t0</span>); <span class="Identifier">$time_extract</span> += <span class="Identifier">$elapsed</span>; <span class="Statement">if</span> ( <span class="Identifier">$num</span> % <span class="Constant">1000</span> == <span class="Constant">0</span> ) { <span class="Statement">printf</span>(<span class="Constant">"</span><span class="Identifier">%d</span><span class="Constant">, %.6f, %.6f</span><span class="Special">\n</span><span class="Constant">"</span>, <span class="Identifier">$num</span>, <span class="Identifier">$time_read</span>/<span class="Constant">1000</span>, <span class="Identifier">$time_extract</span>/<span class="Constant">1000</span>); <span class="Identifier">$time_read</span> = <span class="Constant">0</span>; <span class="Identifier">$time_extract</span> = <span class="Constant">0</span>; } <span class="Comment"># 추출했던 파일은 삭제</span> <span class="Statement">unlink</span> <span class="Identifier">$member</span> <span class="Statement">or</span> <span class="Statement">die</span> <span class="Constant">"</span><span class="Constant">unlink:</span><span class="Identifier">$!</span><span class="Constant">"</span>; } <span class="Identifier">$mu</span>->record(<span class="Constant">'</span><span class="Constant">after loop</span><span class="Constant">'</span>); <span class="Identifier">$mu</span>-><span class="Statement">dump</span>(); </pre> </div> <p></p> 결과: <pre class="code"> 1000, 0.000462, 0.000902 2000, 0.000469, 0.001179 3000, 0.000461, 0.001391 4000, 0.000450, 0.001615 5000, 0.000409, 0.001711 6000, 0.000355, 0.001777 7000, 0.000351, 0.001891 8000, 0.000394, 0.002372 9000, 0.000403, 0.002621 10000, 0.000384, 0.002761 11000, 0.000334, 0.002657 12000, 0.000383, 0.003267 13000, 0.000389, 0.003607 14000, 0.000453, 0.004447 15000, 0.000339, 0.003767 16000, 0.000423, 0.004907 17000, 0.000402, 0.005094 18000, 0.000412, 0.005584 19000, 0.000437, 0.006368 20000, 0.000446, 0.006684 time vsz ( diff) rss ( diff) shared ( diff) code ( diff) data ( diff) 0 94872 ( 94872) 8884 ( 8884) 2036 ( 2036) 1412 ( 1412) 7140 ( 7140) before loop 74 173532 ( 78660) 87472 ( 78588) 2096 ( 60) 1412 ( 0) 85800 ( 78660) after loop </pre> <p></p> <UL> <li> 루프를 2만번 도는 동안 <strong>read()의 실행 시간은 항상 0.4밀리초 안팎으로 일정</strong>하다. <li> 그러나 <strong>extract 함수의 실행 시간은 점점 길어져서, 마지막에는 처음과 비교해서 거의 7~8배로 늘어난다</strong>. 그래서 스크립트 전체가 도는 시간도 실험1은 14초 정도였으나 실험2는 73초 정도 걸렸다. <li> 메모리 역시, <strong>루프를 돌고 나면 78.7MB 정도가 늘어나 있다</strong>. 루프 한 번 돌때마다 4KB 정도 늘어난 셈 </UL> <p></p> <img src="https://gypark.pe.kr/upload/extractMemberWithoutPaths.png" alt="Upload:extractMemberWithoutPaths.png"> <BR> (실험1과 실험2에서 extract 함수의 실행시간 비교) <p></p> <p></p> <a name="S_4"></a><H3><a name='H_1_3' href='#toc'>1.3. </a>실험3</H3> <p></p> 이건 확인 차원에서, 반대로 read 만 2만번 반복하게 한 것. <p></p> <style type="text/css"> <!-- .Identifier { color: #00ffff; } .Constant { color: #ff6060; } .Comment { color: #8080ff; } .Statement { color: #ffff00; } --> </style> <pre class="vim"> <span class="Comment"># Archive::Zip 오브젝트 생성</span> <span class="Statement">my</span> <span class="Identifier">$zip</span> = Archive::Zip-><span class="Statement">new</span>(); <span class="Statement">for</span> <span class="Statement">my</span> <span class="Identifier">$num</span> ( <span class="Constant">1</span> .. <span class="Constant">20000</span> ) { <span class="Comment"># 압축 파일을 매번 읽고, 추출은 하지 않음</span> <span class="Identifier">$zip</span>-><span class="Statement">read</span>(<span class="Identifier">$file</span>); } </pre> <p></p> <button class="memo-toggle">전체 코드 - 길어서 감춤</button><div class="memo-area memo" style="display:none;"> <style type="text/css"> <!-- .Comment { color: #8080ff; } .Constant { color: #ff6060; } .Identifier { color: #00ffff; } .Special { color: #ff40ff; } .PreProc { color: #ff40ff; } .Statement { color: #ffff00; } --> </style> <pre class="vim"> <span class="PreProc">#!/usr/bin/env perl</span> <span class="Statement">use strict</span>; <span class="Statement">use warnings</span>; <span class="Comment"># local $| = 1;</span> <span class="Statement">use </span>Archive::Zip <span class="Constant">qw/</span><span class="Constant">:ERROR_CODES :CONSTANTS</span><span class="Constant">/</span>; <span class="Statement">use </span>Time::HiRes <span class="Constant">qw/</span><span class="Constant">gettimeofday tv_interval</span><span class="Constant">/</span>; <span class="Statement">use </span>Memory::Usage; <span class="Statement">my</span> <span class="Identifier">$member</span> = <span class="Constant">'</span><span class="Constant">LIST.txt</span><span class="Constant">'</span>; <span class="Comment"># ZIP 파일 내에서 꺼낼 파일</span> <span class="Statement">my</span> <span class="Identifier">$file</span> = <span class="Constant">'</span><span class="Constant">/home/gypark/temp/gearman/data/00001.zip</span><span class="Constant">'</span>; <span class="Statement">my</span> <span class="Identifier">$mu</span> = Memory::Usage-><span class="Statement">new</span>; <span class="Comment"># Archive::Zip 오브젝트 생성</span> <span class="Statement">my</span> <span class="Identifier">$zip</span> = Archive::Zip-><span class="Statement">new</span>(); <span class="Statement">my</span> <span class="Identifier">$time_read</span> = <span class="Constant">0</span>; <span class="Statement">my</span> <span class="Identifier">$time_extract</span> = <span class="Constant">0</span>; <span class="Statement">my</span> <span class="Identifier">$t0</span>; <span class="Statement">my</span> <span class="Identifier">$elapsed</span>; <span class="Identifier">$mu</span>->record(<span class="Constant">'</span><span class="Constant">before loop</span><span class="Constant">'</span>); <span class="Statement">for</span> <span class="Statement">my</span> <span class="Identifier">$num</span> ( <span class="Constant">1</span> .. <span class="Constant">20000</span> ) { <span class="Comment"># 압축 파일을 매번 읽고, 추출은 하지 않음</span> <span class="Identifier">$t0</span> = [ gettimeofday ]; <span class="Identifier">$zip</span>-><span class="Statement">read</span>(<span class="Identifier">$file</span>); <span class="Identifier">$elapsed</span> = tv_interval(<span class="Identifier">$t0</span>); <span class="Identifier">$time_read</span> += <span class="Identifier">$elapsed</span>; <span class="Statement">if</span> ( <span class="Identifier">$num</span> % <span class="Constant">1000</span> == <span class="Constant">0</span> ) { <span class="Statement">printf</span>(<span class="Constant">"</span><span class="Identifier">%d</span><span class="Constant">, %.6f, %.6f</span><span class="Special">\n</span><span class="Constant">"</span>, <span class="Identifier">$num</span>, <span class="Identifier">$time_read</span>/<span class="Constant">1000</span>, <span class="Identifier">$time_extract</span>/<span class="Constant">1000</span>); <span class="Identifier">$time_read</span> = <span class="Constant">0</span>; <span class="Identifier">$time_extract</span> = <span class="Constant">0</span>; } } <span class="Identifier">$mu</span>->record(<span class="Constant">'</span><span class="Constant">after loop</span><span class="Constant">'</span>); <span class="Identifier">$mu</span>-><span class="Statement">dump</span>(); </pre> </div> <p></p> 결과: <pre class="code"> 1000, 0.000449, 0.000000 2000, 0.000444, 0.000000 3000, 0.000442, 0.000000 4000, 0.000444, 0.000000 5000, 0.000442, 0.000000 6000, 0.000446, 0.000000 7000, 0.000443, 0.000000 8000, 0.000449, 0.000000 9000, 0.000446, 0.000000 10000, 0.000422, 0.000000 11000, 0.000355, 0.000000 12000, 0.000286, 0.000000 13000, 0.000285, 0.000000 14000, 0.000287, 0.000000 15000, 0.000286, 0.000000 16000, 0.000296, 0.000000 17000, 0.000303, 0.000000 18000, 0.000308, 0.000000 19000, 0.000327, 0.000000 20000, 0.000445, 0.000000 time vsz ( diff) rss ( diff) shared ( diff) code ( diff) data ( diff) 0 94868 ( 94868) 8880 ( 8880) 2036 ( 2036) 1412 ( 1412) 7136 ( 7136) before loop 8 171756 ( 76888) 85876 ( 76996) 2076 ( 40) 1412 ( 0) 84024 ( 76888) after loop </pre> <UL> <li> read() 시간은 역시 항상 비슷비슷하다. <li> extract 함수는 호출하지 않았으니 볼 것 없고 <li> <strong>메모리 사용량이 실험2와 동일하게 늘었다</strong>. </UL> <p></p> <p></p> <a name="S_5"></a><H3><a name='H_1_4' href='#toc'>1.4. </a>실험4 - 매번 객체 만들기</H3> <p></p> 생각해보니까, 정작 이 테스트를 빼먹었다. 루프를 2만 번 돌면서 매번 Archive::Zip 객체를 새로 생성해서 사용하자. <p></p> <style type="text/css"> <!-- .Identifier { color: #00ffff; } .Constant { color: #ff6060; } .Comment { color: #8080ff; } .Statement { color: #ffff00; } --> </style> <pre class="vim"> <span class="Statement">for</span> <span class="Statement">my</span> <span class="Identifier">$num</span> ( <span class="Constant">1</span> .. <span class="Constant">20000</span> ) { <span class="Comment"># Archive::Zip 오브젝트를 매번 생성</span> <span class="Statement">my</span> <span class="Identifier">$zip</span> = Archive::Zip-><span class="Statement">new</span>(); <span class="Comment"># 압축 파일을 매번 읽고</span> <span class="Identifier">$zip</span>-><span class="Statement">read</span>(<span class="Identifier">$file</span>); <span class="Comment"># 특정 파일을 추출</span> <span class="Identifier">$zip</span>->extractMemberWithoutPaths(<span class="Identifier">$member</span>); } </pre> <p></p> <button class="memo-toggle">전체 코드 - 길어서 감춤</button><div class="memo-area memo" style="display:none;"> <style type="text/css"> <!-- .Comment { color: #8080ff; } .Constant { color: #ff6060; } .Identifier { color: #00ffff; } .Special { color: #ff40ff; } .PreProc { color: #ff40ff; } .Statement { color: #ffff00; } --> </style> <pre class="vim"> <span class="PreProc">#!/usr/bin/env perl</span> <span class="Statement">use strict</span>; <span class="Statement">use warnings</span>; <span class="Comment"># local $| = 1;</span> <span class="Statement">use </span>Archive::Zip <span class="Constant">qw/</span><span class="Constant">:ERROR_CODES :CONSTANTS</span><span class="Constant">/</span>; <span class="Statement">use </span>Time::HiRes <span class="Constant">qw/</span><span class="Constant">gettimeofday tv_interval</span><span class="Constant">/</span>; <span class="Statement">use </span>Memory::Usage; <span class="Statement">my</span> <span class="Identifier">$member</span> = <span class="Constant">'</span><span class="Constant">LIST.txt</span><span class="Constant">'</span>; <span class="Comment"># ZIP 파일 내에서 꺼낼 파일</span> <span class="Statement">my</span> <span class="Identifier">$file</span> = <span class="Constant">'</span><span class="Constant">/home/gypark/temp/gearman/data/00001.zip</span><span class="Constant">'</span>; <span class="Statement">my</span> <span class="Identifier">$mu</span> = Memory::Usage-><span class="Statement">new</span>; <span class="Statement">my</span> <span class="Identifier">$time_read</span> = <span class="Constant">0</span>; <span class="Statement">my</span> <span class="Identifier">$time_extract</span> = <span class="Constant">0</span>; <span class="Statement">my</span> <span class="Identifier">$t0</span>; <span class="Statement">my</span> <span class="Identifier">$elapsed</span>; <span class="Identifier">$mu</span>->record(<span class="Constant">'</span><span class="Constant">before loop</span><span class="Constant">'</span>); <span class="Statement">for</span> <span class="Statement">my</span> <span class="Identifier">$num</span> ( <span class="Constant">1</span> .. <span class="Constant">20000</span> ) { <span class="Comment"># Archive::Zip 오브젝트를 매번 생성</span> <span class="Statement">my</span> <span class="Identifier">$zip</span> = Archive::Zip-><span class="Statement">new</span>(); <span class="Comment"># 압축 파일을 매번 읽고</span> <span class="Identifier">$t0</span> = [ gettimeofday ]; <span class="Identifier">$zip</span>-><span class="Statement">read</span>(<span class="Identifier">$file</span>); <span class="Identifier">$elapsed</span> = tv_interval(<span class="Identifier">$t0</span>); <span class="Identifier">$time_read</span> += <span class="Identifier">$elapsed</span>; <span class="Comment"># 특정 파일을 추출</span> <span class="Identifier">$t0</span> = [ gettimeofday ]; <span class="Identifier">$zip</span>->extractMemberWithoutPaths(<span class="Identifier">$member</span>); <span class="Identifier">$elapsed</span> = tv_interval(<span class="Identifier">$t0</span>); <span class="Identifier">$time_extract</span> += <span class="Identifier">$elapsed</span>; <span class="Statement">if</span> ( <span class="Identifier">$num</span> % <span class="Constant">1000</span> == <span class="Constant">0</span> ) { <span class="Statement">printf</span>(<span class="Constant">"</span><span class="Identifier">%d</span><span class="Constant">, %.6f, %.6f</span><span class="Special">\n</span><span class="Constant">"</span>, <span class="Identifier">$num</span>, <span class="Identifier">$time_read</span>/<span class="Constant">1000</span>, <span class="Identifier">$time_extract</span>/<span class="Constant">1000</span>); <span class="Identifier">$time_read</span> = <span class="Constant">0</span>; <span class="Identifier">$time_extract</span> = <span class="Constant">0</span>; } <span class="Comment"># 추출했던 파일은 삭제</span> <span class="Statement">unlink</span> <span class="Identifier">$member</span> <span class="Statement">or</span> <span class="Statement">die</span> <span class="Constant">"</span><span class="Constant">unlink:</span><span class="Identifier">$!</span><span class="Constant">"</span>; } <span class="Identifier">$mu</span>->record(<span class="Constant">'</span><span class="Constant">after loop</span><span class="Constant">'</span>); <span class="Identifier">$mu</span>-><span class="Statement">dump</span>(); </pre> </div> <p></p> 결과: <pre class="code"> 1000, 0.000419, 0.000705 2000, 0.000361, 0.000610 3000, 0.000412, 0.000692 4000, 0.000428, 0.000721 5000, 0.000457, 0.000770 6000, 0.000459, 0.000787 7000, 0.000475, 0.000797 8000, 0.000421, 0.000705 9000, 0.000428, 0.000721 10000, 0.000424, 0.000715 11000, 0.000418, 0.000706 12000, 0.000439, 0.000735 13000, 0.000388, 0.000636 14000, 0.000478, 0.000790 15000, 0.000456, 0.000777 16000, 0.000433, 0.000727 17000, 0.000425, 0.000721 18000, 0.000322, 0.000543 19000, 0.000392, 0.000655 20000, 0.000395, 0.000665 time vsz ( diff) rss ( diff) shared ( diff) code ( diff) data ( diff) 0 94872 ( 94872) 8884 ( 8884) 2036 ( 2036) 1412 ( 1412) 7140 ( 7140) before loop 24 94872 ( 0) 8976 ( 92) 2096 ( 60) 1412 ( 0) 7140 ( 0) after loop </pre> <UL> <li> read() 시간도 일정 <li> extract() 시간도 일정 <li> 메모리도 별다른 누수는 없는 걸로 보임 <li> 매번 객체를 생성하는 오버헤드는 어쩔 수 없이 있겠지만, 전체 실행 시간을 생각하면 훨씬 이득이다. (24초 정도 걸렸음) </UL> <p></p> <p></p> <a name="S_6"></a><H3><a name='H_1_5' href='#toc'>1.5. </a>중간 결론</H3> <p></p> <UL> <li> read()가 압축 파일을 읽을 때 기존에 읽었던 메모리를 제대로 반납하지 않는지, read()를 할 때마다 메모리를 차지함 <li> 게다가 압축 파일의 내용을 읽어 추출할 때 이렇게 늘어난 메모리의 양의 영향을 받는 걸로 보인다. <li> 테스트는 extractTree()와 extract<a rel="nofollow" href="https://gypark.pe.kr/wiki/action=edit&id=MemberWithoutPaths" class="wikipageedit" >M</a>emberWithoutPaths() 두 가지만 가지고 했으나, 아마도 extract... 함수들 모두에 적용될 것으로 짐작됨. <li> 이제 증상은 파악했으니 모듈 소스를 들여다보고 문제점을 찾으면 좋겠으나... 그럴 의지도 시간도 없어서 버그 리포트나 해야겠다. 어쩌면 이미 리포트도 되어 있는지도 모르겠음. <li> <strong>아무튼 이 모듈을 쓰는 사람은 이것을 명심하여 다수의 압축 파일을 처리할 때는 매번 Archive::Zip 객체를 새로 만드는 게 좋겠다.</strong> </UL> <p></p> <p></p> <p></p> <p></p> <a name="S_7"></a><H2><a name='H_2' href='#toc'>2. </a>기타 & Comments</H2> <p></p> <div class="comments"> <form method="post" action="https://gypark.pe.kr/wiki" enctype="multipart/form-data" name="comments"><input type="hidden" name="action" value="comments" /><input type="hidden" name="id" value="Perl/Archive-Zip" /><input type="hidden" name="pageid" value="Perl/Archive-Zip" /><input type="hidden" name="up" value="100" /><input type="hidden" name="ccode" value="OmaVzJnrVRc" />이름: <input type="text" name="name" size="15" maxlength="80" class="comments" /> <DIV style='display:none;'>Homepage: <input type="text" name="homepage" size="10" maxlength="80" class="comments" /></DIV>내용: <input type="hidden" name="long" value="1" /><br><textarea name="comment" rows="7" cols="80" class="comments"></textarea> <input type="submit" name="Submit" value="달기" /></form> </div> <hr noshade style="height:1px"> <a href="https://gypark.pe.kr/wiki/컴퓨터분류" class="wikipagelink" data-editlink="https://gypark.pe.kr/wiki/action=edit&id=컴퓨터분류">컴퓨터분류</a> <HR class='footer'> <DIV class='editguide'><br>마지막 편집일: 2013-12-20 5:52 pm <a rel="nofollow" href="https://gypark.pe.kr/wiki/action=browse&diff=1&id=Perl/Archive-Zip">(변경사항 [d])</a><br>1674 hits | <a href="https://gypark.pe.kr/wiki/Perl/Archive-Zip" title="이 페이지의 주소를 url인코딩한 URL">Permalink</a> | <a rel="nofollow" href="https://gypark.pe.kr/wiki/action=history&id=Perl/Archive-Zip">변경내역 보기 [h]</a> | <a rel="nofollow" href="https://gypark.pe.kr/wiki/action=edit&id=Perl/Archive-Zip" class="wikipageedit" >페이지 소스 보기</a></DIV><HR class='footer'> <DIV class='footer'><i> 0.139 sec </i><a accesskey="x" name="PAGE_BOTTOM" href="#PAGE_TOP">처음으로 [t]</a></DIV> </body> </html>