<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Artisan Coder &#187; compiler</title>
	<atom:link href="http://www.artisancoder.com/tag/compiler/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.artisancoder.com</link>
	<description>Software development as a craft</description>
	<lastBuildDate>Mon, 23 Aug 2010 13:19:58 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Syntactic closures in Sly</title>
		<link>http://www.artisancoder.com/2010/05/syntactic-closures-in-sly/</link>
		<comments>http://www.artisancoder.com/2010/05/syntactic-closures-in-sly/#comments</comments>
		<pubDate>Mon, 10 May 2010 19:52:57 +0000</pubDate>
		<dc:creator>Alex</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Scheme]]></category>
		<category><![CDATA[compiler]]></category>
		<category><![CDATA[sly]]></category>
		<category><![CDATA[syntactic closures]]></category>

		<guid isPermaLink="false">http://www.ventonegro.org/?p=159</guid>
		<description><![CDATA[Continuing my quest in mastering Scheme, I have completely revamped Sly&#8216;s front-end and implemented syntactic closures in the source code expander. The initial motivation was to get alpha-renamed identifiers as the result of the expansion phase, to make it easier to apply subsequent transformations on the source code (like Dybvig&#8217;s Fixing Letrec). But it was [...]]]></description>
			<content:encoded><![CDATA[<p>Continuing my quest in mastering Scheme, I have completely revamped <a href="http://github.com/asandroq/sly" title="Sly Scheme">Sly</a>&#8216;s front-end and implemented <a href="http://community.schemewiki.org/?syntactic-closures" title="Syntactic Closures">syntactic closures</a> in the source code expander. The initial motivation was to get alpha-renamed identifiers as the result of the expansion phase, to make it easier to apply subsequent transformations on the source code (like Dybvig&#8217;s <a href="http://www.cs.indiana.edu/~dyb/pubs/fixing-letrec.pdf" title="Fixing Letrec">Fixing Letrec</a>). But it was a worthy effort in itself. All previous derived syntax that was internally rewritten using <tt>gensym</tt> now is written using syntactic closures, making them clearer and much more robust because of the added hygiene.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.artisancoder.com/2010/05/syntactic-closures-in-sly/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Adding a garbage collector</title>
		<link>http://www.artisancoder.com/2009/05/adding-a-garbage-collector/</link>
		<comments>http://www.artisancoder.com/2009/05/adding-a-garbage-collector/#comments</comments>
		<pubDate>Tue, 05 May 2009 15:00:12 +0000</pubDate>
		<dc:creator>Alex</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Scheme]]></category>
		<category><![CDATA[compiler]]></category>
		<category><![CDATA[garbage collector]]></category>
		<category><![CDATA[scheme]]></category>

		<guid isPermaLink="false">http://www.ventonegro.org/?p=96</guid>
		<description><![CDATA[I guess it is clear by now that I am writing a toy Scheme compiler and virtual machine. Primarily for learning the techniques, I just touch it from time to time. The last addition to the virtual machine was a Cheney-style copy garbage collector. I wanted to implement a very simple algorithm to get a [...]]]></description>
			<content:encoded><![CDATA[<p>I guess it is clear by now that I am writing a toy Scheme compiler and virtual machine. Primarily for learning the techniques, I just touch it from time to time. The last addition to the virtual machine was a <a href="http://en.wikipedia.org/wiki/Cheney%27s_algorithm" title="Cheney's algorithm">Cheney-style</a> copy garbage collector. I wanted to implement a very simple algorithm to get a functioning system quicker. The simplest algorithms I know of are the Cheney one and the <a href="http://portal.acm.org/citation.cfm?id=363554&#038;dl=GUIDE&#038;coll=GUIDE&#038;CFID=34360320&#038;CFTOKEN=66444289" title="An efficient machine-independent procedure for garbage collection in various list structures">Deutsch-Schorr-Waite</a> mark-and-sweep garbage collector. I chose the copying one because I believe generational garbage collection is the way to go for Scheme virtual machines, given that the rate of allocation is very high (allocation in a copy garbage collector is just moving a pointer) and most data objects die young (how many times did you use <tt>reverse</tt> after a loop?).</p>
<p>I was surprised by how easy it was to implement the collector. It is a very simple one (stop-the-world, two semi-spaces etc.) but nevertheless I always thought garbage collectors were somewhat magical, a bunch of elves that worked out of sight to keep your process working perfectly. This is exactly why I am writing such a system, even if it never gets free onto the world it will fulfill its purpose. Of course, if I am satisfied with the result, I will release the beast.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.artisancoder.com/2009/05/adding-a-garbage-collector/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Compiling let</title>
		<link>http://www.artisancoder.com/2009/01/compiling-let/</link>
		<comments>http://www.artisancoder.com/2009/01/compiling-let/#comments</comments>
		<pubDate>Wed, 14 Jan 2009 18:54:42 +0000</pubDate>
		<dc:creator>Alex</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Scheme]]></category>
		<category><![CDATA[compiler]]></category>
		<category><![CDATA[scheme]]></category>

		<guid isPermaLink="false">http://www.ventonegro.org/?p=78</guid>
		<description><![CDATA[Following my previous Scheme toy interpreter in Lua, I have already tried twice to write a simple Scheme compiler. I failed mostly because the complexity grew too fast and I could not see any of it working before losing interest. But I then read Ghuloum&#8217;s paper on incremental compiler construction. It&#8217;s a very nice approach, [...]]]></description>
			<content:encoded><![CDATA[<p>Following my previous <a href="http://www.artisancoder.com/2008/06/toy-scheme-interpreter-in-lua/" title="Toy Scheme interpreter in Lua">Scheme toy interpreter in Lua</a>, I have already tried twice to write a simple Scheme compiler. I failed mostly because the complexity grew too fast and I could not see any of it working before losing interest. But I then read <a href="http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf" title="An Incremental Approach to Compiler Construction">Ghuloum&#8217;s paper</a> on incremental compiler construction. It&#8217;s a very nice approach, and I stopped trying to think everything in advance and instead started to work little by little. Unlike the paper I&#8217;m not compiling to assembly but to a virtual machine, tough.</p>
<p>I have come across one of the many design decisions a compiler writer faces. When compiling <tt>let</tt> forms, they are first translated to the closed lambda applications they are syntatic sugar for. For instance:</p>

<div class="wp_codebox"><table><tr id="p783"><td class="code" id="p78code3"><pre class="scheme" style="font-family:monospace;"><span style="color: #66cc66;">&#40;</span><span style="color: #b1b100;">let</span> <span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#40;</span>a <span style="color: #ff0000;">&quot;Alex&quot;</span><span style="color: #66cc66;">&#41;</span>
      <span style="color: #66cc66;">&#40;</span><span style="color: #b1b100;">e</span> <span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">+</span> <span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">*</span> pi <span style="color: #cc66cc;">2</span><span style="color: #66cc66;">&#41;</span> <span style="color: #cc66cc;">1</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span>
      <span style="color: #66cc66;">&#40;</span>i <span style="color: #66cc66;">&#40;</span><span style="color: #b1b100;">car</span> <span style="color: #66cc66;">&#40;</span><span style="color: #b1b100;">cons</span> <span style="color: #cc66cc;">1</span> <span style="color: #cc66cc;">0</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span>
  <span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#40;</span><span style="color: #b1b100;">if</span> a <span style="color: #66cc66;">+</span> <span style="color: #66cc66;">-</span><span style="color: #66cc66;">&#41;</span> <span style="color: #b1b100;">e</span> i<span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span></pre></td></tr></table></div>

<p>Becomes:</p>

<div class="wp_codebox"><table><tr id="p784"><td class="code" id="p78code4"><pre class="scheme" style="font-family:monospace;"><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#40;</span><span style="color: #b1b100;">lambda</span> <span style="color: #66cc66;">&#40;</span>a <span style="color: #b1b100;">e</span> i<span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#40;</span><span style="color: #b1b100;">if</span> a <span style="color: #66cc66;">+</span> <span style="color: #66cc66;">-</span><span style="color: #66cc66;">&#41;</span> <span style="color: #b1b100;">e</span> i<span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span>
 <span style="color: #ff0000;">&quot;Alex&quot;</span> <span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">+</span> <span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">*</span> pi <span style="color: #cc66cc;">2</span><span style="color: #66cc66;">&#41;</span> <span style="color: #cc66cc;">1</span><span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#40;</span><span style="color: #b1b100;">car</span> <span style="color: #66cc66;">&#40;</span><span style="color: #b1b100;">cons</span> <span style="color: #cc66cc;">1</span> <span style="color: #cc66cc;">0</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span></pre></td></tr></table></div>

<p>There is no need to emit code to create full closures when compiling closed lambda applications, because they cannot be applied multiple times. Instead, the environment is extended during compilation and the body is compiled in this new environment, but still it is considered just a piece of the enclosing body. So far so good, but what does happen during execution?</p>
<p>During the execution of the virtual machine, the arguments to full closures are pushed onto the stack (along with a return address, saved frame pointer and number of arguments pushed), before jumping to the closure code. Currently, when entering the body of closed lambda applications, a lightweight closure calling protocol is used: Just the frame pointer and the arguments are pushed onto the stack and are latter popped. But when accessing bindings not local to the current scope, which is very common in <tt>let</tt> forms when one access the arguments of the enclosing full closure, the compiler has to emit code for jumping through activation frames until the desired one is found, and the binding is found there. In other words, the VM instruction for accessing bindings is <tt>LOAD i j</tt>, where <tt>i</tt> is the activation frame level and <tt>j</tt> is the index of the binding inside that activation frame.</p>
<p>Although this works, it is not as fast as it could be. I am planning to use flat environments as described in <a href="http://www.cs.indiana.edu/~dyb/pubs/3imp.pdf" title="Three Implementation Models for Scheme">Dybvig&#8217;s PhD thesis</a> so all access to bindings inside closures will have constant time, not linear on the depth of the desired binding. The current compilation of closed lambda applications defeats that optimisation, although it makes bindings somewhat shallower. The alternative is to look for all the bindings introduced by closed lambda applications during compilation and always allocate space for all of them in the stack. This uses (possibly much) more memory because allocates space for bindings that may never be needed during execution, but accessing any binding is again constant time because all of them are known at compile time.</p>
<p>As usual, it boils down to memory versus speed. I guess I&#8217;ll leave it as is for now and worry more about speed after the compiler is able of even compiling itself.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.artisancoder.com/2009/01/compiling-let/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Does the World need another Scheme system?</title>
		<link>http://www.artisancoder.com/2008/05/does-the-world-need-another-scheme-system/</link>
		<comments>http://www.artisancoder.com/2008/05/does-the-world-need-another-scheme-system/#comments</comments>
		<pubDate>Wed, 21 May 2008 19:27:32 +0000</pubDate>
		<dc:creator>Alex</dc:creator>
				<category><![CDATA[Lisp]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[compiler]]></category>
		<category><![CDATA[scheme]]></category>

		<guid isPermaLink="false">http://www.ventonegro.org/?p=9</guid>
		<description><![CDATA[I am currently reading the third chapter of Lisp in Small Pieces. It is really a wonderful book. By teaching how to implement Lisp, it teaches a lot about using the language too. Moreover, reading it sometimes I feel the urge that almost every Schemer has at one time or another felt: The urge to [...]]]></description>
			<content:encoded><![CDATA[<p>I am currently reading the third chapter of <a href="http://www.amazon.com/gp/product/0521545668/ref=s9qpick_c2_at3-2871_p?pf_rd_m=ATVPDKIKX0DER&#038;pf_rd_s=center-&#038;pf_rd_r=0E22QSHYB0WW14SABE40&#038;pf_rd_t=101&#038;pf_rd_p=278247301&#038;pf_rd_i=507846" title="Lisp in Small Pieces">Lisp in Small Pieces</a>. It is really a wonderful book. By teaching how to implement Lisp, it teaches a lot about using the language too. Moreover, reading it sometimes I feel the urge that almost every Schemer has at one time or another felt: The urge to write his own Scheme system.</p>
<p>Ok, I know the World is already sick of Schemes, it only takes a look at this <a href="http://community.schemewiki.org/?scheme-faq-standards#implementations" title="What Scheme implementations are there?">list</a> to know why. But I still think I have to do it, if only for my personal learning. There is no need to bore the World to death with another half-done Scheme. Sometimes I think about features that could make my Scheme useful, though. They are probably never going to be implemented, but here they are:<br />
<h3>Full R5RS compliance</h3>
<p>The R5RS is a piece of art (R6RS, on the other hand, is a stain, a mess). Of course it does not describe a terribly useful system, but it beautifully describes the <strong>core</strong> of one. Everything else can then come from there. For more compliance with other systems, the SRFIs can be used as the basis for a standard library.</p>
<h3>Bytecode virtual machine</h3>
<p>Although Scheme is a dynamic language, there is a <em>lot</em> that can be known about the program before executing it. Lexical variables, never assigned variables, closure and continuation creation etc. This can be processed at compile time, allowing for a simple and fast virtual machine. Interpreting the source code directly is easier but very inefficient for any serious system.<br />
<h3>Incremental garbage collection</h3>
<p>Stop-and-go garbage collectors are easier to implement, but in some applications the pause is inacceptable. For applications like interactive games, the smooth user experience only can be achieved by an incremental garbage collector. It would be nice if it was a moving collector too, to avoid too much fragmentation of the heap (some naive C programmers think the use of <tt>malloc</tt> and <tt>free</tt> can beat a garbage collector in any circustances, until they are hit in the face by heap fragmentation. There is a reason the Apache web server uses memory pools). If I remember correctly the CLR garbage collector is very good and compacts the memory without the use of from- and to-spaces, which is cool, because uses less heap memory. Hmm, while I am dreaming about my ideal garbage collector, make it concurrent too (see below).</p>
<h3>Concurrency</h3>
<p>These days the hot topic is concurrency. New personal computers with two and four cores sell more and more every day. The programming languages/environments that makes easy for the programmer to use these new cores are going to be big. <a href="http://www.erlang.org/" title="Erlang">Erlang</a> is the hottest thing right now in this regard, although <a href="http://www.haskell.org/" title="Haskell">Haskell</a> is a strong candidate too. In the mainstream Java has a concurrent VM, but Java uses the old lock/unlock paradigm of other mainstream languages, <a href="http://clojure.sourceforge.net/" title="Clojure">Clojure</a> is the new language on top of the JVM that leverage that power for the programmer. But although this is all so hot, the only thing that seems to be happening to Scheme is <a href="http://toute.ca/" title="Termite">Termite</a>, which as far as I know can only use green threads. My hypotetical Scheme would have a concurrent VM, <a href="http://www.artisancoder.com/2008/05/concurrent-scheme/" title="Concurrent Scheme">some Scheme compilation techniques</a> already give a start in that direction. There would be maybe a <tt>spawn-continuation</tt> form that would take a continuation captured with <tt>call/cc</tt> and resume it in parallel with the current one, in a M:N model (M continuations spread on N OS threads).</p>
<h3>Just-in-time compilation</h3>
<p>Well, once dreaming, why not go all the way? One of the nice things about Scheme is the bottom-up, incremental development. Developing at the REPL gives instant feedback and ease debugging. What if it could give the speed of compiled-to-CPU languages as well? Another advantage is hot-swapping code in a running application. This can&#8217;t be done with Chicken or Gambit, which are fast, but Scheme-to-C compilers. Larceny, Chez and Ikarus can do this. By using a bytecode VM, the system is portable to wherever there is a C compiler, and by adding JIT support for some architectures it can be made fast too.</p>
<h3>Escape continuations</h3>
<p>Full, first-class continuations with indefinite extent are a powerful and mind-bending feature, but unfortunately, are usually hard to implement and causes pauses when invoked. Besides them, my hypotetical Scheme would have escape continuations. Instead of completely overwriting the execution stack, they simply unwind it until some recorded point, which can be made very fast. Exceptions would them be implemented as escape continuations rather than full ones.</p>
<h3>Easy FFI</h3>
<p>I am a long-time <a href="http://www.lua.org/" title="Lua">Lua</a> user, and it is amazing how they could get the C API so good. Not only it is very easy to add new C/C++ libraries to Lua, it is trivial to embed the Lua interpreter itself in a legacy application, or an application that needs to be mostly in C/C++, such as games. As Scheme, unfortunately, is not one of the World&#8217;s favorite languages, the capacity to talk to the outside World is fundamental. And by being embeddable (which is the goal of Guile, but it sucks for Windows), it would be easier to take Scheme to another domains where it is not so popular today.</p>
<p>Well, I guess this is it. Thinking about it, there seems to be no Scheme system today with all these features. Will I have the skills to write something like this after LiSP? I doubt it, only the garbage collector would take me ages. But it would be nice if such thing existed&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.artisancoder.com/2008/05/does-the-world-need-another-scheme-system/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
