ionel's codelog - djangohttps://blog.ionelmc.ro/2022-03-23T00:00:00+02:00How to run uWSGI2022-03-14T00:00:00+02:002022-03-23T00:00:00+02:00Ionel Cristian Mărieștag:blog.ionelmc.ro,2022-03-14:/2022/03/14/how-to-run-uwsgi/<p>Given the <a class="reference external" href="https://uwsgi-docs.readthedocs.io/en/latest/Options.html">cornucopia</a> of options uWSGI offers it's really hard to figure
out what options and settings are good for your typical web app.</p>
<p>Normally you'd just balk and run something simpler with less knobs and dials, like mod-wsgi with Apache but alas, uWSGI is so flexible
and has so …</p><p>Given the <a class="reference external" href="https://uwsgi-docs.readthedocs.io/en/latest/Options.html">cornucopia</a> of options uWSGI offers it's really hard to figure
out what options and settings are good for your typical web app.</p>
<p>Normally you'd just balk and run something simpler with less knobs and dials, like mod-wsgi with Apache but alas, uWSGI is so flexible
and has so many features that mod-wsgi lacks. If only it weren't so tricky to configure...</p>
<p>First off, hands down, this is the most important setting - you should always start your configuration in strict mode. This will save you
lots of pain and suffering if you ever fiddle with options.</p>
<div class="highlight"><pre><span></span><span class="k">[uwsgi]</span><span class="w"></span>
<span class="c1"># Error on unknown options (prevents typos)</span><span class="w"></span>
<span class="na">strict</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
</pre></div>
<p>In general the most reliable concurrency model is processes, with no threads:</p>
<div class="highlight"><pre><span></span><span class="c1"># Formula: cores * 2 + 2</span><span class="w"></span>
<span class="na">processes</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">%(%k * 2 + 2)</span><span class="w"></span>
</pre></div>
<p>You could enable threads (the <tt class="docutils literal">threads</tt> option) and use less <tt class="docutils literal">processes</tt> but that can be problematic for code that is CPU-bound or not
thread-safe. I wouldn't enable the <a class="reference external" href="https://uwsgi-docs.readthedocs.io/en/latest/Gevent.html">gevent plugin</a> - you're just asking for
trouble with all that monkey-patching. Essentially you're using more memory to avoid certain problems.</p>
<p>Most of the useful uWSGI features rely on the master process, it's a pretty mandatory option to have:</p>
<div class="highlight"><pre><span></span><span class="c1"># Most of uWSGI features depend on the master mode</span><span class="w"></span>
<span class="na">master</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
</pre></div>
<p>So now that we have a master process we can do either load the application in the master one time or load it in every worker process. If
your project has lots of imports and things going on at import time it's something worth considering but you need to be wary of how you
manage external resources (like connections, locks and whatnot).</p>
<p>Basically each worker would be a copy of the master process. While the memory is copy-on-write the resources probably aren't.</p>
<p>You can deal with shared FDs by marking them as close-on-exec, these options will make uWSGI mark all the FDs as COE before forking
a worker, and after forking uWSGI's internal FDs will also be COE (if you'd ever want to call fork() in your crazy app).</p>
<div class="highlight"><pre><span></span><span class="c1"># Close fds on fork (don't allow subprocess to mess with parent's fds)</span><span class="w"></span>
<span class="na">close-on-exec</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
<span class="na">close-on-exec2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
</pre></div>
<p>Locks can't be dealt with automatically. Well, the stdlib tries to, and even tho there have been many bug-fixes with logging locks being
improperly shared after a fork you can always get a very sticky surprise. So essentially you need to ask yourself what's more
important - speed or correctness.</p>
<p>If you're prepared to have health checks and rolling deployments, you shouldn't care so much about server boot time - I'm pretty sure
correctness is what you want, thus you should make uWSGI import your code after it has started all the workers. Slower but safer:</p>
<div class="highlight"><pre><span></span><span class="c1"># In case there's some bad global state (pointless to use with need-app = true)</span><span class="w"></span>
<span class="na">lazy-apps</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
</pre></div>
<p>Otherwise, if you load the app before fork you might as well make just the service fail if it can't load the app at all.
You can probably avoid implementing fancy health checks by just using this:</p>
<div class="highlight"><pre><span></span><span class="c1"># Exit if no app can be loaded (pointless to use with lazy-apps = true)</span><span class="w"></span>
<span class="na">need-app</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
</pre></div>
<p>You still need to have threading enabled most of the time, for example if you
use <a class="reference external" href="https://docs.sentry.io/clients/python/advanced/#a-note-on-uwsgi">Sentry</a>:</p>
<div class="highlight"><pre><span></span><span class="c1"># Enable threads for sentry, see:</span><span class="w"></span>
<span class="c1"># https://docs.sentry.io/clients/python/advanced/#a-note-on-uwsgi</span><span class="w"></span>
<span class="na">enable-threads</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
</pre></div>
<p>Assuming you want to run a single project certain things can be disabled:</p>
<div class="highlight"><pre><span></span><span class="c1"># Avoid multiple interpreters (automatically created in case you need mounts)</span><span class="w"></span>
<span class="na">single-interpreter</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
</pre></div>
<p>Even if you don't run your app in a Docker container this is a good thing to do. Strangely uWSGI doesn't do this by default - a consequence
of having too many features and use-cases I guess...</p>
<div class="highlight"><pre><span></span><span class="c1"># Respect SIGTERM and do shutdown instead of reload</span><span class="w"></span>
<span class="na">die-on-term</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
</pre></div>
<p>The preferred way to load your app should be <tt class="docutils literal">module</tt> as it forces you get your application imported correctly.
If you want to keep the configuration file generic you can use an environment variable, example:</p>
<div class="highlight"><pre><span></span><span class="c1"># WSGI module (application callable expected inside)</span><span class="w"></span>
<span class="na">module</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">$(DJANGO_PROJECT_NAME).wsgi</span><span class="w"></span>
</pre></div>
<p>A bit of process management necessary most of the time:</p>
<div class="highlight"><pre><span></span><span class="c1"># Respawn processes that take more than ... seconds</span><span class="w"></span>
<span class="na">harakiri</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">300</span><span class="w"></span>
<span class="na">harakiri-verbose</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
<span class="c1"># Respawn processes after serving ... requests</span><span class="w"></span>
<span class="na">max-requests</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">5000</span><span class="w"></span>
<span class="c1"># Respawn if processes are bloated</span><span class="w"></span>
<span class="na">reload-on-as</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">1024</span><span class="w"></span>
<span class="na">reload-on-rss</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">512</span><span class="w"></span>
<span class="c1"># We don't expect abuse so lets have fastest respawn possible</span><span class="w"></span>
<span class="na">forkbomb-delay</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">0</span><span class="w"></span>
</pre></div>
<p>I wouldn't use the evil reload variants (<tt class="docutils literal"><span class="pre">evil-reload-on-rss</span></tt> and <tt class="docutils literal"><span class="pre">evil-reload-on-as</span></tt>) as they will kill your workers at unexpected
points and that job is better left to the linux OOM killer anyway.</p>
<p>Assuming you'll have a Nginx frontend the best way to connect them is via a unix domain socket - it has the lowest overhead, and well, it's
better to have a file with the wrong perms than a port open on the wrong interface. Assuming you'll start uWSGI as root:</p>
<div class="highlight"><pre><span></span><span class="c1"># Assuming we start from root we need to create the socket way early</span><span class="w"></span>
<span class="na">shared-socket</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">/var/run/app.uwsgi</span><span class="w"></span>
<span class="na">chmod-socket</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">666</span><span class="w"></span>
<span class="na">socket</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">=0</span><span class="w"></span>
<span class="c1"># Change user after binding the socket</span><span class="w"></span>
<span class="na">uid</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">app</span><span class="w"></span>
<span class="na">gid</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">app</span><span class="w"></span>
</pre></div>
<p>In Nginx all you need is something along these lines.</p>
<div class="highlight"><pre><span></span><span class="k">http</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="c1"># Some fine-tuning</span>
<span class="w"> </span><span class="kn">client_max_body_size</span><span class="w"> </span><span class="mi">10m</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="kn">client_body_buffer_size</span><span class="w"> </span><span class="mi">64k</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="kn">large_client_header_buffers</span><span class="w"> </span><span class="mi">8</span><span class="w"> </span><span class="mi">32k</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="kn">server</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="kn">location</span><span class="w"> </span><span class="s">/</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w"> </span><span class="kn">include</span><span class="w"> </span><span class="s">/etc/nginx/uwsgi_params</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="kn">uwsgi_pass</span><span class="w"> </span><span class="s">unix:/var/run/app.uwsgi</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="kn">uwsgi_ignore_client_abort</span><span class="w"> </span><span class="no">on</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="kn">uwsgi_next_upstream</span><span class="w"> </span><span class="no">off</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="kn">uwsgi_read_timeout</span><span class="w"> </span><span class="mi">300</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="c1"># Prevent nginx discarding large responses.</span>
<span class="w"> </span><span class="kn">uwsgi_buffering</span><span class="w"> </span><span class="no">on</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="c1"># Initial response size (practically headers size)</span>
<span class="w"> </span><span class="kn">uwsgi_buffer_size</span><span class="w"> </span><span class="mi">64k</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="kn">uwsgi_buffers</span><span class="w"> </span><span class="mi">8</span><span class="w"> </span><span class="mi">32k</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="p">}</span><span class="w"></span>
<span class="w"> </span><span class="p">}</span><span class="w"></span>
<span class="p">}</span><span class="w"></span>
</pre></div>
<p>Why do we need all these buffer tweaks and limits you wonder? Well you should strive for compatibility and resilience:</p>
<ul class="simple">
<li>Allow requests with lots of cookies, should you need to have cookie session storage.
That means big headers thus we increase some buffer sizes.</li>
<li>Disallow really large uploads. Most apps don't need to take file uploads larger than 10Mb so that's a good default.</li>
<li>Prevent getting DOS-ed by slow-client type of attacks like <a class="reference external" href="https://en.wikipedia.org/wiki/Slowloris_(computer_security)">Slowris</a> or <a class="reference external" href="https://en.wikipedia.org/wiki/R-U-Dead-Yet">RUDY</a>.
That means the frontend needs to buffer the request body - an acceptable trade-off if we also have a request body limit.</li>
</ul>
<p>With those settings you should fare pretty well, but you should always tests anyway -
<a class="reference external" href="https://github.com/shekyan/slowhttptest">slowhttptest</a> is available as a Fedora and Ubuntu package.</p>
<p>Note that each worker will access the socket directly (call <tt class="docutils literal">accept()</tt> on that socket) regardless of protocol (TCP or UDS) thus
some workloads won't be evenly distributed to the uWSGI workers. So if you have an application that has some slow views and some fast views
a good option to consider is this:</p>
<div class="highlight"><pre><span></span><span class="c1"># Enable an accept mutex for a more balanced worker load</span><span class="w"></span>
<span class="na">thunder-lock</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
</pre></div>
<p>You're essentially trading a bit of throughput and minimum latency for way better maximum latency.
Read more about it <a class="reference external" href="https://uwsgi-docs.readthedocs.io/en/latest/articles/SerializingAccept.html">here</a>.</p>
<p>Other useful options:</p>
<div class="highlight"><pre><span></span><span class="c1"># Good for debugging/development</span><span class="w"></span>
<span class="na">auto-procname</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
<span class="na">log-5xx</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
<span class="na">log-zero</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
<span class="na">log-slow</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">1000</span><span class="w"></span>
<span class="na">log-date</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">[%%Y-%%m-%%d %%H:%%M:%%S]</span><span class="w"></span>
<span class="na">log-format</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">%(ftime) "%(method) %(uri)" %(status) %(rsize)+%(hsize) in %(msecs)ms pid:%(pid) worker:%(wid) core:%(core)</span><span class="w"></span>
<span class="na">log-format-strftime</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">[%%Y-%%m-%%d %%H:%%M:%%S]</span><span class="w"></span>
<span class="c1"># Enable the stats service for uwsgitop, pip install uwsgitop, and run:</span><span class="w"></span>
<span class="c1"># uwsgitop /var/run/app.stats</span><span class="w"></span>
<span class="na">stats</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">/var/run/app.stats</span><span class="w"></span>
</pre></div>
<p>Another problem that you might care about, especially if you got used to <tt class="docutils literal">apachectl <span class="pre">-k</span> graceful</tt> is, well, waiting for pending requests
at shutdown. uWSGI just kills all the workers by default. You can enable graceful shutdown by having
<a class="reference external" href="https://github.com/unbit/uwsgi/issues/849#issuecomment-118869386">this hook</a>:</p>
<div class="highlight"><pre><span></span><span class="c1"># See: https://github.com/unbit/uwsgi/issues/849#issuecomment-118869386</span><span class="w"></span>
<span class="c1"># Note that SIGTERM is 15 not 1 :-)</span><span class="w"></span>
<span class="na">hook-master-start</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">unix_signal:15 gracefully_kill_them_all</span><span class="w"></span>
</pre></div>
<p>Note that it would make uWSGI always do a graceful shutdown, and you should always have <tt class="docutils literal">harakiri</tt> enabled if you use this. Otherwise
shutdowns and restarts can get stuck.</p>
<p>Another way to do this is to use the master fifo and send a graceful shutdown command, eg:</p>
<div class="highlight"><pre><span></span><span class="c1"># For graceful shutdown you can run: echo q > /var/run/fifo.uwsgi</span><span class="w"></span>
<span class="na">master-fifo</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">/var/run/fifo.uwsgi</span><span class="w"></span>
</pre></div>
<p>You can also use this method to do a brutal shutdown/restart and
<a class="reference external" href="https://uwsgi-docs.readthedocs.io/en/latest/MasterFIFO.html">other things</a>.</p>
<div class="section" id="but-what-if-i-don-t-want-to-run-nginx">
<h2>But what if I don't want to run Nginx?<a class="headerlink" href="#but-what-if-i-don-t-want-to-run-nginx" title="Permalink to this headline">
*</a></h2>
<p>uWSGI certainly makes this possible but alas, it also makes it very hard to get it right. Remember that we need the frontend to do protect
the workers from abusive clients?</p>
<p>You'd think that running an <a class="reference external" href="https://uwsgi-docs.readthedocs.io/en/latest/HTTP.html">HTTP router</a> (the <tt class="docutils literal">http</tt> option) as opposed to
having the workers serve HTTP directly (the <tt class="docutils literal"><span class="pre">http-socket</span></tt> option) would protect from <a class="reference external" href="https://en.wikipedia.org/wiki/Slowloris_(computer_security)">Slowris</a> or <a class="reference external" href="https://en.wikipedia.org/wiki/R-U-Dead-Yet">RUDY</a> (slow request body attack) but you'd be
very wrong.</p>
<p>You can easily test this by running <tt class="docutils literal">slowhttptest <span class="pre">-B</span></tt>. It fails quickly all while Nginx runs like a champ. So is there a way to solve
this? Or, how ugly is it? Funnily enough it's possible, and yes it's ugly and contrived:</p>
<div class="highlight"><pre><span></span><span class="c1"># Same setup as before, allow starting as root and changing user later by using a shared socket</span><span class="w"></span>
<span class="na">shared-socket</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">/var/run/app.uwsgi</span><span class="w"></span>
<span class="na">chmod-socket</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">666</span><span class="w"></span>
<span class="na">uwsgi-socket</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">=0</span><span class="w"></span>
<span class="c1"># This is how a request runs with this setup:</span><span class="w"></span>
<span class="c1"># http request -> http router -> fastrouter -> worker</span><span class="w"></span>
<span class="na">http-to</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">/var/run/app.router</span><span class="w"></span>
<span class="na">http</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">:8000</span><span class="w"></span>
<span class="na">fastrouter</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">/var/run/app.router</span><span class="w"></span>
<span class="na">fastrouter-use-pattern</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">/var/run/app.uwsgi</span><span class="w"></span>
<span class="c1"># Buffer in-memory up to 64kb</span><span class="w"></span>
<span class="na">fastrouter-post-buffering</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">%(64 * 1024)</span><span class="w"></span>
<span class="c1"># 10Mb request body limit</span><span class="w"></span>
<span class="na">limit-post</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">%(10 * 1024 * 1024)</span><span class="w"></span>
</pre></div>
<p>It can't be simpler because the <tt class="docutils literal"><span class="pre">post-buffering</span></tt> option (necessary to prevent the workers getting hosed up by slow requests) doesn't apply
to the http router - it applies to the worker. There's no <tt class="docutils literal"><span class="pre">http-post-buffering</span></tt> option thus the only choice is to have the fastrouter as
the buffering middleman.</p>
<p>Note that it's best to leave <tt class="docutils literal"><span class="pre">fastrouter-post-buffering</span></tt> to a small value as buffer handling
<a class="reference external" href="https://github.com/unbit/uwsgi/blob/2.0.20/core/buffer.c#L20">isn't very</a>
<a class="reference external" href="https://github.com/unbit/uwsgi/blob/2.0.20/plugins/http/http.c#L644">well done</a> in uWSGI.</p>
<p>Likely you'll need to serve static files as well:</p>
<div class="highlight"><pre><span></span><span class="na">static-map</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">/static=/var/www/static</span><span class="w"></span>
<span class="c1"># Expire after 24h</span><span class="w"></span>
<span class="na">static-expires</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">.* %(24 * 60 * 60)</span><span class="w"></span>
<span class="na">static-gzip-all</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
</pre></div>
<p>The one tricky bit is the <tt class="docutils literal"><span class="pre">static-gzip-all</span></tt> option - uWSGI doesn't gzip on the fly - it expects .gz files around. There's a really easy
way to build them using <a class="reference external" href="https://pypi.org/project/whitenoise/">whitenoise</a>. Either run <tt class="docutils literal">python <span class="pre">-m</span> whitenoise.compress</tt> or use this
Django setting:</p>
<div class="highlight"><pre><span></span><span class="c1"># This automatically creates a .gz file for each static file</span>
<span class="n">STATICFILES_STORAGE</span> <span class="o">=</span> <span class="s2">"whitenoise.storage.CompressedStaticFilesStorage"</span>
</pre></div>
<p>Now you might wonder why not also gzip responses. There are two ways of doing it - <strong>both problematic</strong>:</p>
<ul>
<li><p class="first">Use <tt class="docutils literal"><span class="pre">http-auto-gzip</span></tt> like in this <a class="reference external" href="https://ugu.readthedocs.io/en/latest/compress.html">uWSGI guide</a>. Note that:</p>
<ul class="simple">
<li>You have to stop sending <tt class="docutils literal"><span class="pre">Content-Length</span></tt> from your application. You'll end up implementing middleware that removes the
<tt class="docutils literal"><span class="pre">Content-Length</span></tt> that <tt class="docutils literal">django.middleware.common.CommonMiddleware</tt> adds. No, you should not just remove <tt class="docutils literal">CommonMiddleware</tt> for
obvious reasons.</li>
<li>The <tt class="docutils literal"><span class="pre">uWSGI-Encoding</span></tt> header is not removable with this technique
(<tt class="docutils literal"><span class="pre">response-route-run</span> = <span class="pre">delheader:uWSGI-Encoding</span></tt> doesn't actually work).</li>
<li>You cannot tweak the compression ratio (it's hardcoded at <tt class="docutils literal">9</tt> - not really that efficient CPU-wise).</li>
</ul>
<p>Here's an example that would work in general, with the aforementioned tradeoffs:</p>
<div class="highlight"><pre><span></span><span class="c1"># I wouldn't copy this...</span><span class="w"></span>
<span class="na">http-auto-gzip</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
<span class="na">collect-header</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">Content-Type RESPONSE_CONTENT_TYPE</span><span class="w"></span>
<span class="na">response-route-if</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">equal:${RESPONSE_CONTENT_TYPE};application/json addheader:uWSGI-Encoding: gzip</span><span class="w"></span>
<span class="na">response-route-if</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">startswith:${RESPONSE_CONTENT_TYPE};text/ addheader:uWSGI-Encoding: gzip</span><span class="w"></span>
</pre></div>
</li>
<li><p class="first">Use <a class="reference external" href="https://uwsgi-docs.readthedocs.io/en/latest/Transformations.html">transformations</a>.
Although this approach is a bit more flexible, you still cannot tweak the compression ratio
(same hardcode at <tt class="docutils literal">9</tt> - inefficient CPU-wise) and it's more complex as you can see:</p>
<div class="highlight"><pre><span></span><span class="na">collect-header</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">Content-Type RESPONSE_CONTENT_TYPE</span><span class="w"></span>
<span class="na">collect-header</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">Content-Length RESPONSE_CONTENT_LENGTH</span><span class="w"></span>
<span class="c1"># uWSGI internal are not that smart, thus no content-length means it's 0</span><span class="w"></span>
<span class="na">response-route-if</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">empty:${RESPONSE_CONTENT_LENGTH} goto:no-length</span><span class="w"></span>
<span class="c1"># Don't bother compressing 1kb responses, not worth the trouble</span><span class="w"></span>
<span class="na">response-route-if</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">islower:${RESPONSE_CONTENT_LENGTH};1024 last:</span><span class="w"></span>
<span class="na">response-route-label</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">no-length</span><span class="w"></span>
<span class="c1"># Make sure the client actually wants gzip</span><span class="w"></span>
<span class="na">response-route-if</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">contains:${HTTP_ACCEPT_ENCODING};gzip goto:check-response</span><span class="w"></span>
<span class="na">response-route-run</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">last:</span><span class="w"></span>
<span class="na">response-route-label</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">check-response</span><span class="w"></span>
<span class="c1"># Don't bother compressing non-text stuff, usually not worth it</span><span class="w"></span>
<span class="na">response-route-if</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">equal:${RESPONSE_CONTENT_TYPE};application/json goto:apply-gzip</span><span class="w"></span>
<span class="na">response-route-if</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">startswith:${RESPONSE_CONTENT_TYPE};text/ goto:apply-gzip</span><span class="w"></span>
<span class="na">response-route-run</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">last:</span><span class="w"></span>
<span class="na">response-route-label</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">apply-gzip</span><span class="w"></span>
<span class="na">response-route-run</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">gzip:</span><span class="w"></span>
<span class="c1"># Why apply this filter too you wonder? The gzip transformation is not smart</span><span class="w"></span>
<span class="c1"># enough to chunk the body or set a Content-Length, thus keepalive will be broken</span><span class="w"></span>
<span class="na">http-auto-chunked</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
</pre></div>
<p>Previously this blog post had <tt class="docutils literal"><span class="pre">response-route-run</span> = chunked:</tt> but it appears that <tt class="docutils literal"><span class="pre">http-auto-chunked</span></tt> performs better.</p>
</li>
</ul>
</div>
<div class="section" id="tl-dr">
<h2>TL;DR<a class="headerlink" href="#tl-dr" title="Permalink to this headline">
*</a></h2>
<p>I just want to run uWSGI standalone, just give me my copy-pasta config or I'll copy something really bad from SO!</p>
<p>🙄</p>
<div class="highlight"><pre><span></span><span class="k">[uwsgi]</span><span class="w"></span>
<span class="c1"># Error on unknown options (prevents typos)</span><span class="w"></span>
<span class="na">strict</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
<span class="c1"># Formula: cores * 2 + 2</span><span class="w"></span>
<span class="na">processes</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">%(%k * 2 + 2)</span><span class="w"></span>
<span class="c1"># Most of uWSGI features depend on the master mode</span><span class="w"></span>
<span class="na">master</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
<span class="c1"># Close fds on fork (don't allow subprocess to mess with parent's fds)</span><span class="w"></span>
<span class="na">close-on-exec</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
<span class="na">close-on-exec2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
<span class="c1"># In case there's some bad global state (pointless to use with need-app = true)</span><span class="w"></span>
<span class="na">lazy-apps</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
<span class="c1"># Enable threads for sentry, see:</span><span class="w"></span>
<span class="c1"># https://docs.sentry.io/clients/python/advanced/#a-note-on-uwsgi</span><span class="w"></span>
<span class="na">enable-threads</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
<span class="c1"># Avoid multiple interpreters (automatically created in case you need mounts)</span><span class="w"></span>
<span class="na">single-interpreter</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
<span class="c1"># Respect SIGTERM and do shutdown instead of reload</span><span class="w"></span>
<span class="na">die-on-term</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
<span class="c1"># See: https://github.com/unbit/uwsgi/issues/849#issuecomment-118869386</span><span class="w"></span>
<span class="c1"># Note that SIGTERM is 15 not 1 :-)</span><span class="w"></span>
<span class="na">hook-master-start</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">unix_signal:15 gracefully_kill_them_all</span><span class="w"></span>
<span class="c1"># All the commands: https://uwsgi-docs.readthedocs.io/en/latest/MasterFIFO.html</span><span class="w"></span>
<span class="na">master-fifo</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">/var/run/app.fifo</span><span class="w"></span>
<span class="c1"># Respawn processes that take more than ... seconds</span><span class="w"></span>
<span class="na">harakiri</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">300</span><span class="w"></span>
<span class="na">harakiri-verbose</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
<span class="c1"># Respawn processes after serving ... requests</span><span class="w"></span>
<span class="na">max-requests</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">5000</span><span class="w"></span>
<span class="c1"># Respawn if processes are bloated</span><span class="w"></span>
<span class="na">reload-on-as</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">1024</span><span class="w"></span>
<span class="na">reload-on-rss</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">512</span><span class="w"></span>
<span class="c1"># We don't expect abuse so lets have fastest respawn possible</span><span class="w"></span>
<span class="na">forkbomb-delay</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">0</span><span class="w"></span>
<span class="c1"># Enable an accept mutex for a more balanced worker load</span><span class="w"></span>
<span class="na">thunder-lock</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
<span class="c1"># Good for debugging/development</span><span class="w"></span>
<span class="na">auto-procname</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
<span class="na">log-5xx</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
<span class="na">log-zero</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
<span class="na">log-slow</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">1000</span><span class="w"></span>
<span class="na">log-date</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">[%%Y-%%m-%%d %%H:%%M:%%S]</span><span class="w"></span>
<span class="na">log-format</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">%(ftime) "%(method) %(uri)" %(status) %(rsize)+%(hsize) in %(msecs)ms pid:%(pid) worker:%(wid) core:%(core)</span><span class="w"></span>
<span class="na">log-format-strftime</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">[%%Y-%%m-%%d %%H:%%M:%%S]</span><span class="w"></span>
<span class="c1"># Enable the stats service for uwsgitop, pip install uwsgitop, and run:</span><span class="w"></span>
<span class="c1"># uwsgitop /var/run/app.stats</span><span class="w"></span>
<span class="na">stats</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">/var/run/app.stats</span><span class="w"></span>
<span class="c1"># Same setup as before, allow starting as root and changing user later by using a shared socket</span><span class="w"></span>
<span class="na">shared-socket</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">/var/run/app.uwsgi</span><span class="w"></span>
<span class="na">chmod-socket</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">666</span><span class="w"></span>
<span class="na">uwsgi-socket</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">=0</span><span class="w"></span>
<span class="c1"># Change user after binding the socket</span><span class="w"></span>
<span class="na">uid</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">app</span><span class="w"></span>
<span class="na">gid</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">app</span><span class="w"></span>
<span class="c1"># This is how a request runs with this setup:</span><span class="w"></span>
<span class="c1"># http request -> http router -> fastrouter -> worker</span><span class="w"></span>
<span class="na">http-to</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">/var/run/app.router</span><span class="w"></span>
<span class="na">http</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">:8000</span><span class="w"></span>
<span class="na">fastrouter</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">/var/run/app.router</span><span class="w"></span>
<span class="na">fastrouter-use-pattern</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">/var/run/app.uwsgi</span><span class="w"></span>
<span class="c1"># Buffer in-memory up to 64kb</span><span class="w"></span>
<span class="na">fastrouter-post-buffering</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">%(64 * 1024)</span><span class="w"></span>
<span class="c1"># 10Mb request body limit</span><span class="w"></span>
<span class="na">limit-post</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">%(10 * 1024 * 1024)</span><span class="w"></span>
<span class="na">static-map</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">/static=/var/www/static</span><span class="w"></span>
<span class="c1"># Expire after 24h</span><span class="w"></span>
<span class="na">static-expires</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">.* %(24 * 60 * 60)</span><span class="w"></span>
<span class="c1"># Don't forget to run python -m whitenoise.compress or similar!</span><span class="w"></span>
<span class="na">static-gzip-all</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">true</span><span class="w"></span>
<span class="c1"># Apply conditional gzip encoding</span><span class="w"></span>
<span class="na">collect-header</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">Content-Type RESPONSE_CONTENT_TYPE</span><span class="w"></span>
<span class="na">collect-header</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">Content-Length RESPONSE_CONTENT_LENGTH</span><span class="w"></span>
<span class="c1"># uWSGI internal are not that smart, thus no content-length means it's 0</span><span class="w"></span>
<span class="na">response-route-if</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">empty:${RESPONSE_CONTENT_LENGTH} goto:no-length</span><span class="w"></span>
<span class="c1"># Don't bother compressing 1kb responses, not worth the trouble</span><span class="w"></span>
<span class="na">response-route-if</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">islower:${RESPONSE_CONTENT_LENGTH};1024 last:</span><span class="w"></span>
<span class="na">response-route-label</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">no-length</span><span class="w"></span>
<span class="c1"># Make sure the client actually wants gzip</span><span class="w"></span>
<span class="na">response-route-if</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">contains:${HTTP_ACCEPT_ENCODING};gzip goto:check-response</span><span class="w"></span>
<span class="na">response-route-run</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">last:</span><span class="w"></span>
<span class="na">response-route-label</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">check-response</span><span class="w"></span>
<span class="c1"># Don't bother compressing non-text stuff, usually not worth it</span><span class="w"></span>
<span class="na">response-route-if</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">equal:${RESPONSE_CONTENT_TYPE};application/json goto:apply-gzip</span><span class="w"></span>
<span class="na">response-route-if</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">startswith:${RESPONSE_CONTENT_TYPE};text/ goto:apply-gzip</span><span class="w"></span>
<span class="na">response-route-run</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">last:</span><span class="w"></span>
<span class="na">response-route-label</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">apply-gzip</span><span class="w"></span>
<span class="na">response-route-run</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">gzip:</span><span class="w"></span>
<span class="c1"># Why apply this filter too you wonder? The gzip transformation is not smart</span><span class="w"></span>
<span class="c1"># enough to chunk the body or set a Content-Length, thus keepalive will be broken</span><span class="w"></span>
<span class="na">response-route-run</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">chunked:</span><span class="w"></span>
</pre></div>
<div class="section" id="addendum">
<h3>Addendum<a class="headerlink" href="#addendum" title="Permalink to this headline">
*</a></h3>
<p>Note that this example will make uWSGI create several files at <tt class="docutils literal">/var/run</tt> - it should be writable by the <cite>app</cite> user.</p>
</div>
</div>
Speeding up Django pagination2020-02-02T00:00:00+02:002020-02-02T00:00:00+02:00Ionel Cristian Mărieștag:blog.ionelmc.ro,2020-02-02:/2020/02/02/speeding-up-django-pagination/<p>I assume you have already read <a class="reference external" href="https://hakibenita.com/optimizing-the-django-admin-paginator">Optimizing the Django Admin Paginator</a>. If
not, this is basically the take-away from that article:</p>
<div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">InfinityPaginator</span><span class="p">(</span><span class="n">Paginator</span><span class="p">):</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">count</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="mi">99999999999</span>
<span class="k">class</span> <span class="nc">MyAdmin</span><span class="p">(</span><span class="n">admin</span><span class="o">.</span><span class="n">ModelAdmin</span><span class="p">):</span>
<span class="n">paginator</span> <span class="o">=</span> <span class="n">InfinityPaginator</span>
<span class="n">show_full_result_count</span> <span class="o">=</span> <span class="kc">False</span>
</pre></div>
<p>Though the article has a trick with using a …</p><p>I assume you have already read <a class="reference external" href="https://hakibenita.com/optimizing-the-django-admin-paginator">Optimizing the Django Admin Paginator</a>. If
not, this is basically the take-away from that article:</p>
<div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">InfinityPaginator</span><span class="p">(</span><span class="n">Paginator</span><span class="p">):</span>
<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">count</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="mi">99999999999</span>
<span class="k">class</span> <span class="nc">MyAdmin</span><span class="p">(</span><span class="n">admin</span><span class="o">.</span><span class="n">ModelAdmin</span><span class="p">):</span>
<span class="n">paginator</span> <span class="o">=</span> <span class="n">InfinityPaginator</span>
<span class="n">show_full_result_count</span> <span class="o">=</span> <span class="kc">False</span>
</pre></div>
<p>Though the article has a trick with using a <a class="reference external" href="https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-STATEMENT-TIMEOUT">statement_timeout</a>, I think it's pointless. In the real world you
should expect to get that overt 99999999999 count all over the place. Unless you have some sort of toy project it's very likely your
database will be under load. Add some user/group filtering and you'll be always hit the time limit.</p>
<p>What if you could make the count more realistic, but still cheap? Using a random number would be too inconsistent. Strangely enough someone
decided that it's a good idea to put a <a class="reference external" href="https://wiki.postgresql.org/wiki/Count_estimate">count estimate</a> idea in the postgresql wiki and,
for reasons I decided to see how hard is to implement it in django, in a somewhat generalized fashion</p>
<p>From a series of "<a class="reference external" href="https://blog.ionelmc.ro/presentations/just-because/">Just because you can, you have to try it!</a>", behold
<a class="footnote-reference" href="#the-drawback" id="footnote-reference-1">[1]</a>:</p>
<div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">EstimatedQuerySet</span><span class="p">(</span><span class="n">models</span><span class="o">.</span><span class="n">QuerySet</span><span class="p">):</span>
<span class="n">estimate_bias</span> <span class="o">=</span> <span class="mf">1.2</span>
<span class="n">estimate_threshold</span> <span class="o">=</span> <span class="mi">100</span>
<span class="k">def</span> <span class="nf">estimated_count</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_result_cache</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">count</span><span class="p">()</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">qs</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">model</span><span class="o">.</span><span class="n">_base_manager</span><span class="o">.</span><span class="n">all</span><span class="p">()</span>
<span class="n">compiler</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">query</span><span class="o">.</span><span class="n">get_compiler</span><span class="p">(</span><span class="s1">'default'</span><span class="p">)</span>
<span class="n">where</span><span class="p">,</span> <span class="n">params</span> <span class="o">=</span> <span class="n">compiler</span><span class="o">.</span><span class="n">compile</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">query</span><span class="o">.</span><span class="n">where</span><span class="p">)</span>
<span class="n">qs</span> <span class="o">=</span> <span class="n">qs</span><span class="o">.</span><span class="n">extra</span><span class="p">(</span><span class="n">where</span><span class="o">=</span><span class="p">[</span><span class="n">where</span><span class="p">]</span> <span class="k">if</span> <span class="n">where</span> <span class="k">else</span> <span class="kc">None</span><span class="p">,</span> <span class="n">params</span><span class="o">=</span><span class="n">params</span><span class="p">)</span>
<span class="n">cursor</span> <span class="o">=</span> <span class="n">connections</span><span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">db</span><span class="p">]</span><span class="o">.</span><span class="n">cursor</span><span class="p">()</span>
<span class="n">query</span> <span class="o">=</span> <span class="n">qs</span><span class="o">.</span><span class="n">query</span><span class="o">.</span><span class="n">clone</span><span class="p">()</span>
<span class="n">query</span><span class="o">.</span><span class="n">add_annotation</span><span class="p">(</span><span class="n">Count</span><span class="p">(</span><span class="s1">'*'</span><span class="p">),</span> <span class="n">alias</span><span class="o">=</span><span class="s1">'__count'</span><span class="p">,</span> <span class="n">is_summary</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
<span class="n">query</span><span class="o">.</span><span class="n">clear_ordering</span><span class="p">(</span><span class="kc">True</span><span class="p">)</span>
<span class="n">query</span><span class="o">.</span><span class="n">select_for_update</span> <span class="o">=</span> <span class="kc">False</span>
<span class="n">query</span><span class="o">.</span><span class="n">select_related</span> <span class="o">=</span> <span class="kc">False</span>
<span class="n">query</span><span class="o">.</span><span class="n">select</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">query</span><span class="o">.</span><span class="n">default_cols</span> <span class="o">=</span> <span class="kc">False</span>
<span class="n">sql</span><span class="p">,</span> <span class="n">params</span> <span class="o">=</span> <span class="n">query</span><span class="o">.</span><span class="n">sql_with_params</span><span class="p">()</span>
<span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'Running EXPLAIN </span><span class="si">%s</span><span class="s1">'</span><span class="p">,</span> <span class="n">sql</span><span class="p">)</span>
<span class="n">cursor</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s2">"EXPLAIN </span><span class="si">%s</span><span class="s2">"</span> <span class="o">%</span> <span class="n">sql</span><span class="p">,</span> <span class="n">params</span><span class="p">)</span>
<span class="n">lines</span> <span class="o">=</span> <span class="n">cursor</span><span class="o">.</span><span class="n">fetchall</span><span class="p">()</span>
<span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'Got EXPLAIN result:</span><span class="se">\n</span><span class="s1">> </span><span class="si">%s</span><span class="s1">'</span><span class="p">,</span>
<span class="s1">'</span><span class="se">\n</span><span class="s1">> '</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">line</span> <span class="k">for</span> <span class="n">line</span><span class="p">,</span> <span class="ow">in</span> <span class="n">lines</span><span class="p">))</span>
<span class="n">marker</span> <span class="o">=</span> <span class="s1">' on </span><span class="si">%s</span><span class="s1"> '</span> <span class="o">%</span> <span class="bp">self</span><span class="o">.</span><span class="n">model</span><span class="o">.</span><span class="n">_meta</span><span class="o">.</span><span class="n">db_table</span>
<span class="k">for</span> <span class="n">line</span><span class="p">,</span> <span class="ow">in</span> <span class="n">lines</span><span class="p">:</span>
<span class="k">if</span> <span class="n">marker</span> <span class="ow">in</span> <span class="n">line</span><span class="p">:</span>
<span class="k">for</span> <span class="n">part</span> <span class="ow">in</span> <span class="n">line</span><span class="o">.</span><span class="n">split</span><span class="p">():</span>
<span class="k">if</span> <span class="n">part</span><span class="o">.</span><span class="n">startswith</span><span class="p">(</span><span class="s1">'rows='</span><span class="p">):</span>
<span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'Found size (</span><span class="si">%s</span><span class="s1">) estimate in query EXPLAIN: </span><span class="si">%s</span><span class="s1">'</span><span class="p">,</span>
<span class="n">part</span><span class="p">,</span> <span class="n">line</span><span class="p">)</span>
<span class="n">count</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="nb">int</span><span class="p">(</span><span class="n">part</span><span class="p">[</span><span class="mi">5</span><span class="p">:])</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">estimate_bias</span><span class="p">)</span>
<span class="k">if</span> <span class="n">count</span> <span class="o"><</span> <span class="bp">self</span><span class="o">.</span><span class="n">estimate_threshold</span><span class="p">:</span>
<span class="c1"># Unreliable, will make views with lots of filtering</span>
<span class="c1"># output confusing results.</span>
<span class="c1"># Just do normal count, shouldn't be that slow.</span>
<span class="c1"># (well, not much slower than the actual query)</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">count</span><span class="p">()</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">return</span> <span class="n">count</span>
<span class="k">return</span> <span class="n">qs</span><span class="o">.</span><span class="n">count</span><span class="p">()</span>
<span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">exc</span><span class="p">:</span>
<span class="n">logger</span><span class="o">.</span><span class="n">warning</span><span class="p">(</span><span class="s2">"Failed to estimate queryset count: </span><span class="si">%s</span><span class="s2">"</span><span class="p">,</span> <span class="n">exc</span><span class="p">)</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">count</span><span class="p">()</span>
</pre></div>
<p>Because the normal count method is unchanged you can use that QuerySet everywhere.</p>
<div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">MyModel</span><span class="p">(</span><span class="n">models</span><span class="o">.</span><span class="n">Model</span><span class="p">):</span>
<span class="o">...</span>
<span class="n">objects</span> <span class="o">=</span> <span class="n">EstimatedQuerySet</span><span class="o">.</span><span class="n">as_manager</span><span class="p">()</span>
</pre></div>
<p>Now using the <tt class="docutils literal">estimated_count</tt> in the paginator will uncover a problem: sometimes it will underestimate. You can play with the
<tt class="docutils literal">estimate_bias</tt> but it will never work well with edge-cases (like heavy filtering).</p>
<p>A good compromise is to tune it for the general case and for everything else trick the pagination to always increment the page count when
you're looking at the last page.</p>
<div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">EstimatedPaginator</span><span class="p">(</span><span class="n">Paginator</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">validate_number</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">number</span><span class="p">):</span>
<span class="k">if</span> <span class="n">number</span> <span class="o">>=</span> <span class="bp">self</span><span class="o">.</span><span class="n">num_pages</span><span class="p">:</span>
<span class="c1"># noinspection PyPropertyAccess</span>
<span class="bp">self</span><span class="o">.</span><span class="n">num_pages</span> <span class="o">=</span> <span class="n">number</span> <span class="o">+</span> <span class="mi">1</span>
<span class="k">return</span> <span class="nb">super</span><span class="p">(</span><span class="n">EstimatedPaginator</span><span class="p">,</span> <span class="bp">self</span><span class="p">)</span><span class="o">.</span><span class="n">validate_number</span><span class="p">(</span><span class="n">number</span><span class="p">)</span>
<span class="nd">@cached_property</span>
<span class="k">def</span> <span class="nf">count</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">object_list</span><span class="o">.</span><span class="n">estimated_count</span><span class="p">()</span>
<span class="k">class</span> <span class="nc">MyAdmin</span><span class="p">(</span><span class="n">admin</span><span class="o">.</span><span class="n">ModelAdmin</span><span class="p">):</span>
<span class="n">paginator</span> <span class="o">=</span> <span class="n">EstimatedPaginator</span>
<span class="n">show_full_result_count</span> <span class="o">=</span> <span class="kc">False</span>
</pre></div>
<p>If you think that <tt class="docutils literal"># noinspection PyPropertyAccess</tt> is funny it's because it is - <tt class="docutils literal">num_pages</tt> is a <tt class="docutils literal">cached_property</tt> and the following
line destroys PyCharm's assumptions about how <a class="reference external" href="https://docs.python.org/3/reference/datamodel.html#invoking-descriptors">non-data descriptors</a> should work.</p>
<p>It also goes against sane practices like not having unexpected side-effects. But alas, it gets worse. There's another problem there: there's
always going to be a next page even if the current page is empty (or not full). To fix that we mess again with the internals:</p>
<div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">_get_page</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">objects</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span>
<span class="c1"># If page ain't full it means that it's the real last page, remove the extra.</span>
<span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">objects</span><span class="p">)</span> <span class="o"><</span> <span class="bp">self</span><span class="o">.</span><span class="n">per_page</span><span class="p">:</span>
<span class="c1"># noinspection PyPropertyAccess</span>
<span class="bp">self</span><span class="o">.</span><span class="n">num_pages</span> <span class="o">-=</span> <span class="mi">1</span>
<span class="k">return</span> <span class="nb">super</span><span class="p">(</span><span class="n">EstimatedPaginator</span><span class="p">,</span> <span class="bp">self</span><span class="p">)</span><span class="o">.</span><span class="n">_get_page</span><span class="p">(</span><span class="n">objects</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">)</span>
</pre></div>
<p>One could still input an out of bounds page number through in the URL but I think it's pointless to handle that.</p>
<div class="section" id="what-about-that-pypropertyaccess">
<h2>What about that PyPropertyAccess?<a class="headerlink" href="#what-about-that-pypropertyaccess" title="Permalink to this headline">
*</a></h2>
<p>Suppose you have this:</p>
<div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">cached_property</span><span class="p">:</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">func</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">func</span> <span class="o">=</span> <span class="n">func</span>
<span class="k">def</span> <span class="fm">__get__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">instance</span><span class="p">,</span> <span class="bp">cls</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
<span class="k">if</span> <span class="n">instance</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">return</span> <span class="bp">self</span>
<span class="n">res</span> <span class="o">=</span> <span class="n">instance</span><span class="o">.</span><span class="vm">__dict__</span><span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">func</span><span class="o">.</span><span class="vm">__name__</span><span class="p">]</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">func</span><span class="p">(</span><span class="n">instance</span><span class="p">)</span>
<span class="k">return</span> <span class="n">res</span>
<span class="k">class</span> <span class="nc">Foobar</span><span class="p">:</span>
<span class="nd">@cached_property</span>
<span class="k">def</span> <span class="nf">foo</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="s2">"bar"</span>
</pre></div>
<p>Because <tt class="docutils literal">cached_property</tt> doesn't implement a <tt class="docutils literal">__set__</tt>, assignments will be made through the instance's <tt class="docutils literal">__dict__</tt>:</p>
<div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">x</span> <span class="o">=</span> <span class="n">Foobar</span><span class="p">()</span>
<span class="gp">>>> </span><span class="n">x</span><span class="o">.</span><span class="n">foo</span> <span class="o">=</span> <span class="s1">'123'</span>
<span class="gp">>>> </span><span class="n">x</span><span class="o">.</span><span class="n">foo</span>
<span class="go">'123'</span>
<span class="gp">>>> </span><span class="n">y</span> <span class="o">=</span> <span class="n">Foobar</span><span class="p">()</span>
<span class="gp">>>> </span><span class="n">y</span><span class="o">.</span><span class="n">foo</span> <span class="o">+=</span> <span class="s1">'123'</span>
<span class="gp">>>> </span><span class="n">y</span><span class="o">.</span><span class="n">foo</span>
<span class="go">'bar123'</span>
</pre></div>
<p>I suspect that PyCharm doesn't discern data vs non-data descriptors at all. Or perhaps it's a subtle hint that it's a bad idea to assign to
something that doesn't implement a setter?</p>
<table class="docutils footnote" frame="void" id="the-drawback" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-1">[1]</a></td><td>Though you should be wondering if you want to take a look at this hard-to-test method every time you upgrade Django ...</td></tr>
</tbody>
</table>
</div>
Proxying objects in Python2015-01-12T00:00:00+02:002016-02-17T00:00:00+02:00Ionel Cristian Mărieștag:blog.ionelmc.ro,2015-01-12:/2015/01/12/proxying-objects-in-python/<p>A lazy object proxy is an object that wraps a callable but defers the call until the object is actually required, and
caches the result of said call.</p>
<p>These kinds of objects are useful in resolving various dependency issues, few examples:</p>
<ul class="simple">
<li>Objects that need to held circular references at each …</li></ul><p>A lazy object proxy is an object that wraps a callable but defers the call until the object is actually required, and
caches the result of said call.</p>
<p>These kinds of objects are useful in resolving various dependency issues, few examples:</p>
<ul class="simple">
<li>Objects that need to held circular references at each other, but at different stages. To instantiate object <cite>Foo</cite> you
need an instance of <cite>Bar</cite>. Instance of <cite>Bar</cite> needs an instance of <cite>Foo</cite> in some of it methods (but not at
construction). Circular imports sound familiar?</li>
<li>Performance sensitive code. You don't know ahead of time what you're going to use but you don't want to pay for
allocating all the resources at the start as you usually need just few of them.</li>
</ul>
<p>There are other examples, I've just made up a couple for context.</p>
<p>If you've used Django you may be familiar with <a class="reference external" href="https://github.com/django/django/blob/stable/1.7.x/django/utils/functional.py#L337">SimpleLazyObject</a>. For simple use-cases it's fine, and if you're already
using Django the choice is obvious. Unfortunately it's missing many magic methods, most glaring omissions:
<tt class="docutils literal">__iter__</tt>, <tt class="docutils literal">__getslice__</tt>, <tt class="docutils literal">__call__</tt> etc. It's not too bad, you can just subclass and add them yourself.</p>
<p>But what if you need to have <tt class="docutils literal">__getattr__</tt>? The horrors of the infinite recursive call beckon.</p>
<p>Meanwhile I've noticed that <a class="reference external" href="https://github.com/GrahamDumpleton/wrapt">wrapt</a> has a quite complete <a class="reference external" href="http://wrapt.readthedocs.org/en/latest/wrappers.html#object-proxy">object proxy</a>. Unfortunately it's not really amendable to adding
a <cite>lazy</cite> behavior in a subclass due to the C extension (I wouldn't make bets on sub-classing the pure-python proxy
implementation either without some unwanted overhead :-).</p>
<p>Thus I forked the code and changed everything to have the <cite>lazy behavior</cite>. You can see the results here:
<a class="reference external" href="https://github.com/ionelmc/python-lazy-object-proxy">https://github.com/ionelmc/python-lazy-object-proxy</a></p>
<p>Part of that is a C extension packaging exercise but that's for another blog-post <a class="footnote-reference" href="#footnote-2" id="footnote-reference-1">[2]</a>.</p>
<p>I've also done some benchmarks (with <a class="reference external" href="https://github.com/ionelmc/pytest-benchmark">pytest-benchmark</a>) <a class="footnote-reference" href="#footnote-1" id="footnote-reference-2">[1]</a>:</p>
<div class="term"><pre class="nowrap">
<span style="color: #cdcd00">-- benchmark: min 5 rounds (of min 25.00us), 30.00s max time, timer: time.perf_counter --</span>
Name (time in ns) Min Max Mean StdDev Rounds Iterations
<span style="color: #cdcd00">-----------------------------------------------------------------------------------------</span>
test_perf[slots] <span style="font-weight: bold"> 606.8182</span><span style="font-weight: bold"> 26084.0909</span><span style="font-weight: bold"> 627.7139</span><span style="font-weight: bold"> 89.5553</span> 1111112 44
test_perf[cext] <span style="color: #00cd00"></span><span style="font-weight: bold; color: #00cd00"> 84.7701</span><span style="color: #00cd00"></span><span style="font-weight: bold; color: #00cd00"> 2830.4598</span><span style="color: #00cd00"></span><span style="font-weight: bold; color: #00cd00"> 86.2741</span><span style="color: #00cd00"></span><span style="font-weight: bold; color: #00cd00"> 9.6827</span> 1006712 348
test_perf[simple] <span style="font-weight: bold"> 328.9474</span><span style="font-weight: bold"> 11456.5790</span><span style="font-weight: bold"> 334.8236</span><span style="font-weight: bold"> 41.8470</span> 1195220 76
test_perf[django] <span style="font-weight: bold"> 409.5238</span><span style="font-weight: bold"> 17969.8413</span><span style="font-weight: bold"> 417.4172</span><span style="font-weight: bold"> 49.9735</span> 1158302 63
test_perf[objproxies] <span style="color: #cd0000"></span><span style="font-weight: bold; color: #cd0000"> 880.0000</span><span style="color: #cd0000"></span><span style="font-weight: bold; color: #cd0000"> 31256.6666</span><span style="color: #cd0000"></span><span style="font-weight: bold; color: #cd0000"> 923.1323</span><span style="color: #cd0000"></span><span style="font-weight: bold; color: #cd0000"> 106.3637</span> 1111112 30
<span style="color: #cdcd00">-----------------------------------------------------------------------------------------</span>
</pre></div><p>The <tt class="docutils literal">slots</tt> and <tt class="docutils literal">cext</tt> implementations are based on <a class="reference external" href="https://github.com/GrahamDumpleton/wrapt">wrapt</a>'s code. I've named the pure Python implementation
<tt class="docutils literal">slots</tt> because that is the distinguishing implementation technique. And that was all I had in the beginning. I've
wondered why Django's <a class="reference external" href="https://github.com/django/django/blob/stable/1.7.x/django/utils/functional.py#L337">SimpleLazyObject</a> is faster, by a significant margin even.</p>
<p>To find out what exactly is different I've made a primitive tracer:</p>
<div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">import</span> <span class="nn">os</span>
<span class="kn">import</span> <span class="nn">linecache</span>
<span class="kn">from</span> <span class="nn">lazy_object_proxy.slots</span> <span class="kn">import</span> <span class="n">Proxy</span>
<span class="kn">from</span> <span class="nn">django.utils.functional</span> <span class="kn">import</span> <span class="n">SimpleLazyObject</span>
<span class="k">def</span> <span class="nf">dumbtrace</span><span class="p">(</span><span class="n">frame</span><span class="p">,</span> <span class="n">event</span><span class="p">,</span> <span class="n">args</span><span class="p">):</span>
<span class="n">sys</span><span class="o">.</span><span class="n">stdout</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="s2">"</span><span class="si">%015s</span><span class="s2">:</span><span class="si">%-3s</span><span class="s2"> </span><span class="si">%06s</span><span class="s2"> </span><span class="si">%s</span><span class="s2">"</span> <span class="o">%</span> <span class="p">(</span>
<span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">basename</span><span class="p">(</span><span class="n">frame</span><span class="o">.</span><span class="n">f_code</span><span class="o">.</span><span class="n">co_filename</span><span class="p">),</span>
<span class="n">frame</span><span class="o">.</span><span class="n">f_lineno</span><span class="p">,</span>
<span class="n">event</span><span class="p">,</span>
<span class="n">linecache</span><span class="o">.</span><span class="n">getline</span><span class="p">(</span><span class="n">frame</span><span class="o">.</span><span class="n">f_code</span><span class="o">.</span><span class="n">co_filename</span><span class="p">,</span> <span class="n">frame</span><span class="o">.</span><span class="n">f_lineno</span><span class="p">)</span>
<span class="p">))</span>
<span class="k">return</span> <span class="n">dumbtrace</span> <span class="c1"># "step in"</span>
<span class="k">for</span> <span class="n">Implementation</span> <span class="ow">in</span> <span class="n">Proxy</span><span class="p">,</span> <span class="n">SimpleLazyObject</span><span class="p">:</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Testing </span><span class="si">%s</span><span class="s2"> ..."</span> <span class="o">%</span> <span class="n">Implementation</span><span class="o">.</span><span class="vm">__name__</span><span class="p">)</span>
<span class="n">obj</span> <span class="o">=</span> <span class="n">Implementation</span><span class="p">(</span><span class="k">lambda</span><span class="p">:</span> <span class="s1">'foobar'</span><span class="p">)</span>
<span class="nb">str</span><span class="p">(</span><span class="n">obj</span><span class="p">)</span>
<span class="n">sys</span><span class="o">.</span><span class="n">settrace</span><span class="p">(</span><span class="n">dumbtrace</span><span class="p">)</span>
<span class="nb">str</span><span class="p">(</span><span class="n">obj</span><span class="p">)</span>
<span class="n">sys</span><span class="o">.</span><span class="n">settrace</span><span class="p">(</span><span class="kc">None</span><span class="p">)</span> <span class="c1"># we don't want to trace other stuff</span>
</pre></div>
<p>And from that I've got:</p>
<div class="term docutils container">
<pre class="code nowrap literal-block">
Testing Proxy ...
slots.py:122 call def __str__(self):
slots.py:123 line return str(self.__wrapped__)
slots.py:74 call @property
slots.py:76 line try:
slots.py:77 line return __getattr__(self, '__target__')
slots.py:77 return return __getattr__(self, '__target__')
slots.py:123 return return str(self.__wrapped__)
Testing SimpleLazyObject ...
functional.py:222 call def inner(self, *args):
functional.py:223 line if self._wrapped is empty:
functional.py:225 line return func(self._wrapped, *args)
functional.py:225 return return func(self._wrapped, *args)
</pre>
</div>
<!-- * -->
<p>Essentially, the biggest difference is an extra function call (the <tt class="docutils literal">__wrapped__</tt> property).</p>
<p>Now I've thought to myself: I can do that too, using the <a class="reference external" href="https://blog.ionelmc.ro/2014/11/04/an-interesting-python-descriptor-quirk/">cached property technique</a> I could remove the second function call.
But that trick needs a <tt class="docutils literal">__dict__</tt> - it can't work with <tt class="docutils literal">__slots__</tt>. So I've proceeded to make an implementation that
doesn't have that (the "<tt class="docutils literal">simple</tt>" from the previous benchmark table). It was faster indeed but then I finally
understood why Graham Dumpleton used <tt class="docutils literal">__slots__</tt> (while the tests started to fail).</p>
<p>Turns out he had replaced the normal <tt class="docutils literal">__dict__</tt> with a property <a class="footnote-reference" href="#footnote-3" id="footnote-reference-3">[3]</a>, and proxying <tt class="docutils literal">vars(obj)</tt> relies on having
<tt class="docutils literal">dict__</tt> as a proxy property. In other words, you can't use <tt class="docutils literal">vars</tt> on an object without a <tt class="docutils literal">__dict__</tt> (like most
builtin types).</p>
<p>Interestingly enough, the implementation with <tt class="docutils literal">__slots__</tt> is much faster on PyPy <a class="footnote-reference" href="#footnote-4" id="footnote-reference-4">[4]</a>:</p>
<div class="term"><pre class="nowrap">
<span style="color: #cdcd00">-- benchmark: 4 tests, min 5 rounds (of min 25.00us), 30.00s max time, timer: monotonic --</span>
Name (time in ns) Min Max Mean StdDev Rounds Iterations
<span style="color: #cdcd00">------------------------------------------------------------------------------------------</span>
test_perf[slots] <span style="color: #00cd00"></span><span style="font-weight: bold; color: #00cd00"> 2.1267</span><span style="color: #00cd00"></span><span style="font-weight: bold; color: #00cd00"> 139.0987</span><span style="color: #00cd00"></span><span style="font-weight: bold; color: #00cd00"> 2.3513</span><span style="color: #00cd00"></span><span style="font-weight: bold; color: #00cd00"> 0.4176</span> 1003345 13824
test_perf[simple] <span style="font-weight: bold"> 24.0000</span><span style="font-weight: bold"> 9981.7000</span><span style="font-weight: bold"> 29.9561</span><span style="color: #cd0000"></span><span style="font-weight: bold; color: #cd0000"> 37.2147</span> 1250001 1000
test_perf[django] <span style="font-weight: bold"> 25.1000</span><span style="color: #cd0000"></span><span style="font-weight: bold; color: #cd0000"> 10186.4000</span><span style="font-weight: bold"> 29.5746</span><span style="font-weight: bold"> 26.3704</span> 1195220 1000
test_perf[objproxies] <span style="color: #cd0000"></span><span style="font-weight: bold; color: #cd0000"> 25.6000</span><span style="font-weight: bold"> 9509.6000</span><span style="color: #cd0000"></span><span style="font-weight: bold; color: #cd0000"> 30.2238</span><span style="font-weight: bold"> 20.0922</span> 1176471 1000
<span style="color: #cdcd00">------------------------------------------------------------------------------------------</span>
</pre></div><hr class="docutils" />
<p>Now I'm a bit broken up about this, which implementation should be the default? Should the <tt class="docutils literal">simple</tt> one be the default
on PyPy?</p>
<table class="docutils footnote" frame="void" id="footnote-1" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-2">[1]</a></td><td>HTML output generated with <tt class="docutils literal">ansi2html <span class="pre">--inline</span> <span class="pre">--scheme=xterm</span></tt>. You can capture output with all the ANSI
escapes codes by running <tt class="docutils literal">script <span class="pre">-c</span> "command" output.txt</tt>.</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-2" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-1">[2]</a></td><td>You can take a look at <a class="reference external" href="https://github.com/ionelmc/cookiecutter-pylibrary">cookiecutter-pylibrary</a> for now.</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-3" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-3">[3]</a></td><td>See: <a class="reference external" href="https://github.com/GrahamDumpleton/wrapt/blob/1.10.2/src/wrappers.py#L49-L51">wrapt/wrappers.py</a></td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-4" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-4">[4]</a></td><td>In case you're wondering what's with the different timer, the tests are done on PyPy (not PyPy3). That means no
<a class="reference external" href="https://www.python.org/dev/peps/pep-0418/#time-perf-counter">high precision timer</a>, so I had to implement my
own using <tt class="docutils literal">clock_gettime(CLOCK_MONOTONIC)</tt> from <tt class="docutils literal">__pypy__.time</tt>.</td></tr>
</tbody>
</table>
Terrible choices: MySQL2014-12-28T00:00:00+02:002017-09-25T00:00:00+03:00Ionel Cristian Mărieștag:blog.ionelmc.ro,2014-12-28:/2014/12/28/terrible-choices-mysql/<p>I've used MySQL for a while now, and there were lots of surprising things I needed to cater for. This is from a <a class="reference external" href="https://www.djangoproject.com/">Django</a> and MySQL 5.5 perspective <a class="footnote-reference" href="#footnote-1" id="footnote-reference-1">[*]</a>. Later on you'll see the horrible things I did to work
around. It was a terrible experience, as I've also used …</p><p>I've used MySQL for a while now, and there were lots of surprising things I needed to cater for. This is from a <a class="reference external" href="https://www.djangoproject.com/">Django</a> and MySQL 5.5 perspective <a class="footnote-reference" href="#footnote-1" id="footnote-reference-1">[*]</a>. Later on you'll see the horrible things I did to work
around. It was a terrible experience, as I've also used PostgreSQL ...</p>
<p>Feel free to add your own experiences in the comments section.</p>
<div class="section" id="the-defaults">
<h2>The defaults<a class="headerlink" href="#the-defaults" title="Permalink to this headline">
*</a></h2>
<p>MySQL supports a large part of the <a class="reference external" href="http://en.wikipedia.org/wiki/SQL:1999">ANSI SQL 99</a> standard, however the default
settings are nowhere close to that.</p>
<p>If you used any other database then it's going to be a very perplexing experience.</p>
<div class="section" id="sql-mode">
<h3>SQL mode<a class="headerlink" href="#sql-mode" title="Permalink to this headline">
*</a></h3>
<p>With the default settings MySQL truncates and does other unspeakable things to data for the sake of not giving errors.
Where is this a correct choice, hard to say.</p>
<p>What the defaults allow:</p>
<ul class="simple">
<li>Storing invalid dates like <tt class="docutils literal"><span class="pre">'0000-00-00'</span></tt> or <tt class="docutils literal"><span class="pre">'2010-01-00'</span></tt> <a class="footnote-reference" href="#footnote-2" id="footnote-reference-2">[1]</a>.</li>
<li>Silently treating errors like:<ul>
<li>Specifying an unavailable storage engine. It will use the default engine, silently <a class="footnote-reference" href="#footnote-3" id="footnote-reference-3">[2]</a>.</li>
<li>Inserting invalid data. Larger strings get truncated to the maximum length. Larger integers get truncated to the
maximum. Other things get converted to NULL if the column allows that. All silently <a class="footnote-reference" href="#footnote-4" id="footnote-reference-4">[3]</a>.</li>
</ul>
</li>
</ul>
<p>And no, ORMs won't save you from this pain by default. For Django you need to have something like this in the settings:</p>
<div class="highlight"><pre><span></span><span class="n">DATABASES</span> <span class="o">=</span> <span class="p">{</span>
<span class="s1">'default'</span><span class="p">:</span> <span class="p">{</span>
<span class="s1">'ENGINE'</span><span class="p">:</span> <span class="s1">'django.db.backends.mysql'</span><span class="p">,</span>
<span class="s1">'NAME'</span><span class="p">:</span> <span class="s1">'whatever'</span><span class="p">,</span>
<span class="s1">'OPTIONS'</span><span class="p">:</span> <span class="p">{</span>
<span class="s1">'sql_mode'</span><span class="p">:</span> <span class="s1">'TRADITIONAL'</span><span class="p">,</span>
<span class="p">}</span> <span class="c1"># Note that later we find out that this is not enough. Read on.</span>
<span class="p">}</span>
<span class="p">}</span>
</pre></div>
<p>There's an open Django <a class="reference external" href="https://code.djangoproject.com/ticket/15940">ticket to set this the default</a>.</p>
<p>And yes, you can set this in the MySQL settings but you don't really want that. Your app will misbehave and corrupt
data if you ever happen to forget to change the <tt class="docutils literal">/etc/mysql/my.cnf</tt> before deployment. And if you ever forget to do
that, your data is already going to be corrupted by the time you notice something is not quite right.</p>
</div>
<div class="section" id="collations-and-encodings">
<h3>Collations and encodings<a class="headerlink" href="#collations-and-encodings" title="Permalink to this headline">
*</a></h3>
<p>So far I've used <tt class="docutils literal">utf8</tt> as the connection charset. Cause you never know what the default one is (<a class="reference external" href="http://dev.mysql.com/doc/refman/5.5/en/server-options.html#option_mysqld_character-set-server">probably latin1</a>). However, note that
the charset affects the maximum index size for VARCHAR columns. Ever wonder why you often see this:</p>
<div class="highlight"><pre><span></span><span class="n">myfield</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">CharField</span><span class="p">(</span><span class="n">max_length</span><span class="o">=</span><span class="mi">255</span><span class="p">,</span> <span class="n">db_index</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
</pre></div>
<p>Yes, the index on <tt class="docutils literal">utf8</tt> columns is limited to 255 characters <a class="footnote-reference" href="#footnote-5" id="footnote-reference-5">[4]</a>. If you need to store emoticons like an angry face
<span class="bigger">😠</span>- and most probably you'd want to, given the predicament - then you're out of luck, you need a different
charset, <tt class="docutils literal">utf8mb4</tt> <a class="footnote-reference" href="#footnote-6" id="footnote-reference-6">[5]</a>. But that means smaller index, only 191 characters <a class="footnote-reference" href="#footnote-12" id="footnote-reference-7">[11]</a>.</p>
<p>But that's not that bad, you should get some errors if you fail to set the correct encoding. Collations however, are
more trippy. A collation is a set of rules for comparing characters in a character set <a class="footnote-reference" href="#footnote-7" id="footnote-reference-8">[6]</a>.</p>
<p>Notice a pattern here? Note the <tt class="docutils literal">_ci</tt> suffix:</p>
<pre class="literal-block">
mysql> SHOW CHARACTER SET;
+----------+-----------------------------+---------------------+--------+
| Charset | Description | Default collation | Maxlen |
+----------+-----------------------------+---------------------+--------+
| latin1 | cp1252 West European | latin1_swedish_ci | 1 |
| ascii | US ASCII | ascii_general_ci | 1 |
| utf8 | UTF-8 Unicode | utf8_general_ci | 3 |
| utf8mb4 | UTF-8 Unicode | utf8mb4_general_ci | 4 |
| utf32 | UTF-32 Unicode | utf32_general_ci | 4 |
+----------+-----------------------------+---------------------+--------+
</pre>
<p>All the default collations are case insensitive. This means your queries and indexes are also going to be case
insensitive <a class="footnote-reference" href="#footnote-7" id="footnote-reference-9">[6]</a>. I'd say they are just <cite>insensitive</cite>. To your pain.</p>
<p>If you have a case insensitive collation all sorts of queries that aren't a text search will behave strangely <a class="footnote-reference" href="#footnote-8" id="footnote-reference-10">[7]</a>.
You'll notice that <a class="reference external" href="https://docs.djangoproject.com/en/1.7/ref/models/querysets/#get-or-create">get_or_create</a> doesn't work as expected when you have accents or a different case.</p>
<p>The solution is to use <tt class="docutils literal">utf_bin</tt> collation and specify a different one only when you need special accent handling and
case folding. The settings now look like this:</p>
<div class="highlight"><pre><span></span><span class="n">DATABASES</span> <span class="o">=</span> <span class="p">{</span>
<span class="s1">'default'</span><span class="p">:</span> <span class="p">{</span>
<span class="s1">'ENGINE'</span><span class="p">:</span> <span class="s1">'django.db.backends.mysql'</span><span class="p">,</span>
<span class="s1">'NAME'</span><span class="p">:</span> <span class="s1">'mydatabase'</span><span class="p">,</span>
<span class="s1">'OPTIONS'</span><span class="p">:</span> <span class="p">{</span>
<span class="s1">'sql_mode'</span><span class="p">:</span> <span class="s1">'TRADITIONAL'</span><span class="p">,</span>
<span class="s1">'charset'</span><span class="p">:</span> <span class="s1">'utf8'</span><span class="p">,</span>
<span class="s1">'init_command'</span><span class="p">:</span> <span class="s1">'SET '</span>
<span class="s1">'storage_engine=INNODB,'</span>
<span class="s1">'character_set_connection=utf8,'</span>
<span class="s1">'collation_connection=utf8_bin'</span>
<span class="p">}</span> <span class="c1"># Note that later we find out that this is still not enough. Read on.</span>
<span class="p">}</span>
<span class="p">}</span>
</pre></div>
<p>Unfortunately I didn't know this from the start. So I had to fix the existing data and installation scripts. When you
create a database, make sure that you specify the encoding and collation:</p>
<div class="highlight"><pre><span></span><span class="k">CREATE</span><span class="w"> </span><span class="k">DATABASE</span><span class="w"> </span><span class="n">mydatabase</span><span class="w"> </span><span class="k">CHARACTER</span><span class="w"> </span><span class="k">SET</span><span class="w"> </span><span class="n">utf8</span><span class="w"> </span><span class="k">COLLATE</span><span class="w"> </span><span class="n">utf8_bin</span><span class="p">;</span><span class="w"></span>
</pre></div>
<p>To fix the existing data, a <a class="reference external" href="http://south.aeracode.org/">south</a> migration would look like:</p>
<div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">south.db</span> <span class="kn">import</span> <span class="n">db</span>
<span class="kn">from</span> <span class="nn">south.v2</span> <span class="kn">import</span> <span class="n">DataMigration</span>
<span class="k">class</span> <span class="nc">Migration</span><span class="p">(</span><span class="n">DataMigration</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">forwards</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">orm</span><span class="p">):</span>
<span class="nb">print</span><span class="p">(</span><span class="s1">''</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="s1">' Altering database ...'</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s2">"ALTER DATABASE CHARACTER SET utf8 COLLATE utf8_bin;"</span><span class="p">)</span>
<span class="k">for</span> <span class="n">table</span><span class="p">,</span> <span class="ow">in</span> <span class="n">db</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s1">'SHOW TABLES'</span><span class="p">):</span>
<span class="nb">print</span><span class="p">(</span><span class="s1">' Altering table </span><span class="si">%s</span><span class="s1"> ...'</span> <span class="o">%</span> <span class="n">table</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span>
<span class="s2">"ALTER TABLE </span><span class="si">%s</span><span class="s2"> CONVERT TO CHARACTER SET utf8 COLLATE utf8_bin"</span> <span class="o">%</span> <span class="n">table</span>
<span class="p">)</span>
<span class="k">def</span> <span class="nf">backwards</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">orm</span><span class="p">):</span>
<span class="c1"># Altering the tables takes lots of time and</span>
<span class="c1"># locks the tables, since it copies all the data.</span>
<span class="k">raise</span> <span class="ne">RuntimeError</span><span class="p">(</span>
<span class="s2">"This migration probably took 2 hours, you don't really want to rollback ..."</span>
<span class="p">)</span>
</pre></div>
<p>With the new migrations in Django 1.7:</p>
<div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">django.db</span> <span class="kn">import</span> <span class="n">migrations</span>
<span class="k">def</span> <span class="nf">forwards</span><span class="p">(</span><span class="n">apps</span><span class="p">,</span> <span class="n">schema_editor</span><span class="p">):</span>
<span class="k">with</span> <span class="n">schema_editor</span><span class="o">.</span><span class="n">connection</span><span class="o">.</span><span class="n">cursor</span><span class="p">()</span> <span class="k">as</span> <span class="n">cursor</span><span class="p">:</span>
<span class="nb">print</span><span class="p">(</span><span class="s1">''</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="s1">' Altering database ...'</span><span class="p">)</span>
<span class="n">cursor</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s2">"ALTER DATABASE CHARACTER SET utf8 COLLATE utf8_bin;"</span><span class="p">)</span>
<span class="n">cursor</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="s2">"SHOW TABLES;"</span><span class="p">)</span>
<span class="k">for</span> <span class="n">table</span><span class="p">,</span> <span class="ow">in</span> <span class="n">cursor</span><span class="o">.</span><span class="n">fetchall</span><span class="p">():</span>
<span class="nb">print</span><span class="p">(</span><span class="s1">' Altering table </span><span class="si">%s</span><span class="s1"> ...'</span> <span class="o">%</span> <span class="n">table</span><span class="p">)</span>
<span class="n">cursor</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span>
<span class="s2">"ALTER TABLE </span><span class="si">%s</span><span class="s2"> CONVERT TO CHARACTER SET utf8 COLLATE utf8_bin"</span> <span class="o">%</span> <span class="n">table</span>
<span class="p">)</span>
<span class="k">def</span> <span class="nf">backwards</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">orm</span><span class="p">):</span>
<span class="c1"># Altering the tables takes lots of time and</span>
<span class="c1"># locks the tables, since it copies all the data.</span>
<span class="k">raise</span> <span class="ne">RuntimeError</span><span class="p">(</span>
<span class="s2">"This migration probably took 2 hours, you don't really want to rollback ..."</span>
<span class="p">)</span>
<span class="k">class</span> <span class="nc">Migration</span><span class="p">(</span><span class="n">migrations</span><span class="o">.</span><span class="n">Migration</span><span class="p">):</span>
<span class="n">dependencies</span> <span class="o">=</span> <span class="p">[</span>
<span class="c1"># Needs to be filled, to figure it out run:</span>
<span class="c1"># django-admin makemigrations myapp --empty</span>
<span class="p">]</span>
<span class="n">operations</span> <span class="o">=</span> <span class="p">[</span><span class="n">migrations</span><span class="o">.</span><span class="n">RunPython</span><span class="p">(</span><span class="n">forwards</span><span class="p">,</span> <span class="n">backwards</span><span class="p">)]</span>
</pre></div>
<p>Note that this will take long time to run (MySQL seems to copy all the tables for some reason) and it will change the
encoding and collation for all the columns <a class="footnote-reference" href="#footnote-9" id="footnote-reference-11">[8]</a>.</p>
</div>
<div class="section" id="the-transaction-isolation-level">
<h3>The transaction isolation level<a class="headerlink" href="#the-transaction-isolation-level" title="Permalink to this headline">
*</a></h3>
<p>Unfortunately getting the collation and encodings right won't make <a class="reference external" href="https://docs.djangoproject.com/en/1.7/ref/models/querysets/#get-or-create">get_or_create</a> work flawlessly <a class="footnote-reference" href="#footnote-10" id="footnote-reference-12">[9]</a>. The default
transaction level for MySQL is <a class="reference external" href="http://dev.mysql.com/doc/refman/5.5/en/set-transaction.html#isolevel_repeatable-read">REPEATABLE READ</a>, that means reads are consistent in the same transaction - they will
return the same result, even if outside the transaction the data has changed. In other words, <a class="reference external" href="http://dev.mysql.com/doc/refman/5.5/en/set-transaction.html#isolevel_repeatable-read">REPEATABLE READ</a> will
break <a class="reference external" href="https://docs.djangoproject.com/en/1.7/ref/models/querysets/#get-or-create">get_or_create</a> in a transaction - it's possible that code in a transaction won't "see" the object created
outside the transaction (like in another process).</p>
<p>Unfortunately the defaults are hard to change without breaking existing apps <a class="footnote-reference" href="#footnote-11" id="footnote-reference-13">[10]</a>, so you have to work this out in your
connection settings. Now the settings are:</p>
<div class="highlight"><pre><span></span><span class="n">DATABASES</span> <span class="o">=</span> <span class="p">{</span>
<span class="s1">'default'</span><span class="p">:</span> <span class="p">{</span>
<span class="s1">'ENGINE'</span><span class="p">:</span> <span class="s1">'django.db.backends.mysql'</span><span class="p">,</span>
<span class="s1">'NAME'</span><span class="p">:</span> <span class="s1">'mydatabase'</span><span class="p">,</span>
<span class="s1">'OPTIONS'</span><span class="p">:</span> <span class="p">{</span>
<span class="s1">'sql_mode'</span><span class="p">:</span> <span class="s1">'TRADITIONAL'</span><span class="p">,</span>
<span class="s1">'charset'</span><span class="p">:</span> <span class="s1">'utf8'</span><span class="p">,</span>
<span class="s1">'init_command'</span><span class="p">:</span> <span class="s1">'SET '</span>
<span class="s1">'storage_engine=INNODB,'</span>
<span class="s1">'character_set_connection=utf8,'</span>
<span class="s1">'collation_connection=utf8_bin,'</span>
<span class="s1">'SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED'</span><span class="p">,</span>
<span class="p">}</span> <span class="c1"># Now we have a mild degree of confidence :-)</span>
<span class="p">}</span>
<span class="p">}</span>
</pre></div>
</div>
<div class="section" id="ddl-statements">
<h3>DDL statements<a class="headerlink" href="#ddl-statements" title="Permalink to this headline">
*</a></h3>
<p><a class="reference external" href="http://en.wikipedia.org/wiki/Data_definition_language#SQL">DDL</a> statements in MySQL, not only they are slow and lock
tables (that means downtime), they will ignore transactions. The almighty InnoDB can't save your sanity when an
<tt class="docutils literal">ALTER</tt> is used in a transaction. Thus, migrations in MySQL must be approached with great care (test them well) and to
avoid downtime you need to use specific external tools.</p>
<p>Annoying, even for development:</p>
<pre class="literal-block">
! Error found during real run of migration! Aborting.
! Since you have a database that does not support running
! schema-altering statements in transactions, we have had
! to leave it in an interim state between migrations.
! The South developers regret this has happened, and would
! like to gently persuade you to consider a slightly
! easier-to-deal-with DBMS.
</pre>
<p>If this is the same with the migration system in 1.7 (seems it it, but you only get a plain traceback) then it's ideal
to have migrations small in scope if you have <a class="reference external" href="https://docs.djangoproject.com/en/1.7/ref/migration-operations/#django.db.migrations.operations.RunSQL">custom sql</a> that can
easily fail, and <a class="reference external" href="https://docs.djangoproject.com/en/1.7/topics/migrations/#squashing-migrations">squash them</a> later,
after you have successfully ran them.</p>
</div>
</div>
<div class="section" id="the-inflexibility">
<h2>The inflexibility<a class="headerlink" href="#the-inflexibility" title="Permalink to this headline">
*</a></h2>
<p>There's something wrong with the query optimizer, it tends to use temporary tables for most queries that have a handful
of joins and a <tt class="docutils literal">GROUP BY</tt>. This quite common with Django but the <cite>why and how</cite> is a too large topic to tackle here.
Maybe later ...</p>
<p>For now, a story of grief.</p>
<p>I have a model like this:</p>
<div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">Item</span><span class="p">(</span><span class="n">models</span><span class="o">.</span><span class="n">Model</span><span class="p">):</span>
<span class="k">class</span> <span class="nc">Meta</span><span class="p">:</span>
<span class="n">unique_together</span> <span class="o">=</span> <span class="s2">"type"</span><span class="p">,</span> <span class="s2">"name"</span><span class="p">,</span> <span class="s2">"parent"</span>
<span class="nb">type</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">CharField</span><span class="p">(</span><span class="n">max_length</span><span class="o">=</span><span class="mi">20</span><span class="p">,</span> <span class="n">db_index</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
<span class="n">name</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">CharField</span><span class="p">(</span><span class="n">max_length</span><span class="o">=</span><span class="mi">200</span><span class="p">,</span> <span class="n">db_index</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
<span class="n">parent</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">ForeignKey</span><span class="p">(</span>
<span class="s2">"self"</span><span class="p">,</span>
<span class="n">null</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="c1"># Notice anything peculiar?</span>
<span class="n">blank</span><span class="o">=</span><span class="kc">True</span>
<span class="p">)</span>
</pre></div>
<p>For reasons unknown I went ahead with this model. Why was it designed like that and why it remained like that needs not
be questioned. But what is wrong with it?</p>
<p>You see, creating unique index on <tt class="docutils literal">NULL</tt> columns is a bad idea as <tt class="docutils literal">NULL</tt> values aren't included in the index. The
database will disallow inserting <tt class="docutils literal">"foobar"</tt> two times but will allow inserting a boundless number of <tt class="docutils literal">NULL</tt> values.
For example, this would be allowed:</p>
<div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">Item</span><span class="o">.</span><span class="n">objects</span><span class="o">.</span><span class="n">create</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s2">"stuff"</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="s2">"foobar"</span><span class="p">)</span><span class="o">.</span><span class="n">pk</span>
<span class="go">1L</span>
<span class="gp">>>> </span><span class="n">Item</span><span class="o">.</span><span class="n">objects</span><span class="o">.</span><span class="n">create</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s2">"stuff"</span><span class="p">,</span> <span class="nb">type</span><span class="o">=</span><span class="s2">"foobar"</span><span class="p">)</span><span class="o">.</span><span class="n">pk</span>
<span class="go">2L</span>
</pre></div>
<p>Given unfavorable conditions (parallelism), <a class="reference external" href="https://docs.djangoproject.com/en/1.7/ref/models/querysets/#get-or-create">get_or_create</a> would create duplicate objects - because the database lets
it.</p>
<p>This is something normal in SQL and ideally you'd redesign the model and MySQL is not to be blamed, right? But, alas,
no. You see, I was using an ORM and I really liked the convenience of the <a class="reference external" href="https://docs.djangoproject.com/en/1.7/ref/models/fields/#foreignkey">ForeignKey field</a>. Pity me for my weakness for I was like a
thirsty fool in the desert.</p>
<p>The options I didn't have:</p>
<ul class="simple">
<li>Keep the constraint but replace the <tt class="docutils literal">parent</tt> in the index with a computed column. But MySQL doesn't have computed
columns!</li>
<li>Use a conditional index. Nope, MySQL doesn't allow conditional indexes.</li>
<li>Create a view. Make the index on the view. No, MySQL doesn't allow creating indexes on views.</li>
</ul>
<p>To add insult to the injury, PostgreSQL had all these.</p>
<p>Creating some workaround on the client side was no option. The application was highly parallelism - this had to be
handled in the database. But I fought on and found a solution: :uppercase:TRIGGERS.</p>
<p>In my mad quest to solve this quickly I brought this into existence:</p>
<div class="highlight"><pre><span></span><span class="k">CREATE</span><span class="w"> </span><span class="k">TRIGGER</span><span class="w"> </span><span class="n">check_unique_on_item</span><span class="w"></span>
<span class="k">BEFORE</span><span class="w"> </span><span class="k">INSERT</span><span class="w"> </span><span class="k">ON</span><span class="w"> </span><span class="n">myapp_item</span><span class="w"></span>
<span class="k">FOR</span><span class="w"> </span><span class="k">EACH</span><span class="w"> </span><span class="k">ROW</span><span class="w"> </span><span class="k">BEGIN</span><span class="w"></span>
<span class="w"> </span><span class="k">IF</span><span class="w"> </span><span class="k">NEW</span><span class="p">.</span><span class="n">parent_id</span><span class="w"> </span><span class="k">IS</span><span class="w"> </span><span class="no">NULL</span><span class="w"></span>
<span class="w"> </span><span class="k">THEN</span><span class="w"></span>
<span class="w"> </span><span class="k">IF</span><span class="w"> </span><span class="p">(</span><span class="w"></span>
<span class="w"> </span><span class="k">SELECT</span><span class="w"> </span><span class="nf">COUNT</span><span class="p">(</span><span class="o">*</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">myapp_item</span><span class="w"> </span><span class="n">item</span><span class="w"></span>
<span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">item</span><span class="p">.</span><span class="n">parent_id</span><span class="w"> </span><span class="k">IS</span><span class="w"> </span><span class="no">NULL</span><span class="w"> </span><span class="k">AND</span><span class="w"></span>
<span class="w"> </span><span class="n">item</span><span class="p">.</span><span class="k">name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">NEW</span><span class="p">.</span><span class="k">name</span><span class="w"> </span><span class="k">AND</span><span class="w"></span>
<span class="w"> </span><span class="n">item</span><span class="p">.</span><span class="k">type</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">NEW</span><span class="p">.</span><span class="k">type</span><span class="w"></span>
<span class="w"> </span><span class="p">)</span><span class="w"> </span><span class="o">></span><span class="w"> </span><span class="mi">0</span><span class="w"></span>
<span class="w"> </span><span class="k">THEN</span><span class="w"></span>
<span class="w"> </span><span class="k">SET</span><span class="w"> </span><span class="k">NEW</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'Error: Cannot insert this item. There is already an existing entry.'</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="k">END</span><span class="w"> </span><span class="k">IF</span><span class="p">;</span><span class="w"></span>
<span class="w"> </span><span class="k">END</span><span class="w"> </span><span class="k">IF</span><span class="p">;</span><span class="w"></span>
<span class="k">END</span><span class="p">;</span><span class="w"></span>
</pre></div>
<!-- * -->
<p>Because I was doing something invalid, just to stop the insert, <a class="reference external" href="https://docs.djangoproject.com/en/1.7/ref/models/querysets/#get-or-create">get_or_create</a> had to deal with a new type of error:
<tt class="docutils literal">DatabaseError</tt>. Thus the <tt class="docutils literal">Item</tt> model needed a custom manager:</p>
<div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">django.db.utils</span> <span class="kn">import</span> <span class="n">DatabaseError</span>
<span class="k">class</span> <span class="nc">ItemManager</span><span class="p">(</span><span class="n">models</span><span class="o">.</span><span class="n">Manager</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">get_or_create</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="o">**</span><span class="n">lookups</span><span class="p">):</span>
<span class="k">try</span><span class="p">:</span>
<span class="k">return</span> <span class="nb">super</span><span class="p">(</span><span class="n">ItemManager</span><span class="p">,</span> <span class="bp">self</span><span class="p">)</span><span class="o">.</span><span class="n">get_or_create</span><span class="p">(</span><span class="o">**</span><span class="n">lookups</span><span class="p">)</span>
<span class="k">except</span> <span class="n">DatabaseError</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="k">try</span><span class="p">:</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="o">**</span><span class="n">lookups</span><span class="p">),</span> <span class="kc">False</span>
<span class="k">except</span> <span class="bp">self</span><span class="o">.</span><span class="n">model</span><span class="o">.</span><span class="n">DoesNotExist</span><span class="p">:</span>
<span class="k">raise</span> <span class="n">e</span>
</pre></div>
<!-- ** -->
<p>Now if you look closely at the trigger you'll notice that it's not really a real constraint, thus, not atomic. I have
realized that it doesn't really work during load-tests ...</p>
<p>So I went back to the drawing board and came up with a new idea: create a shadow table that has all the constraints,
and triggers to update that table before the real one gets changed.</p>
<p>So I removed the <tt class="docutils literal">unique_together</tt> from the <tt class="docutils literal">Item</tt> model and created this:</p>
<div class="highlight"><pre><span></span><span class="k">class</span> <span class="nc">ItemUniqueFixup</span><span class="p">(</span><span class="n">models</span><span class="o">.</span><span class="n">Model</span><span class="p">):</span>
<span class="k">class</span> <span class="nc">Meta</span><span class="p">:</span>
<span class="n">unique_together</span> <span class="o">=</span> <span class="s2">"type"</span><span class="p">,</span> <span class="s2">"name"</span><span class="p">,</span> <span class="s2">"parent"</span>
<span class="nb">type</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">CharField</span><span class="p">(</span><span class="n">max_length</span><span class="o">=</span><span class="mi">20</span><span class="p">)</span>
<span class="n">name</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">CharField</span><span class="p">(</span><span class="n">max_length</span><span class="o">=</span><span class="mi">200</span><span class="p">)</span>
<span class="n">parent</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">PositiveIntegerField</span><span class="p">()</span>
</pre></div>
<p>And created the triggers and filled the shadow table:</p>
<div class="highlight"><pre><span></span><span class="k">DROP</span><span class="w"> </span><span class="k">TRIGGER</span><span class="w"> </span><span class="k">IF</span><span class="w"> </span><span class="k">EXISTS</span><span class="w"> </span><span class="n">check_unique_on_item</span><span class="p">;</span><span class="w"></span>
<span class="k">TRUNCATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">myapp_itemuniquefixup</span><span class="p">;</span><span class="w"></span>
<span class="k">INSERT</span><span class="w"> </span><span class="k">INTO</span><span class="w"> </span><span class="n">myapp_itemuniquefixup</span><span class="w"> </span><span class="p">(</span><span class="n">id</span><span class="p">,</span><span class="w"> </span><span class="k">type</span><span class="p">,</span><span class="w"> </span><span class="k">name</span><span class="p">,</span><span class="w"> </span><span class="n">parent</span><span class="p">)</span><span class="w"></span>
<span class="k">SELECT</span><span class="w"> </span><span class="k">old</span><span class="p">.</span><span class="n">id</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">id</span><span class="p">,</span><span class="w"> </span><span class="k">old</span><span class="p">.</span><span class="k">type</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="k">type</span><span class="p">,</span><span class="w"> </span><span class="k">old</span><span class="p">.</span><span class="k">name</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="k">name</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="k">IF</span><span class="p">(</span><span class="k">old</span><span class="p">.</span><span class="n">parent_id</span><span class="w"> </span><span class="k">IS</span><span class="w"> </span><span class="no">NULL</span><span class="p">,</span><span class="w"> </span><span class="mi">0</span><span class="p">,</span><span class="w"> </span><span class="k">old</span><span class="p">.</span><span class="n">parent_id</span><span class="p">)</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="n">parent</span><span class="w"></span>
<span class="k">FROM</span><span class="w"> </span><span class="n">myapp_item</span><span class="w"> </span><span class="k">old</span><span class="p">;</span><span class="w"></span>
<span class="k">CREATE</span><span class="w"> </span><span class="k">TRIGGER</span><span class="w"> </span><span class="n">check_unique_on_item_insert</span><span class="w"></span>
<span class="k">BEFORE</span><span class="w"> </span><span class="k">INSERT</span><span class="w"> </span><span class="k">ON</span><span class="w"> </span><span class="n">myapp_item</span><span class="w"></span>
<span class="k">FOR</span><span class="w"> </span><span class="k">EACH</span><span class="w"> </span><span class="k">ROW</span><span class="w"> </span><span class="k">BEGIN</span><span class="w"></span>
<span class="w"> </span><span class="k">INSERT</span><span class="w"> </span><span class="k">INTO</span><span class="w"> </span><span class="n">myapp_itemuniquefixup</span><span class="w"> </span><span class="p">(</span><span class="n">id</span><span class="p">,</span><span class="w"> </span><span class="k">type</span><span class="p">,</span><span class="w"> </span><span class="k">name</span><span class="p">,</span><span class="w"> </span><span class="n">parent</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="k">VALUES</span><span class="w"> </span><span class="p">(</span><span class="w"></span>
<span class="w"> </span><span class="k">NEW</span><span class="p">.</span><span class="n">id</span><span class="p">,</span><span class="w"> </span><span class="k">NEW</span><span class="p">.</span><span class="k">type</span><span class="p">,</span><span class="w"> </span><span class="k">NEW</span><span class="p">.</span><span class="k">name</span><span class="p">,</span><span class="w"> </span><span class="k">IF</span><span class="p">(</span><span class="k">NEW</span><span class="p">.</span><span class="n">parent_id</span><span class="w"> </span><span class="k">IS</span><span class="w"> </span><span class="no">NULL</span><span class="p">,</span><span class="w"> </span><span class="mi">0</span><span class="p">,</span><span class="w"> </span><span class="k">NEW</span><span class="p">.</span><span class="n">parent_id</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="p">);</span><span class="w"></span>
<span class="k">END</span><span class="p">;</span><span class="w"></span>
<span class="k">CREATE</span><span class="w"> </span><span class="k">TRIGGER</span><span class="w"> </span><span class="n">check_unique_on_item_update</span><span class="w"></span>
<span class="k">BEFORE</span><span class="w"> </span><span class="k">UPDATE</span><span class="w"> </span><span class="k">ON</span><span class="w"> </span><span class="n">myapp_item</span><span class="w"></span>
<span class="k">FOR</span><span class="w"> </span><span class="k">EACH</span><span class="w"> </span><span class="k">ROW</span><span class="w"> </span><span class="k">BEGIN</span><span class="w"></span>
<span class="w"> </span><span class="k">UPDATE</span><span class="w"> </span><span class="n">myapp_itemuniquefixup</span><span class="w"> </span><span class="n">fix</span><span class="w"></span>
<span class="w"> </span><span class="k">SET</span><span class="w"> </span><span class="n">fix</span><span class="p">.</span><span class="k">type</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">NEW</span><span class="p">.</span><span class="k">type</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="n">fix</span><span class="p">.</span><span class="k">name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">NEW</span><span class="p">.</span><span class="k">name</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="n">fix</span><span class="p">.</span><span class="n">parent</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">IF</span><span class="p">(</span><span class="k">NEW</span><span class="p">.</span><span class="n">parent_id</span><span class="w"> </span><span class="k">IS</span><span class="w"> </span><span class="no">NULL</span><span class="p">,</span><span class="w"> </span><span class="mi">0</span><span class="p">,</span><span class="w"> </span><span class="k">NEW</span><span class="p">.</span><span class="n">parent_id</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="k">NEW</span><span class="p">.</span><span class="n">id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">fix</span><span class="p">.</span><span class="n">id</span><span class="p">;</span><span class="w"></span>
<span class="k">END</span><span class="p">;</span><span class="w"></span>
<span class="k">CREATE</span><span class="w"> </span><span class="k">TRIGGER</span><span class="w"> </span><span class="n">check_unique_on_item_delete</span><span class="w"></span>
<span class="k">BEFORE</span><span class="w"> </span><span class="k">DELETE</span><span class="w"> </span><span class="k">ON</span><span class="w"> </span><span class="n">myapp_item</span><span class="w"></span>
<span class="k">FOR</span><span class="w"> </span><span class="k">EACH</span><span class="w"> </span><span class="k">ROW</span><span class="w"> </span><span class="k">BEGIN</span><span class="w"></span>
<span class="w"> </span><span class="k">DELETE</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">myapp_itemuniquefixup</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">OLD</span><span class="p">.</span><span class="n">id</span><span class="p">;</span><span class="w"></span>
<span class="k">END</span><span class="p">;</span><span class="w"></span>
</pre></div>
<p>Now you see, this was very tricky to get right. With conditional indexes in PostgreSQL this would had been as easy as:</p>
<div class="highlight"><pre><span></span><span class="k">CREATE</span><span class="w"> </span><span class="k">UNIQUE</span><span class="w"> </span><span class="k">INDEX</span><span class="w"> </span><span class="n">myapp_item_noparent</span><span class="w"> </span><span class="k">ON</span><span class="w"> </span><span class="n">myapp_item</span><span class="p">(</span><span class="k">type</span><span class="p">,</span><span class="w"> </span><span class="k">name</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">parent</span><span class="w"> </span><span class="k">IS</span><span class="w"> </span><span class="k">NULL</span><span class="p">;</span><span class="w"></span>
<span class="k">CREATE</span><span class="w"> </span><span class="k">UNIQUE</span><span class="w"> </span><span class="k">INDEX</span><span class="w"> </span><span class="n">myapp_item_withparent</span><span class="w"> </span><span class="k">ON</span><span class="w"> </span><span class="n">myapp_item</span><span class="p">(</span><span class="k">type</span><span class="p">,</span><span class="w"> </span><span class="k">name</span><span class="p">,</span><span class="w"> </span><span class="n">parent</span><span class="p">)</span><span class="w"></span>
<span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">parent</span><span class="w"> </span><span class="k">IS</span><span class="w"> </span><span class="k">NOT</span><span class="w"> </span><span class="k">NULL</span><span class="p">;</span><span class="w"></span>
</pre></div>
</div>
<div class="section" id="in-hindsight">
<h2>In hindsight ...<a class="headerlink" href="#in-hindsight" title="Permalink to this headline">
*</a></h2>
<p>Had I knew all this from the start, maybe the ride would had been in <cite>easy mode</cite>. But the more you go on the more you
can't stop thinking <cite>"what if I had used PostgreSQL"</cite>. You can't shake off the dirty sensation of <cite>"I have made a terrible
choice"</cite> if you have to explain these awful quirks of MySQL to every new programmer. In a way this is technical debt -
you have to pay the cognitive burden to correctly use MySQL.</p>
<table class="docutils footnote" frame="void" id="footnote-1" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-1">[*]</a></td><td>If these are problems of the past, I don't care. It has been a terrible journey and it needs to be told.</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-2" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-2">[1]</a></td><td>See: <a class="reference external" href="http://dev.mysql.com/doc/refman/5.5/en/sql-mode.html#sqlmode_no_zero_date">http://dev.mysql.com/doc/refman/5.5/en/sql-mode.html#sqlmode_no_zero_date</a> and
<a class="reference external" href="http://dev.mysql.com/doc/refman/5.5/en/sql-mode.html#sqlmode_no_zero_in_date">http://dev.mysql.com/doc/refman/5.5/en/sql-mode.html#sqlmode_no_zero_in_date</a></td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-3" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-3">[2]</a></td><td>See: <a class="reference external" href="http://dev.mysql.com/doc/refman/5.5/en/sql-mode.html#sqlmode_no_engine_substitution">http://dev.mysql.com/doc/refman/5.5/en/sql-mode.html#sqlmode_no_engine_substitution</a></td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-4" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-4">[3]</a></td><td>See: <a class="reference external" href="http://dev.mysql.com/doc/refman/5.5/en/sql-mode.html#sqlmode_strict_all_tables">http://dev.mysql.com/doc/refman/5.5/en/sql-mode.html#sqlmode_strict_all_tables</a></td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-5" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-5">[4]</a></td><td>See: <a class="reference external" href="http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-upgrading.html">http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-upgrading.html</a></td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-6" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-6">[5]</a></td><td>See: <a class="reference external" href="https://code.djangoproject.com/ticket/18392">https://code.djangoproject.com/ticket/18392</a></td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-7" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label">[6]</td><td><em>(<a class="fn-backref" href="#footnote-reference-8">1</a>, <a class="fn-backref" href="#footnote-reference-9">2</a>)</em> See: <a class="reference external" href="http://dev.mysql.com/doc/refman/5.5/en/charset-general.html">http://dev.mysql.com/doc/refman/5.5/en/charset-general.html</a></td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-8" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-10">[7]</a></td><td><p class="first">More details: <a class="reference external" href="http://mzsanford.wordpress.com/2010/12/28/mysql-and-unicode/">http://mzsanford.wordpress.com/2010/12/28/mysql-and-unicode/</a></p>
<p>Also, worth checking out:</p>
<ul class="last simple">
<li><a class="reference external" href="https://wiki.postgresql.org/wiki/Things_to_find_out_about_when_moving_from_MySQL_to_PostgreSQL">https://wiki.postgresql.org/wiki/Things_to_find_out_about_when_moving_from_MySQL_to_PostgreSQL</a></li>
<li><a class="reference external" href="http://www.postgresql.org/docs/9.1/static/collation.html">http://www.postgresql.org/docs/9.1/static/collation.html</a></li>
</ul>
</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-9" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-11">[8]</a></td><td>See: <a class="reference external" href="http://dev.mysql.com/doc/refman/5.5/en/alter-table.html">http://dev.mysql.com/doc/refman/5.5/en/alter-table.html</a></td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-10" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-12">[9]</a></td><td>See: <a class="reference external" href="http://jmoiron.net/blog/innodb-transaction-isolation/">http://jmoiron.net/blog/innodb-transaction-isolation/</a></td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-11" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-13">[10]</a></td><td>See: <a class="reference external" href="https://code.djangoproject.com/ticket/13906">#13906</a> <a class="reference external" href="https://code.djangoproject.com/ticket/14026">#14026</a></td></tr>
</tbody>
</table>
<!-- http://sqlblog.com/blogs/alexander_kuznetsov/archive/2013/11/15/learning-postgresql-differences-in-implementation-of-constraints.aspx -->
<table class="docutils footnote" frame="void" id="footnote-12" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-7">[11]</a></td><td>See: <a class="reference external" href="http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-upgrading.html">http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-upgrading.html</a></td></tr>
</tbody>
</table>
</div>
Ramblings about data-structures2014-09-22T00:00:00+03:002016-05-02T00:00:00+03:00Ionel Cristian Mărieștag:blog.ionelmc.ro,2014-09-22:/2014/09/22/ramblings-about-data-structures/<p>I was reading <a class="reference external" href="http://www.occasionalinspiration.com/2014/09/100-functions.html">this</a> the other day. The article
presents this peculiar quote:</p>
<blockquote class="highlights">
<p>"It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data
structures."</p>
<p class="attribution">—Alan J. Perlis <a class="footnote-reference" href="#footnote-1" id="footnote-reference-1">[1]</a></p>
</blockquote>
<p>And then some thoughts and questions regarding the quote that seemed …</p><p>I was reading <a class="reference external" href="http://www.occasionalinspiration.com/2014/09/100-functions.html">this</a> the other day. The article
presents this peculiar quote:</p>
<blockquote class="highlights">
<p>"It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data
structures."</p>
<p class="attribution">—Alan J. Perlis <a class="footnote-reference" href="#footnote-1" id="footnote-reference-1">[1]</a></p>
</blockquote>
<p>And then some thoughts and questions regarding the quote that seemed a bit ridiculous. But what does this seemingly
ambiguous quote really mean?</p>
<p>After some thought, I've concluded that it's an apologia for Lisp's <em>lists everywhere</em> philosophy.</p>
<p>You can reduce the quote to:</p>
<blockquote class="highlights">
It's better to have one extremely generic type than 10 incompatible types.</blockquote>
<p>Python already does this: every object implements, surprise, an object interface that boils down to a bunch of magic
methods.</p>
<p>On the other hand, Lisp has <a class="reference external" href="http://en.wikipedia.org/wiki/Generic_function">generic functions</a>. What this mean is that
there is a dispatch system that binds certain function implementations to certain data-structures. A data structure that
is bound with specific actions (aka functions or methods) is what I call a <em>type</em>.</p>
<p>The idea of not having specialised data-structures is an illusion - if a function takes something as an input then you
have assumed a specific data structure, not just a mere list. This is why I think it's worthwhile designing the
data-structures before the actions you'll want to have in the system.</p>
<p>Sounds counter-intuitive, as actions make the system run, not the data-structures - it seems natural to map out the
actions first. Alas, starting with data-structures first lets you easily see what actions you could have and what
actions you can't. In other words, it gives you perspective on many things: hard constrains, data flow and dependencies.</p>
<p>Just the actions don't give you perspective on dependencies. Dependencies imply state. State implies data. They don't
give you perspective on what the system can and can't do - actions depends on inputs, <em>data</em> in other words. To put it
another way, data is the limiting factor on what actions you could have and what actions you could not.</p>
<p>Given Python's lax access to the innards of objects, a large part of your type's API is just the data-structure. Also,
given Python's support for properties <a class="footnote-reference" href="#footnote-2" id="footnote-reference-2">[2]</a>, a large part of your API could be something that <em>looks like</em> a
data-structure. So it's worthwhile to look really hard at this aspect of software design in the early stages of your
project.</p>
<div class="section" id="generality-of-types">
<h2>Generality of types<a class="headerlink" href="#generality-of-types" title="Permalink to this headline">
*</a></h2>
<img alt="A graph with few examples of Utility/Reach proportions" src="https://blog.ionelmc.ro/2014/09/22/ramblings-about-data-structures/data-structures.png" />
<p>There are two main properties of types:</p>
<ul class="simple">
<li><span class="uppercase">Utility</span>: How well does the type supports the consumer of said type. Does it have all the required
actions? Is the API well suited or it makes the customer handle things that he should not be concerned with? Those are
the key questions.</li>
<li><span class="uppercase">Reach</span>: How many distinct consumers can use this type. Does the type bring unwanted dependencies or
concerns in the consumer? Does the type have many actions that go unused, and can't be used, by a large part of all
the possible consumers?</li>
</ul>
<p>To give examples few examples:</p>
<ul class="simple">
<li>A <cite>list</cite> has a very large <span class="uppercase">reach</span> and most of the time it fits fairly well consumers that just need a
sequence-like type. However, the <span class="uppercase">utility</span> is very limited - you wouldn't use a list where you would use a
higher level type, like an <cite>invoice</cite>.</li>
<li>An <cite>invoice</cite> has a very limited <span class="uppercase">reach</span>, you can only use it in billing code. But the <span class="uppercase">utility</span>
of it is tremendous - you wouldn't want to use a mere <cite>list</cite> - you'd burden your payment code with concerns that are
better encapsulated in the <cite>invoice</cite> type.</li>
</ul>
<p>There's a tradeoff in having both <span class="uppercase">reach</span> and <span class="uppercase">utility</span>: complexity vs re-usability. Something with
<span class="uppercase">reach</span> and <span class="uppercase">utility</span> can be used in many places. However, complexity is bound to occur - handling
all those disparate use-cases is taxing.</p>
<p>I'd even go as far to argue there's a hard limit to reaching both goals - pushing one goal limits the other, from the
perspective of what you can have in an API.</p>
<p>If you can afford to change things later it's best to start with good <span class="uppercase">utility</span> and then move towards
<span class="uppercase">reach</span> as use-cases arise. Otherwise you're at risk of over-engineering and wasting time both on
development and maintenance.</p>
<div class="section" id="what-about-the-javascript-array">
<h3>What about the JavaScript Array?<a class="headerlink" href="#what-about-the-javascript-array" title="Permalink to this headline">
*</a></h3>
<p>The <cite>Array</cite> object (as any other object in JavaScript) is very interesting problem from the perspective of the iterator
interface. A <tt class="docutils literal">for (var element in variable)</tt> block will iterate on whatever is there, both attributes and elements of the
actual sequence. From this perspective the <cite>Array</cite> increases complexity (there's a cognitive burden, both on the
implementers of the <cite>Array</cite> object and the users of it). If the <cite>Array</cite> would not allow attributes then this wouldn't be
such an issue. But then the <span class="uppercase">reach</span> would be less.</p>
<p>On the other hand, you could in theory use an <cite>Array</cite> object as an <cite>invoice</cite> substitute, you could just slap the <cite>invoice</cite>
fields like buyer, seller, total value, reference number etc on the <cite>Array</cite> object. So from this perspective it has higher
<span class="uppercase">utility</span> than a plain <cite>list</cite> where you can't have arbitrary attributes (<tt class="docutils literal">AttributeError: 'list' object has no
attribute 'buyer'</tt>).</p>
</div>
</div>
<div class="section" id="importance-of-data-structures">
<h2>Importance of data-structures<a class="headerlink" href="#importance-of-data-structures" title="Permalink to this headline">
*</a></h2>
<p>So, you see, designing data-structures is tricky. There are tradeoffs to be made. If your data is wrongly designed then
you'll have to compensate those flaws in your code. That means more code, crappy code, to maintain.</p>
<p>Interestingly, this has been well put almost 40 years ago:</p>
<blockquote class="highlights">
<p>"Show me your tables, and I won't usually need your flowchart; it'll be obvious."</p>
<p class="attribution">—Fred Brooks, in <cite>Chapter 9</cite> of <cite>The Mythical Man-Month</cite> <a class="footnote-reference" href="#footnote-3" id="footnote-reference-3">[3]</a></p>
</blockquote>
<p>The same thing, in more modern language:</p>
<blockquote class="highlights">
<p>"Show me your data structures, and I won't usually need your code; it'll be obvious."</p>
<p class="attribution">—Eric Raymond, in <cite>The Cathedral and the Bazaar</cite>, paraphrasing Brooks <a class="footnote-reference" href="#footnote-3" id="footnote-reference-4">[3]</a></p>
</blockquote>
<p>Though it seems this isn't instilled in everyone's mind: some people actually think that you can accurately reason about
business logic before reasoning about the data <a class="footnote-reference" href="#footnote-4" id="footnote-reference-5">[4]</a>. You can't reliably reason about logic if you don't know what data
dependencies said logic has.</p>
</div>
<div class="section" id="chicken-and-egg">
<h2>Chicken and egg<a class="headerlink" href="#chicken-and-egg" title="Permalink to this headline">
*</a></h2>
<p>There are situations where you can't know what data you need to store before you know what the actions are. So you'd be
inclined to start thinking about all the actions first.</p>
<p>I prefer to sketch out a minimalistic data-structure and refine it as I become aware of what actions I need to support,
making a note of what actions I have so far. This works reasonably well and allows me to easily see any redundancy or
potential to generalize, or the need to specialize for that matter. In a way, this is similar to <a class="reference external" href="http://en.wikipedia.org/wiki/Class-responsibility-collaboration_card">CRC cards</a> but more profound.</p>
</div>
<div class="section" id="redundancy">
<h2>Redundancy<a class="headerlink" href="#redundancy" title="Permalink to this headline">
*</a></h2>
<p>Starting with the data first allows you to easily see any redundancy in the data and make intelligent <a class="reference external" href="http://en.wikipedia.org/wiki/Database_normalization">normalization</a> choices.</p>
<p>Duplicated data-structures, especially the ones that are slightly different but distinct are a special kind of evil.
They will frequently encourage, and sometimes even force the programmer to produce duplicated code, or code that <cite>tries</cite>
to handle all the variances. It's very tempting because they are so similar. But alas, you can only think of so many
things at the same time.</p>
<p>Even if you don't want to normalize your data, starting with the data first can result in <cite>synergy</cite>: the data from
<cite>this</cite> place mixes or adapts well to the data from <cite>that other</cite> place. This <cite>synergy</cite> will reduce the amount of
boilerplate and adapter code.</p>
</div>
<div class="section" id="closing-thoughts">
<h2>Closing thoughts<a class="headerlink" href="#closing-thoughts" title="Permalink to this headline">
*</a></h2>
<p>Alan J. Perlis' principle can't be applied in all situations as the <span class="uppercase">reach</span> property is not always the
imperative of software design and it has some disadvantages as illustrated above.</p>
<p>There are situations where none of these ideas apply (I don't mean to be complete):</p>
<ul class="simple">
<li>You don't have any data or your application is not data-centric, or there other more pressing things to consider
first.</li>
<li>You live in the perfect world of frozen requirements that are known before implementation.</li>
</ul>
<table class="docutils footnote" frame="void" id="footnote-1" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-1">[1]</a></td><td>9th quote: <a class="reference external" href="http://www.cs.yale.edu/homes/perlis-alan/quotes.html">http://www.cs.yale.edu/homes/perlis-alan/quotes.html</a></td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-2" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-2">[2]</a></td><td>Python allows you to implement transparent getters and setters via descriptors or the <tt class="docutils literal">property</tt> builtin (which
implements the descriptor interface).</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-3" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label">[3]</td><td><em>(<a class="fn-backref" href="#footnote-reference-3">1</a>, <a class="fn-backref" href="#footnote-reference-4">2</a>)</em> <a class="reference external" href="http://www.dreamsongs.com/ObjectsHaveNotFailedNarr.html">http://www.dreamsongs.com/ObjectsHaveNotFailedNarr.html</a></td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-4" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-5">[4]</a></td><td><p class="first">A most peculiar remark: "<cite>As a symptom of this, I've interviewed candidates who, when presented with a simple OO
business logic exercise, start by writing Django models. Please don't do this.</cite>"</p>
<p class="last">—<a class="reference external" href="http://mauveweb.co.uk/posts/2014/08/organising-django-projects.html">http://mauveweb.co.uk/posts/2014/08/organising-django-projects.html</a></p>
</td></tr>
</tbody>
</table>
</div>
Packaging a python library2014-05-25T00:00:00+03:002019-09-30T00:00:00+03:00Ionel Cristian Mărieștag:blog.ionelmc.ro,2014-05-25:/2014/05/25/python-packaging/<div class="admonition note">
<p class="first admonition-title">Note</p>
<p>This is about packaging <em>libraries</em>, <strong>not</strong> <em>applications</em>.</p>
<p>⸻</p>
<p class="last">All the advice here is implemented in a project template (with full support for C extensions):
<a class="reference external" href="https://github.com/ionelmc/cookiecutter-pylibrary">cookiecutter-pylibrary</a> (<a class="reference external" href="https://blog.ionelmc.ro/2014/08/08/a-python-package-template/">introduction</a>).</p>
</div>
<p>I think the packaging best practices should be revisited, there are lots of good tools now-days that are either unused or
underused. It's generally …</p><div class="admonition note">
<p class="first admonition-title">Note</p>
<p>This is about packaging <em>libraries</em>, <strong>not</strong> <em>applications</em>.</p>
<p>⸻</p>
<p class="last">All the advice here is implemented in a project template (with full support for C extensions):
<a class="reference external" href="https://github.com/ionelmc/cookiecutter-pylibrary">cookiecutter-pylibrary</a> (<a class="reference external" href="https://blog.ionelmc.ro/2014/08/08/a-python-package-template/">introduction</a>).</p>
</div>
<p>I think the packaging best practices should be revisited, there are lots of good tools now-days that are either unused or
underused. It's generally a good thing to re-evaluate best practices all the time.</p>
<p>I assume here that your package is to be tested on multiple Python versions, with different combinations of dependency
versions, settings etc.</p>
<p>And few principles that I like to follow when packaging:</p>
<ul class="simple">
<li>If there's a tool that can help with testing use it. Don't waste time building a custom test runner if you can just
use <a class="reference external" href="http://pytest.org/latest/">py.test</a> or <a class="reference external" href="http://nose.readthedocs.org/en/latest/">nose</a>. They come with a large
ecosystem of plugins that can improve your testing.</li>
<li>When possible, prevent issues early. This is mostly a matter of strictness and exhaustive testing. Design things to
prevent common mistakes.</li>
<li>Collect all the coverage data. Record it. Identify regressions.</li>
<li>Test all the possible configurations.</li>
</ul>
<div class="section" id="the-structure">
<h2>The structure<a class="headerlink" href="#the-structure" title="Permalink to this headline">
*</a></h2>
<p>This is fairly important, everything revolves around this. I prefer this sort of layout:</p>
<pre class="literal-block">
├─ src
│ └─ packagename
│ ├─ __init__.py
│ └─ ...
├─ tests
│ └─ ...
└─ setup.py
</pre>
<p>The <tt class="docutils literal">src</tt> directory is a better approach because:</p>
<ul>
<li><p class="first">You get <cite>import parity</cite>. The current directory is implicitly included in <tt class="docutils literal">sys.path</tt>; but not so when installing &
importing from <tt class="docutils literal"><span class="pre">site-packages</span></tt>. Users will never have the same current working directory as you do.</p>
<p>This constraint has beneficial implications in both testing and packaging:</p>
<ul class="simple">
<li>You will be forced to test the installed code (e.g.: by installing in a virtualenv). This will ensure that the deployed code
works (it's packaged correctly) - otherwise your tests will fail. Early. Before you can publish a broken
distribution.</li>
<li>You will be forced to install the distribution. If you ever uploaded a distribution on <a class="reference external" href="https://pypi.python.org/pypi">PyPI</a> with missing modules or broken dependencies it's because you didn't test the
installation. Just beeing able to successfuly build the <tt class="docutils literal">sdist</tt> doesn't guarantee it will actually install!</li>
</ul>
</li>
<li><p class="first">It prevents you from readily importing your code in the <tt class="docutils literal">setup.py</tt> script. This is a bad practice because it will
always blow up if importing the main package or module triggers additional imports for dependencies (which may not be
available <a class="footnote-reference" href="#footnote-5" id="footnote-reference-1">[5]</a>). Best to not make it possible in the first place.</p>
</li>
<li><p class="first">Simpler packaging code and <a class="reference external" href="https://docs.python.org/2/distutils/sourcedist.html#commands">manifest</a>. It makes
manifests very simple to write (e.g.: you package a Django app that has templates or static files). Also, zero fuss
for large libraries that have multiple packages. Clear separation of code being packaged and code doing the packaging.</p>
<p>Without <tt class="docutils literal">src</tt> writting a <tt class="docutils literal">MANIFEST.in</tt> is tricky <a class="footnote-reference" href="#footnote-6" id="footnote-reference-2">[6]</a>. If your manifest is broken your tests will fail. It's much
easier with a <tt class="docutils literal">src</tt> directory: just add <tt class="docutils literal">graft src</tt> in <tt class="docutils literal">MANIFEST.in</tt>.</p>
<p>Publishing a broken package to PyPI is not fun.</p>
</li>
<li><p class="first">Without <tt class="docutils literal">src</tt> you get <a class="reference external" href="https://github.com/ionelmc/python-packaging-blunders">messy</a> editable installs ("<tt class="docutils literal">setup.py
develop</tt>" or "<tt class="docutils literal">pip install <span class="pre">-e</span></tt>"). Having no separation (no <tt class="docutils literal">src</tt> dir) will force setuptools to put your project's
root on <tt class="docutils literal">sys.path</tt> - with all the junk in it (e.g.: <tt class="docutils literal">setup.py</tt> and other test or configuration scripts will
unwittingly become importable).</p>
</li>
<li><p class="first">There are better tools. You don't need to deal with installing packages just to run the tests anymore. Just use <a class="reference external" href="https://testrun.org/tox/latest/">tox</a> -
it will install the package for you <a class="footnote-reference" href="#footnote-2" id="footnote-reference-3">[2]</a> automatically, zero fuss, zero friction.</p>
</li>
<li><p class="first">Less chance for user mistakes - they <a class="reference external" href="https://blog.ionelmc.ro/2014/06/25/python-packaging-pitfalls/">will happen</a> -
assume nothing!</p>
</li>
<li><p class="first">Less chance for tools to mixup code with non-code.</p>
</li>
</ul>
<p>Another way to put it, <em>flat is better than nested</em> <a class="footnote-reference" href="#footnote-7" id="footnote-reference-4">[*]</a> - <strong>but not for data</strong>. A file-system is just data after all -
and cohesive, well normalized data structures are desirable.</p>
<p>You'll notice that I don't include the tests in the installed packages. Because:</p>
<ul>
<li><p class="first">Module discovery tools will trip over your test modules. Strange things usually happen in test module. The <tt class="docutils literal">help</tt>
builtin does module discovery. E.g.:</p>
<pre class="literal-block">
>>> help('modules')
Please wait a moment while I gather a list of all available modules...
__future__ antigravity html select
...
</pre>
</li>
<li><p class="first">Tests usually require additional dependencies to run, so they aren't useful by their own - you can't run them
directly.</p>
</li>
<li><p class="first">Tests are concerned with development, not usage.</p>
</li>
<li><p class="first">It's extremely unlikely that the user of the library will run the tests instead of the library's developer. E.g.: you
don't run the tests for Django while testing your apps - Django is already tested.</p>
</li>
</ul>
<div class="section" id="alternatives">
<h3>Alternatives<a class="headerlink" href="#alternatives" title="Permalink to this headline">
*</a></h3>
<p>You could use <tt class="docutils literal">src</tt>-less layouts, few examples:</p>
<table border="1" class="docutils">
<colgroup>
<col width="50%" />
<col width="50%" />
</colgroup>
<thead valign="bottom">
<tr><th class="head">Tests in package</th>
<th class="head">Tests outside package</th>
</tr>
</thead>
<tbody valign="top">
<tr><td><pre class="first last literal-block">
├─ packagename
│ ├─ __init__.py
│ ├─ ...
│ └─ tests
│ └─ ...
└─ setup.py
</pre>
</td>
<td><pre class="first last literal-block">
├─ packagename
│ ├─ __init__.py
│ └─ ...
├─ tests
│ └─ ...
└─ setup.py
</pre>
</td>
</tr>
</tbody>
</table>
<p>These two layouts became popular because packaging had many problems few years ago, so it wasn't feasible to install the
package just to test it. People still recommend <a class="reference external" href="https://github.com/audreyr/cookiecutter-pypackage">them</a> <a class="footnote-reference" href="#footnote-4" id="footnote-reference-5">[4]</a>
<a rel="nofollow" href="http://blog.habnab.it/blog/2013/07/21/python-packages-and-you/">even</a> if it based on <a rel="nofollow" href="http://as.ynchrono.us/2007/12/filesystem-structure-of-python-project_21.html">old</a> and <a rel="nofollow" href="http://guide.python-distribute.org/example.html?highlight=src">oudated</a> assumptions.</p>
<p>Most projects use them incorectly, as all the test runners except Twisted's <tt class="docutils literal">trial</tt> have incorrect defaults for the
<em>current working directory</em> - you're going to test the wrong code if you don't test the installed code. <tt class="docutils literal">trial</tt> does
the right thing by changing the <em>working directory</em> to something temporary, but most projects don't use <tt class="docutils literal">trial</tt>.</p>
</div>
</div>
<div class="section" id="the-setup-script">
<h2>The setup script<a class="headerlink" href="#the-setup-script" title="Permalink to this headline">
*</a></h2>
<p>Unfortunately with the current packaging tools, there are many <a class="reference external" href="https://blog.ionelmc.ro/2014/06/25/python-packaging-pitfalls/">pitfalls</a>. The <tt class="docutils literal">setup.py</tt> script should be as <a class="reference external" href="https://github.com/ionelmc/python-nameless/blob/test-pure/setup.py">simple as
possible</a>:</p>
<div class="highlight"><pre><span></span><span class="ch">#!/usr/bin/env python</span>
<span class="c1"># -*- encoding: utf-8 -*-</span>
<span class="kn">from</span> <span class="nn">__future__</span> <span class="kn">import</span> <span class="n">absolute_import</span>
<span class="kn">from</span> <span class="nn">__future__</span> <span class="kn">import</span> <span class="n">print_function</span>
<span class="kn">import</span> <span class="nn">io</span>
<span class="kn">import</span> <span class="nn">re</span>
<span class="kn">from</span> <span class="nn">glob</span> <span class="kn">import</span> <span class="n">glob</span>
<span class="kn">from</span> <span class="nn">os.path</span> <span class="kn">import</span> <span class="n">basename</span>
<span class="kn">from</span> <span class="nn">os.path</span> <span class="kn">import</span> <span class="n">dirname</span>
<span class="kn">from</span> <span class="nn">os.path</span> <span class="kn">import</span> <span class="n">join</span>
<span class="kn">from</span> <span class="nn">os.path</span> <span class="kn">import</span> <span class="n">splitext</span>
<span class="kn">from</span> <span class="nn">setuptools</span> <span class="kn">import</span> <span class="n">find_packages</span>
<span class="kn">from</span> <span class="nn">setuptools</span> <span class="kn">import</span> <span class="n">setup</span>
<span class="k">def</span> <span class="nf">read</span><span class="p">(</span><span class="o">*</span><span class="n">names</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span>
<span class="k">with</span> <span class="n">io</span><span class="o">.</span><span class="n">open</span><span class="p">(</span>
<span class="n">join</span><span class="p">(</span><span class="n">dirname</span><span class="p">(</span><span class="vm">__file__</span><span class="p">),</span> <span class="o">*</span><span class="n">names</span><span class="p">),</span>
<span class="n">encoding</span><span class="o">=</span><span class="n">kwargs</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'encoding'</span><span class="p">,</span> <span class="s1">'utf8'</span><span class="p">)</span>
<span class="p">)</span> <span class="k">as</span> <span class="n">fh</span><span class="p">:</span>
<span class="k">return</span> <span class="n">fh</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
<span class="n">setup</span><span class="p">(</span>
<span class="n">name</span><span class="o">=</span><span class="s1">'nameless'</span><span class="p">,</span>
<span class="n">version</span><span class="o">=</span><span class="s1">'1.753.10'</span><span class="p">,</span>
<span class="n">license</span><span class="o">=</span><span class="s1">'BSD-2-Clause'</span><span class="p">,</span>
<span class="n">description</span><span class="o">=</span><span class="s1">'An example package. Generated with cookiecutter-pylibrary.'</span><span class="p">,</span>
<span class="n">long_description</span><span class="o">=</span><span class="s1">'</span><span class="si">%s</span><span class="se">\n</span><span class="si">%s</span><span class="s1">'</span> <span class="o">%</span> <span class="p">(</span>
<span class="n">re</span><span class="o">.</span><span class="n">compile</span><span class="p">(</span><span class="s1">'^.. start-badges.*^.. end-badges'</span><span class="p">,</span> <span class="n">re</span><span class="o">.</span><span class="n">M</span> <span class="o">|</span> <span class="n">re</span><span class="o">.</span><span class="n">S</span><span class="p">)</span><span class="o">.</span><span class="n">sub</span><span class="p">(</span><span class="s1">''</span><span class="p">,</span> <span class="n">read</span><span class="p">(</span><span class="s1">'README.rst'</span><span class="p">)),</span>
<span class="n">re</span><span class="o">.</span><span class="n">sub</span><span class="p">(</span><span class="s1">':[a-z]+:`~?(.*?)`'</span><span class="p">,</span> <span class="sa">r</span><span class="s1">'``\1``'</span><span class="p">,</span> <span class="n">read</span><span class="p">(</span><span class="s1">'CHANGELOG.rst'</span><span class="p">))</span>
<span class="p">),</span>
<span class="n">author</span><span class="o">=</span><span class="s1">'Ion</span><span class="se">\\</span><span class="s1">"</span><span class="se">\'</span><span class="s1">el Cristian M</span><span class="se">\\</span><span class="s1">u0103rie</span><span class="se">\\</span><span class="s1">u0219'</span><span class="p">,</span>
<span class="n">author_email</span><span class="o">=</span><span class="s1">'contact@ionelmc.ro'</span><span class="p">,</span>
<span class="n">url</span><span class="o">=</span><span class="s1">'https://github.com/ionelmc/python-nameless'</span><span class="p">,</span>
<span class="n">packages</span><span class="o">=</span><span class="n">find_packages</span><span class="p">(</span><span class="s1">'src'</span><span class="p">),</span>
<span class="n">package_dir</span><span class="o">=</span><span class="p">{</span><span class="s1">''</span><span class="p">:</span> <span class="s1">'src'</span><span class="p">},</span>
<span class="n">py_modules</span><span class="o">=</span><span class="p">[</span><span class="n">splitext</span><span class="p">(</span><span class="n">basename</span><span class="p">(</span><span class="n">path</span><span class="p">))[</span><span class="mi">0</span><span class="p">]</span> <span class="k">for</span> <span class="n">path</span> <span class="ow">in</span> <span class="n">glob</span><span class="p">(</span><span class="s1">'src/*.py'</span><span class="p">)],</span>
<span class="n">include_package_data</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
<span class="n">zip_safe</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">classifiers</span><span class="o">=</span><span class="p">[</span>
<span class="c1"># complete classifier list: http://pypi.python.org/pypi?%3Aaction=list_classifiers</span>
<span class="s1">'Development Status :: 5 - Production/Stable'</span><span class="p">,</span>
<span class="s1">'Intended Audience :: Developers'</span><span class="p">,</span>
<span class="s1">'License :: OSI Approved :: BSD License'</span><span class="p">,</span>
<span class="s1">'Operating System :: Unix'</span><span class="p">,</span>
<span class="s1">'Operating System :: POSIX'</span><span class="p">,</span>
<span class="s1">'Operating System :: Microsoft :: Windows'</span><span class="p">,</span>
<span class="s1">'Programming Language :: Python'</span><span class="p">,</span>
<span class="s1">'Programming Language :: Python :: 2.7'</span><span class="p">,</span>
<span class="s1">'Programming Language :: Python :: 3'</span><span class="p">,</span>
<span class="s1">'Programming Language :: Python :: 3.5'</span><span class="p">,</span>
<span class="s1">'Programming Language :: Python :: 3.6'</span><span class="p">,</span>
<span class="s1">'Programming Language :: Python :: 3.7'</span><span class="p">,</span>
<span class="s1">'Programming Language :: Python :: 3.8'</span><span class="p">,</span>
<span class="s1">'Programming Language :: Python :: 3.9'</span><span class="p">,</span>
<span class="s1">'Programming Language :: Python :: Implementation :: CPython'</span><span class="p">,</span>
<span class="s1">'Programming Language :: Python :: Implementation :: PyPy'</span><span class="p">,</span>
<span class="c1"># uncomment if you test on these interpreters:</span>
<span class="c1"># 'Programming Language :: Python :: Implementation :: IronPython',</span>
<span class="c1"># 'Programming Language :: Python :: Implementation :: Jython',</span>
<span class="c1"># 'Programming Language :: Python :: Implementation :: Stackless',</span>
<span class="s1">'Topic :: Utilities'</span><span class="p">,</span>
<span class="p">],</span>
<span class="n">project_urls</span><span class="o">=</span><span class="p">{</span>
<span class="s1">'Changelog'</span><span class="p">:</span> <span class="s1">'https://github.com/ionelmc/python-nameless/blob/master/CHANGELOG.rst'</span><span class="p">,</span>
<span class="s1">'Issue Tracker'</span><span class="p">:</span> <span class="s1">'https://github.com/ionelmc/python-nameless/issues'</span><span class="p">,</span>
<span class="p">},</span>
<span class="n">keywords</span><span class="o">=</span><span class="p">[</span>
<span class="c1"># eg: 'keyword1', 'keyword2', 'keyword3',</span>
<span class="p">],</span>
<span class="n">python_requires</span><span class="o">=</span><span class="s1">'>=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*'</span><span class="p">,</span>
<span class="n">install_requires</span><span class="o">=</span><span class="p">[</span>
<span class="c1"># eg: 'aspectlib==1.1.1', 'six>=1.7',</span>
<span class="p">],</span>
<span class="n">extras_require</span><span class="o">=</span><span class="p">{</span>
<span class="c1"># eg:</span>
<span class="c1"># 'rst': ['docutils>=0.11'],</span>
<span class="c1"># ':python_version=="2.6"': ['argparse'],</span>
<span class="p">},</span>
<span class="n">setup_requires</span><span class="o">=</span><span class="p">[</span>
<span class="s1">'pytest-runner'</span><span class="p">,</span>
<span class="p">],</span>
<span class="n">entry_points</span><span class="o">=</span><span class="p">{</span>
<span class="s1">'console_scripts'</span><span class="p">:</span> <span class="p">[</span>
<span class="s1">'nameless = nameless.cli:main'</span><span class="p">,</span>
<span class="p">]</span>
<span class="p">},</span>
<span class="p">)</span>
</pre></div>
<p>What's special about this:</p>
<ul class="simple">
<li>No <tt class="docutils literal">exec</tt> or <tt class="docutils literal">import</tt> trickery.</li>
<li>Includes everything from <tt class="docutils literal">src</tt>: packages <em>or</em> root-level modules.</li>
<li>Explicit encodings.</li>
</ul>
</div>
<div class="section" id="running-the-tests">
<h2>Running the tests<a class="headerlink" href="#running-the-tests" title="Permalink to this headline">
*</a></h2>
<p>Again, it seems people <a class="reference external" href="http://www.jeffknupp.com/blog/2013/08/16/open-sourcing-a-python-project-the-right-way/">fancy the idea</a> of running <tt class="docutils literal">python setup.py
test</tt> to run the package's tests. I think that's not worth doing - <tt class="docutils literal">setup.py test</tt> is a failed experiment to
replicate some of <a class="reference external" href="http://www.cpan.org/">CPAN</a>'s <a class="reference external" href="http://wiki.cpantesters.org/wiki/WhatIsCPANTesters">test system</a>.
Python doesn't have a common test result protocol so it serves no purpose to have a common test command <a class="footnote-reference" href="#footnote-1" id="footnote-reference-6">[1]</a>. At least
not for now - we'd need someone to build specifications and services that make this worthwhile, and champion them. I
think it's important in general to recognize failure where there is and go back to the drawing board when that's
necessary - there are absolutely no services or tools that use <tt class="docutils literal">setup.py test</tt> command in a way that brings added
value. Something is definitely wrong here.</p>
<p>I believe it's too late now for PyPI to do anything about it, <a class="reference external" href="https://travis-ci.org/">Travis</a> is already a solid,
reliable, extremely flexible and free alternative. It integrates very well with <a class="reference external" href="http://github.com/">Github</a> - builds
will be run automatically for each Pull Request.</p>
<p>To test locally <a class="reference external" href="https://testrun.org/tox/latest/">tox</a> is a very good way to run all the possible testing configurations (each configuration will be a
<a class="reference external" href="https://testrun.org/tox/latest/">tox</a> environment). I like to organize the tests into a matrix with these additional environments:</p>
<ul class="simple">
<li><tt class="docutils literal">check</tt> - check package metadata (e.g.: if the restructured text in your long description is valid)</li>
<li><tt class="docutils literal">clean</tt> - clean coverage</li>
<li><tt class="docutils literal">report</tt> - make coverage report for all the accumulated data</li>
<li><tt class="docutils literal">docs</tt> - build sphinx docs</li>
</ul>
<p>I also like to have environments with <em>and</em> without coverage measurement and run them all the time. Race conditions are
usually performance sensitive and you're unlikely to catch them if you run everything with coverage measurements.</p>
<div class="section" id="the-test-matrix">
<h3>The test matrix<a class="headerlink" href="#the-test-matrix" title="Permalink to this headline">
*</a></h3>
<p>Depending on dependencies you'll usually end up with a huge number of combinations of python versions, dependency
versions and different settings. Generally people just hard-code everything in <tt class="docutils literal">tox.ini</tt> or only in <tt class="docutils literal">.travis.yml</tt>.
They end up with incomplete local tests, or test configurations that run serially in Travis. I've tried that, didn't
like it. I've tried duplicating the environments in both <tt class="docutils literal">tox.ini</tt> and <tt class="docutils literal">.travis.yml</tt>. Still didn't like it.</p>
<div class="admonition note">
<p class="first admonition-title">Note</p>
<p>This <tt class="docutils literal">bootstrap.py</tt> technique is a bit outdated now. It still works fine but for simple matrices you can use a
<a class="reference external" href="https://tox.readthedocs.io/en/latest/config.html#generative-envlist">tox generative envlist</a> (it was implemented after
I wrote this blog post, unfortunately).</p>
<p>⸻</p>
<p class="last">See <a class="reference external" href="https://github.com/ionelmc/python-nameless">python-nameless</a> for an example using that.</p>
</div>
<p>As there were no readily usable alternatives to generate the configuration, I've implemented a generator script that
uses templates to generate <tt class="docutils literal">tox.ini</tt> and <tt class="docutils literal">.travis.yml</tt>. This is way better, it's <a class="reference external" href="http://en.wikipedia.org/wiki/Don%27t_repeat_yourself">DRY</a>, you can easily skip running tests on specific configurations
(e.g.: skip Django 1.4 on Python 3) and there's less work to change things.</p>
<p>The essentials (<a class="reference external" href="https://github.com/ionelmc/python-nameless/tree/test-matrix-separate-cext">full code</a>):</p>
<div class="section" id="setup-cfg">
<h4><tt class="docutils literal">setup.cfg</tt><a class="headerlink" href="#setup-cfg" title="Permalink to this headline">
*</a></h4>
<p>The generator script uses a configuration file (<tt class="docutils literal">setup.cfg</tt> for convenience):</p>
<div class="highlight"><pre><span></span><span class="w"> </span><span class="na">dist</span><span class="w"></span>
<span class="w"> </span><span class="na">build</span><span class="w"></span>
<span class="w"> </span><span class="na">migrations</span><span class="w"></span>
<span class="na">python_files</span><span class="w"> </span><span class="o">=</span><span class="w"></span>
<span class="w"> </span><span class="na">test_*.py</span><span class="w"></span>
<span class="w"> </span><span class="na">*_test.py</span><span class="w"></span>
<span class="w"> </span><span class="na">tests.py</span><span class="w"></span>
<span class="na">addopts</span><span class="w"> </span><span class="o">=</span><span class="w"></span>
<span class="w"> </span><span class="na">-ra</span><span class="w"></span>
<span class="w"> </span><span class="na">--strict</span><span class="w"></span>
<span class="w"> </span><span class="na">--ignore</span><span class="o">=</span><span class="s">docs/conf.py</span><span class="w"></span>
<span class="w"> </span><span class="na">--ignore</span><span class="o">=</span><span class="s">setup.py</span><span class="w"></span>
<span class="w"> </span><span class="na">--ignore</span><span class="o">=</span><span class="s">ci</span><span class="w"></span>
<span class="w"> </span><span class="na">--ignore</span><span class="o">=</span><span class="s">.eggs</span><span class="w"></span>
<span class="w"> </span><span class="na">--doctest-modules</span><span class="w"></span>
<span class="w"> </span><span class="na">--doctest-glob</span><span class="o">=</span><span class="s">\*.rst</span><span class="w"></span>
<span class="w"> </span><span class="na">--tb</span><span class="o">=</span><span class="s">short</span><span class="w"></span>
<span class="na">testpaths</span><span class="w"> </span><span class="o">=</span><span class="w"></span>
<span class="w"> </span><span class="na">tests</span><span class="w"></span>
<span class="k">[tool:isort]</span><span class="w"></span>
<span class="na">force_single_line</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">True</span><span class="w"></span>
<span class="na">line_length</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">120</span><span class="w"></span>
<span class="na">known_first_party</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">nameless</span><span class="w"></span>
<span class="na">default_section</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">THIRDPARTY</span><span class="w"></span>
<span class="na">forced_separate</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">test_nameless</span><span class="w"></span>
<span class="na">skip</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">.tox,.eggs,ci/templates,build,dist</span><span class="w"></span>
<span class="k">[matrix]</span><span class="w"></span>
<span class="c1"># This is the configuration for the `./bootstrap.py` script.</span><span class="w"></span>
<span class="c1"># It generates `.travis.yml`, `tox.ini` and `.appveyor.yml`.</span><span class="w"></span>
<span class="c1">#</span><span class="w"></span>
<span class="c1"># Syntax: [alias:] value [!variable[glob]] [&variable[glob]]</span><span class="w"></span>
<span class="c1">#</span><span class="w"></span>
<span class="c1"># alias:</span><span class="w"></span>
<span class="c1"># - is used to generate the tox environment</span><span class="w"></span>
<span class="c1"># - it's optional</span><span class="w"></span>
<span class="c1"># - if not present the alias will be computed from the `value`</span><span class="w"></span>
<span class="c1"># value:</span><span class="w"></span>
<span class="c1"># - a value of "-" means empty</span><span class="w"></span>
<span class="c1"># !variable[glob]:</span><span class="w"></span>
<span class="c1"># - exclude the combination of the current `value` with</span><span class="w"></span>
<span class="c1"># any value matching the `glob` in `variable`</span><span class="w"></span>
<span class="c1"># - can use as many you want</span><span class="w"></span>
<span class="c1"># &variable[glob]:</span><span class="w"></span>
<span class="c1"># - only include the combination of the current `value`</span><span class="w"></span>
<span class="c1"># when there's a value matching `glob` in `variable`</span><span class="w"></span>
<span class="c1"># - can use as many you want</span><span class="w"></span>
<span class="na">python_versions</span><span class="w"> </span><span class="o">=</span><span class="w"></span>
<span class="w"> </span><span class="na">py27</span><span class="w"></span>
<span class="w"> </span><span class="na">py35</span><span class="w"></span>
<span class="w"> </span><span class="na">py36</span><span class="w"></span>
<span class="w"> </span><span class="na">py37</span><span class="w"></span>
<span class="w"> </span><span class="na">py38</span><span class="w"></span>
<span class="w"> </span><span class="na">py39</span><span class="w"></span>
<span class="w"> </span><span class="na">pypy</span><span class="w"></span>
<span class="w"> </span><span class="na">pypy3</span><span class="w"></span>
<span class="na">dependencies</span><span class="w"> </span><span class="o">=</span><span class="w"></span>
<span class="c1"># 1.4: Django==1.4.16 !python_versions[py3*]</span><span class="w"></span>
<span class="c1"># 1.5: Django==1.5.11</span><span class="w"></span>
<span class="c1"># 1.6: Django==1.6.8</span><span class="w"></span>
<span class="c1"># 1.7: Django==1.7.1 !python_versions[py26]</span><span class="w"></span>
<span class="c1"># Deps commented above are provided as examples. That's what you would use in a Django project.</span><span class="w"></span>
<span class="na">coverage_flags</span><span class="w"> </span><span class="o">=</span><span class="w"></span>
<span class="w"> </span><span class="na">cover: true</span><span class="w"></span>
<span class="w"> </span><span class="na">nocov: false</span><span class="w"></span>
<span class="na">environment_variables</span><span class="w"> </span><span class="o">=</span><span class="w"></span>
<span class="w"> </span><span class="na">-</span><span class="w"></span>
</pre></div>
</div>
<div class="section" id="ci-bootstrap-py">
<h4><tt class="docutils literal">ci/bootstrap.py</tt><a class="headerlink" href="#ci-bootstrap-py" title="Permalink to this headline">
*</a></h4>
<p>This is the generator script. You run this whenever you want to regenerate the configuration:</p>
<div class="highlight"><pre><span></span><span class="ch">#!/usr/bin/env python</span>
<span class="c1"># -*- coding: utf-8 -*-</span>
<span class="kn">from</span> <span class="nn">__future__</span> <span class="kn">import</span> <span class="n">absolute_import</span>
<span class="kn">from</span> <span class="nn">__future__</span> <span class="kn">import</span> <span class="n">print_function</span>
<span class="kn">from</span> <span class="nn">__future__</span> <span class="kn">import</span> <span class="n">unicode_literals</span>
<span class="kn">import</span> <span class="nn">os</span>
<span class="kn">import</span> <span class="nn">subprocess</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">from</span> <span class="nn">os.path</span> <span class="kn">import</span> <span class="n">abspath</span>
<span class="kn">from</span> <span class="nn">os.path</span> <span class="kn">import</span> <span class="n">dirname</span>
<span class="kn">from</span> <span class="nn">os.path</span> <span class="kn">import</span> <span class="n">exists</span>
<span class="kn">from</span> <span class="nn">os.path</span> <span class="kn">import</span> <span class="n">join</span>
<span class="n">base_path</span> <span class="o">=</span> <span class="n">dirname</span><span class="p">(</span><span class="n">dirname</span><span class="p">(</span><span class="n">abspath</span><span class="p">(</span><span class="vm">__file__</span><span class="p">)))</span>
<span class="k">def</span> <span class="nf">check_call</span><span class="p">(</span><span class="n">args</span><span class="p">):</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"+"</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">)</span>
<span class="n">subprocess</span><span class="o">.</span><span class="n">check_call</span><span class="p">(</span><span class="n">args</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">exec_in_env</span><span class="p">():</span>
<span class="n">env_path</span> <span class="o">=</span> <span class="n">join</span><span class="p">(</span><span class="n">base_path</span><span class="p">,</span> <span class="s2">".tox"</span><span class="p">,</span> <span class="s2">"bootstrap"</span><span class="p">)</span>
<span class="k">if</span> <span class="n">sys</span><span class="o">.</span><span class="n">platform</span> <span class="o">==</span> <span class="s2">"win32"</span><span class="p">:</span>
<span class="n">bin_path</span> <span class="o">=</span> <span class="n">join</span><span class="p">(</span><span class="n">env_path</span><span class="p">,</span> <span class="s2">"Scripts"</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">bin_path</span> <span class="o">=</span> <span class="n">join</span><span class="p">(</span><span class="n">env_path</span><span class="p">,</span> <span class="s2">"bin"</span><span class="p">)</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">exists</span><span class="p">(</span><span class="n">env_path</span><span class="p">):</span>
<span class="kn">import</span> <span class="nn">subprocess</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Making bootstrap env in: </span><span class="si">{0}</span><span class="s2"> ..."</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">env_path</span><span class="p">))</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">check_call</span><span class="p">([</span><span class="n">sys</span><span class="o">.</span><span class="n">executable</span><span class="p">,</span> <span class="s2">"-m"</span><span class="p">,</span> <span class="s2">"venv"</span><span class="p">,</span> <span class="n">env_path</span><span class="p">])</span>
<span class="k">except</span> <span class="n">subprocess</span><span class="o">.</span><span class="n">CalledProcessError</span><span class="p">:</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">check_call</span><span class="p">([</span><span class="n">sys</span><span class="o">.</span><span class="n">executable</span><span class="p">,</span> <span class="s2">"-m"</span><span class="p">,</span> <span class="s2">"virtualenv"</span><span class="p">,</span> <span class="n">env_path</span><span class="p">])</span>
<span class="k">except</span> <span class="n">subprocess</span><span class="o">.</span><span class="n">CalledProcessError</span><span class="p">:</span>
<span class="n">check_call</span><span class="p">([</span><span class="s2">"virtualenv"</span><span class="p">,</span> <span class="n">env_path</span><span class="p">])</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Installing `jinja2` into bootstrap environment..."</span><span class="p">)</span>
<span class="n">check_call</span><span class="p">([</span><span class="n">join</span><span class="p">(</span><span class="n">bin_path</span><span class="p">,</span> <span class="s2">"pip"</span><span class="p">),</span> <span class="s2">"install"</span><span class="p">,</span> <span class="s2">"jinja2"</span><span class="p">,</span> <span class="s2">"tox"</span><span class="p">,</span> <span class="s2">"matrix"</span><span class="p">])</span>
<span class="n">python_executable</span> <span class="o">=</span> <span class="n">join</span><span class="p">(</span><span class="n">bin_path</span><span class="p">,</span> <span class="s2">"python"</span><span class="p">)</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">exists</span><span class="p">(</span><span class="n">python_executable</span><span class="p">):</span>
<span class="n">python_executable</span> <span class="o">+=</span> <span class="s1">'.exe'</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Re-executing with: </span><span class="si">{0}</span><span class="s2">"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">python_executable</span><span class="p">))</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"+ exec"</span><span class="p">,</span> <span class="n">python_executable</span><span class="p">,</span> <span class="vm">__file__</span><span class="p">,</span> <span class="s2">"--no-env"</span><span class="p">)</span>
<span class="n">os</span><span class="o">.</span><span class="n">execv</span><span class="p">(</span><span class="n">python_executable</span><span class="p">,</span> <span class="p">[</span><span class="n">python_executable</span><span class="p">,</span> <span class="vm">__file__</span><span class="p">,</span> <span class="s2">"--no-env"</span><span class="p">])</span>
<span class="k">def</span> <span class="nf">main</span><span class="p">():</span>
<span class="kn">import</span> <span class="nn">jinja2</span>
<span class="kn">import</span> <span class="nn">matrix</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Project path: </span><span class="si">{0}</span><span class="s2">"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">base_path</span><span class="p">))</span>
<span class="n">jinja</span> <span class="o">=</span> <span class="n">jinja2</span><span class="o">.</span><span class="n">Environment</span><span class="p">(</span>
<span class="n">loader</span><span class="o">=</span><span class="n">jinja2</span><span class="o">.</span><span class="n">FileSystemLoader</span><span class="p">(</span><span class="n">join</span><span class="p">(</span><span class="n">base_path</span><span class="p">,</span> <span class="s2">"ci"</span><span class="p">,</span> <span class="s2">"templates"</span><span class="p">)),</span>
<span class="n">trim_blocks</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
<span class="n">lstrip_blocks</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
<span class="n">keep_trailing_newline</span><span class="o">=</span><span class="kc">True</span>
<span class="p">)</span>
<span class="n">tox_environments</span> <span class="o">=</span> <span class="p">{}</span>
<span class="k">for</span> <span class="p">(</span><span class="n">alias</span><span class="p">,</span> <span class="n">conf</span><span class="p">)</span> <span class="ow">in</span> <span class="n">matrix</span><span class="o">.</span><span class="n">from_file</span><span class="p">(</span><span class="n">join</span><span class="p">(</span><span class="n">base_path</span><span class="p">,</span> <span class="s2">"setup.cfg"</span><span class="p">))</span><span class="o">.</span><span class="n">items</span><span class="p">():</span>
<span class="n">deps</span> <span class="o">=</span> <span class="n">conf</span><span class="p">[</span><span class="s2">"dependencies"</span><span class="p">]</span>
<span class="n">tox_environments</span><span class="p">[</span><span class="n">alias</span><span class="p">]</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"deps"</span><span class="p">:</span> <span class="n">deps</span><span class="o">.</span><span class="n">split</span><span class="p">(),</span>
<span class="p">}</span>
<span class="k">if</span> <span class="s2">"coverage_flags"</span> <span class="ow">in</span> <span class="n">conf</span><span class="p">:</span>
<span class="n">cover</span> <span class="o">=</span> <span class="p">{</span><span class="s2">"false"</span><span class="p">:</span> <span class="kc">False</span><span class="p">,</span> <span class="s2">"true"</span><span class="p">:</span> <span class="kc">True</span><span class="p">}[</span><span class="n">conf</span><span class="p">[</span><span class="s2">"coverage_flags"</span><span class="p">]</span><span class="o">.</span><span class="n">lower</span><span class="p">()]</span>
<span class="n">tox_environments</span><span class="p">[</span><span class="n">alias</span><span class="p">]</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="n">cover</span><span class="o">=</span><span class="n">cover</span><span class="p">)</span>
<span class="k">if</span> <span class="s2">"environment_variables"</span> <span class="ow">in</span> <span class="n">conf</span><span class="p">:</span>
<span class="n">env_vars</span> <span class="o">=</span> <span class="n">conf</span><span class="p">[</span><span class="s2">"environment_variables"</span><span class="p">]</span>
<span class="n">tox_environments</span><span class="p">[</span><span class="n">alias</span><span class="p">]</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="n">env_vars</span><span class="o">=</span><span class="n">env_vars</span><span class="o">.</span><span class="n">split</span><span class="p">())</span>
<span class="k">for</span> <span class="n">name</span> <span class="ow">in</span> <span class="n">os</span><span class="o">.</span><span class="n">listdir</span><span class="p">(</span><span class="n">join</span><span class="p">(</span><span class="s2">"ci"</span><span class="p">,</span> <span class="s2">"templates"</span><span class="p">)):</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">join</span><span class="p">(</span><span class="n">base_path</span><span class="p">,</span> <span class="n">name</span><span class="p">),</span> <span class="s2">"w"</span><span class="p">)</span> <span class="k">as</span> <span class="n">fh</span><span class="p">:</span>
<span class="n">fh</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">jinja</span><span class="o">.</span><span class="n">get_template</span><span class="p">(</span><span class="n">name</span><span class="p">)</span><span class="o">.</span><span class="n">render</span><span class="p">(</span><span class="n">tox_environments</span><span class="o">=</span><span class="n">tox_environments</span><span class="p">))</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Wrote </span><span class="si">{}</span><span class="s2">"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">name</span><span class="p">))</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"DONE."</span><span class="p">)</span>
<span class="k">if</span> <span class="vm">__name__</span> <span class="o">==</span> <span class="s2">"__main__"</span><span class="p">:</span>
<span class="n">args</span> <span class="o">=</span> <span class="n">sys</span><span class="o">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">:]</span>
<span class="k">if</span> <span class="n">args</span> <span class="o">==</span> <span class="p">[</span><span class="s2">"--no-env"</span><span class="p">]:</span>
<span class="n">main</span><span class="p">()</span>
<span class="k">elif</span> <span class="ow">not</span> <span class="n">args</span><span class="p">:</span>
<span class="n">exec_in_env</span><span class="p">()</span>
<span class="k">else</span><span class="p">:</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Unexpected arguments </span><span class="si">{0}</span><span class="s2">"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">args</span><span class="p">),</span> <span class="n">file</span><span class="o">=</span><span class="n">sys</span><span class="o">.</span><span class="n">stderr</span><span class="p">)</span>
<span class="n">sys</span><span class="o">.</span><span class="n">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
</pre></div>
</div>
<div class="section" id="ci-templates-travis-yml">
<h4><tt class="docutils literal"><span class="pre">ci/templates/.travis.yml</span></tt><a class="headerlink" href="#ci-templates-travis-yml" title="Permalink to this headline">
*</a></h4>
<p>This has some goodies in it: the very useful <a class="reference external" href="http://blog.andrew.net.au/2007/08/15/">libSegFault.so trick</a>.</p>
<p>It basically just runs tox.</p>
<div class="highlight"><pre><span></span><span class="x">language: python</span>
<span class="x">dist: xenial</span>
<span class="x">virt: lxd</span>
<span class="x">cache: false</span>
<span class="x">env:</span>
<span class="x"> global:</span>
<span class="x"> - LD_PRELOAD=/lib/x86_64-linux-gnu/libSegFault.so</span>
<span class="x"> - SEGFAULT_SIGNALS=all</span>
<span class="x"> - LANG=en_US.UTF-8</span>
<span class="x">matrix:</span>
<span class="x"> include:</span>
<span class="x"> - python: '3.6'</span>
<span class="x"> env:</span>
<span class="x"> - TOXENV=check</span>
<span class="cp">{%</span>- <span class="k">for</span> <span class="nv">env</span><span class="o">,</span> <span class="nv">config</span> <span class="k">in</span> <span class="nv">tox_environments</span><span class="o">|</span><span class="nf">dictsort</span> <span class="cp">%}{{</span> <span class="s1">''</span> <span class="cp">}}</span><span class="x"></span>
<span class="x"> - env:</span>
<span class="x"> - TOXENV=</span><span class="cp">{{</span> <span class="nv">env</span> <span class="cp">}}{%</span> <span class="k">if</span> <span class="nv">config.cover</span> <span class="cp">%}</span><span class="x">,codecov,extension-coveralls,coveralls</span><span class="cp">{%</span> <span class="k">endif</span> <span class="cp">%}</span><span class="x"></span>
<span class="cp">{%</span>- <span class="k">if</span> <span class="nv">env.startswith</span><span class="o">(</span><span class="s1">'pypy3'</span><span class="o">)</span> <span class="cp">%}{{</span> <span class="s1">''</span> <span class="cp">}}</span><span class="x"></span>
<span class="x"> - TOXPYTHON=pypy3</span>
<span class="x"> python: 'pypy3'</span>
<span class="cp">{%</span>- <span class="k">elif</span> <span class="nv">env.startswith</span><span class="o">(</span><span class="s1">'pypy'</span><span class="o">)</span> <span class="cp">%}{{</span> <span class="s1">''</span> <span class="cp">}}</span><span class="x"></span>
<span class="x"> python: 'pypy'</span>
<span class="cp">{%</span>- <span class="k">else</span> <span class="cp">%}{{</span> <span class="s1">''</span> <span class="cp">}}</span><span class="x"></span>
<span class="x"> python: '</span><span class="cp">{{</span> <span class="s1">'{0[2]}.{0[3]}'</span><span class="nv">.format</span><span class="o">(</span><span class="nv">env</span><span class="o">)</span> <span class="cp">}}</span><span class="x">'</span>
<span class="cp">{%</span>- <span class="k">endif</span> <span class="cp">%}{{</span> <span class="s1">''</span> <span class="cp">}}</span><span class="x"></span>
<span class="cp">{%</span>- <span class="k">endfor</span> <span class="cp">%}{{</span> <span class="s1">''</span> <span class="cp">}}</span><span class="x"></span>
<span class="x">before_install:</span>
<span class="x"> - python --version</span>
<span class="x"> - uname -a</span>
<span class="x"> - lsb_release -a || true</span>
<span class="x">install:</span>
<span class="x"> - python -mpip install --progress-bar=off tox -rci/requirements.txt</span>
<span class="x"> - virtualenv --version</span>
<span class="x"> - easy_install --version</span>
<span class="x"> - pip --version</span>
<span class="x"> - tox --version</span>
<span class="x">script:</span>
<span class="x"> - tox -v</span>
<span class="x">after_failure:</span>
<span class="x"> - cat .tox/log/*</span>
<span class="x"> - cat .tox/*/log/*</span>
<span class="x">notifications:</span>
<span class="x"> email:</span>
<span class="x"> on_success: never</span>
<span class="x"> on_failure: always</span>
</pre></div>
</div>
<div class="section" id="ci-templates-tox-ini">
<h4><tt class="docutils literal">ci/templates/tox.ini</tt><a class="headerlink" href="#ci-templates-tox-ini" title="Permalink to this headline">
*</a></h4>
<div class="highlight"><pre><span></span><span class="x">[tox]</span>
<span class="x">envlist =</span>
<span class="x"> clean,</span>
<span class="x"> check,</span>
<span class="cp">{%</span> <span class="k">for</span> <span class="nv">env</span> <span class="k">in</span> <span class="nv">tox_environments</span><span class="o">|</span><span class="nf">sort</span> <span class="cp">%}</span><span class="x"></span>
<span class="x"> </span><span class="cp">{{</span> <span class="nv">env</span> <span class="cp">}}</span><span class="x">,</span>
<span class="cp">{%</span> <span class="k">endfor</span> <span class="cp">%}</span><span class="x"></span>
<span class="x"> report</span>
<span class="x">[testenv]</span>
<span class="x">basepython =</span>
<span class="x"> {bootstrap,clean,check,report,codecov,coveralls,extension-coveralls}: {env:TOXPYTHON:python3}</span>
<span class="x">setenv =</span>
<span class="x"> PYTHONPATH={toxinidir}/tests</span>
<span class="x"> PYTHONUNBUFFERED=yes</span>
<span class="x">passenv =</span>
<span class="x"> *</span>
<span class="x">deps =</span>
<span class="x"> pytest</span>
<span class="x"> pytest-travis-fold</span>
<span class="x">commands =</span>
<span class="x"> python setup.py clean --all build_ext --force --inplace</span>
<span class="x"> {posargs:pytest -vv --ignore=src}</span>
<span class="x">[testenv:bootstrap]</span>
<span class="x">deps =</span>
<span class="x"> jinja2</span>
<span class="x"> matrix</span>
<span class="x">skip_install = true</span>
<span class="x">commands =</span>
<span class="x"> python ci/bootstrap.py --no-env</span>
<span class="x">[testenv:check]</span>
<span class="x">deps =</span>
<span class="x"> docutils</span>
<span class="x"> check-manifest</span>
<span class="x"> flake8</span>
<span class="x"> readme-renderer</span>
<span class="x"> pygments</span>
<span class="x"> isort</span>
<span class="x">skip_install = true</span>
<span class="x">commands =</span>
<span class="x"> python setup.py check --strict --metadata --restructuredtext</span>
<span class="x"> check-manifest {toxinidir}</span>
<span class="x"> flake8</span>
<span class="x"> isort --verbose --check-only --diff --filter-files .</span>
<span class="x">[testenv:coveralls]</span>
<span class="x">deps =</span>
<span class="x"> coveralls</span>
<span class="x">skip_install = true</span>
<span class="x">commands =</span>
<span class="x"> coveralls --merge=extension-coveralls.json []</span>
<span class="x">[testenv:extension-coveralls]</span>
<span class="x">deps =</span>
<span class="x"> cpp-coveralls</span>
<span class="x">skip_install = true</span>
<span class="x">commands =</span>
<span class="x"> coveralls --build-root=. --include=src --dump=extension-coveralls.json []</span>
<span class="x">[testenv:codecov]</span>
<span class="x">deps =</span>
<span class="x"> codecov</span>
<span class="x">skip_install = true</span>
<span class="x">commands =</span>
<span class="x"> codecov --gcov-root=. []</span>
<span class="x">[testenv:report]</span>
<span class="x">deps = coverage</span>
<span class="x">skip_install = true</span>
<span class="x">commands =</span>
<span class="x"> coverage report</span>
<span class="x"> coverage html</span>
<span class="x">[testenv:clean]</span>
<span class="x">commands = coverage erase</span>
<span class="x">skip_install = true</span>
<span class="x">deps = coverage</span>
<span class="cp">{%</span> <span class="k">for</span> <span class="nv">env</span><span class="o">,</span> <span class="nv">config</span> <span class="k">in</span> <span class="nv">tox_environments</span><span class="o">|</span><span class="nf">dictsort</span> <span class="cp">%}</span><span class="x"></span>
<span class="x">[testenv:</span><span class="cp">{{</span> <span class="nv">env</span> <span class="cp">}}</span><span class="x">]</span>
<span class="x">basepython = {env:TOXPYTHON:</span><span class="cp">{{</span> <span class="nv">env.split</span><span class="o">(</span><span class="s2">"-"</span><span class="o">)[</span><span class="m">0</span><span class="o">]</span> <span class="k">if</span> <span class="nv">env.startswith</span><span class="o">(</span><span class="s2">"pypy"</span><span class="o">)</span> <span class="k">else</span> <span class="s2">"python{0[2]}.{0[3]}"</span><span class="nv">.format</span><span class="o">(</span><span class="nv">env</span><span class="o">)</span> <span class="cp">}}</span><span class="x">}</span>
<span class="cp">{%</span> <span class="k">if</span> <span class="nv">config.cover</span> <span class="k">or</span> <span class="nv">config.env_vars</span> <span class="cp">%}</span><span class="x"></span>
<span class="x">setenv =</span>
<span class="x"> {[testenv]setenv}</span>
<span class="cp">{%</span> <span class="k">endif</span> <span class="cp">%}</span><span class="x"></span>
<span class="cp">{%</span> <span class="k">for</span> <span class="nv">var</span> <span class="k">in</span> <span class="nv">config.env_vars</span> <span class="cp">%}</span><span class="x"></span>
<span class="x"> </span><span class="cp">{{</span> <span class="nv">var</span> <span class="cp">}}</span><span class="x"></span>
<span class="cp">{%</span> <span class="k">endfor</span> <span class="cp">%}</span><span class="x"></span>
<span class="cp">{%</span> <span class="k">if</span> <span class="nv">config.cover</span> <span class="cp">%}</span><span class="x"></span>
<span class="x"> SETUP_PY_EXT_COVERAGE=yes</span>
<span class="x">usedevelop = true</span>
<span class="x">commands =</span>
<span class="x"> python setup.py clean --all build_ext --force --inplace</span>
<span class="x"> {posargs:pytest --cov --cov-report=term-missing -vv}</span>
<span class="cp">{%</span> <span class="k">endif</span> <span class="cp">%}</span><span class="x"></span>
<span class="cp">{%</span> <span class="k">if</span> <span class="nv">config.cover</span> <span class="k">or</span> <span class="nv">config.deps</span> <span class="cp">%}</span><span class="x"></span>
<span class="x">deps =</span>
<span class="x"> {[testenv]deps}</span>
<span class="cp">{%</span> <span class="k">endif</span> <span class="cp">%}</span><span class="x"></span>
<span class="cp">{%</span> <span class="k">if</span> <span class="nv">config.cover</span> <span class="cp">%}</span><span class="x"></span>
<span class="x"> pytest-cov</span>
<span class="cp">{%</span> <span class="k">endif</span> <span class="cp">%}</span><span class="x"></span>
<span class="cp">{%</span> <span class="k">for</span> <span class="nv">dep</span> <span class="k">in</span> <span class="nv">config.deps</span> <span class="cp">%}</span><span class="x"></span>
<span class="x"> </span><span class="cp">{{</span> <span class="nv">dep</span> <span class="cp">}}</span><span class="x"></span>
<span class="cp">{%</span> <span class="k">endfor</span> -<span class="cp">%}</span><span class="x"></span>
<span class="cp">{%</span> <span class="k">endfor</span> -<span class="cp">%}</span><span class="x"></span>
</pre></div>
</div>
<div class="section" id="ci-templates-appveyor-ini">
<h4><tt class="docutils literal"><span class="pre">ci/templates/.appveyor.ini</span></tt><a class="headerlink" href="#ci-templates-appveyor-ini" title="Permalink to this headline">
*</a></h4>
<p>For Windows-friendly projects:</p>
<div class="highlight"><pre><span></span><span class="x">version: '{branch}-{build}'</span>
<span class="x">build: off</span>
<span class="x">environment:</span>
<span class="x"> global:</span>
<span class="x"> COVERALLS_EXTRAS: '-v'</span>
<span class="x"> COVERALLS_REPO_TOKEN: IoRlAEvnKbwbhBJ2jrWPqzAnE2jobA0I3</span>
<span class="x"> matrix:</span>
<span class="x"> - TOXENV: check</span>
<span class="x"> TOXPYTHON: C:\Python36\python.exe</span>
<span class="x"> PYTHON_HOME: C:\Python36</span>
<span class="x"> PYTHON_VERSION: '3.6'</span>
<span class="x"> PYTHON_ARCH: '32'</span>
<span class="cp">{%</span> <span class="k">for</span> <span class="nv">env</span><span class="o">,</span> <span class="nv">config</span> <span class="k">in</span> <span class="nv">tox_environments</span><span class="o">|</span><span class="nf">dictsort</span> <span class="cp">%}</span><span class="x"></span>
<span class="cp">{%</span> <span class="k">if</span> <span class="nv">env.startswith</span><span class="o">((</span><span class="s1">'py2'</span><span class="o">,</span> <span class="s1">'py3'</span><span class="o">))</span> <span class="cp">%}</span><span class="x"></span>
<span class="x"> - TOXENV: </span><span class="cp">{{</span> <span class="nv">env</span> <span class="cp">}}{%</span> <span class="k">if</span> <span class="nv">config.cover</span> <span class="cp">%}</span><span class="x">,codecov,coveralls</span><span class="cp">{%</span> <span class="k">endif</span> <span class="cp">%}{{</span> <span class="s2">""</span> <span class="cp">}}</span><span class="x"></span>
<span class="x"> TOXPYTHON: C:\Python</span><span class="cp">{{</span> <span class="nv">env</span><span class="o">[</span><span class="m">2</span><span class="o">:</span><span class="m">4</span><span class="o">]</span> <span class="cp">}}</span><span class="x">\python.exe</span>
<span class="x"> PYTHON_HOME: C:\Python</span><span class="cp">{{</span> <span class="nv">env</span><span class="o">[</span><span class="m">2</span><span class="o">:</span><span class="m">4</span><span class="o">]</span> <span class="cp">}}</span><span class="x"></span>
<span class="x"> PYTHON_VERSION: '</span><span class="cp">{{</span> <span class="nv">env</span><span class="o">[</span><span class="m">2</span><span class="o">]</span> <span class="cp">}}</span><span class="x">.</span><span class="cp">{{</span> <span class="nv">env</span><span class="o">[</span><span class="m">3</span><span class="o">]</span> <span class="cp">}}</span><span class="x">'</span>
<span class="x"> PYTHON_ARCH: '32'</span>
<span class="cp">{%</span> <span class="k">if</span> <span class="s1">'nocov'</span> <span class="k">in</span> <span class="nv">env</span> <span class="cp">%}</span><span class="x"></span>
<span class="x"> WHEEL_PATH: .tox/dist</span>
<span class="cp">{%</span> <span class="k">endif</span> <span class="cp">%}</span><span class="x"></span>
<span class="x"> - TOXENV: </span><span class="cp">{{</span> <span class="nv">env</span> <span class="cp">}}{%</span> <span class="k">if</span> <span class="nv">config.cover</span> <span class="cp">%}</span><span class="x">,codecov,coveralls</span><span class="cp">{%</span> <span class="k">endif</span> <span class="cp">%}{{</span> <span class="s2">""</span> <span class="cp">}}</span><span class="x"></span>
<span class="x"> TOXPYTHON: C:\Python</span><span class="cp">{{</span> <span class="nv">env</span><span class="o">[</span><span class="m">2</span><span class="o">:</span><span class="m">4</span><span class="o">]</span> <span class="cp">}}</span><span class="x">-x64\python.exe</span>
<span class="x"> PYTHON_HOME: C:\Python</span><span class="cp">{{</span> <span class="nv">env</span><span class="o">[</span><span class="m">2</span><span class="o">:</span><span class="m">4</span><span class="o">]</span> <span class="cp">}}</span><span class="x">-x64</span>
<span class="x"> PYTHON_VERSION: '</span><span class="cp">{{</span> <span class="nv">env</span><span class="o">[</span><span class="m">2</span><span class="o">]</span> <span class="cp">}}</span><span class="x">.</span><span class="cp">{{</span> <span class="nv">env</span><span class="o">[</span><span class="m">3</span><span class="o">]</span> <span class="cp">}}</span><span class="x">'</span>
<span class="x"> PYTHON_ARCH: '64'</span>
<span class="cp">{%</span> <span class="k">if</span> <span class="s1">'nocov'</span> <span class="k">in</span> <span class="nv">env</span> <span class="cp">%}</span><span class="x"></span>
<span class="x"> WHEEL_PATH: .tox/dist</span>
<span class="cp">{%</span> <span class="k">endif</span> <span class="cp">%}</span><span class="x"></span>
<span class="cp">{%</span> <span class="k">if</span> <span class="nv">env.startswith</span><span class="o">(</span><span class="s1">'py2'</span><span class="o">)</span> <span class="cp">%}</span><span class="x"></span>
<span class="x"> WINDOWS_SDK_VERSION: v7.0</span>
<span class="cp">{%</span> <span class="k">endif</span> <span class="cp">%}</span><span class="x"></span>
<span class="cp">{%</span> <span class="k">endif</span> <span class="cp">%}{%</span> <span class="k">endfor</span> <span class="cp">%}</span><span class="x"></span>
<span class="x">init:</span>
<span class="x"> - ps: echo $env:TOXENV</span>
<span class="x"> - ps: ls C:\Python*</span>
<span class="x">install:</span>
<span class="x"> - '%PYTHON_HOME%\python -mpip install --progress-bar=off tox -rci/requirements.txt'</span>
<span class="x"> - '%PYTHON_HOME%\Scripts\virtualenv --version'</span>
<span class="x"> - '%PYTHON_HOME%\Scripts\easy_install --version'</span>
<span class="x"> - '%PYTHON_HOME%\Scripts\pip --version'</span>
<span class="x"> - '%PYTHON_HOME%\Scripts\tox --version'</span>
<span class="x">test_script:</span>
<span class="x"> - cmd /E:ON /V:ON /C .\ci\appveyor-with-compiler.cmd %PYTHON_HOME%\Scripts\tox</span>
<span class="x">on_failure:</span>
<span class="x"> - ps: dir "env:"</span>
<span class="x"> - ps: get-content .tox\*\log\*</span>
<span class="x">### To enable remote debugging uncomment this (also, see: http://www.appveyor.com/docs/how-to/rdp-to-build-worker):</span>
<span class="x"># on_finish:</span>
<span class="x"># - ps: $blockRdp = $true; iex ((new-object net.webclient).DownloadString('https://raw.githubusercontent.com/appveyor/ci/master/scripts/enable-rdp.ps1'))</span>
</pre></div>
<p>If you've been patient enough to read through that you'll notice:</p>
<ul class="simple">
<li>The <a class="reference external" href="https://travis-ci.org/">Travis</a> configuration uses tox for each item in the matrix. This makes testing in <a class="reference external" href="https://travis-ci.org/">Travis</a> consistent with testing
locally.</li>
<li>The environment order for <a class="reference external" href="https://testrun.org/tox/latest/">tox</a> is <tt class="docutils literal">clean</tt>, <tt class="docutils literal">check</tt>, <tt class="docutils literal"><span class="pre">2.6-1.3</span></tt>, <tt class="docutils literal"><span class="pre">2.6-1.4</span></tt>, ..., <tt class="docutils literal">report</tt>.</li>
<li>The environments with coverage measurement run the code without installing (<tt class="docutils literal">usedevelop = true</tt>) so that coverage
can combine all the measurements at the end.</li>
<li>The environments without coverage will sdist and install into virtualenv (<a class="reference external" href="https://testrun.org/tox/latest/">tox</a>'s default behavior <a class="footnote-reference" href="#footnote-2" id="footnote-reference-7">[2]</a>) so that
packaging issues are caught early.</li>
<li>The <tt class="docutils literal">report</tt> environment combines all the runs at the end into a single report.</li>
</ul>
<p>Having the complete list of environments in tox.ini is a huge advantage:</p>
<ul class="simple">
<li>You run everything in parallel locally (if your tests don't need strict isolation) with <a class="reference external" href="https://pypi.python.org/pypi/detox/">detox</a>. And you can still run
everything in parallel if you want to use <a class="reference external" href="https://drone.io/">drone.io</a> instead of <a class="reference external" href="https://travis-ci.org/">Travis</a>.</li>
<li>You can measure cummulated coverage for everything (merge the coverage measurements for all the environments into a
single one) locally.</li>
</ul>
</div>
</div>
<div class="section" id="test-coverage">
<h3>Test coverage<a class="headerlink" href="#test-coverage" title="Permalink to this headline">
*</a></h3>
<p>There's <a class="reference external" href="https://coveralls.io/">Coveralls</a> - a nice way to track coverage over time and over multiple builds. It will automatically add comments
on Github Pull Request about changes in coverage.</p>
</div>
</div>
<div class="section" id="tl-dr">
<h2>TL;DR<a class="headerlink" href="#tl-dr" title="Permalink to this headline">
*</a></h2>
<ul class="simple">
<li>Put code in <tt class="docutils literal">src</tt>.</li>
<li>Use <a class="reference external" href="https://testrun.org/tox/latest/">tox</a> and <a class="reference external" href="https://pypi.python.org/pypi/detox/">detox</a>.</li>
<li>Test <strong>both</strong> <em>with</em> coverage measurements and <em>without</em>.</li>
<li>Use a generator script for <tt class="docutils literal">tox.ini</tt> and <tt class="docutils literal">.travis.ini</tt>.</li>
<li>Run the tests in <a class="reference external" href="https://travis-ci.org/">Travis</a> with <a class="reference external" href="https://testrun.org/tox/latest/">tox</a> to keep things consistent with local testing.</li>
</ul>
<p>Too complicated? Just use a <a class="reference external" href="https://github.com/ionelmc/cookiecutter-pylibrary">python package template</a>.</p>
<p>Not convincing enough? Read <a class="reference external" href="https://hynek.me/articles/testing-packaging/">Hynek's post about the src layout</a>.</p>
<hr class="docutils" />
<p>Also worth checking out this <a class="reference external" href="https://blog.ionelmc.ro/2014/06/25/python-packaging-pitfalls/">short list of packaging pitfalls</a>.</p>
<table class="docutils footnote" frame="void" id="footnote-1" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-6">[1]</a></td><td>There's <a class="reference external" href="https://pypi.python.org/pypi/python-subunit/">subunit</a> and probably others but they are widely used.</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-2" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label">[2]</td><td><em>(<a class="fn-backref" href="#footnote-reference-3">1</a>, <a class="fn-backref" href="#footnote-reference-7">2</a>)</em> See <a class="reference external" href="https://testrun.org/tox/latest/example/basic.html?highlight=install">example</a>.</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-3" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label">[3]</td><td>There is a feature specification/proposal in <a class="reference external" href="https://testrun.org/tox/latest/">tox</a> for <a class="reference external" href="http://tox.readthedocs.org/en/latest/config-v2.html">multi-dimensional configuration</a> but it still doesn't solve the problem of generating the
<tt class="docutils literal">.travis.yml</tt> file. There's also <a class="reference external" href="https://github.com/slafs/tox-matrix">tox-matrix</a> but it's not flexibile
enough.</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-4" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-5">[4]</a></td><td><a class="reference external" href="https://github.com/audreyr/cookiecutter-pypackage">cookiecutter-pypackage</a> is acceptable at the surface level
(tests outside, correct MANIFEST) but still has the core problem (lack of <tt class="docutils literal">src</tt> separation) and gives the wrong
idea to glancing users.</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-5" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-1">[5]</a></td><td><p class="first">It's a chicken-and-egg problem: how can <tt class="docutils literal">pip</tt> know what dependencies to install if running the <tt class="docutils literal">setup.py</tt>
script requires unknownable dependencies?</p>
<p class="last">There are so many weird corners you can get into by having the power to run arbitrary code in the <tt class="docutils literal">setup.py</tt>
script. This why people <a class="reference external" href="http://legacy.python.org/dev/peps/pep-0390/">tried to</a> <a class="reference external" href="https://pypi.python.org/pypi/d2to1">change</a> <tt class="docutils literal">setup.py</tt> to pure metadata.</p>
</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-6" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-2">[6]</a></td><td>Did you know the order of the rules in <tt class="docutils literal">MANIFEST.in</tt> matters?</td></tr>
</tbody>
</table>
<table class="docutils footnote" frame="void" id="footnote-7" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#footnote-reference-4">[*]</a></td><td><a class="reference external" href="http://legacy.python.org/dev/peps/pep-0020/">PEP-20</a>'s 5th aphorism: Flat is better than nested.</td></tr>
</tbody>
</table>
</div>