{ inercia }

May 10

PacketFence: a network access control solution -

May 09

How do you estimate on an Agile project? -

Estimation can be a difficult beast to deal with; more so on an Agile project. How do you estimate when you don’t have a list of requirements that is complete or signed-off by the customer? Or a nailed-down schedule? What should your currency of estimation be? How do you estimate on a distributed team? Is it worth estimating at all?

May 01

Dependency Inversion in practice

Apr 26

Copperhead -

Data parallelism brought to Python from Nvidia Research.

Apr 24

Strategy: Using Lots of RAM Often Cheaper than Using a Hadoop Cluster -

For smaller problems memory has reached a GB/$ ratio where it is technically and financially feasible to use a single server with 100s of GB of DRAM rather than a cluster. Given the majority of analytics jobs do not process huge data sets, a cluster doesn’t need to be your first option. Scaling up RAM saves on programmer time, reduces programmer effort, improved accuracy, and reduces hardware costs.

Apr 01

objgraph: hunting memory leaks

Mar 21

How we use Python at Spotify

Mar 20

Redis persistence -

I had the idea Redis did not implement “reliable” persistence. I’ve always used it as a in-memory key-value storage, mostly from Python. So I was looking for some persistence solutions when I found this explanation by one of the developers where he tries to demystify Redis persistence.

He goes on the common aspects to consider when persisting databases, explaining the interactions with the OS and why you can’t assure all data is saved to disk unless you want to do all your I/O operations synchronous (and that is something rather expensive). Then he explains the main persistence alternatives: snapshottings and AOF files.

Snapshots just dump the whole dataset to a disk image, but they are expensive operations and cannot be performed too often (they are usually done every 5 minutes or so). In contrast, AOF files do little I/O by saving to a file all the operations as they happen, right in the same order as they are executed by the server, so they can be replayed in case of a crash. In my view, and depending on the fsync policy you use (that forces a commit of data to disk), AOF files can provide the same reliability as any other conventional database (ie, PostgreSQL). And, as the author explains, both methods can be combined for increased security: 

From a more practical point of view Redis provides both AOF and RDB snapshots, that can be enabled simultaneously (this is the advised setup, when in doubt), offering at the same time easy of operations and data durability. 

Mar 19

Async I/O for Python 3 -

A presentation by Guido van Rossum on how Tulip is going to replace gevent, Tornado and all those incompatible libraries…

Using Redis as LINE storage -

How they used Redis as storage in LINE, and how they scaled up with the help of Hadoop, HBase and some other tools…