Skip to main content


Sustainable Living: One man's trash...

Recent posts

Freely-Speaking: On the need to act with urgency.

I just read this article on the Great Barrier Reef suffering irreversible damage from climate disruption. It moved me so much that I just had to quickly post an appeal to anyone who happened to be reading this blog:

The changes happening to our environment are real, massive, and definitely caused in very large parts by human action (e.g. burning of fossil fuels for transportation, and energy, deforestation etc.) and made worse by inaction (e.g.: governments twiddling their thumbs and ignoring the problem, or afraid of shaking up the status quo).

There is some good news to all of this too though: Since it is humans causing this problem, it is also up to us to do everything in our power to fix these problems. And since Earth Week is also coming up, I would like to appeal to everyone to move to action.

Freely Speaking: Programming Biology - Eric Clavins

Today's post refers to an inspiring talk by Eric Calvins who talks about programming the biology lab to program biological entities. I'll leave it to the reader to see what the implications of this vision are.

Freely Speaking: What does SciPy have to do with bio-based ideas??!

I was recently asked the above question. And it's a totally valid question as SciPy is somewhat outside of what I usually write about (biological topics, sustainable topics). There is a logical connection though, and it has to do with what I do at work.

Building biological entities is difficult because unlike car design, biological entities like to "misbehave". I say misbehave but it actually only means that we don't sufficiently understand microorganisms well enough to model them perfectly.

This is where data science comes in of course. Through collection of large data sets data science and related fields can help us uncover patterns not seen before. these patterns then help make a better yeast model. Better models = faster product development. Faster product development = faster route to a sustainable business = more products with a positive impact.

So there is a link. Simple, right?

Recap: SciPy 2015 - Synopsis

SciPy 2015 has come and gone. If I step back, what are some of the learning lessons?

There were certain themes that recurred from talk to talk:

Speed. One of the perceived limitations of Python seems to be speed of execution which is important to process very large datasets. Many talks dealt with this topic in various ways. Some of the approaches included enabling process parallelization (Dask, DistArray), GPU-acceleration (VisPy), or acceleration via some means of compilation - sometimes just-in-times (Numba). With these tools, Python is no longer slow. It's impressive that the combination of these approaches has enabled data scientists to process 60+ GB data sets as if they would be loaded into memory on one small laptop that actually only has 8-16 GB of memory.Visualization was a theme. So many talks dealt with making complex data sets visible, and they did so to address different issues: Serialization to enable interactivity (Bokeh, matplotlib), visualizing large dynamic datase…

Recap: SciPy 2015 - Day 3

Jake Van Der Plaas, one of the main contributors of anything from numpy to scipy, scikit-learn, mpld3 etc., gave the key note speech on the third day of the conference talking about the state of scientific computing in Python. As a side note, Jake is a senior scientist and Director Research at the eSciences at University of Washington. I hear that he is currently involved in developing the data analysis pipeline for the LSST as well.

Recap: SciPy 2015 - Day 2

Wes McKinney held the key note speech on the second day. This talk was more of a retrospective, personal journey with a view on the future for python and the greater data science community. Interestingly, some of the tools seem to have started a "long time ago" - 2008. Wes talked about 2011 being the year when Pandas development took off again. Thinking about my own history, I joined Amyris in 2011 as part of the Enzymology department which doesn't feel that long ago. Pandas bug/design fixes, and data wrangling capabilities were implemented from June 2011 to July 2012, which is just 3 months before I joined the software engineering department, and that feels really recent.