As a part of setting up the website for the Docathon I've had to re-learn all of my date string formatting rules. It's one of those little problems you don't really think about - turning an arbitrary string into something structured like a date - until you've actually got to do it.
Every now and then I come across a nifty feature in Matplotlib that I wish I'd known about earlier. The MPL documentation can be a beast to get through, and as a result you miss some cool stuff sometimes.
This is a quick demo of one such feature: the cycler.
Have you ever had to loop through a number of plotting parameters in matplotlib? Say you have two datasets and you'd like to compare them to one another. Maybe something like this:
This is a quick demo of how I created this video. Check it out below, or read on to see the code that made it!
Scraping publication amounts at biorxiv¶
Per a recent request somebody posted on Twitter, I thought it'd be fun to write a quick scraper for the biorxiv, an excellent new tool for posting pre-prints of articles before they're locked down with a publisher embargo.
A big benefit of open science is the ability to use modern technologies (like web scraping) to make new use of data that would originally be unavailable to the public. One simple example of this is information and metadata about published articles. While we're not going to dive too deeply here, maybe this will serve as inspiration for somebody else interested in scraping the web.
This article is now interactive! Check out a live Binder instance here
In the next few months, I'll try to take some time to talk about the things I learn as I make my way through this literature. While it's easy to make one-off complaints to one another about how "science is broken" without really diving into the details, it's important learn about how
I've finally decompressed after my first go-around with Scipy. For those who haven't heard of this conference before, Scipy is an annual meeting where members of scientific community get together to discuss their love of Python, scientific programming, and open science. It spans both academics and people from industry, making it a unique place in terms of how software interfaces with scientific research. (if you're interested the full set of Scipy conferences, check out here
As a scientist, watching the Brexit vote was a little bit painful. Though probably not for the reason you're thinking. No, it wasn't the politics that bothered me, but the method for making such an incredibly important decision. Let me explain...
Scientists are a bit obsessed with the concept of error. In the context of collecting data and anaylzing it, this takes the form of our "confidence" in the results. If all the data say the same thing, then we are usually pretty confident in the overall message. If the data is more complicated than this (and it always is), then we need to define how confident
When we discuss "computational efficiency", you often hear people throw around phrases like $O(n^2)$ or $O(nlogn)$. We talk about them in the abstract, and it can be hard to appreciate what these distinctions mean and how important they are. So let's take a quick look at what computational efficiency looks like in the context of a very famous algorithm: The Fourier Transform.
NIH Fellowship Success Rates¶
As I'm entering the final years of graduate school, I've been applying for a few typical "pre-doc" fellowships. One of these is the NRSA, which is notorious for requiring you to wade through forests of beaurocratic documents (seriously, their "guidelines" for writing an NRSA are over 100 pages!). Doing so ends up taking a LOT of time.
Using Craigslist to compare prices in the Bay Area¶
In the last post I showed how to use a simple python bot to scrape data from Criagslist. This is a quick follow-up to take a peek at the data.
Note - data that you scrape from Craigslist is pretty limited. They tend to clear out old posts, and you can only scrape from recent posts anyway to avoid them blocking you.