The blog has gone a little quiet as I converge on what is hopefully The End of dissertating, which has been taking up an awful lot of time. But a recent post by DrugMonkey on their perceptions of the hurdles crowdfunding science faces inspired me to write a post. Because, while I disagree with some of the specific points DM makes, I tend to agree – crowdfunding’s time has not yet come.

This makes me really, really sad. More after the jump.

Continue reading ‘Crowdfunding Science: An Idea I Want To Love’

In a previous post I reviewed my interactions with Python the programming language and the community following the New Years’ Python Meme that’s making the rounds on Twitter and the like.

And now, I’m shamelessly stealing it to look at how I used R in 2012. I figure I should probably let the #2012pythonmeme stay Python-only, so I’m going to try #Rin2012.

1. What’s the coolest R application, package, or library you have discovered in 2012?

This one is like choosing between your favorite children: RStudio and plyr are both new discoveries to me that have massively altered the way I work with R, for the better. If I had to pick one, I’d probably go with RStudio just because of the amount of work that got done this year thanks to a R-specific IDE. Pretty excited about RStudio Server too, I just need a system to run it on. Anyone want to donate one?

2. What new programming technique did you learn in 2012

I’m going to cheat, and say since plyr didn’t get the coolest R package award, we’re going to give it credit here. Because parallelism in R was the new programming technique I learned this year, and for simple “Apply this function by a grouping variable” tasks, plyr and it’s connection to doMC saved some serious time. And it’s always gratifying to see all the cores on your machine go to 100%.

3. Which open source project did you contribute to the most in 2012? What did you do?

I didn’t specifically contribute to any open source projects. However, I did publish the code needed for a workshop I taught on mathematical epidemiology on GitHub, and one of the papers I published has a freely available electronic appendix hosted there as well. I’ve got some plans for next year – stay tuned.

4. Which R blog or website did you read the most in 2012

CrossValidated and StackOverflow for websites. The amazingly useful R-Bloggers lets me cheat in terms of blogs and say “All of them”.

5. What are the top things you want to learn in 2013

I’m interested in doing some social network analysis with R, and there’s a project that I’ve got an extensive codebase for in SAS that I want to translate over to R, but haven’t the faintest clue on how to get started.

6. What is the top software, application, or library you wish someone would write in 2013

Besides the magical R faeries leaving me a copy of the above mentioned SAS code in R? I’d really like to see a replacement for GillespieSSA designed for intensive, research grade projects.

Want to do your own list? here’s how:

  • copy-paste the questions and answer them in your blog
  • tweet it with #Rin2012 hashtag

Ran across this on a number of different sites, just going over this year in terms of my interaction with the Python programming language, the open source community, etc. In a slight twist, I’m going to do the same thing for R in a second post.

1. What’s the coolest Python application, framework, or library you have discovered in 2012?

PiCloud. I’ve been following their growth and expansion, and while for the moment I’ve used them mostly for toy examples and experiments, I think they represent a really promising platform for the kind of on-demand, embarrassingly parallel simulation work I’ve been doing recently. Some of my Python programming got shelved this year while other stuff got done, so I’m looking forward to revisiting the platform and seeing how its developed. The PiCloud Notebook has me pretty excited.

2. What new programming technique did you learn in 2012

Honestly, 2012 was the first year I really pushed myself in terms of programming, so I could probably say “All of them” and be right. Generally speaking, just thinking “I could write a function for that” was a huge step.

3. Which open source project did you contribute to the most in 2012? What did you do?

Nothing really. That isn’t strictly true for R (see the post to follow this one), but my Python skills are pretty dubious. I do have some ideas bouncing around in my head however, so hopefully 2013 will be the year of a contribution or two.

4. Which Python blog or website did you read the most in 2012

StackOverflow and Python Weekly.

5. What are the top things you want to learn in 2013

I’d like to push myself a little more in terms of object-oriented programming. It’s something that I can see the utility of, and would make some of the “book keeping” tasks I need to do in a project or two vastly easier, but it’s still not a technique I’m comfortable with.

I’d also like to dabble in Django, moving from “This result comes out on the command line” into something useful to the world at large.

6. What is the top software, application, or library you wish someone would write in 2013

What’s left of my dissertation. Hopefully that someone is me, or I’m screwed.

Want to do your own list? here’s how:

  • copy-paste the questions and answer them in your blog
  • tweet it with #2012pythonmeme hashtag

Retraction Watch has a post on the Elsevier Editorial System (ESS) being hacked at some point in the last month, and generating some paper withdrawls because the reviews for it were faked. Sadly, I am not surprised – some of the security measures taken by journals are a touch out-of-date.

Continue reading ‘Elsevier Hacked: Can we get some basic security for journal editorial systems?’

As you may (or may not) have noticed, this blog has been fairly quiet recently – I had entered a period of what a family member referred to as “Radio Silence”. A combination of deadlines, trying to get projects out the door, some life-related stress and two major conferences in a month will do that.

I’m back, hopefully, and should be updating content again regularly soon – hoping to post some more coding-related posts, some science posts, and I’ve joined the O’Reilly Books review network, so there’s probably some reviews of their books from the perspective of an Epidemiologist coming down the pipeline.

Or at least that’s the plan.

Almost no one will contest that being able to reproduce the findings from scientific studies is key to advancing science – I say almost no one because in my experience you can always find one person to disagree with anything if you look hard enough. We all acknowledge its possible, have entire sessions devoted to fretting about John Ioannidis’ paper (which has, ironically, gotten extended past the actual support in the paper in my opinion), and node sagely when people talk about making code available, writing clear methods sections, etc.

So when press releases and news reports about the Reproducibility Initiative started making the rounds on various blogs I read, I looked it over with interest. The concept is simple: Reproducible results are good, and should be rewarded. Validate your study through the initiative, and you’ll not only get a ‘Certificate of Reproducibility’ (of whatever worth that might be to you) and more importantly for most career scientists, the replicated results can be published as an independent paper in the PLOS Reproducibility Collection, and the original study will be marked as reproduced in the parent journal if it’s one of the Initiative’s supporters.

That all sounds great…but as with all things, there’s a “but…” coming. Or, to my mind, several. More after the jump.

Continue reading ‘Repeatability, Replication and the Reproducibility Initiative’

Breaking my dissertation and administrata induced silence for a small rant combining two of my favorite things – Apple Computer Inc, and simulation. Recently, the New York Times featured the article ‘Apple Confronts the Law of Large Numbers‘. The fundamental assertion? That the earnings growth and stock price of Apple cannot continue its rapid rise. The justification? The Law of Large Numbers, and the idea that as Apple grows larger, each additional % increase in earnings, profit, etc. represents a bigger and bigger step in terms of the absolute dollar amount.

One problem: That’s not how the Law of Large Numbers works. More after the jump.

Continue reading ‘That’s Not How the “Law of Large Numbers” Works’

So Labguru recently had a blog post entitled 5 Best Mobile Apps for Research Scientists. It’s a decent list, though it’s actually the four best, since your brand new iPad app isn’t something I’m sure you can actually count in an impartial list, though it does look cool.

It’s actually a better list than most. I find myself getting irked when “Science” is taken to invariably mean either “Physics” or more commonly in life science blogs and the like I read, wet-lab biology/biochemistry. What about us poor theorists? Or population-level empiricists? Do we really need a list dominated by timers to make sure you take your samples out of the water bath in time?

After the jump are my Top 5 apps, hopefully not terribly biased toward my own research. And absolutely not featuring my own (nonexistent) app.

Continue reading ‘My Top 5 Mobile Apps for Scientists’

A previous post of mine had suggested that, despite them being extremely similar operating systems, and really there being no clear reason why, Revolution R 5.0, which does support Red Hat Enterprise Linux, refused to work on Fedora 16. The installation failed, dependencies could not be installed, tech support was singularly unhelpful because I wasn’t using RHEL 5.0, and I essentially said “nuts to this” and went back to my trusty, working, free installation of R sitting in my beloved native OS X. But today, the plot thickened…

Continue reading ‘Revolution R and Fedora: Revisited’


I ran across this post at The Tree of Life entitled ‘Interesting new metagenomics paper w/ one big big big caveat – critical software not available”.

The long and short of it? Paper appears in Science, has fancy new methodology, lacks the software for someone else to use their methodology. Blog author understandably annoyed. But I have some sympathy with the authors of the paper itself, as much as I prefer the code for an analysis to be available for publication. My thoughts after the jump.

Continue reading ‘On Unpublished Software’