Genomics Programming
9 followers 0 articles/week
streaming merge of sorted objects

A lot of software still seems to rely on being able to read big-ish data into memory. This is not possible (or at least not desirable) for much of the data that I work with. There are very nice tools in python to allow operating on chunks of data at a time. When combined with a decent data-layout, this can be very powerful, and simpler even than reading...

Wed Dec 4, 2013 18:14
Adding bed/wig data to dalliance genome browser

I have been playing a bit with the dalliance genome browser. It is quite useful and I have started using it to generate links to send to researchers to show regions of interest we find from bioinformatics analyses. I added a document to my github repo describing how to display a bed file in the browser. That rst is here and displayed in inline below....

Wed Apr 13, 2011 04:40
(bloom) filter-ing repeated reads

In this post, I'll talk a bit about using a bloom filter as a pre-filter for large amounts of data, specifically some next-gen sequencing reads. Bloom FiltersA Bloom Filter is a memory efficient way of determining if an element is in a set. It can have false positives, but not false negatives. A while ago, I wrote a Cython/Python wrapper for the C code...

Fri Oct 22, 2010 19:12
filtering paired end reads (high throughput sequencing)

NOTE: I don't recommend using this code. It is not supported and currently does not work for some sets of reads. If you use it, be prepared to fix it. I wrote last time about a pipeline for high-throughput sequence data. In it, I mentioned that the fastx toolkit works well for filtering but does not handle paired end reads. The problem is that you...

Tue Sep 21, 2010 02:34
ngs / high-throughput sequencing pipeline

This is the minimal set of preprocessing steps I run on high-throughput sequencing data (mostly from the Illumina sequencers) and then how I prep and view the alignments. If there's something I should add or consider, please let me know. I'll put it in the form of a shell script that assumes you've got this software installed. I'll also assume your...

Mon Sep 13, 2010 04:21
GSNAP

AlignersSince starting the methylcoder project, I've been using the bowtie short read aligner. It's very fast, uses very little memory, aligns Illimina, SOLID, and colorspace reads, and has enough options to keep you busy (including my favorite: --try-hard). There's a new short-read aligner in my feed-reader each week. I wish, as a service, they'd tell...

Wed Jul 14, 2010 22:50

Build your own newsfeed

Ready to give it a go?
Start a 14-day trial, no credit card required.

Create account