Rogue Scholar

BiologíaInglés

Identifying Pathogens in Sequencing Data

Publicado 21 de junio de 2012

I just read an interesting paper on pathogen discovery using next-generation sequencing data, recommended to me by Nick Loman. A previously described algorithm (PathSeq, Kostic et al) for discovering microbes by deep-sequencing human tissue uses computational subtraction, whereby the initial collection of reads is depleted of human DNA by consecutive alignment to the human reference using MAQ and BLAST.

1000 GenomesBioinformaticsDatabasesENCODEBiologíaInglés

The HaploREG Database for Functional Annotation of SNPs

https://doi.org/10.59350/q0g3e-4vt59

Publicado 11 de junio de 2012

Autor Stephen Turner

The ENCODE project continues to generate massive numbers of data points on how genes are regulated. This data will be of incredible use for understanding the role of genetic variation, both for altering low-level cellular phenotypes (like gene expression or splicing), but also for complex disease phenotypes. While it is all deposited into the UCSC browser, ENCODE data is not always the easiest to access or manipulate.

BioinformaticsNoteworthy BlogsRRecommended ReadingRSSBiologíaInglés

How to Stay Current in Bioinformatics/Genomics

https://doi.org/10.59350/wqd5d-f4y96

Publicado 29 de mayo de 2012

Autor Stephen Turner

A few folks have asked me how I get my news and stay on top of what's going on in my field, so I thought I'd share my strategy.

BioinformaticsPLINKRSoftwareBiologíaInglés

Stepping Outside My Open-Source Comfort Zone: A First Look at Golden Helix SVS

https://doi.org/10.59350/9pr8r-4pk96

Publicado 16 de mayo de 2012

Autor Stephen Turner

I'm a huge supporter of the Free and Open Source Software movement. I've written more about R than anything else on this blog, all the code I post here is free and open-source, and a while back I invited you to steal this blog under a cc-by-sa license. Every now and then, however, something comes along that just might be worth paying for.

BioinformaticsBiologíaInglés

Video Tip: Use Ensembl BioMart to Quickly Get Ortholog Information

https://doi.org/10.59350/dndjf-sja47

Publicado 11 de mayo de 2012

Autor Stephen Turner

A few weeks ago I showed you how to convert gene IDs with BioMart. Yesterday I hosted a workshop on the Ensembl Genome Browser, given by Dr. Bert Overduin from EBI-EMBL. He gave several examples of very useful tasks that you can do very quickly and easily using BioMart. One, in particular, is something that I'm doing for a client in the core right now.

AnnouncementsBioinformaticsRBiologíaInglés

NSF BIGDATA webinar

https://doi.org/10.59350/3gph9-few41

Publicado 1 de mayo de 2012

Autor Stephen Turner

If you're doing any kind of big data analysis - genomics, transcriptomics, proteomics, bioinformatics - then unless you've been on vacation the last few weeks you've no doubt heard about the NSF/NIH BIGDATA Initiative (here's the NSF solicitation and here's the New York Times article about the funding opportunity). The solicitation "aims to advance core scientific and technological means of managing, analyzing, visualizing, and

BioinformaticsBiologíaInglés

Awk Command to Count Total, Unique, and the Most Abundant Read in a FASTQ file

https://doi.org/10.59350/b4y4g-1rr89

Publicado 18 de abril de 2012

Autor Stephen Turner

I was reading through a paper on comparative ChIP-Seq when I found this awk gem that lets you get some very basic stats very quickly on next generation sequencing reads. To use, simply cat the fastq file (or gunzip -c) and pipe that to this awk command: cat myfile.fq | awk '((NR-2)%4==0){read=$1;total++;count[read]++}END{for(read in count){if(!max||count[read]>max)

BioinformaticsRTwitterBiologíaInglés

RNA-Seq Methods & March Twitter Roundup

https://doi.org/10.59350/zv7sm-p1x03

Publicado 6 de abril de 2012

Autor Stephen Turner

There were lots of interesting developments this month that didn't work their way into a full blog post. Here is an incomplete list of what I've been tweeting about over the last few weeks.

BioinformaticsRBiologíaInglés

Video Tip: Convert Gene IDs with Biomart

https://doi.org/10.59350/d2h43-d8y56

Publicado 14 de marzo de 2012

Autor Stephen Turner

I get asked frequently how to convert from one gene identifier to another. This can be tricky, especially when relying on gene symbols, as Will pointed out in a previous post a few years ago. There are several tools that can do this, including DAVID and the previously mentioned new Biomart ID Converter, but I still prefer using the Ensembl Biomart for this because of its added flexibility and annotation.

AnnouncementsBiologíaInglés

Redesign by Subtraction

https://doi.org/10.59350/84qhp-smg59

Publicado 13 de marzo de 2012

Autor Stephen Turner

GGD has a new look. I was inspired by Gina Trapani (Smarterware, Lifehacker) to remove any extra lines, links, and other "ink" that doesn't serve any purpose, and I hope the site appears cleaner and easier to read. I also wanted the extra horizontal space for larger images and avoid the dreaded side-scrolling in posts with lots of code like this one.

BioinformaticsLinuxRBiologíaInglés

find | xargs ... Like a Boss

https://doi.org/10.59350/dy3rn-xgm60

Publicado 9 de marzo de 2012

Autor Stephen Turner

*Edit March 12* Be sure to look at the comments, especially the commentary on Hacker News - you can supercharge the find|xargs idea by using find|parallel instead. --- Do you ever discover a trick to do something better, faster, or easier, and wish you could reclaim all the wasted time and effort before your discovery?

Getting Genetics Done

Identifying Pathogens in Sequencing Data

The HaploREG Database for Functional Annotation of SNPs

How to Stay Current in Bioinformatics/Genomics

Stepping Outside My Open-Source Comfort Zone: A First Look at Golden Helix SVS

Video Tip: Use Ensembl BioMart to Quickly Get Ortholog Information

NSF BIGDATA webinar

Awk Command to Count Total, Unique, and the Most Abundant Read in a FASTQ file

RNA-Seq Methods & March Twitter Roundup

Video Tip: Convert Gene IDs with Biomart

Redesign by Subtraction

find | xargs ... Like a Boss