About me

I recently graduated from Chapman University with a B.S. in Mathematics and a B.S. in Computer Science. In Fall 2010 I will be entering a Master's program in computer science at the Univeristy of California, Irvine.

I am currently on summer vacation and available for hire! I am aptly suited for a positions involving software engineering, bioinformatics, and database web applications. Please see my resume for complete qualifications and contact me for further information!

For your convenience, my resume is available as a PDF.

Bioinformatics

I am currently working in computational molecular phylogenetics studying the co-evolution of organisms. To facilitate this I have developed a web-service for identifying orthologs: proteins in different species that are related through speciation. These proteins are important because they can be compared to one another to build a phylogenetic tree. This web-service runs on Chapman University's computing infrastructure and is available at http://ortholog.us.

In the near future this will be expanded to support protein-nucleotide and nucleotide-protein searches to enable the identification of orthologs using Expressed Sequence Tags (ESTs). This would enable phylogenetic trees to easily include species that are underrepresented in traditional sequence databases but have strong EST data available.

We will also be expanding on the service's API by adding Python, Java, and C++ bindings to simplify searching for orthologs from within custom software applications. In the meantime we offer a JSON API so developers can make queries and retrieve results using a JSON library, such as the one included in Python 2.6.

Machine Learning

My main experience with machine learning has been using Markov chains to model Java naming conventions in SourceForge projects. This model was incorporated in an Eclipse plug-in to help novice programmers adhere to standard naming conventions. The accompanying paper was presented at ACM's SIGCSE 2010.

I have also done extensive research using Markov Chain Monte Carlo for high-dimension numerical integration. This is perhaps most commonly used for estimating the posterior of complex probability distributions. For further information please see my in-depth introductory paper.

Finally, I have used techniques such as TF-IDF, Latent Semantic Analysis, and Naive Bayes to classify song lyrics into genres for an undergraduate artificial intelligence course.

Database Application Development

I freelance for several small companies doing custom web application development. I usually stick to the django web framework for the rapid development of structured systems. I have been using this framework since pre-1.0 and am familiar with many of its intricacies. If you are interested in hiring me please contact me through the form on this site or using the email address on my resume.