FlyBase Homework Assignment

FlyBase Homework Assignment

FlyBase (flybase.org) is a valuable community resources for Drosophila (and non-Drosophila) geneticists.  The purpose of this exercises is to help you become familiar with some of the functions of FlyBase such that you can use it for.  Note: you don’t need to pay the web access fee.

There is a ton of stuff that can be done with it and I hope you learn some of it by answering the following questions.

Start by watching these quick tutorials:

  • How to find all data related to a gene

https://www.youtube.com/watch?v=OlHv3zWt5cw

  • Finding related genes in FlyBase: Gene Groups

https://www.youtube.com/watch?v=GQ_2X29Gx-E

  • FlyBase 2.0

https://www.youtube.com/watch?v=DGUjKmeTlKE

  • Gene Snapshots

https://www.youtube.com/watch?v=6IDlJGXdIP8

Acp26Aa is well studied gene.

What is the full name of this gene?

What is its annotation symbol?

Tell me a bit about what it does.

What is the approximate location of this gene in the genome?

Paste the decorated FASTA sequence below.  Note: keep the colors that are used on the website.

What do the different colors and capitalization indicate?

Download the coding sequence (CDS) in FASTA file. In the sequence box, I would change “gene region” to “CDS” and click “Get Sequence”.  On the next page you can click “Download FASTA” and open the sequence in a text viewer such as WordPad.

How many nucleotides is the coding sequence?

Download the protein translation in FASTA and paste it below. The FASTA text sequence corresponds to the individual amino acids with their abbreviations. (https://en.wikipedia.org/wiki/Amino_acid)

How many amino acids long is the protein?

What is the first and last amino acid?

What the gene ontology categories for Acp26Aa?

In what tissue is Acp26Aa expressed based on Northern Blots?

What is the first citation in the references section?

BLAST the D. melanogaster Acp26Aa coding DNA sequence against the other species in the Sophophora subgenus (exclude D. melanogaster). Use “Genome Assembly” and blastn NT->NT.  In general, E values < 10^-10 are usually considered to be significant hits.

                What is BLAST?

What is an E-value?

                Based on a cut-off of 10^-10, what other species, show strong evidence for homologous genes?

In many cases, you want to be able to identify orthologous genes as compared to just homologous genes.  What is the difference between orthologous genes and homologous genes?

One way to test for orthologous genes is called reciprocal best BLAST hits.  Take the gene from D. melanogaster and BLAST it and then take the top hit that you get and BLAST it back against the D. melanogaster genome.  Take the top hit when using the D. melanogaster version of Acp26Aa and BLAST that hit against the D. melanogaster genome.  Do you get the D. melanogaster Acp26Aa back as the top hit?  (Hints: to get the sequence from the top BLAST hit, click the subject FASTA button on the results table.  To see more details about the hit you get back, click on GBrowse, this will take you to the region of the genome where the hit it located and you can see any features in that area)

Are there any known human orthologs of Acp26Aa or any Human disease model data?

CG4637 is another interesting gene.  What is this gene’s name?

What does CG4637 do (according to gene snap shot)?

Does CG4637 have any human orthologs and if so what are they?

What role might these orthologs play in human disease according to:

  • FlyBase Human Disease Model Reports?

  • According to the Summary, Automated Description

SAMPLE ASSIGNMENT

Powered by WordPress