[rat-forum] Announcing change to RGD GENES ftp file

Kowalski, George gkowalski at mcw.edu
Tue May 8 15:00:39 CDT 2007


All, 

 

Because of internal database changes, and inconsistencies found while
trying to read the rat GENES file into Biomart, RGD has stopped updating
this file effective immediately. This GENES file is currently located
at:

 

ftp://rgd.mcw.edu/pub/data_release/GENES
<ftp://rgd.mcw.edu/pub/data_release/GENES> 

 

This file will be available at the site above for at least two months,
at which time it will be removed. 

 

 

We are now be generating three new files: 

 

ftp://rgd.mcw.edu/pub/data_release/GENES_RAT
<ftp://rgd.mcw.edu/pub/data_release/GENES_RAT> 

 

ftp://rgd.mcw.edu/pub/data_release/GENES_HUMAN
<ftp://rgd.mcw.edu/pub/data_release/GENES_HUMAN> 

 

ftp://rgd.mcw.edu/pub/data_release/GENES_MOUSE
<ftp://rgd.mcw.edu/pub/data_release/GENES_MOUSE> 

 

 

The format of these files is similar to that of GENES file except for a
number of small changes. These changes are:

 

 

1) Broken the starting and stopping positions into separate fields - the
old data were not machine readable if one of these fields was not
present.

 

Removed:

 

START_POS, STOP_POS fields

 

and the data has been moved into the fields:  

 

START_POS_31  STOP_POS_31   START_POS_34  STOP_POS_34

 

2) We've added the fields: STRAND_31  and STRAND_34 containing the
strand information. 

 

3) The SWISSPROT_ID field has now been renamed to the UNIPROT_ID field .


 

4) Changed the following field names for consistency; now they don't
contain spaces (for human readability) : 

 

The "ENTREZ GENE" field is now ENTREZ_GENE

 

The "GDB ID" field is now GDB_ID

 

3) Removed the following fields as we are generating separate files for
Human and Mouse:

 

MOUSE_HOMOLOG_RGD_ID       

MOUSE_HOMOLOG_SYMBOL      

MOUSE_HOMOLOG_NAME          

MOUSE_CHROMOSOME  

MGD_ID            

HUMAN_HOMOLOG_RGD_ID       

HUMAN_HOMOLOG_SYMBOL      

HUMAN_HOMOLOG_NAME          

HUMAN_CHROMOSOME

 

 

Below is a complete list of fields in the new files and in the current
file.

 

Fields in the new files: 

 

GENE_RGD_ID     

SYMBOL  

NAME    

GENE_DESC       

CHROMOSOME      

FISH_BAND       

START_POS_31    

STOP_POS_31     

STRAND_31       

START_POS_34    

STOP_POS_34     

STRAND_34       

CURATED_REF_RGD_ID      

CURATED_REF_PUBMED_ID   

UNCURATED_PUBMED_ID     

RATMAP_ID       

ENTREZ_GENE     

UNIPROT_ID      

RHDB_ID 

UNCURATED_REF_MEDLINE_ID        

GENBANK_NUCLEOTIDE      

TIGR_ID 

GENBANK_PROTEIN 

UNIGENE_ID      

GDB_ID  

SSLP_RGD_ID     

SSLP_SYMBOL     

ALIAS_VALUE     

ALIAS_TYPES     

QTL_RGD_ID      

QTL_SYMBOL      

NOMENCLATURE_STATUS     

SPLICE_RGD_ID   

SPLICE_SYMBOL   

GENE_TYPE       

ENSEMBL_ID

 

Fields in the current file ( GENES ) :

 

GENE_RGD_ID        
SYMBOL     
NAME       
GENE_DESC  
CHROMOSOME 
FISH_BAND  
START_POS  
STOP_POS   
CURATED_REF_RGD_ID 
CURATED_REF_PUBMED_ID       
UNCURATED_PUBMED_ID 
RATMAP_ID  
ENTREZ GENE        
SWISSPROT_ID       
RHDB_ID    
UNCURATED_REF_MEDLINE_ID    
GENBANK_NUCLEOTIDE 
TIGR_ID    
GENBANK_PROTEIN    
UNIGENE_ID 
MOUSE_HOMOLOG_RGD_ID        
MOUSE_HOMOLOG_SYMBOL        
MOUSE_HOMOLOG_NAME 
MOUSE_CHROMOSOME   
MGD_ID     
HUMAN_HOMOLOG_RGD_ID        
HUMAN_HOMOLOG_SYMBOL        
HUMAN_HOMOLOG_NAME 
HUMAN_CHROMOSOME   
GDB ID     
SSLP_RGD_ID        
SSLP_SYMBOL        
ALIAS_VALUE        
ALIAS_TYPES        
QTL_RGD_ID 
QTL_SYMBOL 
NOMENCLATURE_STATUS 
SPLICE_RGD_ID      
SPLICE_SYMBOL      
GENE_TYPE  

ENSEMBL_ID

 

We will be updating these new extracts weekly. Please contact me if you
have any questions in regards to this file. I am also on the rats-forum
email list. 

 

George Kowalski

Medical College of Wisconsin - Project Lead RGD Database

Human and Molecular Genetics Center

414.456.5746    gkowalski at mcw.edu

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gray.hmgc.mcw.edu/pipermail/rat-forum/attachments/20070508/baf99043/attachment.html>


More information about the rat-forum mailing list