There are three things to take care of :
The two most important things are absolute consistency in naming image files, and knowing the actual sequences which were used to generate each image.
The closer you come to perfection the easier it will be for me to integrate your images into our distributed database.
And don’t be put off even if you only have 20 or 30 images - it’s all useful data.
In general your image file names should be composed of two parts : <source-sequence>_<image-specific-data>.gif (or whatever)
The first part (<source-sequence>) should refer exactly to the sequence you used to generate the image.
Typically this will be a clone name or a cDNA accession number, but it may also be (say) a 5’ EST accession number, or it could be your own made-up ID, especially if you are using un-submitted sequences.
The second part (<image-specific-data>) is the bit that tells us what the image is of.
It typically contains a sequence of short codes, which again, must be used consistently.
These codes may include the following :
But may have other elements that you feel would be useful to communicate to a viewer.
The parts of the name should be separated by ’-’ or ’_’ - please don’t use blanks or ’.’, or other difficult characters like =, +, |, >, etc..
The parts of the name should always be in the same order for your whole set of data, and all should used parts should be present for all genes, even if ’empty’ or ’NA’, for some.
E.g.
If this is done in such a way that I can retrieve the sequences from these names, and understand your codes, then that’s about all it needs.
If an image file is named inconsistently it may not be searchable, although this will probably become clear in the error checking phase.
Now you will need to generate a list of all your image files - but probably you have this already. There are ways of generating a list automatically (-ish) if the images are all in one folder.
In addition you may need to construct one or two simple spreadsheet data files (that you send to me). Although these may not be necessary in all cases.
List 1. image to sequence reference table
If you have used an indirect reference to a sequence in the first part of your image name, then it will need to be in this data, with each image reference (the
So either :
(your images were named NP-AX-100024-etc...)
OR
(your images were named OB-000234-etc...)
Or you may provide a .fasta file [1] of your sequences.
N.B. If you are using your own sequences, say amplified from genomic or mRNA material, then much the best strategy is to submit these sequences to GenBank before (finally) naming the images, or generating the image data. The process is quite quick, and has the advantages that (a) you will have an accession number when you come to publish the data, and (b) it will be easier to incorporate the image in the Search Engine.
List 2. short code translations
This is very simple, and just enables me to get your images described correctly from your embedded codes.
E.g.
etc.
As many as you need - preferably in groups.
ALTERNATIVELY (tho’ not recommended)
You can provide a complete spreadsheet list of your image file names with a description for display and either the accession number to retrieve each sequence or the sequences themselves, e.g.
There are two options here :
put them in a folder on a local web-server (or any other that you can get access to !)
send them to me
I prefer the first option, and most of you will have some sort of Institute web server which should be happy to host your image data in a publically accessible folder (that’s the optimistic view). This should be very easy with a bit of co-operation from your IT folk. If that’s not possible, just send them to me on a CD.
2003-2010 © Metamorphosys - Tous droits réservés
Ce site est géré sous SPIP 1.9.2b [9381] et utilise le squelette EVA-Web 3.0 Bêta1
Dernière mise à jour : lundi 3 mai 2010