Datasources in mx are files that you'd like to store on the server. These can be variously linked to data in mx or shared with other project members etc.
Translating a Nexus data source into mx
Data sources that are Nexus files exported from Mesquite can be translated into the relational format used by mx. This allows you to query and display your matrices on the web using mx.
To load a file using the web interface:
- Click on the Phylo tab, then Data sources
- Click new on the top right then fill out the form, choosing the file you'd like to load
- After the record is created, show it, you'll see the options to the left. Click convert and you will see a report and some options as to how you want to convert the file into the database.
- If you don't see a report after clicking `convert` there has been a problem parsing the file (see below in part for solving this).
Using rake tasks to translate Nexus files into mx
Rake tasks are run from the command line/shell, anywhere within the rails application.
Debugging files against the parser
rake mx:matrix:debug file=<full path to file>
If the matrix parses ok you'll see a small report about the number of characters, OTUs, etc. If it doesn't you'll get an error. Some errors are nicely captured, others not (yet).
Things to look for in files that aren't parsing
- Is the file exported from Mesquite? While it is possible that other Nexus legal files can be loaded chances are there are issues that we haven't come across yet that the parser can't handle.
- Labels of any type can't begin with "#" (yet)
- Do you have back to back single quotes in the file? These are almost always a problem. Mesquite allows these in labels but the parser in mx doesn't.
- Single and double quotes (" or ') - Mesquite apparently does not wrap its labels (things like character state or taxon names) in quotes in a consistent manner. This is likely to be the single biggest problem with the parser, they will have to be manually fixed. Some examples:
'''this is a legal Mesquite label'' ' # BAD - space between quotes 'This can't be handled either' # BAD - single quote within single quotes 'Andy's_taxon_name' # BAD - singlye quote within single quote ~ # BAD - a tilda is bad "While that can't this can!" # GOOD - single quote within double quotes 'Do not "quote" me on that.' # GOOD - double quotes within singles
Loading mutiple matrices at once
Multiple matrices can be loaded at once from a rake task.
rake mx:matrix:load datapath=/data/my_matrices metafile=foo.txt project=23 person=4 RAILS_ENV=[production|development]
datapath - the full path to the folder containing the matrices and the metafile project - a legal mx project id person - a person in that project, will be used as the creator etc. metafile - a YAML file in the format described below
The YAML metadata file
Format will likely change!!
--- MX_test_01.nex: # the filename :data_source_ref_id: 12 # referenced everywhere a ref_id is needed :title: "foo bar a licious 2" # the matrix title :data_source_name: "test 1231233555" # the datasource tile (essentially the same thing in this case) :notes: "Some note here." # does nothing (but it should) :generate_short_chr_name: false # should work :generate_otu_name_with_ds_id: false # should work :generate_chr_name_with_ds_id: false # should work :match_otu_to_db_using_name: false # should work, but careful, attempts to code identical states currently fails :match_otu_to_db_using_matrix_name: false :match_chr_to_db_using_name: false :generate_chr_with_ds_ref_id: false # needs a valid ref_id (not tested) :generate_otu_with_ds_ref_id: false # needs a valid ref_id (not tested) :generate_tags_from_notes => false # creates a Tag for notes :generate_tag_with_note => false # if not false appends the text included to the notes of the Tag Another_file.nex: ... Yet_another_file: