Quick introduction
Quick start
Enter a search term in the "Search for a protein" field and press search. Click the "view" button for the protein of interest.
Quick introduction
Quick start
Enter a search term in the "Search for a protein" field and press search. Click the "view" button for the protein of interest.
Input options
Protein search
The "Search" field is linked to the UniProt protein search engine (Fig.1). Use any term to search for the protein in the UniParc database (e.g. Cyclin-dependent kinase inhibitor 1B). To increase the relevance of retrieved UniParc entries additional terms, like species and gene name, can be added. If the retrieved list doesn't include the protein of interest, the UniProt accession (e.g. P46527) or UniProt ID (e.g. CDKN1B) can be used. After the search results are retrieved, select the protein of interest from the result list. Valid searches are for example:
Custom input
The "Custom alignment" input option allows for entering a plain text or uploading a file. These have
to be in FASTA format and can be either one single protein or a multiple sequence alignment. Here
the first sequence is used as the query sequence.
Fig. 2. ProViz custom alignment input. (1) Text field to input protein sequence or multiple
sequence alignment in FASTA format. (2) Submit button to create the ProViz visualisation with the
alignment provided. (3) File uploader to provide a protein sequence or multiple sequence alignment
by uploading a FASTA file.
URL input
To access the main visualisation directly a set of URL options can be used. An extensive list is available in the "URL options" section.
Main visualisation
Overview
Alignment
The alignment section of the main visualisation shows the query sequence and if available a homologue alignment from Quest for Orthologues or GeneTree. It shows the name of the species on the left panel with links to UniProt or Ensemble. The right side displays the alignment coloured in the ClustalX scheme.
Features
The feature section shows tracks containing information associated with the query protein. These tracks are represented in rows which are grouped by type of data shown, with the left part describing the type of data shown and the right showing the data in one of three available formats. Features mapping to a continuous segment of the query protein (e.g. domains or transmembrane regions) are displayed as horizontal bars spanning the corresponding residues. Peptide tracks are similar to bar tracks, but display amino acids aligned with the corresponding residue in the query protein. Histogram tracks display quantitative data for the protein on a residue by residue basis. Data is displayed as vertical bars corresponding to the value given to the residue.
Tool bar
On the top left the name of the protein and gene together with the species is displayed. On the right hand side the user can choose between different alignments. The buttons right of the alignment selector activate panel for additional information like uncollapsing the alignment, searching for residues by regular expression, showing the compact view, highlighting areas of interest, protein architecture overview and recoloring of the alignment. There are reset and home buttons at the end.
The options panel containes options to modify different aspects of the main visualisation. The session section allows the user to create or reset a session and download a PDF of the visualisation. In the sequence section contains the comapt view option. The architecture overview can be enabled in the architecture section. The selection area containes controls for the slider, focus and resize options. The add new track section allows the user to add a custom feature file by drag/drop or file upload. Download options for the sequence data is available in the download section.
The protein tab of the sidebar contains a list of all hidden proteins and allows the user to restore each individually or hide and show all sequences at once.
The features tab of the sidebar contains all available features and feature groups. Each feature or feature group can be hidden or shown by the toggle buttons next to it. The show and hide all buttons will switch all features on or off.
The help tab of the sidebar displays the help for the main visualisation.
The about tab of the options sidebar contains the about section describing ProViz and key features.
Databases and programs
Databases
ProViz utilises many databases to give the user a wide range of information about the protein of interest. Multiple sequence alignments are retrieved from Quest for Orthologues and GeneTree. Protein modularity data are provided by ELM, Pfam and Phospho.ELM. For structural information, PDB, DSSP and homology models from SWISS-MODEL are used. Genomics data are retrived from DbSNP, 1000 genomes and isoforms from UniProt. Additional curated data are available from UniProt and Switches.ELM.
Predictions
ProViz includes data from various predictive programs and databases, like conservation, ELM,
MobiDB, IUPred, PsiPred and Anchor.
Data sources table
Name | Description | PMID | URL |
Multiple sequence alignments | |||
GeneTree | Homo/Para/orthologue alignments and gene duplication information | 19029536 | www.ensembl.org |
GOPHER | Orthologue alignments by reciprocal best hit | 17576682 | bioware.ucd.ie |
Quest for orthologues | Datasets of homologous genes | 18819722 | questfororthologs.org |
Protein modularity | |||
ELM | Manually curated linear motifs | 26615199 | elm.eu.org |
Pfam | Functional regions and binding domains | 24288371 | pfam.xfam.org |
Phospho.ELM | Experimentally verified phosphorylation sites | 21062810 | phospho.elm.eu.org |
Structural information | |||
PDB | Experimentally resolved protein tertiary structures | 10592235 | www.rcsb.org |
DSSP | Secondary structure derived from PDB tertiary structures | 25352545 | swift.cmbi.ru.nl/gv/dssp |
Homology models/ SWISS-MODEL | Assigned tertiary structure by sequence similarity to resolved structure | 24782522 | swissmodel.expasy.org |
Genomic data | |||
DbSNP | Single-nucleotide polymorphism with disease association and genotype information | 11125122 | www.ncbi.nlm.nih.gov/SNP |
1000 genomes | Single-nucleotide polymorphism | 23128226 | www.1000genomes.org |
Isoforms | Alternative splicing | 25348405 | www.uniprot.org |
Additional curated data | |||
Mutagenesis | Experimentally validated point mutations and effect | 25348405 | www.uniprot.org |
Regions of interest | Experimentally validated functional areas | 25348405 | www.uniprot.org |
Switches.ELM | Experimentally validated motif-based molecular switches | 23550212 | switches.elm.eu.org |
Prediction | |||
MobiDB | Collection of various disorder prediction methods | 25361972 | mobidb.bio.unipd.it |
IUPred | Intrinsically disordered regions | 15769473 | iupred.enzim.hu |
PsiPred | Secondary structure for human proteins | 23748958 | bioinf.cs.ucl.ac.uk/psipred |
Anchor | Binding sites in disordered regions | 19412530 | anchor.enzim.hu |
ELM | Linear motifs by regular expression | 26615199 | elm.eu.org |
Conservation | Conservation of residues across the alignment | 22977176 | bioware.ucd.ie |
PDF download
The PDF download is available by buttons in the options sidebar or the toolbar on the top right. This creates a file download containing a PDF document of the ProViz visualisation.
URL options
Users can construct URLs to access and customise protein visualisations using the URL options below:
URL options table
URL option | Description | Input type | Example |
uniprot_acc | UniProt accession of the protein to be visualised | string: UniProt accession | uniprot_acc=P46527 |
alignment | Type of homology alignment to be displayed | string: QFO,[TaxonID] | alignment=Metazoa |
ali_start | Starting residue for the scope of the alignment | integer > 0 | ali_start=10 |
ali_end | Ending residue for the scope of the alignment. Will be set to protein length, if greater than protein length. | integer > 0 | ali_end=20 |
disable | Disables feature tracks by providing the names of features. This prevents loading of data for mentioned features, they can’t be activated without reload of the page. | string: motif, elm, modification, phospho, mutagenesis, pfam, structure, PDB, homology, splice_variant, SNP, chain, dna_binding, region, metal_binding, site, cross_link, iupred | disable=motif,modification,SNP |
collapse | Collapses feature groups by providing the names of feature groups. Features are loaded and hidden, but can be activated in the options panel. | string: alignment, switch, motif, modification, mutation, structure, PDB, isoform, snp, feature, disorder | collapse=alignment,PDB |
hideAln | Hide proteins by providing accessions separated by commas | string: UniProt accession" | hideAln=H2Q5H2,F6Z4RO |
showAln | Show proteins by providing accessions separated by commas. All other sequences will be hidden | string: UniProt accession | showAln=H2Q5H2,F6Z4RO |
genetree_mode | If alignment is set to GeneTree select paralog, ortholog or all | string: paralog, ortholog, all | genetree_mode=paralog |
url_rest | Providing a URL pointing to a custom track file will load the visualisation for the custom data automatically | string: file URL | rest_url=http://slim.ucd.ie/proviz/help/custom_track/track.xml |
URL example
http://slim.ucd.ie/proviz/proviz.php?uniprot_acc=P46527&alignment=33208&disable=PDB&ali_start=140&ali_send=198Use of custom data
Users are able to add custom data to any existing ProViz visualisation. This can be achieved by providing a file in either XML, CSV or JSON format by drag and drop, file upload or link to a server providing the file via REST service.
XML
The XML file has to start and end with the "tracks" tag. Each track starts and ends with the
"track" tag and has the mandatory option "type" (feature, peptide, histogram) and accepts the
options "name", "type", "position", "colour" and "opacity". The "track" tag accepts multiple
"entry" tags. The "entry" tag requires the "start" and "end" options for type feature or peptide
or "position" option for histogram, but also accepts "text" for feature, "value" for histogram
and "sequence for peptide, "hover", "link", "text_colour", "colour" and "opacity".
An example of the format is available at:
XML
XML schema
An XML schema can be used to design and validate a ProViz readable XML file and is available for download here: XML schema
CSV
The CSV file consists of two parts, the header line and the data lines which both have to be
comma separated. The header line specifies the available fields and is mandatory, but the fields
can be in any order. The data lines contain the data to be visualised and each line represents
one element called entry. Fields starting with "t_" are track fields and only have to be defined
once and will automatically be applied to all other lines with the same track number. Each track
requires the "track", "t_type" and "t_position" fields, all other track fields are optional. The
remaining fields define the shown element (entry fields). Required entry fields are "entry",
either "start" and "end" or "position" as well as one of "text", "value" and "sequence".
An example of the format is available at:
CSV
CSV options
Field name | Description | Input type |
track | Numbering of tracks | integer > 0 |
t_name | Track name | string |
t_type | Track type | "string: feature, peptide, histogram" |
t_position | Track position, -1 for track display above the main sequence and alignment, 1 for track display above the main sequence and alignment | "integer: -1 / 1" |
t_colour | Default colour of all elements in the given track | "string: #000000 - #FFFFFF" |
t_opacity | Default opacity of all elements in the given track | "double: 0 - 1" |
t_text_colour | Default colour of the text of all elements in the given track | "string: #000000 - #FFFFFF" |
t_help | Tracks help tooltip | string |
entry | Numbering of entries | integer > 0 |
text | Entries displayed text. For feature tracks only. | string |
value | Value determining the height of the bars in a histogram. For histogram tracks only. | double |
sequence | A peptides sequence. For peptide tracks only. | string |
hover | Tooltip for given entry | string |
link | Link activated by clicking the given entry | "string: URL" |
start | Start position of entry. Only for feature and peptide tracks. Needs end value | integer > 0 |
end | End position of entry. Only for feature and peptide tracks. Needs start value. | integer > 0 |
position | Position of entry. Only for histogram tracks. | integer > 0 |
colour | Color of given entry. Overrides “t_colour” property. | "string: #000000 - #FFFFFF" |
opacity | Opacity of given entry. Overrides “t_opacity” property | "double: 0 - 1" |
text_colour | Text colour of given entry. Overrides “t_text_colour” property | "string: #000000 - #FFFFFF" |
JSON
The JSON file contains a standard JSON object. The object is an array filled with one dictionary
per track. These dictionaries contain the "type", "position", "name", "colour", "opacity",
"text_colour" variables which represent the default values for all entries in the track and the
"data" array filled with one dictionary per entry. The entry dictionary provides the variables
"start", "end", "text", "link", "colour", "opacity", "hover" and "text_colour" to customise the
entry further.
An example of the format is available at:
JSON
JSON options
Field name | Description | Input type |
name | Track name | string |
type | Track type | "string: feature, peptide, histogram" |
position | Track position, -1 for track display above the main sequence and alignment, 1 for track display above the main sequence and alignment | "integer: -1 / 1" |
colour | Default colour of all elements in the given track | "string: #000000 - #FFFFFF" |
opacity | Default opacity of all elements in the given track | "double: 0 - 1" |
text_colour | Default colour of the text of all elements in the given track | "string: #000000 - #FFFFFF" |
help | Tracks help tooltip | string |
text | Entries displayed text. For feature tracks only. | string |
value | Value determining the height of the bars in a histogram. For histogram tracks only. | double |
sequence | A peptides sequence. For peptide tracks only. | string |
hover | Tooltip for given entry | string |
link | Link activated by clicking the given entry | "string: URL" |
start | Start position of entry. Only for feature and peptide tracks. Needs end value | integer > 0 |
end | End position of entry. Only for feature and peptide tracks. Needs start value. | integer > 0 |
position | Position of entry. Only for histogram tracks. | integer > 0 |
colour | Color of given entry. Overrides “t_colour” property. | "string: #000000 - #FFFFFF" |
opacity | Opacity of given entry. Overrides “t_opacity” property | "double: 0 - 1" |
text_colour | Text colour of given entry. Overrides “t_text_colour” property | "string: #000000 - #FFFFFF" |
Legal