PDBest (PDB Enhanced Structures Toolkit) is a user- friendly, freely available platform for acquiring, manipulating and normalizing protein structures in a high-throughput and seamless fashion. The platform has an intuitive graphical interface developed to allow researchers and students with no programming background to download and manipulate theirs files without using the command line. The platform can also save protocols, enabling users to easily share PDB searching and filtering data, improving reproducibility of the analyses carried out subsequently.
The software platform was developed in C++ language on the QT framework, providing high performance for all major operating systems: Windows, Linux, and Mac OS X.
On PDBest, users can provide input files from (1) a local repository, (2) download them from the RCSB Protein Data Bank mirror, via an online query, using their searching parameters or (3) a combination of both.
Using the "Online Query" option users can search for biomolecules using all parameters available at the RCSB Protein Data Bank, combining the searching criteria with the logical operators "and"/"or" in any combination, allowing very specific and sophisticated queries to be performed.
Furthermore, it is possible to remove files by different sequence similarity thresholds on "Remove Similar Sequence at" list box.
Users can compose their queries via the box menu and add it with the "Insert" button and add or modify its parameters. A query option can be removed using the icon "x" on the window or the the query can be reset with the "clear" button. The "Submit" button will submit the query to the RCSB database and the list of matching PDB files will be shown to be analyses before the processing step. It is possible to change or refine the query before processing, making adjustments at any time. The PDB identifiers can be seen via the "Show" button and a list of them can be saved.
The "Add" button can be used to manually include PDB identifiers.
There is no limit of number of molecues to be acquired and processed by PDBest.
The input PDB files can provided from a local repository though three option buttons:
PDB files must have unique names, even if in different directories, otherwise only one instance will be considered. Files shown on the list box are sent to processing section.
The "Configurations" section includes two main options, "Filtering Standards" and "Output" options. After loading the files users can choose amongst many filtering options to be applied as well as decide where to store the filtered files and name conventions.
PDBest can manipulate PDB files by applying a series of filters or processing parameters which allows users to select relevant information to theirs analyses. A new file will be created with the selected records, and the original file will be maintained.
The PDB file format definition stablishes a set of standards to be followed during structure deposition and its sections are extensively described at the wwwPDB. The PDBest filterig criteria is devided in the folloing sections. The user can choose to keep or discard the records selecting the check box accordingly:
On the residue component tab users can choose to select or remove specific residues based on their 3-letter code. It is also possible to filter atoms with multiple occipancies (greater/lower occupancy or keep atoms without occupancy). It is also possible to renumber residues.
On the "other informations" tab it is possible to filter models, for instance, from structures determined by NMR, to remove solvent molecules and filter TER, ANISOU and HETATM records.
The Output configurations allow users to select the directory where the PDB files will be download from PDB online repository if online queries are performed and where processed files will be stored after the application of filters. The user may include an expression and choose the output format. For example: using the expression ".filtered" every processed file will contain this term as in 4HHB.filtered.pdb.
After submiting queries to RCSB Protein Data Bank and/or loading local files and choosing filtering criteria, the files can be seen on a grid on the Process section. Up to a hundred file indentifiers can be seen at once. On the grid the origin of the file (online/local) and its status (to be downloaded/ready) is shown. The files with "To be downloaded" status will be downloaded by pressing the "Start Download" button.
Files can be selected and removed from the grid before processing.
Users can verify inconsistencies on files format with the Verify Inconsistencies button. Files with errors are marked on the grid. The possible issues on the structures verified by PDBest are missing atoms or residues, atoms with multiple occupancy and non-standard residues .
The Open Detailed file button will open a window with a complete report regarding the inconsistencies found. The report can lso be saved into a file.
The "Process" button applies the selected filters, generating the new files.