Metrics
1,716,004 Downloads
Featured Dataverses

In order to use this feature you must have at least one published or linked dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

1 to 10 of 142,102 Results
Jul 22, 2025
Horokhovskyi, Yehor; Roetschke, Hanna P; Cormican, John A.; Pašen, Martin; mishto, michele; Liepe, Juliane, 2025, "Source Data from "An automated workflow to address proteome complexity and the large-search-space problem in proteomics and HLA-I immunopeptidomics."", https://doi.org/10.17617/3.2M9RDY, Edmond, V1
The Source Data associated with the publication. Provided are the gene fusion detection results, predictions of peptide binding to MHC-I, MS search spaces, search engine outputs, tool parameters and post-aggregated analysis results of the search space size and multimapping statistics.
ZIP Archive - 6.4 GB - MD5: 78db2330172cd21cd3fc15663b4907cf
Sequoia constructed reference databases. This file contains GENCODE driven and RNA-seq informed reference databases in FASTA format and ORF annotations in tabular format (37.8 GB on disk). RNA-seq data were derived from K562 cell line and used for reference-guided transcriptome assembly, quantification and ORF prediction. Provided are the GENCODE v...
MS Excel Spreadsheet - 3.7 MB - MD5: 336716d13899598876da4869ed42544d
Peptide multi-mapping within strata. This file contains the numbers of peptides per number of multimapping origins within proteogenomic strata. For exhaustive transcriptomic and genomic ORFs the peptide mapping was checked on ORF level, for reference proteome, CDS off-frame and cis25-spliced peptides the mapping was checked also on gene level. Both...
MS Excel Spreadsheet - 13.4 KB - MD5: b67a7ed20bcc54bc467f4e736bb91f30
Comparison between theoretical and computationally explored strata sizes. Number of unique unspecific peptide sequences and number of theoretical peptide sequences for RNA-seq informed and GENCODE driven strata is shown per peptide length. This table refers to Supplementary Fig. 7e. The data refers to K562 cell line.
MS Excel Spreadsheet - 1.5 MB - MD5: d0a2e1291f7666686916c6dbe1dc694d
Peptide multi-mapping across strata pairs. This file contains the pairwise comparisons of distinct peptide sets between proteogenomic strata. For every pair of databases, peptide length and catalytic rule, provided are the numbers of peptides distinct for each stratum as well as the intersection size and ratios of intersection to each set. Unfilter...
MS Excel Spreadsheet - 30.9 KB - MD5: dea2726c20e431f766a2ffcb969202c8
Strata search space sizes across SPIsnake filtering steps in B721.221. Number of unique peptides derived by tryptic or proteasomal digestion from RNA-seq informed and uninformed search spaces for all peptide lengths. Strata reduction during data-driven filtering is reported for B721.221 cell line expressing HLA-A02:01 or HLA-B07:02.
ZIP Archive - 20.0 GB - MD5: b74511615010607270b9a8f42025ebe8
FASTA Files of MW-filtered canonical and expanded reference databases. The zip file contains two FASTA files of target and decoy peptide sequences for canonical and expanded databases. Decoy sequences are indicated by the label “rev_” in the FASTA header.
ZIP Archive - 7.3 GB - MD5: 45447fd1fec537a575ee6ba49189da98
FASTA Files of MW-RT-filtered canonical and expanded reference databases. The zip file contains two FASTA files of target and decoy peptide sequences for canonical and expanded databases. Decoy sequences are indicated by the label “rev_” in the FASTA header.
ZIP Archive - 260.0 MB - MD5: ab5ae2c2969311bc6123d094dfdd5de1
FASTA Files of MW-RT-affinity-filtered canonical and expanded reference databases. The zip file contains two FASTA files of target and decoy peptide sequences for canonical and expanded databases. Decoy sequences are indicated by the label “rev_” in the FASTA header.
ZIP Archive - 292.9 MB - MD5: 441e90e56b3dd8c54590ea305a756c25
MSFragger original search engine outputs. The zip file contains subfolders each containing ‘pepXML’ output files from MSFragger for the MS raw files which were searched. Folder titles indicate the model system, whether the search results come from the canonical or expanded reference database search and which SPIsnake filters were applied.
Add Data

Sign up or log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.