.. _run_examples: Run examples ================================== This sections gives examples on how to use modCnet. Train ac4C model using IVT datasets ******************** To train a ac4C detection model, ac4C-modified and unmodified samples are required. ac4C-modified and unmodified IVT datasets generated in this study have been uploaded to GEO database under the accession numbers of `GSE227087 `_ and `GSE267558 `_. This demo demostrated how to train a ac4C detection model from scratch using the IVT datasets. **1. Guppy basecalling** Basecalling converts the raw signal generated by Oxform Nanopore sequencing to DNA/RNA sequence. Guppy is used for basecalling in this step. In some nanopore datasets, the sequence information is already contained within the FAST5 files. In such cases, the basecalling step can be skipped as the sequence data is readily available. :: #ac4C-modified guppy_basecaller -i demo_data/IVET_ac4C -s demo_data/IVT_ac4C_guppy --num_callers 40 --recursive --fast5_out --config rna_r9.4.1_70bps_hac.cfg #unmodified guppy_basecaller -i demo_data/IVT_unmod -s demo_data/IVT_unmod_guppy --num_callers 40 --recursive --fast5_out --config rna_r9.4.1_70bps_hac.cfg **2. Multi-reads FAST5 files to single-read FAST5 files** Convert multi-reads FAST5 files to single-read FAST5 files. If the data generated by the sequencing device is already in the single-read format, this step can be skipped. :: #ac4C-modified multi_to_single_fast5 -i demo_data/IVT_ac4C_guppy -s demo_data/IVT_ac4C_guppy_single --recursive #unmodified multi_to_single_fast5 -i demo_data/IVT_unmod_guppy -s demo_data/IVT_unmod_guppy_single --recursive **3. Tombo resquiggling** In this step, the sequence obtained by basecalling is aligned or mapped to a reference genome or a known sequence. Then the corrected sequence is then associated with the corresponding current signals. The resquiggling process is typically performed in-place. No separate files are generated in this step. :: #ac4C-modified tombo resquiggle --overwrite --basecall-group Basecall_1D_000 demo_data/IVT_ac4C_guppy_single demo_data/IVT_DRS.reference.fasta --processes 40 --fit-global-scale --include-event-stdev #unmodified tombo resquiggle --overwrite --basecall-group Basecall_1D_000 demo_data/IVT_unmod_guppy_single demo_data/IVT_DRS.reference.fasta --processes 40 --fit-global-scale --include-event-stdev **4. Map reads to reference** minimap2 is used to map basecalled sequences to reference transcripts. The output sam file serves as the input for the subsequent feature extraction step. :: #ac4C-modified cat demo_data/IVT_ac4C_guppy/pass/*.fastq >demo_data/IVT_ac4C.fastq minimap2 -ax map-ont demo_data/IVT_DRS.reference.fasta demo_data/IVT_ac4C.fastq >demo_data/IVT_ac4C.sam #unmodified cat demo_data/IVT_unmod_guppy/pass/*.fastq >demo_data/IVT_unmod.fastq minimap2 -ax map-ont demo_data/IVT_DRS.reference.fasta demo/IVT_unmod.fastq >demo_data/IVT_unmod.sam **5. Feature extraction** Extract features from resquiggled fast5 files using the ``feature_extraction.py`` scripts in the github repository. :: #ac4C-modified python script/feature_extraction.py --input demo_data/IVT_ac4C_guppy_single \ --reference demo_data/IVT_DRS.reference.fasta \ --sam demo_data/IVT_ac4C.sam \ --output demo_data/IVT_ac4C.feature.tsv \ --clip 10 \ --motif NNCNN #unmodified python script/feature_extraction.py --input demo_data/IVT_unmod_guppy_single \ --reference demo_data/IVT_DRS.reference.fasta \ --sam demo_data/IVT_unmod.sam \ --output demo_data/IVT_unmod.feature.tsv \ --clip 10 \ --motif NNCNN In the feature extraction step, the motif pattern should be provided using the argument ``--motif``. The base symbols of the motif follow the IUB code standard. Here is the full definition of IUB base symbols: +-------------+-------------+ | IUB Base | Expansion | +=============+=============+ | A | A | +-------------+-------------+ | C | C | +-------------+-------------+ | G | G | +-------------+-------------+ | T | T | +-------------+-------------+ | M | AC | +-------------+-------------+ | V | ACG | +-------------+-------------+ | R | AG | +-------------+-------------+ | H | ACT | +-------------+-------------+ | W | AT | +-------------+-------------+ | D | AGT | +-------------+-------------+ | S | CG | +-------------+-------------+ | B | CGT | +-------------+-------------+ | Y | CT | +-------------+-------------+ | N | ACGT | +-------------+-------------+ | K | GT | +-------------+-------------+ **6. Train-test split** The train-test split is performed randomly, ensuring that the data points in each set are representative of the overall dataset. The default split ratios are 80% for training and 20% for testing. The train-test split ratio can be customized by using the argument ``--train_ratio`` to accommodate the specific requirements of the problem and the size of the dataset. The training set is used to train the model, allowing it to learn patterns and relationships present in the data. The testing set, on the other hand, is used to assess the model's performance on new, unseen data. It serves as an independent evaluation set to measure how well the trained model generalizes to data it has not encountered before. By evaluating the model on the testing set, we can estimate its performance, detect overfitting (when the model performs well on the training set but poorly on the testing set) and assess its ability to make accurate predictions on new data. :: usage: train_test_split.py [-h] [--input_file INPUT_FILE] [--train_file TRAIN_FILE] [--test_file TEST_FILE] [--train_ratio TRAIN_RATIO] Split a feature file into training and testing sets. optional arguments: -h, --help show this help message and exit --input_file INPUT_FILE Path to the input feature file --train_file TRAIN_FILE Path to the train feature file --test_file TEST_FILE Path to the test feature file --train_ratio TRAIN_RATIO Ratio of instances to use for training (default: 0.8) #ac4C-modified python scripts/train_test_split.py --input_file demo_data/IVT_ac4C.feature.tsv --train_file demo_data/IVT_ac4C.train.feature.tsv --test_file demo_data/IVT_ac4C.test.feature.tsv --train_ratio 0.8 #unmodified python scripts/train_test_split.py --input_file demo_data/IVT_unmod.feature.tsv --train_file demo_Data/IVT_unmod.feature.tsv --test_file demo_data/IVT_unmod.feature.test.tsv --train_ratio 0.8 **7. Train ac4C model** To train the modCnet model using your own dataset from scratch, set the ``--run_mode`` argument to "train" and the ``--model_type`` argument to "C/ac4C". modCnet accepts both modified and unmodified feature files as input. Additionally, test feature files are necessary to evaluate the model's performance. You can specify the model save path by using the argument ``--new_model``. The model's training epochs can be defined using the argument ``--epochs``, and the model states will be saved at the end of each epoch. modCnet will preferentially use the ``GPU`` for training if CUDA is available on your device; otherwise, it will utilize the ``CPU`` mode. The training process duration can vary, depending on the size of your dataset and the computational capacity, and may last for several hours. :: python script/modCnet.py --run_mode train \ --model_type C/ac4C \ --new_model demo_data/model/C_ac4C.IVT.demo.pkl \ --train_data_C demo_data/IVT_unmod.feature.train.tsv \ --train_data_ac4C demo_data/IVT_ac4C.feature.train.tsv \ --test_data_C demo_data/IVT_ac4C.feature.test.tsv \ --test_data_ac4C demo_data/IVT_unmod.feature.test.tsv \ --epoch 100 During training process, the following information can be used to monitor and evaluate the performance of the model: :: device= cpu train process. data loaded. start training... Epoch 0-0 Train acc: 0.522000,Test Acc: 0.500000,time0:00:24.898431 Epoch 1-0 Train acc: 0.756000,Test Acc: 0.750000,time0:00:42.953740 Epoch 2-0 Train acc: 0.824000,Test Acc: 0.769750,time0:00:27.752530 Epoch 3-0 Train acc: 0.804000,Test Acc: 0.790500,time0:00:29.946116 Epoch 4-0 Train acc: 0.816000,Test Acc: 0.797250,time0:00:24.155293 Epoch 5-0 Train acc: 0.816000,Test Acc: 0.793250,time0:00:23.675549 Epoch 6-0 Train acc: 0.830000,Test Acc: 0.823000,time0:00:27.202119 Epoch 7-0 Train acc: 0.852000,Test Acc: 0.834000,time0:00:36.018639 Epoch 8-0 Train acc: 0.830000,Test Acc: 0.823250,time0:00:27.230856 Epoch 9-0 Train acc: 0.836000,Test Acc: 0.846250,time0:00:58.296155 Epoch 10-0 Train acc: 0.832000,Test Acc: 0.830250,time0:00:22.394222 Epoch 11-0 Train acc: 0.858000,Test Acc: 0.857500,time0:00:18.485811 After the data processing and model training, the following files should be generated by modCnet. The trained model ``C_ac4C.IVT.demo.pkl`` will be saved in the ``./demo_data/model/`` folder. You can utilize this model for making predictions in the future. :: . ├── ac4C.feature.test.tsv ├── ac4C.feature.train.tsv ├── C.feature.test.tsv ├── C.feature.train.tsv ├── IVT_DRS.reference.fasta ├── IVT_fast5 │   └── batch_0.fast5 ├── IVT_fast5_guppy │   ├── fail │   │   └── fastq_runid_71d544d3bd9e1fe7886a5d176c756a576d30ed50_0_0.fastq │   ├── guppy_basecaller_log-2024-05-20_21-21-06.log │   ├── pass │   │   └── fastq_runid_71d544d3bd9e1fe7886a5d176c756a576d30ed50_0_0.fastq │   ├── sequencing_summary.txt │   ├── sequencing_telemetry.js │   └── workspace │   └── batch_0.fast5 ├── IVT_fast5_guppy_single │   ├── 0 │   │   ├── 00007b91-98f4-41c3-9eab-39f40625d550.fast5 │   │   ├── 00104315-e8fa-4031-a122-3741b7531396.fast5 │   │   ├── 0020eb7c-89f8-44bf-aeaf-acb2ea776b2c.fast5 │   │   ├── 0045dcf9-ac50-4e2e-b8dc-ea7a9157b2c4.fast5 │   │   ├── 005c48b0-72d1-4898-9fb2-00bebca69828.fast5 │   │   ├── 0433af9f-ec17-476e-93ff-6d77f8ff6e62.fast5 │   │   ├── 04343c9a-c88b-46e6-9b7d-1f97f7a28128.fast5 │   │   ├── 0b84f368-b4b9-4c63-af9c-7574f9a12d43.fast5 │   │   └── 0b8898ca-a2cc-4687-a53a-15fc159ceb3b.fast5 │   │   │   └── filename_mapping.txt ├── IVT.fastq ├── IVT.feature ├── IVT.sam ├── m5C.feature.test.tsv ├── m5C.feature.train.tsv ├── model │   └── C_ac4C.IVT.demo.pkl └── test.feature.tsv Train m5C model using IVT datasets ******************** m5C-modified and unmodified IVT datasets are publicly available at the GEO database under the accession code `GSE227087 `_. **1. Guppy basecalling** Basecalling converts the raw signal generated by Oxform Nanopore sequencing to DNA/RNA sequence. Guppy is used for basecalling in this step. In some nanopore datasets, the sequence information is already contained within the FAST5 files. In such cases, the basecalling step can be skipped as the sequence data is readily available. :: #m5C-modified guppy_basecaller -i demo_data/IVT_m5C -s demo_data/IVT_m5C_guppy --num_callers 40 --recursive --fast5_out --config rna_r9.4.1_70bps_hac.cfg #unmodified guppy_basecaller -i demo_data/IVT_unmod -s demo_data/IVT_unmod_guppy --num_callers 40 --recursive --fast5_out --config rna_r9.4.1_70bps_hac.cfg **2. Multi-reads FAST5 files to single-read FAST5 files** Convert multi-reads FAST5 files to single-read FAST5 files. If the data generated by the sequencing device is already in the single-read format, this step can be skipped. :: #m5C-modified multi_to_single_fast5 -i demo_data/IVT_m5C_guppy -s demo_data/IVT_m5C_guppy_single --recursive #unmodified multi_to_single_fast5 -i demo_data/IVT_unmod_guppy -s demo_data/IVT_unmod_guppy_single --recursive **3. Tombo resquiggling** In this step, the sequence obtained by basecalling is aligned or mapped to a reference genome or a known sequence. Then the corrected sequence is then associated with the corresponding current signals. The resquiggling process is typically performed in-place. No separate files are generated in this step. :: #m5C-modified tombo resquiggle --overwrite --basecall-group Basecall_1D_000 demo_data/IVT_m5C_guppy_single demo_data/IVT_DRS.reference.fasta --processes 40 --fit-global-scale --include-event-stdev #unmodified tombo resquiggle --overwrite --basecall-group Basecall_1D_000 demo_data/IVT_unmod_guppy_single demo_Data/IVT_DRS.reference.fasta --processes 40 --fit-global-scale --include-event-stdev **4. Map reads to reference** minimap2 is used to map basecalled sequences to reference transcripts. The output sam file serves as the input for the subsequent feature extraction step. :: #m5C-modified cat demo_data/IVT_m5C_guppy/pass/*.fastq >demo_data/IVT_m5C.fastq minimap2 -ax map-ont demo_data/IVT_DRS.reference.fasta demo_data/IVT_m5C.fastq >demo_data/IVT_m5C.sam #unmodified cat demo_data/IVT_unmod_guppy/pass/*.fastq >demo_data/IVT_unmod.fastq minimap2 -ax map-ont demo_data/IVT_DRS.reference.fasta demo_data/IVT_unmod.fastq >demo_data/IVT_unmod.sam **5. Feature extraction** Extract signals and features from resquiggled fast5 files using the following python script. :: #m5C-modified python script/feature_extraction.py --input demo_data/IVT_m5C_guppy_single \ --reference demo_data/IVT_DRS.reference.fasta \ --sam demo_data/IVT_m5C.sam \ --output demo_data/IVT_m5C.feature.tsv \ --clip 10 \ --motif NNCNN #unmodified python script/feature_extraction.py --input demo_data/IVT_unmod_guppy_single \ --reference demo_data/IVT_DRS.reference.fasta \ --sam demo_data/IVT_unmod.sam \ --output demo_data/IVT_unmod.feature.tsv \ --clip 10 \ --motif NNCNN In the feature extraction step, the motif pattern should be provided using the argument ``--motif``. The base symbols of the motif follow the IUB code standard. **6. Train-test split** The train-test split is performed randomly, ensuring that the data points in each set are representative of the overall dataset. The default split ratios are 80% for training and 20% for testing. The train-test split ratio can be customized by using the argument ``--train_ratio`` to accommodate the specific requirements of the problem and the size of the dataset. The training set is used to train the model, allowing it to learn patterns and relationships present in the data. The testing set, on the other hand, is used to assess the model's performance on new, unseen data. It serves as an independent evaluation set to measure how well the trained model generalizes to data it has not encountered before. By evaluating the model on the testing set, we can estimate its performance, detect overfitting (when the model performs well on the training set but poorly on the testing set) and assess its ability to make accurate predictions on new data. :: usage: train_test_split.py [-h] [--input_file INPUT_FILE] [--train_file TRAIN_FILE] [--test_file TEST_FILE] [--train_ratio TRAIN_RATIO] Split a feature file into training and testing sets. optional arguments: -h, --help show this help message and exit --input_file INPUT_FILE Path to the input feature file --train_file TRAIN_FILE Path to the train feature file --test_file TEST_FILE Path to the test feature file --train_ratio TRAIN_RATIO Ratio of instances to use for training (default: 0.8) #m5C-modified python script/train_test_split.py --input_file demo_data/IVT_m5C.feature.tsv --train_file demo_data/IVT_m5C.feature.train.tsv --test_file demo_data/IVT_m5C.feature.test.tsv --train_ratio 0.8 #unmodified python script/train_test_split.py --input_file demo_data/IVT_unmod.feature.tsv --train_file demo_data/IVT_unmod.feature.train.tsv --test_file demo_data/IVT_unmod.feature.test.tsv --train_ratio 0.8 **7. Train m5C model** To train the modCnet model using your own dataset from scratch, you can set the ``--run_mode`` argument to "train". modCnet accepts both modified and unmodified feature files as input. Additionally, test feature files are necessary to evaluate the model's performance. You can specify the model save path by using the argument ``--new_model``. The model's training epochs can be defined using the argument ``--epochs``, and the model states will be saved at the end of each epoch. modCnet will preferentially use the ``GPU`` for training if CUDA is available on your device; otherwise, it will utilize the ``CPU`` mode. The training process duration can vary, depending on the size of your dataset and the computational capacity, and may last for several hours. :: python script/modCnet.py --run_mode train \ --model_type C/m5C --new_model demo_data/model/C_m5C.IVT.demo.pkl \ --train_data_C demo_data/IVT_unmod.feature.train.tsv \ --train_data_m5C demo_data/IVT_m5C.feature.train.tsv \ --test_data_C demo_data/IVT_unmod.feature.train.tsv \ --test_data_m5C demo_data/IVT_m5C.feature.test.tsv \ --epoch 100 During training process, the following information can be used to monitor and evaluate the performance of the model: :: device= cpu train process. data loaded. start training... Epoch 0-0 Train acc: 0.512000,Test Acc: 0.500000,time0:08:16.780508 Epoch 1-0 Train acc: 0.754000,Test Acc: 0.738250,time0:04:33.946534 Epoch 2-0 Train acc: 0.786000,Test Acc: 0.775250,time0:04:57.815192 Epoch 3-0 Train acc: 0.756000,Test Acc: 0.804750,time0:04:31.987233 Epoch 4-0 Train acc: 0.818000,Test Acc: 0.813000,time0:04:55.408595 Epoch 5-0 Train acc: 0.814000,Test Acc: 0.820000,time0:04:31.761226 Epoch 6-0 Train acc: 0.854000,Test Acc: 0.833250,time0:04:15.148943 Epoch 7-0 Train acc: 0.834000,Test Acc: 0.833250,time0:04:42.237964 Epoch 8-0 Train acc: 0.836000,Test Acc: 0.825000,time0:04:35.039245 Epoch 9-0 Train acc: 0.814000,Test Acc: 0.804250,time0:04:52.260900 Epoch 10-0 Train acc: 0.862000,Test Acc: 0.842750,time0:04:57.368643 Epoch 11-0 Train acc: 0.846000,Test Acc: 0.847750,time0:05:24.563390 Epoch 12-0 Train acc: 0.872000,Test Acc: 0.850250,time0:04:59.518973 Epoch 13-0 Train acc: 0.840000,Test Acc: 0.867000,time0:01:40.365091 After the data processing and model training, the following files should be generated by modCnet. The trained model ``C_m5C.IVT.demo.pkl`` will be saved in the ``./demo_data/model/`` folder. You can utilize this model for making predictions in the future. Predict ac4C sites in human cell line ******************** HeLa nanopore data is publicly available and can be downloaded from the `GSE211759 `_. In this demo, subset of the HeLa nanopore data was taken for demonstration purposes due to the large size of the original datasets. The demo datasets were located under ``./demo_data/HeLa_fast5/`` directory. :: demo └── HeLa └── HeLa_fast5 └── batch0.fast5 **1. Guppy basecalling** Basecalling converts the raw signal generated by Oxform Nanopore sequencing to DNA/RNA sequence. Guppy is used for basecalling in this step. In some nanopore datasets, the sequence information is already contained within the FAST5 files. In such cases, the basecalling step can be skipped as the sequence data is readily available. :: guppy_basecaller -i demo_data/HeLa/HeLa_fast5 -s demo_data/HeLa/HeLa_fast5_guppy --num_callers 40 --recursive --fast5_out --config rna_r9.4.1_70bps_hac.cfg **2. Multi-reads FAST5 files to single-read FAST5 files** Convert multi-reads FAST5 files to single-read FAST5 files. If the data generated by the sequencing device is already in the single-read format, this step can be skipped. :: multi_to_single_fast5 -i demo_data/HeLa/HeLa_fast5_guppy -s demo_data/HeLa/HeLa_fast5_guppy_single --recursive **3. Tombo resquiggling** In this step, the sequence obtained by basecalling is aligned or mapped to a reference genome or a known sequence. Then the corrected sequence is then associated with the corresponding current signals. The resquiggling process is typically performed in-plac. No separate files are generated in this step. GRCh38 transcripts file can be download `here `_. :: tombo resquiggle --overwrite --basecall-group Basecall_1D_000 demo_data/HeLa/HeLa_fast5_guppy_single demo_data/GRCh38_subset_reference.fa --processes 40 --fit-global-scale --include-event-stdev **4. Map reads to reference** minimap2 is used to map basecalled sequences to reference transcripts. The output sam file serves as the input for the subsequent feature extraction step. :: cat demo_data/HeLa/HeLa_fast5_guppy/pass/*.fastq >demo_data/HeLa/HeLa.fastq minimap2 -ax map-ont demo_data/GRCh38_subset_reference.fa demo_data/HeLa/HeLa.fastq >demo_data/HeLa/HeLa.sam **5. Feature extraction** Extract signals and features from resquiggled fast5 files using the following python scripts. :: python script/feature_extraction.py --input demo_data/HeLa/HeLa_fast5_guppy_single \ --reference demo_data/GRCh38_subset_reference.fa \ --sam demo_data/HeLa/HeLa.sam \ --output demo_data/HeLa/HeLa.feature.tsv \ --clip 10 \ --motif NNCNN In the feature extraction step, the motif pattern should be provided using the argument ``--motif``. **7. Predict ac4C sites** To predict ac4C sites in HeLa nanopore data using a pretrained model, you can set the ``--run_mode`` argument to "predict". You can specify the pretrained model by using the argument ``--pretrained_model``. :: python script/modCnet.py --run_mode predict \ --pretrained_model model/C_ac4C.pkl \ --feature_file demo_data/HeLa/HeLa.feature.tsv \ --predict_result demo_data/HeLa/HeLa.prediction.tsv During the prediction process, modCnet generates the following files. The prediction result file is named "HEK293T.prediction.tsv". :: demo_data ├── GRCh38_subset_reference.fa ├── HeLa │   ├── HeLa_fast5 │   ├── HeLa_fast5_guppy │   ├── HeLa_fast5_guppy_single │   ├── HeLa.fastq │   ├── HeLa.feature.tsv │   ├── HeLa.prediction.tsv │   └── HeLa.sam The prediction result "demo/HEK293T/HEK293T.prediction.tsv" provides prediction labels along with the corresponding modification probabilities, which can be utilized for further analysis. :: transcript_id site motif read_id prediction probability NM_001349947.2 552 AACCA 320a1a8b-7709-4335-8f6a-84f09ba6592a unmod 0.00014777448 XM_006720125.3 2437 ACCAG 53dd21de-f74b-44db-baa3-06c68772b7e1 unmod 0.062309794 NM_001321485.2 498 TGCTG 1f8ce6a2-5fac-4a2f-ae25-0abdb0de412e unmod 0.17353779 NM_001199673.2 2972 ATCAA 5781a0c4-ede0-452e-8789-9a43740451ab unmod 0.26891512 NM_014364.5 1233 GACAA 47f7b914-a51e-4eab-adb2-e500d8a46fd1 unmod 0.029849814 NM_001321485.2 515 GCCTC 31fe54e8-7724-40c6-aaa2-025ab5de7754 unmod 0.004975981 NM_001136267.2 1780 GACTA 62b6ab58-5ee0-4871-95d5-5db66a9c56c7 unmod 0.0018304548 NM_001143883.4 714 TGCAG 4fb0be9b-9628-46aa-9ba4-40a6456d7d52 unmod 0.1989807 NM_006012.4 1058 ATCTT 7c7ff067-1ead-4838-97c8-5fca91fdfe8a unmod 0.06284212 NM_001143883.4 714 TGCAG 13493367-a9ab-4f20-9f62-ad32c2cc6c2e unmod 0.022585329 NM_001369747.1 920 ATCAT 5d2b59a7-4946-40b0-9c0e-16ba009ad4f5 unmod 0.0009560142 NM_001321485.2 515 GCCTC 1cbc2a9b-02d5-4906-b292-63fe6a30baaa unmod 0.0013002371 XR_949965.1 271 GTCAA 5db89b35-738e-462d-b92b-7cded1ed2c21 unmod 0.005573378 NM_005566.4 1652 ACCTT 5fd3dff6-0a1e-4f22-9a10-cb439cf41393 unmod 0.03093134 NM_001024630.4 5513 TTCAA 0f39d0bc-63ac-4c55-a08c-6c88c2f1fcca unmod 0.083354354 NM_001997.5 473 GGCTT 2f62c329-8d4e-4a2e-b9f8-11290e077d8f unmod 0.09690974 NR_003286.4 1355 AGCGA 49c7e639-5681-473e-936b-c2a01eb94c6f mod 0.7482356 NM_001997.5 112 AACGG 31ec8d67-a62d-4085-8983-75a5c6833b17 unmod 0.01882868 NM_001144943.1 1298 TTCTT 133fa83c-cf3a-4b10-9575-81f298fd0839 unmod 0.13784541 XM_017004733.1 2098 CCCTC 0fca07db-bfa9-4974-8cde-aa746a76301c unmod 0.0036647602 NM_213725.2 421 TTCAA 3e5efe25-6e79-439d-8c9e-26bfd59216da mod 0.8380922 The execution time for each demonstration is estimated to be approximately 3-10 minutes.