您好,登錄后才能下訂單哦!
這篇文章主要為大家展示了“bedtools如何統計序列中堿基含量”,內容簡而易懂,條理清晰,希望能夠幫助大家解決疑惑,下面讓小編帶領大家一起研究并學習一下“bedtools如何統計序列中堿基含量”這篇文章吧。
bedtools統計序列中堿基含量
bedtools軟件中的nuc函數工具可以統計序列堿基含量,其具體用法如下:
Tool: bedtools nuc (aka nucBed) Version: v2.25.0 Summary: Profiles the nucleotide content of intervals in a fasta file. Usage: bedtools nuc [OPTIONS] -fi <fasta> -bed <bed/gff/vcf> Options: -fi Input FASTA file -bed BED/GFF/VCF file of ranges to extract from -fi -s Profile the sequence according to strand. -seq Print the extracted sequence -pattern Report the number of times a user-defined sequence is observed (case-sensitive). -C Ignore case when matching -pattern. By defaulty, case matters. -fullHeader Use full fasta header. - By default, only the word before the first space or tab is used. Output format: The following information will be reported after each BED entry: 1) %AT content 2) %GC content 3) Number of As observed 4) Number of Cs observed 5) Number of Gs observed 6) Number of Ts observed 7) Number of Ns observed 8) Number of other bases observed 9) The length of the explored sequence/interval. 10) The seq. extracted from the FASTA file. (opt., if -seq is used) 11) The number of times a user's pattern was observed. (opt., if -pattern is used.)
示例如下:
bed文件:
CM004359.1 0 10 CM004359.1 100 200 CM004359.1 1000 1050
運行命令:
$bedtools nuc -fi GCA_001651475.1_Ler_Assembly_genomic.fna -bed id.bed #1_usercol 2_usercol 3_usercol 4_pct_at 5_pct_gc 6_num_A 7_num_C 8_num_G 9_num_T 10_num_N 11_num_oth 12_seq_len CM004359.1 0 10 0.600000 0.400000 1 0 4 5 0 0 10 CM004359.1 100 200 0.580000 0.420000 14 0 42 44 0 0 100 CM004359.1 1000 1050 0.660000 0.340000 10 3 14 23 0 0 50
以上是“bedtools如何統計序列中堿基含量”這篇文章的所有內容,感謝各位的閱讀!相信大家都有了一定的了解,希望分享的內容對大家有所幫助,如果還想學習更多知識,歡迎關注億速云行業資訊頻道!
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。