HomeVolumesContestsSectionsForumsUsersPrintHelpAbout

Contests > CEN303_2016Questions > problem:


HW071. 51042 - The most frequent k-mer

CEN303_2016Questions

Start: Oct.28.2016 at 05:00:00 PM
Finish: Nov.01.2016 at 05:00:00 AM
The contest is finished!
• Contest scoreboard

Guest
• Review clarifications (1)

Contest problems

• HW023. 50741 - DNA Distance
• HW031. 51015 - Student Scholarships
• HW032. 51014 - Nine Men's Morris g...
• HW033. 50925 - Optimizing Elevator...
• HW051. 51019 - Finding the hidden...
• HW052. 51020 - Number of nodes r...
• HW061. 50448 - Paint Buckets
• HW062. 51021 - Number of Nodes
• HW071. 51042 - The most freq...
• HW072. 51043 - Genome Sequencing
• HW081. 51044 - Number of Trees
• HW082. 50699 - Fighting Vampires
• HW083. 50671 - Phalanx
• HW091. 51061 - The Longest Path
• HW101. 50682 - Hotel Durres
• HW102. 51067 - Jumping frog
• HW111. 50506 - The Biggest Island

Feedback

If you notice incorrect translations in Contester, please let author know.

Time limit 2000/4000/4000/4000 ms. Memory limit 65000/65000/65000/65000 Kb.

The most frequent k-mer

Adapted from Genome sequencing by Phillip E. C. Compeau, and prepared by Ibrahim Mesecan
Question: You are working on a genome project. And you are given gene fragments. And, you need to find the most frequent k-mer (DNA sequence or a DNA word wkth the length k). When producing k-mers, you take k chars from the beginning of the string and you proceed one char at a time. For example, if you are given the DNA sequence CGTCGTCGT and k is 3; the first 3-mer is CGT, the second 3-mer is GTC. And TCG, CGT, GTC and TCG are the following four 3-mers. Write a program that reads a DNA sequence, and returns the number of occurences of the most frequent k-mer.

Input specification: You will be given two integers in the beginning,

  • the number of DNA fragments (n): an integer not greater than 2,000,
  • the length of k-mer (k): an integer between 1 and 20
Then, each of the following n lines will have gene fragments. where gene fragments contain only the letters (G, A, C or T and no space or any other char). Every line is not longer than 200 letters. Assume that the DNA fragments given in multiple line forms just one continuous DNA sequence.

Output specification: Show one integer (the number of appearance of the most frequent k-mer).

Sample Input I
2 3
CGTC
GTCGT
Sample Input II
3 3
CGCCTAGC
CAGCCGCCCGGAGCG
CCATGGCCTAGCC
Sample Output I
3
Sample Output II
7

Explanation for sample input 1: DNA sequence is given in two lines of DNA fragments: CGTCGTCGC. In which there are three k-mers:

  • GTC appears 2 times
  • TCG appears 2 times
  • CGT appears 3 times
So, CGT is the most frequent 3-mer.


Для отправки решений необходимо выполнить вход.

www.contester.ru