HomeVolumesContestsSectionsForumsUsersPrintHelpAbout

Sections > Linear Data Structures: Arrays > problem:


51043 - Genome Sequencing

Guest
• Review clarifications (5)

Section problems

• 51111 - Even Numbers
• 51115 - Pascal Triangle
• 51043 - Genome Sequencing
• 51116 - The Star Triangle
• 51042 - The most frequent k-mer
• 50271 - Factorial
• 50382 - Parkside's Other Triangle
• 51171 Question 3
• 51093 - rgb2gray
• 51165 Question 1
• 51175 Question 2

Feedback

If you notice incorrect translations in Contester, please let author know.

Time limit 2000/4000/4000/4000 ms. Memory limit 65000/65000/65000/65000 Kb.

Genome Sequencing

Adapted from Genome sequencing by Phillip E. C. Compeau, and prepared by Ibrahim Mesecan
You are working on a genome project. Gene sequences use nucleic acid sequences from the composition: GACT. You are given a genome fragments, and you are asked to merge them. Gene fragments are merged in such a way that, if the last m nucleic acid sequences of gene fragment A matches with the first m nucleic acid sequence of B. These gene fragments A and B can be merged. For example, if m is 2 and you are given TAA as starting gene then you are also given the following nucleic acid sequences: ATG, TGC, TGC, AAACT, AAT and CCG. TAA can be merged with AAT and form TAAT. TAAT is ending with AT and thus it can be merged with ATG and form the sequence TAATG. Then similarly, TAATGC can be formed.

Question: Write a program that reads n gene fragments and after sorting gene fragments, using Sequential search on the sorted list finds the matching sequence. By adding the sequences one after another forms the longest DNA sequence.

Input specification: You will be two integers in the beginning,

  • number of gene fragments (n): an integer between 1 and 10,000,
  • the number nucleic acids used in merge process (m): an integer between 1 and 20
Then, each of the following n lines will have gene fragments where gene fragments contain only the letters (G, A, C or T) and are not longer than 100 letters. The first sequence is the starting sequence and 0 ≤ n ≤ 10,000.

Output specification: Show the fragment which was formed.

Sample Input
7 2
TAA ATG ATG TGC AAACT AAT CTG
Sample Output I
TAAACTGC

Explanation: TAA is the starting DNA fragment, ending with AA. There are two gene fragments starting with AA: AAACT and AAT. Using Sequential search AAACT has been selected as the second gene fragment. Now the new gene fragment is TAAACT, ending with CT. There is only one gene fragment starting with CT: CTG. So, the new gene TAAACTG and it is merged with TGC.



Для отправки решений необходимо выполнить вход.

www.contester.ru