## пятница, 10 января 2014 г.

### Codility. Train. Genomic-range-query ★★★

A non-empty zero-indexed string S is given. String S consists of N characters from the set of upper-case English letters A, C, G, T.
This string actually represents a DNA sequence, and the upper-case letters represent single nucleotides.
You are also given non-empty zero-indexed arrays P and Q consisting of M integers. These arrays represent queries about minimal nucleotides. We represent the letters of string S as integers 1, 2, 3, 4 in arrays P and Q, where A = 1, C = 2, G = 3, T = 4, and we assume that A < C < G < T.
Query K requires you to find the minimal nucleotide from the range (P[K], Q[K]), 0 ≤ P[i] ≤ Q[i] < N.
For example, consider string S = GACACCATA and arrays P, Q such that:
```    P = 0    Q = 8
P = 0    Q = 2
P = 4    Q = 5
P = 7    Q = 7```
The minimal nucleotides from these ranges are as follows:
• (0, 8) is A identified by 1,
• (0, 2) is A identified by 1,
• (4, 5) is C identified by 2,
• (7, 7) is T identified by 4.
Write a function:
def solution(S, P, Q)
that, given a non-empty zero-indexed string S consisting of N characters and two non-empty zero-indexed arrays P and Q consisting of M integers, returns an array consisting of M characters specifying the consecutive answers to all queries.
The sequence should be returned as:
• a Results structure (in C), or
• a vector of integers (in C++), or
• a Results record (in Pascal), or
• an array of integers (in any other programming language).
For example, given the string S = GACACCATA and arrays P, Q such that:
```    P = 0    Q = 8
P = 0    Q = 2
P = 4    Q = 5
P = 7    Q = 7```
the function should return the values [1, 1, 2, 4], as explained above.
Assume that:
• N is an integer within the range [1..100,000];
• M is an integer within the range [1..50,000];
• each element of array P, Q is an integer within the range [0..N − 1];
• P[i] ≤ Q[i];
• string S consists only of upper-case English letters A, C, G, T.
Complexity:
• expected worst-case time complexity is O(N+M);
• expected worst-case space complexity is O(N), beyond input storage (not counting the storage required for input arguments).
Elements of input arrays can be modified.

Solution:
```def solution(S, P, Q):
vMap = dict(A = 1, C = 2, G = 3, T = 4)
vLst = dict(A =[], C =[], G =[], T =[])
for i in xrange(len(S)):
vLst[S[i]].append(i)
for i in xrange(len(P)):
for j in sorted(vLst):
iF, iL = 0, len(vLst[j])
if iL == iF or \
vLst[j][iF] > Q[i] or \
vLst[j][iL - 1] < P[i]:
continue
while iF < iL:
iM = iF + (iL - iF) / 2
if P[i] <= vLst[j][iM]:
if vLst[j][iM] <= Q[i]:
P[i] = vMap[j]
break
iL = iM
else:
iF = iM + 1
else:
continue
break
return P```