| Bytes | Lang | Time | Link |
|---|---|---|---|
| 042 | J | 250425T111033Z | Galen Iv |
| 111 | Tcl | 250423T221754Z | sergiol |
| 090 | Python 3.8 prerelease | 210923T130734Z | Hunaphu |
| 121 | PowerShell Core | 210922T025104Z | Julian |
| 117 | Python 3 | 210923T000252Z | Hunaphu |
| 095 | Factor + grouping.extras | 210919T014945Z | chunes |
| 017 | Vyxal | 210918T224802Z | emanresu |
| 415 | Common Lisp Lispworks | 160711T053404Z | sadfaf |
| 130 | Python 3 | 160502T192427Z | Morgan T |
| 138 | Hoon | 160519T033605Z | RenderSe |
| nan | Perl 6 | 160518T174940Z | null |
| nan | Perl | 160516T203251Z | Denis Ib |
| 052 | J | 160515T015759Z | ljeabmre |
| 103 | Javascript ES7 | 160515T190638Z | BusyBein |
| 021 | MATL | 160516T102817Z | Luis Men |
| 135 | Python 2.7 | 160503T194501Z | deustice |
| 103 | Python 2 | 160514T205702Z | Dennis |
| 126 | Python 3 | 160515T021516Z | Hunter V |
| 023 | CJam | 160502T190647Z | Martin E |
| 077 | Julia 0.4 | 160503T205006Z | Dennis |
| 114 | Groovy | 160503T173657Z | Krzyszto |
| 023 | Pyth | 160503T143342Z | Leaky Nu |
| 059 | Ruby | 160503T113454Z | xsot |
Tcl, 111 bytes
proc D a {string map {00 A 01 C 10 G 11 T 0 A 1 G} [format %llb [join [lmap c [split $a ""] {scan $c %c}] ""]]}
Python 3.8 (pre-release), 92 90 bytes
Does not handle padding, but is under 100.
lambda x,s='':f(x,'ACGT'[v%4]+s)if (v:=int(''.join(map(str,x.encode())))>>len(s)*2) else s
Python 3, 113 110 bytes
Spending almost 1/5 bytes on padding.
def f(x):
v=int(''.join(map(str,x.encode())));v<<=len(bin(v))%2;c=''
while v:c='ACGT'[v%4]+c;v>>=2
return c
Python 3.8 (pre-release), 110 bytes
f=lambda x,s='':f(x,'ACGT'[v%4]+s)if(v:=(u:=int(''.join(map(str,x.encode()))))<<len(bin(u))%2>>len(s)*2)else s
PowerShell Core, 143 121 bytes
filter d{if($_){"$(($_-($i=$_%2))/2|d)$i"}}-join([regex]"(..)"|% s* "$(-join($args|%{+$_})|d)0"|?{$_[1]}|%{"ACGT"[$_%8]})
Explanations
Takes as an input a string transformed as a char array using splatting
First, a filter to convert decimal to binary as Powershell can't do it for number larger than long
filter d{if($_){"$(($_-($i=$_%2))/2|d)$i"}}
This converts the char array to an int array, joins them as a string, splits with binary string concatenated with the padding in strings of length 2.
It converts each substring to a number, and using modulus 8, finds the correct char in the ACGT string.
-join([regex]"(..)"|% s* "$(-join($args|%{+$_})|d)0"|?{$_[1]}|%{"ACGT"[$_%8]})
Python 3, 117 bytes
This answer ignores padding. Looks like it could be shorter.
lambda x:''.join('ACGT'[h//4]+'ACGT'[h%4]for h in [int(h,16)for h in hex(int(''.join(map(str,x.encode()))))[2:]])[1:]
Factor + grouping.extras, 95 bytes
[ [ present ] f map-as concat dec> >bin 2 48 pad-groups 2 group [ bin> "ACGT" nth ] "" map-as ]
Doesn't work on TIO because pad-groups postdates build 1525, the one TIO uses. Here's a picture of running it in build 2101:
There is no build where whitespace can be omitted after strings and pad-groups exists, but pad-groups makes it worth it.
Explanation:
It's a quotation (anonymous function) that accepts a string from the data stack as input and leaves a string on the data stack as output. Assuming "ppcg" is on the data stack when this quotation is called...
| Snippet | Output |
|---|---|
[ present ] f map-as |
{ "112" "112" "99" "103" } |
concat |
"11211299103" |
dec> |
11211299103 |
>bin |
"1010011100001111101101100100011111" |
2 48 pad-groups |
"1010011100001111101101100100011111" (append zeros so length is divisible by 2.) |
2 group |
{ "10" "10" "01" "11" "00" "00" "11" "11" "10" "11" "01" "10" "01" "00" "01" "11" "11" } |
[ bin> ] map |
{ 2 2 1 3 0 0 3 3 2 3 1 2 1 0 1 3 3 } |
[ "ACGT" nth ] "" map-as |
"GGCTAATTGTCGCACTT" |
Vyxal, 17 bytes
Cṅb₅∷[0J]B`ACGT`τ
A mess.
C # Charcodes
ṅ # Concatenated
b # To binary
₅∷[ ] # If length is odd
0J # Append a 0
B # To base10
`ACGT`τ # Convert to custom base `ACGT` (convert to base4, replace 0-3 with corresponding item of "ACGT")
Common Lisp (Lispworks), 415 bytes
(defun f(s)(labels((p(e f)(concatenate'string e f)))(let((b"")(d""))(dotimes(i(length s))(setf b(p b(write-to-string(char-int(elt s i))))))(setf b(write-to-string(parse-integer b):base 2))(if(oddp #1=(length b))(setf b(p b"0")))(do((j 0(+ j 2)))((= j #1#)d)(let((c(subseq b j(+ j 2))))(cond((#2=string="00"c)(setf d(p d"A")))((#2#"01"c)(setf d(p d"C")))((#2#"10"c)(setf d(p d"G")))((#2#"11"c)(setf d(p d"T")))))))))
ungolfed:
(defun f (s)
(labels ((p (e f)
(concatenate 'string e f)))
(let ((b "") (d ""))
(dotimes (i (length s))
(setf b
(p b
(write-to-string
(char-int (elt s i))))))
(setf b (write-to-string (parse-integer b) :base 2))
(if (oddp #1=(length b))
(setf b (p b "0")))
(do ((j 0 (+ j 2)))
((= j #1#) d)
(let ((c (subseq b j (+ j 2))))
(cond ((#2=string= "00" c)
(setf d (p d "A")))
((#2# "01" c)
(setf d (p d "C")))
((#2# "10" c)
(setf d (p d "G")))
((#2# "11" c)
(setf d (p d "T")))))))))
Usage:
CL-USER 2060 > (f "}")
"TTGG"
CL-USER 2061 > (f "golf")
"TAAAAATTATCCATAAATA"
Python 3, 130 bytes.
Saved 2 bytes thanks to vaultah.
Saved 6 bytes thanks to Kevin Lau - not Kenny.
I hate how hard it is to convert to binary in python.
def f(x):c=bin(int(''.join(map(str,map(ord,x)))))[2:];return''.join('ACGT'[int(z+y,2)]for z,y in zip(*[iter(c+'0'*(len(c)%2))]*2))
Test cases:
assert f('codegolf') == 'GGCTTGCGGCCGGAGACGCGGTCTGACGCCTTGTAAATA'
assert f('ppcg') == 'GGCTAATTGTCGCACTT'
Hoon, 148 138 bytes
|*
*
=+
(scan (reel +< |=({a/@ b/tape} (weld <a> b))) dem)
`tape`(flop (turn (rip 1 (mul - +((mod (met 0 -) 2)))) |=(@ (snag +< "ACGT"))))
"abc" is a list of atoms. Interpolate them into strings (<a>) while folding over the list, joining them together into a new string. Parse the number with ++dem to get it back to an atom.
Multiply the number by (bitwise length + 1) % 2 to pad it. Use ++rip to disassemble every two byte pair of the atom into a list, map over the list and use the number as an index into the string "ACGT".
> =a |*
*
=+
(scan (reel +< |=({a/@ b/tape} (weld <a> b))) dem)
`tape`(flop (turn (rip 1 (mul - +((mod (met 0 -) 2)))) |=(@ (snag +< "ACGT"))))
> (a "codegolf")
"GGCTTGCGGCCGGAGACGCGGTCTGACGCCTTGTAAATA"
> (a "ppcg")
"GGCTAATTGTCGCACTT"
> (a "}")
"TTGG"
Perl 6, 57 + 1 (-p flag) = 58 bytes
$_=(+[~] .ords).base(2);s:g/..?/{<A G C T>[:2($/.flip)]}/
Step by step explanation:
-p flag causes Perl 6 interpreter to run code line by line, put current line $_, and at end put it back from $_.
.ords - If there is nothing before a period, a method is called on $_. ords method returns list of codepoints in a string.
[~] - [] is a reduction operator, which stores its reduction operator between brackets. In this case, it's ~, which is a string concatenation operator. For example, [~] 1, 2, 3 is equivalent to 1 ~ 2 ~ 3.
+ converts its argument to a number, needed because base method is only defined for integers.
.base(2) - converts an integer to a string in base 2
$_= - assigns the result to $_.
s:g/..?/{...}/ - this is a regular expression replacing any (:g, global mode) instance of regex ..? (one or two characters). The second argument is a replacement pattern, which in this case in code (in Perl 6, curly brackets in strings and replacement patterns are executed as code).
$/ - a regex match variable
.flip - inverts a string. It implicitly converts $/ (a regex match object) to a string. This is because a single character 1 should be expanded to 10, as opposed to 01. Because of that flip, order of elements in array has G and C reversed.
:2(...) - parses a base-2 string into an integer.
<A G C T> - array of four elements.
...[...] - array access operator.
What does that mean? The program gets list of all codepoints in a string, concatenates them together, converts them to base 2. Then, it replaces all instances of two or one character into one of letters A, G, C, T depending on flipped representation of a number in binary.
Perl, 155 148 137 + 1 (-p flag) = 138 bytes
#!perl -p
s/./ord$&/sge;while($_){/.$/;$s=$&%2 .$s;$t=$v="";$t.=$v+$_/2|0,$v=$_%2*5
for/./g;s/^0// if$_=$t}$_=$s;s/(.)(.)?/([A,C],[G,T])[$1][$2]/ge
Test it on Ideone.
J, 52 bytes
3 :'''ACGT''{~#._2,\#:".,&''x''":(,&:(":"0))/3&u:y'
Usage: 3 :'''ACGT''{~#._2,\#:".,&''x''":(,&:(":"0))/3&u:y' 'codegolf' ==> GGCTTGCGGCCGGAGACGCGGTCTGACGCCTTGTAAATA
Javascript ES7, 105 103 bytes
s=>((+[for(c of s)c.charCodeAt()].join``).toString(2)+'0').match(/../g).map(x=>"ACGT"['0b'+x-0]).join``
The ES7 part is the for(c of s) part.
ES6 version, 107 105 bytes
s=>((+[...s].map(c=>c.charCodeAt()).join``).toString(2)+'0').match(/../g).map(x=>"ACGT"['0b'+x-0]).join``
Ungolfed code
dna = (str)=>{
var codes = +[for(c of str)c.charCodeAt()].join``;
var binaries = (codes.toString(2)+'0').match(/../g);
return binaries.map(x=>"ACGT"['0b'+x-0]).join``
}
This is my first try at golfing on PPCG, feel free to correct me if something's wrong.
Thanks @AlexA for the small improvement.
MATL, 21 bytes
'CGTA'joV4Y2HZa2e!XB)
Explanation
'CGTA' % Push string to be indexed into
j % Take input string
o % Convert each char to its ASCII code
V % Convert to string (*). Numbers are separated by spaces
4Y2 % Push the string '0123456789'
H % Push number 2
Za % Convert string (*) from base '0123456789' to base 2, ignoring spaces
2e % Reshape into a 2-column matrix, padding with a trailing 0 if needed
! % Transpose
XB % Convert from binary to decimal
) % Index into string with the DNA letters. Indexing is 1-based and modular
Python 2.7, 135 bytes
def f(A):g=''.join;B=bin(int(g(map(str,map(ord,A)))))[2:];B+=len(B)%2*'0';return g('ACGT'[int(B[i:i+2],2)] for i in range(len(B))[::2])
Ungolfed:
def f(A):
g = ''.join
B = bin(int(g(map(str,map(ord,A)))))[2:] # convert string input to binary
B += len(B)%2 * '0' # add extra 0 if necessary
return g('ACGT'[int(B[i:i+2],2)] for i in range(len(B))[::2]) # map every two characters into 'ACGT'
Output
f('codegolf')
'GGCTTGCGGCCGGAGACGCGGTCTGACGCCTTGTAAATA'
Python 2, 109 103 bytes
lambda s,j=''.join:j('ACGT'[int(j(t),2)]for t in
zip(*[iter(bin(int(j(`ord(c)`for c in s))*2)[2:])]*2))
Test it on Ideone.
Python 3, 126 bytes
lambda v:"".join(["ACGT"[int(x,2)]for x in map(''.join,zip(*[iter((bin(int("".join([str(ord(i))for i in v])))+"0")[2:])]*2))])
CJam, 24 23 bytes
Thanks to Dennis for saving 1 byte in a really clever way. :)
l:isi2b2/Wf%2fb"AGCT"f=
Explanation
Very direct implementation of the specification. The only interesting bit is the padding to an even number of zeros (which was actually Dennis's idea). Instead of treating the digits in each pair in the usual order, we make the second bit the most significant one. That means, ending in a single bit is identical to appending a zero to it, which means we don't have to append the zero at all.
l e# Read input.
:i e# Convert to character codes.
si e# Convert to flat string and back to integer.
2b e# Convert to binary.
2/ e# Split into pairs.
Wf% e# Reverse each pair.
2fb e# Convert each pair back from binary, to get a value in [0 1 2 3].
"AGCT"f= e# Select corresponding letter for each number.
Julia 0.4, 77 bytes
s->replace(bin(BigInt(join(int(s)))),r"..?",t->"AGCT"[1+int("0b"reverse(t))])
This anonymous function takes a character array as input and returns a string.
Groovy, 114 bytes
{s->'ACGT'[(new BigInteger(((Byte[])s).join())*2).toString(2).toList().collate(2)*.with{0.parseInt(it.join(),2)}]}
Explanation:
{s->
'ACGT'[ //access character from string
(new BigInteger( //create Big Integer from string
((Byte[])s).join() //split string to bytes and then join to string
) * 2) //multiply by 2 to add 0 at the end in binary
.toString(2) //change to binary string
.toList() //split to characters
.collate(2) //group characters by two
*.with{
0.parseInt(it.join(),2) //join every group and parse to decimal
}
]
}
Pyth, 23 bytes
sm@"AGCT"i_d2c.BsjkCMQ2
Explanation
Borrowing the trick from Dennis' Jelly answer.
sm@"AGCT"i_d2c.BsjkCMQ2
CMQ convert each character to its byte value
sjk convert to a string and then to integer
.B convert to binary
c 2 chop into pairs
m d for each pair:
_ reverse it
i 2 convert from binary to integer
@"AGCT" find its position in "AGCT"
s join the string
Ruby, 59 bytes
$_='%b0'.%$_.bytes*''
gsub(/../){:ACGT[$&.hex%7]}
chomp'0'
A full program. Run with the -p flag.
