g | x | w | all
Bytes Lang Time Link
136Python 3251010T054539ZRandom D
114Python251010T182813Z97.100.9
072Red251010T071853ZGalen Iv
318Java250929T153647ZMark
025Vyxal 3240214T185323Zpacman25
067JavaScript Node.js240306T062244Zl4m2
nanScala 2240215T112740Z138 Aspe
040Perl 5 n240214T200506ZXcali
057MATL160117T035119ZLuis Men
185Python 2160116T031741ZTanMath
062Ruby160116T230351ZFlambino
069JavaScript160116T230344ZBenjamin
030Retina160116T020205ZMartin E
112Haskell160116T014918Znimi

Python 3 143 145 136 bytes

Thanks to @97.100.97.109 for finding an error

Edit: Removed a few bytes. Inspired by @97.100.97.109's use of the walrus operator.

def x(y):
 z=y.find("AUG")
 if z<0:return[]
 return [a for i in range(z,len(y),3)if len(a:=y[i:i+3])==3 and y[i:i+3]not in"UAA UAG UGA"]

Ungolfed:

def parse(rna):
    rna_start = rna.find('AUG')
    codons = []

    if rna_start == -1:
        return []

    for i in range(rna_start, len(rna), 3):
        if codons[-1] in ('UAA', 'UAG', 'UGA'):
            return codons
        if len(rna[i:i+3]) == 3:
            codons.append(rna[i:i+3])
    return codons

I essentially just combined the two conditionals and put it in list comprehension. Other than that, I just renamed and shortened a condition.

Python, 114 bytes

def x(y):
 for i in range(z:=y.find("AUG"),(len(y)-z)//3*3+z,3):
  if(c:=y[i:i+3])in"UAA UAG UGA":break
  print(c)

Attempt This Online!

Heavily modified version of @Random Dude's code which fixes the error causing incorrect outputs. This version outputs the codons to stdout as opposed to returning them. If you prefer your code to be functional rather than imperative, here's an alternative:

Python, 117 bytes

def x(y,q=0):z=y.find("AUG");return[c for i in range(z,(len(y)-z)//3*3+z,3)if(q:=((c:=y[i:i+3])in"UAA UAG UGA")+q)<1]

Attempt This Online!

Red, 72 bytes

func[b][parse b[collect[to"AUG"any[not["UAA"|"UAG"|"UGA"]keep 3 skip]]]]

Try it online!

Java, 318

String p(String v){int s=v.indexOf("AUG");List<String>g=new ArrayList<>();if(s==-1)return"";Matcher m=Pattern.compile(".{1,3}").matcher(v.substring(s));while(m.find()){if(m.group().length()<3||m.group().equals("UAA")||m.group().equals("UAG")||m.group().equals("UGA"))break;g.add(m.group());}return String.join(",",g);}
String p(String v) {
    int s = v.indexOf("AUG");
    List<String> g = new ArrayList<>();
    if (s == -1) return "";
    Matcher m = Pattern.compile(".{1,3}").matcher(v.substring(s));
    while (m.find()) {
        if (m.group().length() < 3 || m.group().equals("UAA") || m.group().equals("UAG") || m.group().equals("UGA"))
            break;
        g.add(m.group());
    }
    return String.join(",", g);
}

Vyxal 3, 25 bytes

"ᶠx„ẋİ⁻/3Ŀ:ƛ'u"\ᵇ„o+c]Ṙƒh

Try it Online!

There's my 26 25 byter

old explanation:

"ᶠx„ẋİ⁻/3Ŀ:ƛ'u"\ᵇ„o+=a]Ṙƒh­⁡​‎‎⁡⁠⁡‏⁠‎⁡⁠⁢‏⁠‎⁡⁠⁣‏⁠‎⁡⁠⁤‏⁠‎⁡⁠⁢⁡‏⁠‏​⁡⁠⁡‌⁢​‎⁠⁠⁠⁠⁠‎⁡⁠⁢⁢‏‏​⁡⁠⁡‌⁣​‎⁠⁠⁠⁠‎⁡⁠⁢⁣‏⁠⁠⁠⁠‏​⁡⁠⁡‌⁤​‎‎⁡⁠⁢⁤‏⁠‎⁡⁠⁣⁡‏⁠‎⁡⁠⁣⁢‏⁠‎⁡⁠⁣⁣‏‏​⁡⁠⁡‌⁢⁡​‎‎⁡⁠⁣⁤‏⁠‎⁡⁠⁢⁢⁡‏⁠‎⁡⁠⁢⁢⁢‏⁠‎⁡⁠⁢⁢⁣‏‏​⁡⁠⁡‌⁢⁢​‎‎⁡⁠⁢⁡⁣‏‏​⁡⁠⁡‌⁢⁣​‎‎⁡⁠⁤⁣‏⁠‎⁡⁠⁤⁤‏⁠‎⁡⁠⁢⁡⁡‏⁠‎⁡⁠⁢⁡⁢‏⁠‏​⁡⁠⁡‌⁢⁤​‎‎⁡⁠⁤⁡‏⁠‎⁡⁠⁤⁢‏‏​⁡⁠⁡‌⁣⁡​‎‎⁡⁠⁢⁢⁤‏⁠‎⁡⁠⁢⁣⁡‏⁠‎⁡⁠⁢⁣⁢‏‏​⁡⁠⁡‌­
"ᶠx„ẋ                       # ‎⁡The first index of "aug"
     İ                      # ‎⁢Slice from here to the end
      ⁻                     # ‎⁣Split into parts of length 3
       /3Ŀ:                 # ‎⁤Keep only those whose length is 3 and duplicate
           ƛ        =a]     # ‎⁢⁡for each codon, does it equal any of...
                  o         # ‎⁢⁢overlapping pairs of....
              "\ᵇ„          # ‎⁢⁣The string "aaga" --> ["aa", "ag", "ga"]
            'u              # ‎⁢⁤with a "u" prepended to each ["uaa", "uag", "uga"]
                       Ṙƒh  # ‎⁣⁡partition before truthy indices, take the first item.
💎

Created with the help of Luminespire.

JavaScript (Node.js), 67 bytes

s=>[/AUG(...)*?(?=UA[AG]|UGA|.?.?$)|$/.exec(s)[0].match(/.../g)]+''

Try it online!

Scala 2, 170 156 bytes

A port of @nimi's Haskell answer in Scala.

Saved 14 bytes thanks to @pacman256


Golfed version. Attempt This Online!

s=>{val q=s.indexOf("AUG");if(q>=0){val c=s.drop(q).grouped(3).toSeq;c.takeWhile(c=>c.size==3&& !Seq("UAA","UAG","UGA").contains(c))}else Seq.empty[String]}

Ungolfed version. Attempt This Online!

object RNASequenceProcessor {

  def main(args: Array[String]): Unit = {
    val rnaSequence = "AUGCUUAUGAAUGGCAUGUACUAAUAGACUCACUUAAGCGGUGAUGAA"
    val codingRegion = findCodingRegion(rnaSequence)
    println("["++codingRegion.mkString(",")++"]")
  }

  def findCodingRegion(sequence: String): Seq[String] = {
    // Find the start of the coding region (first occurrence of "AUG")
    val startOfCoding = sequence.indexOf("AUG")
    if (startOfCoding != -1) {
      // Extract the sequence from "AUG" onward
      val codingSequence = sequence.drop(startOfCoding)
      
      // Split the sequence into codons (chunks of 3 nucleotides)
      val codons = codingSequence.grouped(3).toSeq
      
      // Take codons until a stop codon is encountered or the codon length is less than 3
      codons.takeWhile(codon => codon.length == 3 && !Seq("UAA", "UAG", "UGA").contains(codon))
    } else {
      Seq.empty[String] // Return an empty sequence if "AUG" is not found
    }
  }
}

Perl 5 -n, 40 bytes

map/UAA|UAG|UGA/?last:say,/AUG|\B\G.../g

Try it online!

MATL, 57 bytes

j'AUG(...)*?(?=(UAA|UAG|UGA|.?.?$))'XXtn?1X)tnt3\-:)3[]e!

This uses current version (9.3.1) of the language/compiler.

Input and output are through stdin and stdout. The output is separated by linebreaks.

Example

>> matl
 > j'AUG(...)*?(?=(UAA|UAG|UGA|.?.?$))'XXtn?1X)tnt3\-:)3[]e!
 >
> ACAUGGAUGGACUGUAACCCCAUGC
AUG
GAU
GGA
CUG

EDIT (June 12, 2016): to adapt to changes in the language, [] should be removed. The link below includes that modification

Try it online!

Explanation

The code is based on the regular expression

AUG(...)*?(?=(UAA|UAG|UGA|.?.?$))

This matches substrings starting with AUG, containing groups of three characters (...) and ending in either UAA, UAG, or UGA; or ending at the end of the string, and in this case there may be one last incomplete group (.?.?$). Lookahead ((?=...)) is used so that the stop codons are not part of the match. The matching is lazy (*?) in order to finish at the first stop codon found, if any.

j                                     % input string
'AUG(...)*?(?=(UAA|UAG|UGA|.?.?$))'   % regex
XX                                    % apply it. Push cell array of matched substrings
tn?                                   % if non-empty
1X)                                   % get first substring
tnt3\-:)                              % make length the largest possible multiple of 3
3[]e!                                 % reshape into rows of 3 columns
                                      % implicit endif
                                      % implicit display

Python 2, 185 bytes

i=input()
o=[]
if i.find('AUG')>=0:i=map(''.join,zip(*[iter(i[i.find('AUG'):])]*3))
else:print "";exit()
for j in i:
 if j not in['UGA','UAA','UAG']:o+=[j]
 else:break
print ','.join(o)

Explanation Set i to input. Split it from 'AUG' to the end. Split into strings of three. Check if stop codon, and cut.

Try it here

Ruby, 97 95 78 75 62 bytes

->(r){r.scan(/AUG|\B\G.../).join(?,).sub(/,U(AA|AG|GA).*/,'')}

I don't golf much, so I'm sure it can be improved.

Edit: Stole Borrowed Martin Büttner's excellent \B\G trick

JavaScript 88 82 70 69 chars

s=>/AUG(...)+?(?=(U(AA|AG|GA)|$))/.exec(s)[0].match(/.../g).join(",")

Usage Example:

(s=>/AUG(...)+?(?=(U(AA|AG|GA)|$))/.exec(s)[0].match(/.../g).join(","))("ACAUGGAUGGACUGUAACCCCAUGC")

Retina, 39 38 32 30 bytes

M!`AUG|\B\G...
U(AA|AG|GA)\D*

The trailing linefeed is significant.

Output as a linefeed-separated list.

Try it online.

Explanation

M!`AUG|\B\G...

This is match stage which turns the input into a linefeed-separated list of all matches (due to the !). The regex itself matches every codon starting from the first AUG. We achieve this with two separate options. AUG matches unconditionally, so that it can start the list of matches. The second match can be any codon (... matches any three characters), but the \G is a special anchor which ensures that this can only match right after another match. The only problem is that \G also matches at the beginning of the string, which we don't want. Since the input consists only of word characters, we use \B (any position that is not a word boundary) to ensure that this match is not used at the beginning of the input.

U(AA|AG|GA)\D*

This finds the first stop codon, matched as U(AA|AG|GA) as well as everything after it and removes it from the string. Since the first stage split the codons into separate lines, we know that this match is properly aligned with the start codon. We use \D (non-digits) to match any character, since . wouldn't go past the linefeeds, and the input won't contain digits.

Haskell, 115 112 bytes

import Data.Lists
fst.break(\e->elem e["UAA","UAG","UGA"]||length e<3).chunksOf 3.snd.spanList((/="AUG").take 3)

Usage example:

*Main> ( fst.break(\e->elem e["UAA","UAG","UGA"]||length e<3).chunksOf 3.snd.spanList((/="AUG").take 3) ) "AUGCUUAUGAAUGGCAUGUACUAAUAGACUCACUUAAGCGGUGAUGAA"
["AUG","CUU","AUG","AAU","GGC","AUG","UAC"]

How it works:

                spanList((/="AUG").take 3)  -- split input at the first "AUG"
             snd                            -- take 2nd part ("AUG" + rest)
     chunksOf 3                             -- split into 3 element lists
fst.break(\e->                              -- take elements from this list
           elem e["UAA","UAG","UGA"]||      -- as long as we don't see end codons
           length e<3)                      -- or run out of full codons