| Bytes | Lang | Time | Link |
|---|---|---|---|
| 560 | LÖVE2D | 170130T002300Z | ATaco |
| 316 | Bash+ImageMagick+tesseract | 140307T014126Z | Digital |
| nan | 140306T081257Z | grovesNL | |
| 085 | APL | 140306T081156Z | marinus |
| 089 | JavaScript ES6 | 140306T113147Z | Florent |
LÖVE2D, 560 Bytes
t=...;g=love.graphics g.setNewFont(124)g.setBackgroundColor(255,255,255)A=g.newCanvas()B=g.newCanvas()x=1 y=1 g.setColor(255,255,255)g.setCanvas(B)g.clear(0,0,0)for i=1,#t do x=x+1 if t:sub(i,i)=="\n"then x=1 y=y+1 end if t:sub(i,i)=="*"then g.rectangle("fill",x*16,y*16,16,16)end end u=B:newImageData()g.setCanvas(A)S={}for i=0,9 do g.clear(0,0,0,0)g.print(i,48,0)r=A:newImageData()s={i=i,s=0}for x=0,16*8 do for y=0,16*8 do a=u:getPixel(x,y)b=r:getPixel(x,y)s.s=s.s+math.abs(a-b)end end S[i+1]=s end table.sort(S,function(a,b)return a.s<b.s end)print(S[1].i)
First, draws a blocky representation of the input text, then, for each number 0 - 9, overlays a number, checks how many similar pixels there are, and prints the number which got the closest. Very basic OCR. It matches all the Test Cases, and performs reasonably well with mutations.
Call with:
love.exe "" "INPUT"
Bash+ImageMagick+tesseract, 316 chars
Here's a stab at an OCR solution. Its not very accurate though, even when telling tesseract that we have just one char and it is a digit. Moderately golfed, but still somewhat readable:
w=0
c()((w=${#2}>w?${#2}:w))
mapfile -c1 -Cc -t l
h=${#l[@]}
{
echo "# ImageMagick pixel enumeration: $w,$h,1,gray"
for y in ${!l[@]};{
for((x=0;x<w;x++));{
[ "${l[$y]:$x:1}" != " " ]
echo "$x,$y: ($?,$?,$?)"
}
}
}|convert txt:- i.png
tesseract i.png o -psm 10 <(echo "tessedit_char_whitelist 0123456789")
cat o.txt
The script takes input from stdin, so we can pipe from the test script.
Note I have put tee >( cat 1>&2 ) in the pipeline just so we can see what the test script actually generated.
Example output (This was a pretty good run with only 1 incorrect char out of 6):
$ python ./asciitest.py | tee >(cat 1>&2 ) | ./scanascii.sh
***
** **
* **
** *
****
***
****
Tesseract Open Source OCR Engine v3.02 with Leptonica
9
$ python ./asciitest.py | tee >(cat 1>&2 ) | ./scanascii.sh
*
*** *
*
*
*
*
*****
Tesseract Open Source OCR Engine v3.02 with Leptonica
1
$ python ./asciitest.py | tee >(cat 1>&2 ) | ./scanascii.sh
***
** **
** **
** **
** **
* **
***
Tesseract Open Source OCR Engine v3.02 with Leptonica
0
$ python ./asciitest.py | tee >(cat 1>&2 ) | ./scanascii.sh
*****
**
****
*
*
** *
***
Tesseract Open Source OCR Engine v3.02 with Leptonica
5
$ python ./asciitest.py | tee >(cat 1>&2 ) | ./scanascii.sh
****
**
*****
* *
*** ***
** **
****
Tesseract Open Source OCR Engine v3.02 with Leptonica
5
$ python ./asciitest.py | tee >(cat 1>&2 ) | ./scanascii.sh
***
* **
*
**
***
**
******
Tesseract Open Source OCR Engine v3.02 with Leptonica
2
$
Python
I'm sure there will be OCR solutions, but the probability of mine being accurate is much higher.
import difflib as x;r=range;s='2***3**1**1**3****3****3**1**1**3***23*4***6*6*6*6*4*****12***3*2**6*5**4**4**4******2***3*2**6*3***7*2*2**3***23**4***3*1**2*2**2******4**5**21*****2**5****7*6*2*3*3***22****2**5*****2*3*2**2**1**2*3****11*****5**5**4**5**4**4**42****2**2**1**2**2****2**2**1**2**2****12***3**1**1**3**1**2*3****5**2****2'
for c in r(8):s=s.replace(str(c),' '*c)
s=map(''.join,zip(*[iter(s)]*7));a=[raw_input("") for i in r(7)];l=[[x.SequenceMatcher('','|'.join(a),'|'.join(s[i*7:(i+1)*7])).ratio()] for i in r(10)];print l.index(max(l))
Input one line of text at a time.
Not sure of a better way to deal with the asterisks without increasing the character count.
APL (87 85)
1-⍨⊃⍒(,↑{7↑'*'=⍞}¨⍳7)∘(+.=)¨{49↑,(16/2)⊤⎕UCS⍵}¨↓10 3⍴'嵝䍝뫂傁ဣ␋䠁䊫낫䢝䊅넂垵僡ᑨ嘙쐅嘹䜝䪀슪퀪岹亝尵䌧뮢'
Explanation:
Each possible ASCII number is encoded in 48 bits. (The 49th bit is always zero anyway). The string 嵝䍝뫂傁ဣ␋䠁䊫낫䢝䊅넂垵僡ᑨ嘙쐅嘹䜝䪀슪퀪岹亝尵䌧뮢 has three characters per ASCII number, each of which encodes 16 bits.
↓10 3⍴: split the data string into 10 3-char groups, each of which encodes a number.{...}¨: for each of the groups:(16/2)⊤⎕UCS⍵: get the first 16 bits of each of the three characters,: concatenate the bit arrays into one array49↑: take the first 49 elements. There are only 48, so this is equivalent to adding a0at the end.
,↑{7↑'*'=⍞}¨⍳7: read 7 lines of 7 characters from the keyboard, make a bit array for each line where1means the character was a*, and join them together.(+.=)¨: for each possible digit, calculate how much bits the input had in common with the digit.⍒: get the indices for a downwards sort of that list, so that the first item in the result is the index of the largest number in the previous list.⊃: take the first item, which is the index of the digit1-⍨: subtract one, because APL indices are 1-based.
JavaScript (ES6), 89
f=n=>(a=1,[a=(a+a^c.charCodeAt())%35 for(c of n)],[4,25,5,16,0,11,32,13,10,1].indexOf(a))
Usage:
> f(" *** \n * ** \n * \n ** \n ** \n ** \n ******")
2
Un-golfed version:
f = (n) => (
// Initialize the digit's hash.
a=1,
// Hash the digit.
// 35 is used because the resulting hash is unique for the first ten digits.
// Moreover, it generates 4 1-digit hashes.
[a = (a + a ^ c.charCodeAt()) % 35 for(c of n)],
// Compare the hash to pre-computed digit hash.
// The matching hash index is the digit.
[4,25,5,16,0,11,32,13,10,1].indexOf(a)
)