g | x | w | all

Bytes	Lang	Time	Link
429	JavaScript Node.js	250828T043552Z	Arnauld
386	sqlite3	250830T185948Z	DPenner1
433	Python 3	250829T014821Z	Ted
222	Jelly	250831T172333Z	Jonathan
269	Charcoal	250828T095410Z	Neil

JavaScript (Node.js), 429 bytes

Outputs the table in Simplified Chinese.

f=(k=j=129,a=(B=Buffer)("gQOYACg+A/Hnsp58QAaDBwp0Fd6nvqQA3G8CqmBJssMw3zRyIkte3UHTqAD60gIeEDDydwlIvdokfodHwmNkAIiLnCKJXNxCSB33d61Nl8e8oPSizhDFCnl3KRYIJ+eEBq+3VyuUAztrF1OvH/YRUnTjsbSRvbY/VbpRt20TfrOvu4qzm7a8rae3JoyrYiwqmjrbRCKTLO4NL9upe36t8/2bGHYGkFyVpIQCq6ywsbqgoaKjh5KTlJW/","base64"))=>k--?f(k)+(n=a[k]*4^a[552+k>>2]>>k%4*2,n<9?n?'　'.repeat(n*6|8):`
`:B(["553339999"[i=n&15]*5%31^233,a[i+171],n/16+128,...i<2?[a[j++]]:[]])):""

Try it online!

Encoding

For each element name, the Unicode code point is broken down into:

a 2-byte prefix
a 3rd byte in the range \$[128\dots191]\$
a 4th byte in the same range if the code point starts with 0xF0

There are 15 distinct prefixes.

Hence the following encoding on 10 bits:

vvvvvvpppp
\____/\__/
   |    |
   |    '--> prefix identifier
   '-------> value of 3rd byte, minus 128

The upper 8 bits and the lower 2 bits of these 10-bit values are packed into two separate byte streams.

Special values lower than 9 are used for line-feeds and repeated whitespace.

If the prefix identifier is either 0 or 1, a 4th byte is taken from a separate lookup table.

Byte array

All data streams are concatenated into a single byte array, which is encoded in Base64:

Offset	Size	Description	Pointer variable
0	129	the upper 8 bits of the 10-bit values	`k`
129	9	the fourth bytes	`j`
138	33	the lower 2 bits of the 10-bit values	`k`
171	15	the 2nd bytes of the prefixes	`i`

First bytes of the prefixes

The first bytes of the prefixes are not stored in the byte array. They're computed with a dedicated and slightly shorter formula instead:

"553339999"[i]*5%31^233

5, 3 and 9 are converted to 0xF0 (240), 0xE6 (230) and 0xE7 (231) respectively. The default value for \$9\le i\le15\$ is 0xE9 (233).

Encoder

This code generates the Base64 data string, along with the offsets and sizes of each part.

const A = [...
  "氢　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　氦\n" +
  "锂铍　　　　　　　　　　　　　　　　　　　　　　　　硼碳氮氧氟氖\n" +
  "钠镁　　　　　　　　　　　　　　　　　　　　　　　　铝硅磷硫氯氩\n" +
  "钾钙　　　　　　　　　　　　　　钪钛钒铬锰铁钴镍铜锌镓锗砷硒溴氪\n" +
  "铷锶　　　　　　　　　　　　　　钇锆铌钼锝钌铑钯银镉铟锡锑碲碘氙\n" +
  "铯钡镧铈镨钕钷钐铕钆铽镝钬铒铥镱镥铪钽钨铼锇铱铂金汞铊铅铋钋砹氡\n" +
  "钫镭锕钍镤铀镎钚镅锔锫锎锿镄钔锘铹𬬻𬭊𬭳𬭛𬭶鿏𫟼𬬭鿔鿭𫓧镆𫟷鿬鿫"
];

const pfxSet = new Set;

for(const c of A) {
  if(c != '　' && c != '\n') {
    pfxSet.add(getPrefixKey(Buffer.from(c)));
  }
}

const pfx = [...pfxSet].sort();
const _10bit = [];
const byte4 = [];

for(let i = 0; i < A.length; i++) {
  const c = A[i];

  if(c == '　') {
    let j = i;
    while(A[i + 1] == '　') {
      i++;
    }
    _10bit.push(({ 14: 1, 24: 4, 30: 5 })[i + 1 - j]);
  }
  else if(c == '\n') {
    _10bit.push(0);
  }
  else {
    const a = Buffer.from(c);
    const pfxId = pfx.indexOf(getPrefixKey(a));
    _10bit.push(pfxId | a[2] - 128 << 4);
    if(pfxId < 2) {
      byte4.push(a[3]);
    }
  }
}

const _2bit = _10bit.flatMap((_, i) =>
  i & 3 ? [] : [ 3, 2, 1, 0 ].reduce((p, v) => p << 2 | _10bit[i + v] & 3, 0)
);
const _8bit = _10bit.map((n, i) => n >> 2 ^ _2bit[i >> 2] >> i % 4 * 2 + 2);
const byte2 = pfx.map(s => +s.split("/")[2]);
const data = [ _8bit, byte4, _2bit, byte2 ];
let ptr = 0;

data.forEach((arr, i) => {
  console.log(
    [ "8-bit", "4th bytes", "2-bit", "2nd bytes" ][i].padEnd(9) + ": " +
    "offset = " + ptr.toString().padStart(3) + ", " +
    "size = " + arr.length.toString().padStart(3)
  );
  ptr += arr.length;
});

console.log("\n" + Buffer.from(data.flat()).toString("base64"));

function getPrefixKey(a) {
  return [ a.length == 4 ? 0 : 1, a[0], a[1] ].join("/");
}

Try it online!

sqlite3, 386 bytes

Edit: New solution. Turns out it's hard to beat pre-existing compression (though CLI built-in zipfile was beatable). I've left my previous custom solution below.

Simplified Chinese table with double ASCII spaces. Uses the Brotli compression from the sqlite-compressions extension (v0.3.7); place libsqlite_compressions.so in the relevant library path.

.load libsqlite_compressions
SELECT brotli_decode(base85('.fofES<?GrK&%y$?Eu7B5O<+#[1JPi7\P?9Ky7hCrcLVMoC$Zt18`x1DzZb/2_qe;Jr2,L*xLeo9UVzW+-E?]rZ=1De;0iyoMOmW+*k+Y6uk<xFiPw1TEcj/vb/mTSOsx&]tRan#WZ-b&0W7Mu=V=1]h7d?Dxk;eHc>JS;Z2r84B1;c$$<]bu8?,S/<>J]WE_5Ss$IUNQH*5sXztnj*\x[Qnl60n+a.9B`dTOGECzFTnB[=tSGNh[Qe*f53MGX928OmAF:MbZS/&m049;2xA[o&^Qyd]/vqd,:u*^_5B&DV]k1E2^ao27gSzSCRHwCP;>T6a'));

(Edit: I'm not a regular, took me a bit to understand the general rules on using external libraries + byte count, I think I got it now, but let me know if I've done anything wrong)

sqlite3 - custom compression solution - 497 494 482 479 bytes

Traditional Chinese table with ideographic spaces. Uses the regexp extension; place regexp.so in the relevant library path.

.load regexp
PRAGMA encoding=utf16;SELECT unhex(regexp_replace(hex(base85(format('6>ce^%.53c/O^gHU$T=t<a%.41ciqQX4n*On5q*02VEb.?%.43czWwEe716Er,&Yl5:J%.26cEOQrpslhfdgnE35slYO`?h8r1ubnV\xYg,Kt[QG%.22c*K:rm9XvH$QqW#AAwEVq>Vj.r_fzP1Qsj\zVR-,0D$:TbDnw\iV[qI*X,6YwLx,[FOhCH/psbO^f-yi:GqW,xZObTuJkc&BrGyE96jkJG`rP9\L]2-LYxP&s1]zGz.D=Qba#l&>&&wq@w>_RPu0X8_5Jm=s2Llm3%%%%8mi^9PmNC11mghx3UV;tZUF&/fU5G0vU8jm1$p','#','#','#','#','#'))),'88.*A|(\w)(\w\w)(?=.*88.*\1(\w\w)A)','${1:+$2$3}'));

Explanation

I thought a UTF-16(LE) based approach might be more efficient for Chinese characters. The Traditional Chinese table's UTF-16 code units have 16 distinct leading bytes, so I compressed each code unit to 1.5 bytes (3 hex digits), using a lookup table. Crucially, 0x30 is mapped to 0 so that the ideographic spaces are just consecutive zeroes for the base85 encoding + string format.

The last bit of the periodic table where surrogates start contains rare leading bytes and ended up being more efficient to leave as-is. The lookup table is inserted before this and a regular expression does the substitutions. The lookup table starts with 88, which does not otherwise appear in the encoded text and entries are separated by A which does not occur past the lookup table.

Notably, this compressed string is 5 bytes shorter than the base85 Brotli string.

Possible areas for improvement

Encoding: The sqlite3 CLI documentation led me to believe that on Windows, UTF-16 would be the default console output, but I was not personally able to get that to work, so the PRAGMA was necessary. Additionally, the utf16 pragma defaults to native endianness. Big endian might shave one byte as it resulted in one less % that needed to be escaped in the surrogate part.
Based on the pragma, library import and regexp_replace not being that short, I started to wonder whether perl or some sqlite3 + perl solution might have a lower byte count with my method. Feel free to steal this idea, I might not get to it in a bit!

Python 3, 433 bytes

import zlib,base64
exit(zlib.decompress(base64.b85decode('c$}q>TTTK&5Jmr4g-aD8qC8Zpx)prEpa?R9G(sRMI=ku4`C|#z3QUNV+@v0<Q#TdE=;$92!%GEMLpz6Vrfio+s~Fy5n8YxM;ktqh@zp~?=tIit>1ms?JcdOKuLk--`hPCT47o!DEpP#?B5QmHy~nljin~ceLrVNy$1tm)TioVvWCM4Kwvip~0coRMWPxt*=jZ^J;ks$`o<_YG`i5K}8GeGE;irg4w#XG~iGjZHMk!D;ImR#XW7KSaBU99FG|(klqtT7aN3@C7(H7E5!;cs;dqa+&;|$lqKcmO^3*;7G$2nY%yTFzB6U53M`mryHuYECZnpUPe(}wbLpAU96no~N;`99YP{?wXV%7QZg1=<=NJp')).decode())

Try it online!

Simplified Chinese

An obvious answer, but quite efficient on bytes! (The uncompressed table is 687 bytes.)

Note: the output in TIO doesn't handle the spaces and the chinese characters properly, so it looks wonky. I am using the ascii spaces x2 instead of each ideographic space. When I run it in a normal terminal it looks correct.

Jelly, 222 bytes

“V&ḟhẆɠWƒ©Ñ?~ŒufṪ'Æw{_ṅµ®v(Ṅọ8Ẉɦ*ṗUÑȮḄȤẒọ6⁹ṣµʠṗ⁵ȥ~’ṃ“nƬqƲⱮċɲḅẎ{‘>200ḤżƊ;"“ṾṚ^ⱮĖẈṂȥƑ]m'ḳMḞẒ0Ẏ:+"|ĊQDọlƈẉʂı4ZȦN)çƝ\LḌ÷a?⁶ḳoḲƙƘzḌ⁷İ¶1ẋXẏ%G eæ⁽ḣ<bu¬ṫzM8⁼Ƥ¹ROṗZU[ññỌ;¤ʠøṡPṛ*ḍƥẈƒẆḅ$ƭ⁹crɓ³ƒƙḄUɼḂ⁵ẹʂȦẒ‘ḅØ⁵“¤¿Œ%7W‘œṖz⁽-%Zṙ"“¢©©ÑÑ‘ỌY

A niladic Link that yields a list of characters, or a full program that prints to stdout.

Try it online!

How?

“...’ṃ“...‘>200ḤżƊ;"“...‘ḅØ⁵“...‘œṖz⁽-%Zṙ"“...‘ỌY - Link: no arguments
“...’                                             - 1100111122200011222222222222220031222222222222222001222222222222222224222222225122220124222222222222222666667867792877
     ṃ“...‘                                       - base-decompress using [110,152,113,153,149,232,163,212,209,123] as digits [1,...,9,0]
                                                    -> #250s = [110,110,123,123,110,...,212,163,163]
                 Ɗ                                - last three links as a monad - f(#250s):
           >200                                   -   {#250s} greater than 200?
                                                      -> [0,0,0,0,0,...,1,0,0]
               Ḥ                                  -   double -> #62500s
                ż                                 -   zip with {#250s}
                                                      -> [[0,110],[0,110],[0,123],[0,123],[0,110],...,[2,212],[0,163],[0,163]]
                  ;"“...‘                         - zip-wise concatenate with #1s = [186,182,94,149,194,187,179,170,146,93,109,39,217,77,195,189,48,209,58,43,34,124,192,81,68,221,108,156,227,167,25,52,90,190,78,41,23,150,92,76,173,28,97,63,134,217,111,177,161,148,122,173,135,198,127,49,247,88,248,37,71,32,101,22,141,237,60,98,117,7,245,122,77,56,140,151,129,82,79,242,90,85,91,27,27,181,59,3,165,29,244,80,222,42,213,164,187,158,207,212,36,168,137,99,114,155,131,158,161,172,85,166,191,133,214,167,190,189]
                         ḅØ⁵                      - convert {these} from base 250
                            “...‘œṖ               - partition {that} at 1-indices [3,11,19,37,55,87]
                                   z⁽-%           - transpose with filler 12288 (Ideographic Space code-point)
                                       Z          - transpose
                                        ṙ"“...‘   - zip-wise rotate-left by [1,6,6,16,16] (and two implicit, trailing zeros)
                                               Ọ  - cast to characters
                                                Y - join with newline characters

Charcoal, 275 269 bytes

⪪⭆Ｉ⪪”}““p⟦｜nＤ9⁼KＵU⊟V≕λＧＮ¦Φ⌈∨﹪2_↔U⊖Ｒ#5⎚⮌⟧εy∧≧Ｖü>C\Ｗγ*⁴vÀ“§⦄℅JＸ？‖ρ⁷q⦃3∨∕ï'¹ＺＩl,)✂⁷◧⁵↙h″◨ρ∕ＧΠmＨCＩ↧⁷Ｄ➙Ｃ№r2ïa⌕XＨ↙→﹪b¬κς⌊⌈Ｒ[⁵C″⊙▷↥,}~jψ≔=⁵§ς⁶¦¤hl↶⦃∕b⬤Ｗ⁹ γτＴ↶ＱM<Y⁻ＱＪd@ 9⁴✂'ê≦↔Ｉn⊗Πⅉ⟧´A∕]«ez≧¦$⟧⌕ia6σ/↙⌈~Þ…Yr⎚↨ηü¹XＺ0？V＆ê¡ê\ＢＩHyoⅉ(；H～ς⦃¦À⧴Fς⁴êy↗¿w⟧Ｓa@)8⬤XêY⧴℅⊕0“dξ⁷∨4◧Ａ▶⍘yV\”⁶℅ι³²

Try it online! Link is to verbose version of code. Outputs in Traditional Chinese. Explanation:

    ”...”       Compressed string of code points
   ⪪     ⁶      Split into length 6 substrings
  Ｉ             Cast to integer
 ⭆              Map over code points and join
           ι    Current code point
          ℅     Convert to Unicode
⪪           ³²  Split into length 32 substrings
                Implicitly print

Edit: Saved 6 bytes by removing the spaces from my compressed string. Notes on the previous 275-byte version: Simplified Chinese is 9 bytes longer. I can print null bytes and then set the background to the ideographic space but that doesn't save any bytes. I even tried printing vertically (because that way I don't even have to print the null bytes) but that was actually 10 bytes longer. And I also tried encoding the differences between successive characters but that was 26 bytes longer.