g | x | w | all
Bytes Lang Time Link
044AWK250331T174806Zxrs
nan230613T002214Z138 Aspe
nan230613T002732Z138 Aspe
072Python220205T135146ZOliver
033C171213T235720Zuser2301
nan120110T214352ZPaul R
nan141227T224236Zpgy
nan141226T224556ZAmigoNic
916Haskell120121T144512ZJeff Bur
nan120110T062056Zuser unk
nan120110T011820Zgnibbler
nan120109T235300Zratchet

AWK, 44 bytes

{for(;$1;$1=rshift($1,1))and($1,1)&&x++}$0=x

Attempt This Online!

Rust

Rust has its built-in popcount function: (num as u128).count_ones();

It's also worth noting that \$\color{blue}{\text{this function operates in constant time}}\$. That is, regardless of the input value, it always takes the same amount of time to execute. This is because the number of bits in an integer is fixed, so no matter the value, the same number of bits need to be examined.

u128

Try it online!

use std::time::Instant;

fn main() {
    let numbers = vec![5, 1254, 56465];
    let answers = vec![2, 6, 8];

    for (i, &num) in numbers.iter().enumerate() {
        let start = Instant::now();
        let result = (num as u128).count_ones();
        let duration = start.elapsed();

        assert_eq!(result, answers[i], "Failed test for {}", num);
        println!("Number: {}, Ones: {}, Time taken: {:?}", num, result, duration);
    }
}

u256

/src/main.rs

use uint::construct_uint;
use std::time::Instant;

construct_uint! {
    pub struct U256(4);
}

fn count_ones(n: U256) -> u32 {
    n.as_ref().iter().map(|&b| b.count_ones()).sum()
}

fn main() {
    let numbers = vec![
        U256::from(5u64),
        U256::from(1254u64),
        U256::from(56465u64),
        U256::from(18446744073709551615u64),
        U256::from_dec_str("115792089237316195423570985008687907853269984665640564039457584007913129639935").unwrap(),  // 2^256 - 1
    ];

    let answers = vec![2, 6, 8, 64, 256];

    for (i, num) in numbers.iter().enumerate() {
        let start = Instant::now();
        let result = count_ones(num.clone());
        let duration = start.elapsed();

        assert_eq!(result, answers[i], "Failed test for {}", num);
        println!("Number: {}, Ones: {}, Time taken: {:?}", num, result, duration);
    }
}

Cargo.toml

[package]
name = "rust_hello"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
uint = "0.9"
# bitvec = "1.0.1"

Build and running

# win10 environment

$ cargo build
$ target\debug\rust_hello.exe


Number: 5, Ones: 2, Time taken: 600ns
Number: 1254, Ones: 6, Time taken: 700ns
Number: 56465, Ones: 8, Time taken: 400ns
Number: 18446744073709551615, Ones: 64, Time taken: 700ns
Number: 115792089237316195423570985008687907853269984665640564039457584007913129639935, Ones: 256, Time taken: 800ns

Mathematica

Mathematica has its built-in function DigitCount.

Try it online!

popCount[n_Integer] := DigitCount[n, 2, 1];
(ans=popCount[115792089237316195423570985008687907853269984665640564039457584007913129639935])//AbsoluteTiming//Print

Python (72 bytes)

x=bin(int(input()));n=0
for i in x:
    if i=='1':
        n+=1
print(n)

Converts str, to int, to bin, then iterates over the string to count.

C, 33 bytes

I just saw the challenge today, but thought of a recursive solution.

B(n){return (n&1)+(n?B(n/2):0);};

Edited to note that I assumed it was a smallest size challenge. The speed is linear to the number of bits required to contain the number. O(N) = k * Log2(N).

C/C++

int popcnt(int n)
{
    return __builtin_popcount(n);
}

For some CPUs __builtin_popcount compiles to a single instruction (e.g. POPCNT on x86).

Note: requires gcc or gcc-compatible compiler (e.g. ICC).

Erlang

p(N)->p(N,0).
p(0,S)->S;p(N,S)->p(N div 2, S+N rem 2).

The p function divides the number with 2, adds the remainder to S, and calls itself with N/2 (tail) recursively.

While it takes log2(N) steps (division and function call) to calculate the result, it works with arbitrary large numbers.

Here is rachet freak's submission (the 32-bit version) converted to Scala (thanks RF!):

def num1Bits(x: Int) = {
  var i = x
  i = (i & 0x55555555) + ((i>> 1) & 0x55555555)
  i = (i & 0x33333333) + ((i>> 2) & 0x33333333)
  i = (i & 0x0f0f0f0f) + ((i>> 4) & 0x0f0f0f0f)
  i = (i & 0x00ff00ff) + ((i>> 8) & 0x00ff00ff)
      (i & 0x0000ffff) + ((i>>16) & 0x0000ffff)
}

Haskell, 9 (+16)

There was a popCount routine added to the Data.Bits module for GHC v7.2.1/v7.4.1 this summer (see tickets concerning the primop and binding).

import Data.Bits
popCount

There is a good arbitrary precision popcount function in the GNU Multiple Precision Arithmetic Library (GMP), which GHC uses by default, but most other serious languages offer bindings to :

Python, 16 (+12)

import gmpy;
gmpy.popcount(...);

Perl, 11 (+22)

use GMP::Mpz qw(:all);
popcount(...);

scala:

@tailrec
def bits (i: Long, sofar: Int): Int = if (i==0) sofar else bits (i >> 1, (1 & i).toInt + sofar) 

computes on a 2Ghz single core for about 1.500.000 64bit Long values the number of bits set. A straight forward recursive method, comparing the last bit, and shifting the number towards zero. The @tailrec is just a hint to the compiler to warn me, if it can't optimize the tail recursive call - it is not necessary, for the optimization to take place.

It's about 10times faster than the simplest method, using the library function:

def f(n:Long)=n.toBinaryString.count(_=='1')

Translating them to c++ might not lead to similar results.

Using an 8-bit lookup table in the python shell

>>> tab=[0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8]
>>> N=123456789
>>> tab[N>>24]+tab[N>>16&255]+tab[N>>8&255]+tab[N&255]
16

fastest for 32 bit numbers:

int bitcount(int i){
    i=i&0x55555555 + (i>>1)&0x55555555;
    i=i&0x33333333 + (i>>2)&0x33333333;
    i=i&0x0f0f0f0f + (i>>4)&0x0f0f0f0f;
    i=i&0x00ff00ff + (i>>8)&0x00ff00ff;
    return i&0x0000ffff + (i>>16)&0x0000ffff;
}

for 64 bits:

int bitcount(long i){
    i=i&0x5555555555555555 + (i>>1)&0x5555555555555555;
    i=i&0x3333333333333333 + (i>>2)&0x3333333333333333;
    i=i&0x0f0f0f0f0f0f0f0f + (i>>4)&0x0f0f0f0f0f0f0f0f;
    i=i&0x00ff00ff00ff00ff + (i>>8)&0x00ff00ff00ff00ff;
    i=i&0x0000ffff0000ffff + (i>>16)&0x0000ffff0000ffff;
    return i&0x00000000ffffffff + (i>>32)&0x00000000ffffffff;
}

these are all for C-based languages that define the binary operators (C, C++, C#, D, java, ...) (barring the int long type, translate as needed)

for arbitrary bit sizes:

int bitcount(BigInt i){
    int c=0;
    BigInt _2_pow_64 = BigInt(2).pow(64);
    while(i>0){
       long rem = i.mod(_2_pow_64);//or AND with 0xffffffffffffffff
       c+=bitcount(rem);//as defined above
       i=i.div(_2_pow_64);//or rshift with 64
    }
    return c;
}

BigInt being the class for integers of arbitrary size, and the .div, .pow, .mod being the 'divide by', 'to the power of' and 'remainder of' functions reps.