They made a model of accounts that willingly linked their hackernews profiles to their linked-ins and made a model base on that (n= approx 990)
They could “deanonymise” about 67% of those accounts from that n=990 candidate pool (alpha=.1) using their model (they already knew who they were, otherwise how could they verify a correct match?).
When they threw in a bunch of accounts that had nothing to do with those first accounts (89k total accounts) accuracy dropped to around 55%-45% depending on choice of technique.
first thing, those hn accts they trained on weren’t trying to be anonymous. They linked to their linked in profile. So, lie on the internet I guess
this is just a starting point anyway, cheap and fast. That’s what to worry about. $1-$4 per account you’re trying to doxx like this.
This headline sucks.
They made a model of accounts that willingly linked their hackernews profiles to their linked-ins and made a model base on that (n= approx 990)
They could “deanonymise” about 67% of those accounts from that n=990 candidate pool (alpha=.1) using their model (they already knew who they were, otherwise how could they verify a correct match?).
When they threw in a bunch of accounts that had nothing to do with those first accounts (89k total accounts) accuracy dropped to around 55%-45% depending on choice of technique.
first thing, those hn accts they trained on weren’t trying to be anonymous. They linked to their linked in profile. So, lie on the internet I guess
this is just a starting point anyway, cheap and fast. That’s what to worry about. $1-$4 per account you’re trying to doxx like this.
Just an interesting paper.