Programming@programming.dev · 20 hours ago

4.5 Million (Suspected) Fake Stars in GitHub: A Growing Spiral of Popularity Contests, Scams, and Malware

arxiv.org

cross-posted to:
technology@beehaw.org

102

4.5 Million (Suspected) Fake Stars in GitHub: A Growing Spiral of Popularity Contests, Scams, and Malware

arxiv.org

mox@lemmy.sdf.org to

Programming@programming.dev · 20 hours ago

cross-posted to:
technology@beehaw.org

GitHub, the de-facto platform for open-source software development, provides a set of social-media-like features to signal high-quality repositories. Among them, the star count is the most widely used popularity signal, but it is also at risk of being artificially inflated (i.e., faked), decreasing its value as a decision-making signal and posing a security risk to all GitHub users. In this paper, we present a systematic, global, and longitudinal measurement study of fake stars in GitHub. To this end, we build StarScout, a scalable tool able to detect anomalous starring behaviors (i.e., low activity and lockstep) across the entire GitHub metadata. Analyzing the data collected using StarScout, we find that: (1) fake-star-related activities have rapidly surged since 2024; (2) the user profile characteristics of fake stargazers are not distinct from average GitHub users, but many of them have highly abnormal activity patterns; (3) the majority of fake stars are used to promote short-lived malware repositories masquerading as pirating software, game cheats, or cryptocurrency bots; (4) some repositories may have acquired fake stars for growth hacking, but fake stars only have a promotion effect in the short term (i.e., less than two months) and become a burden in the long term. Our study has implications for platform moderators, open-source practitioners, and supply chain security researchers.

cross-posted from: https://beehaw.org/post/17683690

Archived version

Download study (pdf)

GitHub, the de-facto platform for open-source software development, provides a set of social-media-like features to signal high-quality repositories. Among them, the star count is the most widely used popularity signal, but it is also at risk of being artificially inflated (i.e., faked), decreasing its value as a decision-making signal and posing a security risk to all GitHub users.

A recent paper by Cornell University published on Arxiv, the researchers present a systematic, global, and longitudinal measurement study of fake stars in GitHub: StarScout, a scalable tool able to detect anomalous starring behaviors (i.e., low activity and lockstep) across the entire GitHub metadata.

Analyzing the data collected using StarScout, they find that:

(1) fake-star-related activities have rapidly surged since 2024

(2) the user profile characteristics of fake stargazers are not distinct from average GitHub users, but many of them have highly abnormal activity patterns

(3) the majority of fake stars are used to promote short-lived malware repositories masquerading as pirating software, game cheats, or cryptocurrency bots

(4) some repositories may have acquired fake stars for growth hacking, but fake stars only have a promotion effect in the short term (i.e., less than two months) and become a burden in the long term.

The study has implications for platform moderators, open-source practitioners, and supply chain security researchers.

Chat

catch22@programming.dev
link
fedilink
arrow-up
3·
edit-2
6 hours ago
No big surprise here, stars/star reviews are in general completely worthless. I don’t really even bother with them anymore.

Programming@programming.dev

programming@programming.dev

Create a post

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !programming@programming.dev

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person’s post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Follow the programming.dev instance rules
Keep content related to programming in some way
If you’re posting long videos try to add in some form of tldr for those who don’t want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

430 users / day
1.14K users / week
2.33K users / month
9.05K users / 6 months
1 local subscriber
17.7K subscribers
1.35K Posts
19.4K Comments
Modlog