So, Lemmy is sometime missing content. I don't regret switching from Reddit to Lemmy but, expecially for niche communities, the content isn't always here.
My idea is to fix this is a Fediverse-based content relay named Relly.
Relly allows you to select RSS feeds, Mastodon users, Mastodon hashtag and Mastodon instances (so, the top posts on that instance) as sources for content, and post them to your favourite Lemmy community.
There are several features which make Relly better and anti-spam:
- Limits for a source (example: only up to 5 posts a day from this RSS feed)
- Limits for a community (example: only up to 5 posts a day to !archlinux)
- Global limits (example: only up to 10 posts made each day)
- Opt-out for servers & communities (instance and community moderators will be able to ask to be put in the UNLIST, which blocks by default Relly on your instace/community; this isn't an anti-spam, as it is more a tool for avoiding common users to use Relly in a malicous and spammy way)
- Order posts (so, if i have 10 RSS posts and 10 Mastodon posts and a global limit of 15 posts, you can either have the 10 RSS posts and the 5 most upvoted Mastodon posts, or some RSS posts and some Mastodon posts [always the most upvoted])
- Multiple communities (post the same content to different communieties, or set up a fraction [ex. 50%], so that each post has a certain percentage to be posted on a certain community)
- Dynamic limits: You can set an objective of active users/post made in the last 24 hours, so that the limits (either for a specific source, a specific community or globally) will be reduced. Example: if you set a objective of 50 posts, and 25 are made, the limits of Relly will be 50% of what they were originaly set to be; this allows Relly to completly stop posting on a community if the objective was already reached.
- Do not repeat: before posting a link, checks if it was already posted in the community in a specific time period (by default, 48 hours)
- Modularity: new post sources and post outputs can be implemented; an example could be an e-mail output, so that you can run Relly in local and recieve an e-mail everyday with your favourite news)
Relly is designed to be used by moderators of communities, but users can also use it. A user should always ask the moderator if it is OK to use it. A moderator should always ask the admins if it is OK to use it. Moderators, if they are the one using it, should also make public the list of sources, and allow the community to discuss possible edits to the list. The admins should put in the sidebar notes if Relly is OK to use for moderators of communities.
At the moment, Relly is just the idea that I presented here; I want to hear the community's feedback, and if the community is OK with this project being made, I will start working on it (I will make it in Rust and release under the MIT License).
Is scraping reddit's HTML without using an API doable? I'm not sure if the reddit RSS feed has any notion of upvotes/popularity.
I should check, but if i remeber correctly, i had some subreddits that i read on newsboat using some kind of option in the RSS link in order to get the top. (something like
?top=24hrsor like that)I had to enter reddit (eeewww..) but I found it: https://www.reddit.com/r/rss/comments/e3mx1j/how_to_get_rss_feed_of_a_subreddit_with_top_posts/ Check the first comment.