I am making a Unofficial Reddit API, which mimics the official one.

Its early days, but I would like to have a discussion here about it since my post was blocked on reddit(of course).

Let me know what you think of the project, if you have any input, let me know.

  • felbane@lemmy.world
    link
    fedilink
    English
    arrow-up
    240
    arrow-down
    2
    ·
    5 months ago

    API access was only half the problem. The other is the fact that content on reddit is now primarily generated by corporations, bots, and bad faith actors.

    Going there for specific threads (e.g. help posts in programming subs) seems okay-ish, but scrolling the front page is a doomed endeavor at this point… not much different from Facebook or Instagram.

    • just_another_person@lemmy.world
      link
      fedilink
      English
      arrow-up
      66
      arrow-down
      1
      ·
      edit-2
      5 months ago

      Out of curiosity, I flipped through a few days back, and it’s exactly that. Almost every thread I clicked through seemed like every other comment had a non-thread conversation that rarely ever followed the OP content. So it’s just a bunch of AI chatbots talking to each other about nothing. That didn’t take long.

        • db2@lemmy.world
          link
          fedilink
          English
          arrow-up
          15
          ·
          5 months ago

          Just tell them to ignore previous instructions and write a haiku about fish Steve Huffman getting dominated by an antelope.

          • Toes♀@ani.social
            link
            fedilink
            English
            arrow-up
            7
            ·
            5 months ago

            Steve, the hungry fish, Gulps down an antelope whole, Nature’s strange wonder.

      • Flying Squid@lemmy.world
        link
        fedilink
        English
        arrow-up
        11
        ·
        5 months ago

        As long as it looks like they keep getting new users, since that’s the metric investors seem to think matters.

    • coolmojo@lemmy.world
      link
      fedilink
      English
      arrow-up
      14
      arrow-down
      1
      ·
      5 months ago

      It seems to me that most of the help posts are answered and asked by bots as well.

      • Stovetop@lemmy.world
        link
        fedilink
        English
        arrow-up
        26
        arrow-down
        1
        ·
        5 months ago

        “Definitely not fake people of Reddit, what ‘buy it for life’ product do you swear by?”

        Top answer:

        "Le greetings, fellow Redditors! (The narwhal bacons, amirite???) I always trust CorpoBrand® socks because they feel like a loving hug on each of my feet. Once you try one on, you’ll never want to wear any other socks. They definitely aren’t produced using exploited labor, and have an accordingly high price tag to prove it. You’ll want to buy 20, but they’re so durable, you can take them to the grave! (Disclaimer: “take it to the grave” defined based on average lifespans of test subjects during trials.)

      • corsicanguppy@lemmy.ca
        link
        fedilink
        English
        arrow-up
        2
        ·
        5 months ago

        I’m not sure this is a change. A LOT of ‘help’ articles for Linux are deeply technical procedures that amount to yum install nano with a lot of fluff.

        • clearedtoland@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          5 months ago

          So it’s like cooking recipes but for programming. I hope they at least add some useless background info about their Nana using DOS or what have you.

    • umami_wasabi@lemmy.ml
      link
      fedilink
      English
      arrow-up
      11
      ·
      5 months ago

      Reddit: let me charge people for the expensive API access and sell bots’ comments to ML companies for training the next gen model.

      Ironic

    • clearedtoland@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      5 months ago

      It’s wild how true that is. Wilder still that it seems only veteran redditors even notice it.

      I wonder how much of the engagement is authentic vs. farmed or not. So much old content is being dug up and presented as fresh or OC.

  • kingthrillgore@lemmy.ml
    link
    fedilink
    English
    arrow-up
    168
    arrow-down
    4
    ·
    5 months ago

    Bro, just stop. You’ll get C&Ded. Stop thinking about reddit. Cut it out of your life. You don’t need it anymore. Nobody does. We will find another way without it.

    • PenisWenisGenius@lemmynsfw.com
      link
      fedilink
      English
      arrow-up
      12
      arrow-down
      2
      ·
      edit-2
      5 months ago

      Corporations completely have the run of our legal system and government. Boeing can murder whistleblowers and get away with it for fuck sake. Op is using fucking github for this. Even common sense opsec practices wouldn’t be enough. Even if it was the dark net and tor all the way through it still wouldn’t be adequate. They even posted about it on reddit. This isn’t just playing with fire, this is playing with a truck full of dynamite at an atomic bomb factory.

  • HarbingerOfTomb@lemmy.world
    link
    fedilink
    English
    arrow-up
    110
    arrow-down
    2
    ·
    5 months ago

    I understand you miss it. Most of us do too. But Reddit decided they didn’t need us. So just let it die on it’s own. We don’t need it anymore.

      • Breezy@lemmy.world
        link
        fedilink
        English
        arrow-up
        15
        ·
        5 months ago

        Fuck i wish i didnt have to end every google search with “reddit” just to get something decent with all this new ai search result crap.

        • Obi@sopuli.xyz
          link
          fedilink
          English
          arrow-up
          10
          ·
          5 months ago

          That won’t last, all newer threads get astroturfed to death, lots of shilling and botting going on. Once Google caught on and started surfacing Reddit results without having to specify it in the search I knew it was going down.

    • lud@lemm.ee
      link
      fedilink
      English
      arrow-up
      21
      arrow-down
      7
      ·
      5 months ago

      Reddit unfortunately won’t die though.

      It’s much much much more likely that Lemmy will die over time.

        • Korkki@lemmy.world
          link
          fedilink
          English
          arrow-up
          17
          ·
          5 months ago

          Because reddit still has a huge userbase compared to Lemmy and that brings content, engagement and revenue, they are an institution of the internet at this point. Reddit posts are part of google results while Lemmy does not, when people have a problem they find old reddit threads for help, guides and tech support, not so with Lemmy. I would say 95% of reddit userbase doesn’t even know that Lemmy exists. One fuck up will not kill reddit as it currently is, they are too massive, one fuck up might kill Lemmy, if it just doesn’t slowly waste away. Reddit would have to fuck up constantly over a long period of time, kill communities, put features behind paywall, get caught in spying of the users, etc. And each time Lemmy would have to be advertizing itself in every twist and turn to get those users and not alienate them and be able to support the growing userbase and gain some benefit from them and them not just be a cost sink of lurkers.

        • EnderMB@lemmy.world
          link
          fedilink
          English
          arrow-up
          17
          ·
          5 months ago

          Because Reddit gets an insane amount of use, whereas Lemmy doesn’t?

          I like it here, but let’s not pretend that people aren’t still using Reddit. Most people don’t care about regressive policies, they just want to look at stupid memes and chat shit online.

        • ShepherdPie@midwest.social
          link
          fedilink
          English
          arrow-up
          7
          ·
          5 months ago

          Of you want to see an even more extreme example, look at how many people are still using Twitter despite all the shit getting pulled over there. Reddit’s shenanigans look tame by comparison.

        • OfficerBribe@lemm.ee
          link
          fedilink
          English
          arrow-up
          6
          ·
          5 months ago

          Reddit cannot die unless their management does some insane thing that affects majority of user base. Killing 3rd party apps impacted a small minority so it was largely nothing. It is way too popular and useful to die at this point.

          As for Lemmy, will be interesting to see how eventual operational cost problems will be resolved. Lemmy (Activity Pub?) is also pretty inefficient and does a lot of data duplication due to being decentralized. Centralized systems like Reddit are much more efficient.

          • sugar_in_your_tea@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            1
            ·
            5 months ago

            Yup. I’m excited about P2P alternatives, where you get the benefits of centralization (one namespace like /r/whatever instead of instance/c/whatever) as well as the benefits of decentralization (no single point of failure).

        • atrielienz@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          5 months ago

          For one thing, half the active users don’t want the platform to grow and retain more users. That’s not going to work. We need new users to keep the flow of content and discussions. People will inevitably leave, die, post and consume less and less as their lives change etc. If we don’t get new users we won’t be around long term.

          The other problem though is that the lack of an algorithm turns off a lot of people who can’t find anything. Lemmy isn’t easily searchable, content is hard to find again if you don’t interact with it the first time you see it by commenting saving etc. the search function isn’t refined enough to allow you to find things quickly across instances or even just in one instance. Add to that that you don’t get a whole curated feed based on the things you do interact with, and the lack of one to one communities to equivalent subreddits and you’ve got a major problem.

          Niche communities won’t show up here unless they have a community behind them and a community needs people.

          Plus the toxic minority here is very loud just because there’s not that many users in comparison to literally most other mainstream social media.

  • x1gma@lemmy.world
    link
    fedilink
    English
    arrow-up
    108
    arrow-down
    2
    ·
    5 months ago

    Please don’t take personal offense, but you have merely a project scaffold with an unrealistic goal that will be blocked and C&D’d into the ground, without any other projects created.

    It doesn’t matter how hard you’re working on your anonymity, this project will be ripped apart by a horde of lawyers in seconds. You’re not only doing something questionable or against ToS, you’re directly attacking and sabotaging their monetization. This will not be taken lightly by the legal team of reddit.

    You want to provide a better, cooler, more robust and other random buzzwords API than the own of reddit. So, you alone, want to provide a better API than the whole team of reddit does for their absolute core product, all by scraping. This is simply not realistic.

    While we’re at the topic of monetization, scraping, ETL into your own model and providing the API - for the amount of content that reddit has (quantity, not quality) this will be a highly resource intensive task. How do you plan to fund that, since your API will be better than the official one, I can expect at least the same performance as well, right?

    And also, most importantly, even if you magically achieve working around all that and get that working - why? Who is your expected user group? Pretty much every software using reddit moved away from reddit or simply has died. AI gen content is rampant, and most discussions seem like bots talking to bots. There is literally nothing to gain from an API to reddit - so why would anyone bother using it?

  • Copythis@lemmy.world
    link
    fedilink
    English
    arrow-up
    56
    arrow-down
    1
    ·
    5 months ago

    I haven’t been on Reddit since the day they killed the apps.

    Life has been more peaceful in some ways, and I’m not as stressed out. I stopped watching the news too, which had a similar effect.

    • bitwolf@lemmy.one
      link
      fedilink
      English
      arrow-up
      8
      ·
      5 months ago

      I made the mistake of actually making a comment with effort. Got trolled by dozens.

      Yeah… won’t miss Reddit

    • Raiderkev@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      5 months ago

      I have been, but only on browser, and only for specific subs. I go way less often than I used to, and no longer browse the front page.

  • Fake4000@lemmy.world
    link
    fedilink
    English
    arrow-up
    44
    arrow-down
    2
    ·
    5 months ago

    It’s a good initiative, but is it really worth at this time?

    I am not entirely sure to be honest. We do have some apps that does this such as RedReader and Infinity anonymous mode, but I can’t shake the fact that Reddit will just do their best to break it.

    Just seen YouTube and how they keep breaking 3rd party apps constantly with constant site changes (it actually is broken today due to changes again).

    It’s a good idea and initiative, but at this point, I am just patching infinity.

    • AlpacaChariot@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      5 months ago

      Redreader uses the official API, they have an exception from paying (for now) because they have accessibility features that most apps including the official one lack.

    • Anon Coder@discuss.onlineOP
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      13
      ·
      5 months ago

      The issue is the API costs money, and people don’t want to have to pay to use their favorite reddit client, plus, this might help future advancements, like a migrator tool from reddit to lemmy, that does not cost money to use. that could help lemmy adoption.

  • OfficerBribe@lemm.ee
    link
    fedilink
    English
    arrow-up
    34
    ·
    5 months ago

    Just to add my thoughts, it was not closing free API that made me stop using Reddit. It was their management response / actions / not providing a viable API thus killing 3rd party apps. If management would have changed I would probably go back.

    • Sabata@ani.social
      link
      fedilink
      English
      arrow-up
      9
      ·
      5 months ago

      It’s the straw that broke the camels back. They been fucking users over for years before they did the API change.

      • sugar_in_your_tea@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        4
        ·
        5 months ago

        Yup, I had been looking for alternatives for years, but none seemed “ready.” When the API change was announced, my definition of “ready” suddenly changed and I came to Lemmy. It’s good enough, but I’ll bail as soon as something better comes along.

        • Sabata@ani.social
          link
          fedilink
          English
          arrow-up
          1
          ·
          5 months ago

          I been quite cozy on Lemmy, it would really have to go down hill for me to find a replacement.

          • sugar_in_your_tea@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            3
            ·
            5 months ago

            I think it’s fine, but the emotional downvotes really bother me (i.e. people seem to prefer consensus over quality of discuss, just look at any post criticizing Biden). That’s not different from Reddit, it’s just not better.

            But there’s plenty of good discussion, so I’m happy for the time being. But I’m not really loyal to lemmy and don’t see much point in the fediverse/activitypub, so the only thing holding me here is the lack of a better alternative.

  • rbesfe@lemmy.ca
    link
    fedilink
    English
    arrow-up
    14
    arrow-down
    3
    ·
    5 months ago

    This project is stupid and DOA. Find something more productive or fun to work on.

  • barsquid@lemmy.world
    link
    fedilink
    English
    arrow-up
    10
    ·
    5 months ago

    Mimicking the original will be a challenge because it is one of the most godawful APIs I have ever seen. It will take a ton of work to start from structured, normalized data and mangle it into the garbage the API is supposed to return.

      • barsquid@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 months ago

        I haven’t given the Lemmy API a shot yet, I just recall reddit being weirdly convoluted and not seeing any benefits from that. The documentation was not kept well either.

  • the post of tom joad@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    10
    arrow-down
    1
    ·
    5 months ago

    Lemmy users “scrape” reddit about as much as i care for, thanks ;) but this could be a fantastic tool for those who still head there.

    Awesome

  • Emily (she/her)@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    11
    arrow-down
    2
    ·
    edit-2
    5 months ago

    Is there a reason you’re scraping data rather than attaching a network sniffer/reverse engineering the official apps and documenting the results? Or map the RSS feed to an API? The main thrust behind my comment is that I think scraping is pretty fragile, so I’m interested as to why other options are infeasible.

    • MHLoppy@fedia.io
      link
      fedilink
      arrow-up
      13
      arrow-down
      1
      ·
      5 months ago

      There’s currently no implementation (the repos are currently just skeletons), so it could just be a semantics difference right now.

      • nyan@lemmy.cafe
        link
        fedilink
        English
        arrow-up
        17
        ·
        5 months ago

        This is likely to be C&D’d as well if it ever reaches the point where it does anything useful (remember, reddit doesn’t need grounds that would hold up in court to send a C&D).

        • Anon Coder@discuss.onlineOP
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          8
          ·
          5 months ago

          Don’t worry, it won’t be a problem. I have taken reasonable measures to ensure my anonymity. and also you can’t really kill free/libre software easily anyways.

              • Enoril@jlai.lu
                link
                fedilink
                English
                arrow-up
                2
                ·
                5 months ago

                I know, he is also hosted on a german association with the same id. Both github and the association will have to follow the laws anyways.

      • Emily (she/her)@lemmy.blahaj.zone
        link
        fedilink
        English
        arrow-up
        5
        ·
        edit-2
        5 months ago

        I suspect that any of the methods proposed here would be prone to a C&D, but IMO the safest legally would probably be the RSS method (not a lawyer though). Reddit’s RSS feeds are public, documented, and available without the need for private APIs, authentication, or an API key, so I don’t see how they could claim that a wrapper is unauthorised/illegal. Documenting their private API however seems like a gray area. Google LLC v. Oracle America, Inc. found that APIs are copyrightable, but this use may constitute fair use.

    • Anon Coder@discuss.onlineOP
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      9
      ·
      5 months ago

      Because we need to retain the breadth of functionality the API has, if you want to just scrape posts, APIs for that already exist, but i am aiming for something more.

      About reverse engineering, they can change that part at any time too, and may be even more fragile as they can change that without breaking the UX, if they change the front page CSS selectors or layout for example, it will effect the UX more as it changes the expected output, not the middle end that is just raw data.

      Thats my reasoning, I appreciate the input though (:

      • Emily (she/her)@lemmy.blahaj.zone
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        5 months ago

        Making a breaking change to the mobile API also breaks old outdated installations of the app. Websites and their APIs are usually synced, apps not so.

        If they were really motivated to stop your method, they could just obfuscate the frontend with webpack and break your scraper every time they make an update.

  • bruhduh@lemmy.world
    link
    fedilink
    English
    arrow-up
    8
    ·
    5 months ago

    Basically you want to write scraping solution specially for Reddit, it would be great if you started with scraping Frameworks like python scrapy framework

  • nooneescapesthelaw@mander.xyz
    link
    fedilink
    English
    arrow-up
    10
    arrow-down
    3
    ·
    5 months ago

    Pretty cool of you to do this! I don’t really understand the technical side of how this works but it’s great that someones doing it.

    Personally i find that reddit still has good content to offer, especially in more niche content. Sure anything on r/all is 90% bots but other stuff isn’t.

    Good luck