LibreTechni.ca
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
AustralianSimon@lemmy.world to Selfhosted@lemmy.worldEnglish ·
edit-2
6 months ago

Foss webscraper

github.com

external-link
message-square
6
fedilink
73
external-link

Foss webscraper

github.com

AustralianSimon@lemmy.world to Selfhosted@lemmy.worldEnglish ·
edit-2
6 months ago
message-square
6
fedilink
GitHub - jaypyles/Scraperr: Self-hosted webscraper.
github.com
external-link
Self-hosted webscraper. Contribute to jaypyles/Scraperr development by creating an account on GitHub.

Not OP. This was posted to self hosted on reddit and might be useful to some.

Original post - https://www.reddit.com/r/selfhosted/comments/1glf06d/comment/lw1e4zd/

alert-triangle
You must log in or register to comment.
  • MaggiWuerze@feddit.org
    link
    fedilink
    English
    arrow-up
    10
    ·
    6 months ago

    Scraperr is a self-hosted web application that allows users to scrape data from web pages by specifying elements via XPath. Users can submit URLs and the corresponding elements to be scraped, and the results will be displayed in a table.
    From the table, users can download an excel sheet of the job’s results, along with an option to rerun the job.
    View the docs.

    • GravitySpoiled@lemmy.ml
      link
      fedilink
      English
      arrow-up
      4
      ·
      6 months ago

      An excel sheet? …

      • MaggiWuerze@feddit.org
        link
        fedilink
        English
        arrow-up
        2
        ·
        6 months ago

        🤷‍♂️

      • ChapulinColorado@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        6 months ago

        Maybe it’s just a CSV?

  • t҉̠̙ǵ̣̞̄ͪ͜x̸̱͚̳ͫ͐̑̈ͯͣ̚n̒͌҉͉̦̜̝ͅ@lemmy.tgxn.net
    link
    fedilink
    English
    arrow-up
    5
    ·
    6 months ago

    project is here https://github.com/jaypyles/Scraperr

  • Blxter@lemmy.zip
    link
    fedilink
    English
    arrow-up
    1
    ·
    6 months ago

    Yes looks very interesting.

Selfhosted@lemmy.world

selfhosted@lemmy.world

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !selfhosted@lemmy.world

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don’t control.

Rules:

  1. Be civil: we’re here to support and learn from one another. Insults won’t be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it’s not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don’t duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

  • selfh.st Newsletter and index of selfhosted software and apps
  • awesome-selfhosted software
  • awesome-sysadmin resources
  • Self-Hosted Podcast from Jupiter Broadcasting

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 184 users / day
  • 3.66K users / week
  • 6.63K users / month
  • 14.7K users / 6 months
  • 2 local subscribers
  • 46.8K subscribers
  • 2.06K Posts
  • 39.5K Comments
  • Modlog
  • mods:
  • Ruud@lemmy.world
  • Loki@lemmy.world
  • CannaVet@lemmy.world
  • devve@lemmy.world
  • HybridSarcasm@lemmy.world
  • HybridSarcasm@lemmy.hybridsarcasm.xyz
  • BE: 0.19.5
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org