mandy archivist bot - https://notabug.org/mandy/about/wiki/

mandy a82efadcce Fix typo 4 years ago
common a82efadcce Fix typo 4 years ago
lma 5652d05c7d reduce number of changes to request 4 years ago
lmw d6e9d170f2 HACK: workaround urllib read crashing 4 years ago
manual 5b3350c476 create rudimentary error list parser for retrying failures 4 years ago
.gitignore 372f1d1e5f change pretty_lists to public_lists, instruct user to create git repo for them 4 years ago
README.md 06f9838a10 add launch information 4 years ago
TubeUp.patch e66bbc650f improve setup: add README.md, patch files 4 years ago
__main__.patch e66bbc650f improve setup: add README.md, patch files 4 years ago
mandy.sh d5cdbafbac remove unneccesary dependencies and sudo command 4 years ago
setup.sh 372f1d1e5f change pretty_lists to public_lists, instruct user to create git repo for them 4 years ago

README.md

Mandy

More information: About Mandy or oversized blog post

Read those for info.

This README is currently under construction. I haven't even seen it rendered with a proper markdown viewer yet. Suggestions welcone.

Dependencies:

Hard:

  • python3
  • python3-pip
    • beautifulsoup4
    • internetarchive
    • tubeup
    • youtube-dl (this gets installed and updated through pip3 in mandy.sh. pip version is used to prevent obsolete versions causing errors)
  • sh (should be compatable with your shell, tested with Bash)

Soft:

  • cron, used to schedule regular runs
  • git, used to publish lists of archived media
  • mutt, used to send error reports to you via email

Setup:

Clone repo

Setup git, if you haven't already:

git config --global user.email "you@example.com"

git config --global user.name "Your Name"

Run setup.sh. This will generate expected subdirectories and files

Clone or create a git repo in mandy/common/public_lists containing:

  • errorList

  • geoblockList

  • ignoredList

  • successList

  • unavailableList

This repo is intended to be committed to the public. My repo pushes to https://notabug.org/mandy/lists. If you intend to merge with my list, let me know first!

Make sure a script can commit and push automatically, this will probably require setting up SSH.

Either install and configure mutt or edit common/report/reportSender.sh to use your own email client. Alternatively, you can rip out the report sender feature.

Change the email address in common/report/reportSender.sh to one of your own.

The setup script assumes pip3 exists. if needed, substitute pip3 with pip if pip defaults to Python3. Make sure youtube-dl is added to PATH.

pip3 install --user youtube-dl

Install and configure internetarchive Python module (requires archive.org account)

pip3 install --user internetarchive

ia configure

Install and patch TubeUp Python module. You may wish to modify and/or update the patches. diff could be useful:

pip3 install --user tubeup

patch -u [location of TubeUp.py] -i TubeUp.patch # could be in ~/.local/lib/python#.#/site-packages/tubeup

patch -u [location of __main__.py] -i __main__.patch

Install Beautiful Soup 4

pip3 install --user beautifulsoup4

In tubeFeeder.py, edit the lines which check if "Mandy, Defeater of Death" has uploaded the video. Those are simply a hack for me to easily fix corrupted uploads. It will result in a permissions error if you try to reupload a video already uploaded by me.

Run mandy.sh (warning: this runs apt-get update)

This is intended to be later run as a daily cron job in order to:

  • safely merge and sort master lists into the public list repo

  • commit changes to the public list repo

  • update dependencies

  • compile and send a health report via email

Setup a cron job to regularly change directory to /mandy and run mandy.sh.

Launch bot

For each site you want to archive, execute the manager.sh script in its own directory.